Introduction
Founder effects, genetic drift and recombination associated with
the global spread of HIV-1 infection have given rise to genetically
distinct viral strains referred to as ‘subtypes’ and ‘circulating
recombinant forms’ [1]. HIV-1 genetic diversity may impact on
disease progression and response to antiretroviral therapy, and has
implications for vaccine development [2]. It is therefore important
to monitor changes in the genetic and geographic complexity of
the HIV-1 epidemic, and to identify the processes that drive these
changes. Of the various HIV-1 strains that have been described, the most
prevalent worldwide is subtype C [3]. First described in East and
Southern Africa [4], infections with viruses belonging to (or
partially derived from) subtype C are now prevalent in regions
throughout the world, including India, China, and South Americ In many of the regions where it has been introduced,
subtype C has overtaken other HIV-1 strains introduced at earlier
times [6–9]. Notably, studies suggest that subtype C may acquire
multi-drug resistance more rapidly than other HIV-1 subtypes
The rapid spread of subtype C in regions of South America -
including Brazil, Argentina and Uruguay - has drawn particular
attention [12–16]. Recent studies indicate that the South
American subtype C epidemic likely derives from a single founder
virus that entered the continent via Southern Brazil, and was
derived from viral strains prevalent in East Africa [12,13].
However, the external route via which this virus spread from
East Africa to South America has remained mysterious.
In the United Kingdom (UK), the prevalence of subtype C has
increased steadily since the early 1990s, and it now ranks as the
second most prevalent HIV-1 subtype after subtype B [17]. The overwhelming majority of subtype C infections in the UK occur in
individuals whose reported exposure risk is heterosexual contact,
and who were likely infected in Southern or Eastern Africa [18].
However, in a previous analyses of HIV-1 genetic diversity [19],
we observed that some subtype C isolates sampled within the UK
exhibit high levels of genetic similarity to isolates obtained in
South America. To explore this finding in greater detail, we
screened 8,309 subtype C sequences sampled within the UK to
identify isolates genetically linked to the South American
epidemic. We then examined the genetic relationships of these
isolates to subtype C isolates sampled worldwide evolutionary rate and the date of the most recent common
ancestor (MRCA) were performed using a Bayesian Markov chain
Monte Carlo (MCMC) approach as implemented in BEAST v1.7.
Analyses were performed with a Bayesian Skyline coalescent tree
prior, under the GTR + I + C model of nucleotide substitution,
and using both a strict and a relaxed molecular clock (uncorrelated
Lognormal model). Two separate MCMC chains were run for 108
generations for each dataset, sampled every 10,000th generation.
BEAST output was analyzed using TRACER v1.4 [27], with
uncertainty in parameter estimates reflected in the 95% highest
probability density (HPD) values after excluding a burn-in of 10%.
Methods
Study Group and Reference Sequences
8,309 subtype C pol gene sequences sampled within the UK
were obtained from the UK HIV Drug Resistance Database
(www.hivrdb.org.uk). These sequences were generated by population
sequencing from plasma samples collected between 1996
and 2008, and were anonymously linked to data (obtained under
voluntary agreement of patients) describing the ethnicity, nationality
(country of birth) and exposure risk group of infected
individuals. Sequences were at least 1000 nucleotides in length,
spanning the genomic region between 2,253 and 3,251 nucleotides
(HXB2 coordinates). Sequences are available on request from the
UK HIV Drug Resistance Database.
A globally sampled reference sequence set comprising 1,289
previously published subtype C pol gene sequences annotated by
country of sampling was obtained from the Los Alamos HIV
Sequence Database (www.hiv.lanl.gov). The reference set included
sequences from Argentina (n= 8), Burundi (n= 92), Brazil
(n =122), Botswana (n= 144), Ethiopia (n =101), India (n =74),
Kenya (n= 3), Tanzania (n =65), Uganda (n= 11), South Africa
(n =667) and the UK (n =2). The IDs of reference sequences used
in this study are provided as supplementary information (File S1).
Sequence Analysis
Sequences were classified into phylogenetic groups (i.e.
subtypes, circulating recombinant forms and within-subtype
lineages) using the REGA HIV-1 subtyping tool (version 2.0,
available at: www.bioafrica.net) [20–22]. Sequence alignments
were created using MUSCLE [23] and manually edited.
Maximum likelihood phylogenies were constructed using PhyML
[24] and parameters estimated from the dataset (nucleotide
substitution model =HKY85, transition/transversion ratio = 4.0,
gamma shape parameter = 0.780). Bayesian phylogenetic analysis
was performed using MrBayes v3.1.2 [25]. Bayesian phylogenies
were inferred using the GTR+I+C nucleotide substitution selected
using Modeltest [26]. For each dataset, two runs (one cold and one
tree heated, tempJ0.20) of four chains each were run for 107
generations, with trees sampled every 1000th generation. The
burn-in of 10% was excluded from the analysis. Convergence of
parameters was assessed by calculating the effective sample size
(ESS) using TRACER v1.4 [27], excluding an initial 10% for each
run. All parameters estimates for each run showed ESS values
more than 300. Shared drug resistance mutations were identified
using the calibrated population resistance (CPR) tool [28].
Estimation of Evolutionary Rates and Dates
All sequences used for estimation of dates were examined for
evidence of inter- and intra-subtype recombination. Sequences
that were not classified as pure (non-recombinant) subtype C by
REGA (inter-subtype recombination) [21,22] and SCUEL (intrasubtype
recombination) [29] were excluded. Estimates of the evolutionary rate and the date of the most recent common
ancestor (MRCA) were performed using a Bayesian Markov chain
Monte Carlo (MCMC) approach as implemented in BEAST v1.7.
Analyses were performed with a Bayesian Skyline coalescent tree
prior, under the GTR + I + C model of nucleotide substitution,
and using both a strict and a relaxed molecular clock (uncorrelated
Lognormal model). Two separate MCMC chains were run for 108
generations for each dataset, sampled every 10,000th generation.
BEAST output was analyzed using TRACER v1.4 [27], with
uncertainty in parameter estimates reflected in the 95% highest
probability density (HPD) values after excluding a burn-in of 10%.
แนะนำผลผู้ก่อตั้ง ดริฟท์พันธุ และ recombination ที่เกี่ยวข้องกับการแพร่กระจายทั่วโลกติดเชื้อเอชไอวี-1 ให้สูงขึ้นเพื่อแปลงพันธุกรรมสายพันธุ์ไวรัสที่แตกต่างกันเรียกว่า 'subtypes' และ ' การหมุนเวียนrecombinant ฟอร์ม [1] ความหลากหลายทางพันธุกรรมของเชื้อเอชไอวี-1 อาจส่งผลกระทบในความก้าวหน้าของโรคและการตอบสนองต่อการรักษาด้วย และมีผลการพัฒนาวัคซีน [2] จึงเป็นสิ่งสำคัญการตรวจสอบการเปลี่ยนแปลงในความซับซ้อนทางพันธุกรรม และทางภูมิศาสตร์ของระบาดของเชื้อเอชไอวี-1 และกระบวนการที่ไดรฟ์เหล่านี้ระบุการเปลี่ยนแปลง ของสายพันธุ์ต่าง ๆ เอชไอวี-1 ที่มีการอธิบาย มากที่สุดแพร่หลายทั่วโลกเป็นชนิดย่อย C [3] อธิบายครั้งแรก ในตะวันออก และแอฟริกาใต้ [4], ติดเชื้อกับไวรัสอยู่ (หรือบางส่วนได้รับจาก) ขณะนี้เป็นชนิดย่อย C พบมากในภูมิภาคทั่วโลก รวมทั้งอินเดีย จีน และ Americ ใต้ในภูมิภาคซึ่งจะมีการแนะนำชนิดย่อย C มี overtaken อื่น ๆ สายพันธุ์เอชไอวี-1 ที่นำมาใช้ในรุ่นก่อนหน้าเวลา [6-9] ยวด การศึกษาแนะนำว่า ชนิดย่อย C อาจได้รับยาหลายอย่างรวดเร็วมากขึ้นกว่า subtypes อื่น ๆ เอชไอวี-1อย่างรวดเร็วในการแพร่กระจายของชนิดย่อย C ในภูมิภาคอเมริกาใต้-รวมทั้งบราซิล อาร์เจนตินา และ อุรุกวัย - ได้วาดเฉพาะสนใจ [12-16] การศึกษาล่าสุดบ่งชี้ว่า ภาคใต้โรคระบาดชนิดย่อยอเมริกัน C น่าจะมาจากผู้ก่อตั้งที่เดียวไวรัส ที่ป้อนทวีปผ่านใต้บราซิล ถูกderived from viral strains prevalent in East Africa [12,13].However, the external route via which this virus spread fromEast Africa to South America has remained mysterious.In the United Kingdom (UK), the prevalence of subtype C hasincreased steadily since the early 1990s, and it now ranks as thesecond most prevalent HIV-1 subtype after subtype B [17]. The overwhelming majority of subtype C infections in the UK occur inindividuals whose reported exposure risk is heterosexual contact,and who were likely infected in Southern or Eastern Africa [18].However, in a previous analyses of HIV-1 genetic diversity [19],we observed that some subtype C isolates sampled within the UKexhibit high levels of genetic similarity to isolates obtained inSouth America. To explore this finding in greater detail, wescreened 8,309 subtype C sequences sampled within the UK toidentify isolates genetically linked to the South Americanepidemic. We then examined the genetic relationships of theseisolates to subtype C isolates sampled worldwide evolutionary rate and the date of the most recent commonancestor (MRCA) were performed using a Bayesian Markov chainMonte Carlo (MCMC) approach as implemented in BEAST v1.7.Analyses were performed with a Bayesian Skyline coalescent treeprior, under the GTR + I + C model of nucleotide substitution,and using both a strict and a relaxed molecular clock (uncorrelatedLognormal model). Two separate MCMC chains were run for 108generations for each dataset, sampled every 10,000th generation.BEAST output was analyzed using TRACER v1.4 [27], withuncertainty in parameter estimates reflected in the 95% highestprobability density (HPD) values after excluding a burn-in of 10%.MethodsStudy Group and Reference Sequences8,309 subtype C pol gene sequences sampled within the UKwere obtained from the UK HIV Drug Resistance Database(www.hivrdb.org.uk). These sequences were generated by populationsequencing from plasma samples collected between 1996and 2008, and were anonymously linked to data (obtained undervoluntary agreement of patients) describing the ethnicity, nationality(country of birth) and exposure risk group of infectedindividuals. Sequences were at least 1000 nucleotides in length,spanning the genomic region between 2,253 and 3,251 nucleotides(HXB2 coordinates). Sequences are available on request from theUK HIV Drug Resistance Database.A globally sampled reference sequence set comprising 1,289previously published subtype C pol gene sequences annotated bycountry of sampling was obtained from the Los Alamos HIVSequence Database (www.hiv.lanl.gov). The reference set includedsequences from Argentina (n= 8), Burundi (n= 92), Brazil(n =122), Botswana (n= 144), Ethiopia (n =101), India (n =74),Kenya (n= 3), Tanzania (n =65), Uganda (n= 11), South Africa(n =667) and the UK (n =2). The IDs of reference sequences usedin this study are provided as supplementary information (File S1).Sequence AnalysisSequences were classified into phylogenetic groups (i.e.subtypes, circulating recombinant forms and within-subtypelineages) using the REGA HIV-1 subtyping tool (version 2.0,available at: www.bioafrica.net) [20–22]. Sequence alignmentswere created using MUSCLE [23] and manually edited.Maximum likelihood phylogenies were constructed using PhyML[24] and parameters estimated from the dataset (nucleotidesubstitution model =HKY85, transition/transversion ratio = 4.0,gamma shape parameter = 0.780). Bayesian phylogenetic analysiswas performed using MrBayes v3.1.2 [25]. Bayesian phylogenieswere inferred using the GTR+I+C nucleotide substitution selectedusing Modeltest [26]. For each dataset, two runs (one cold and onetree heated, tempJ0.20) of four chains each were run for 107generations, with trees sampled every 1000th generation. Theburn-in of 10% was excluded from the analysis. Convergence ofparameters was assessed by calculating the effective sample size(ESS) using TRACER v1.4 [27], excluding an initial 10% for eachrun. All parameters estimates for each run showed ESS valuesmore than 300. Shared drug resistance mutations were identifiedusing the calibrated population resistance (CPR) tool [28].Estimation of Evolutionary Rates and DatesAll sequences used for estimation of dates were examined forevidence of inter- and intra-subtype recombination. Sequencesthat were not classified as pure (non-recombinant) subtype C byREGA (inter-subtype recombination) [21,22] and SCUEL (intrasubtyperecombination) [29] were excluded. Estimates of the evolutionary rate and the date of the most recent commonancestor (MRCA) were performed using a Bayesian Markov chainMonte Carlo (MCMC) approach as implemented in BEAST v1.7.Analyses were performed with a Bayesian Skyline coalescent treeprior, under the GTR + I + C model of nucleotide substitution,and using both a strict and a relaxed molecular clock (uncorrelatedLognormal model). Two separate MCMC chains were run for 108generations for each dataset, sampled every 10,000th generation.BEAST output was analyzed using TRACER v1.4 [27], withuncertainty in parameter estimates reflected in the 95% highestprobability density (HPD) values after excluding a burn-in of 10%.
การแปล กรุณารอสักครู่..