Samples of C. harmani were collected from the Nagqu
area in Tibet, China, and voucher specimen was deposited
in Institute of Zoology, Shaanxi Normal University;
C. mantchuricum and C. crossoptilon were collected
from the Beijing Zoo, and samples were obtained from
bird specimen collections of the National Zoological
Museum, Institute of Zoology, Chinese Academy of
Sciences. The collection was under the permit from
Forestry Department and conformed to the National
Wildlife Conservation Law in China. No living animal
experiments were conducted in the current research.
All the samples were preserved in 100% ethanol and
stored at −20°C. The total genomic DNA was extracted
from the liver/muscle tissue using the standard phenol/
chloroform method [16].
PCR amplification and sequencing
The PCRs were performed under the following conditions,
2 min initial denaturation at 93°C, 40 cycles of:
10 s denaturation at 92°C, 30 s annealing at 58–53°C,
10 min elongation at 68°C in the preliminary 20 cycles,
and 10 s denaturation at 92°C, 30 s annealing at 53°C,
and 10 min elongation at 68°C with 20 s per cycle added
to the elongation step in the succeeding 20 cycles, and
finally an extension for 7 min at 68°C. The amplifications
were performed in 15 μL reactions containing
2.4 μL of 2.5 mM dNTPs, 2.1 μL of each primer at 10
μΜ, 1.5 μl of 10× LA PCR Buffer I (Mg2+-free), 1.5 μL
of 25 mM MgCl2, 1 μL of DNA template, 0.18 μL of 5
U/μl LA Taq polymerase (Takara, Dalian, China) and
4.22 μL ddH2O.
The sequences of the primers used for PCR amplification
and sequencing of the mitochondrial genes were
obtained from Sorenson (2003) [17] with minor changes
(Additional file 1). The mitogenome of C. harmani was
amplified in seven parts and the gaps were bridged using
other adjacent primers. The PCR products were purified
using the DNA Agarose Gel Extraction Kit (Bioteke,
Beijing, China) after separation by electrophoresis on a
0.8% agarose gel. After separation and purification, the
PCR products were sequenced by Sangon Biotech
(Shanghai) Co., LTD. using the Primer-Walking method.
The C. mantchuricum and C. crossoptilon mitogenomes
were amplified in 12 or 13 fragments, and sequencing was
performed using the Illumina Hiseq2000 high-throughput
sequencing system of Shenzhen Huada Gene Technology
Co., LTD. The gaps in the assembly after high-throughput
sequencing were filled in by direct sequencing using the
ABI 3730 DNA sequencer by Sangon Biotech (Shanghai)
Co., LTD., using adjacent PCR primers.
Gene identification and genome analyses
The Staden sequence analysis package [18] was used
for sequence assembly and annotation of C. harmani
mitogenome. The complete genome assemblies for the
mitogenomes of C. mantchuricum and C. crossoptilon
were performed using the SOAP de novo software.
Most tRNA genes were identified using tRNAscan-SE
1.21 [19] under the ‘tRNAscan only’ search mode, with
the vertebrate mitochondrial genetic code and ‘mito/
chloroplast’ source. The protein-coding genes (PCGs),
rRNA genes and the remaining putative tRNA genes
that were not identified by tRNAscan-SE were identified
by sequence comparison with other Galliformes
species. The rrnS secondary structure of Crossoptilon
was predicted based on the structure of Gallus gallus
and Anas platyrhynchos obtained from the Comparative
RNA Web (CRW) [20], and Pseudopodoces humilis
(now as Parus humilis) structure [16]. The rrnL secondary
structure was predicted based on the structure
of Xenopus laevis obtained from the CRW database,
Bos taurus [21] and P. humilis structures [16]. The
RNAstructure software was used to identify and draw
potential secondary structures in the single-stranded
control region. The nucleotide compositions of the
mitogenomes and amino acid information were analysed
using MEGA 4.1 [22].Sequence alignments
Along with the entire mitogenomes obtained in this
study, 42 Galliformes sequences were used in the phylogenetic
analysis, including two outgroups (Numida
meleagris and Alectura lathami). The DNA sequences of
the other species used in the phylogenetic analyses were
downloaded from GenBank (the accession numbers and
key information are shown in Additional file 2). The
tRNA and rRNA genes and the CR were individually
aligned using ClustalX 1.83 [23] with the default settings.
All 13 protein-coding genes were translated into
amino acids, and then aligned using MEGA 4.1 [22] with
default parameters for each gene, and finally retranslated
into nucleotide sequences.
Phylogenetic analyses of Phasianidae
Datasets containing 13 protein-coding genes (PCG) and
all 37 genes plus the control region (mitogenome) were
used to study the phylogenetic relationships within
Phasianidae. Phylogenetic analysis based on nucleotide
sequences was performed using PAUP*4.0b10 for the
Maximum parsimony (MP) method [24], RAxML-7.0.3
for maximum likelihood (ML) [25] and MrBayes 3.1.2
for Bayesian inference (BI) [26]. For ML and BI analyses,
models of the concatenated nucleotide sequences datasets
were assessed independently using AICc in MrModeltest2.2
[27]. The best fit model GTR + I + G was
chosen for the likelihood and Bayesian analyses. A consensus
tree was generated for MP analysis under the majority
rule. The reliability of the clades in the phylogenetic trees
was assessed by bootstrap probabilities (BSP) computed
using 1000 replicates, with random addition for each bootstrap
replicate. The 1000 replicates bootstrap support was
also performed in the ML analysis. Bayesian analysis with
Markov Chain Monte Carlo sampling was run for
1000000 generations saving a tree every 100 generations,
with one cold and three heated chains, and the burn-in
time was determined by the time to convergence of the
likelihood scores. The Bayesian posterior probabilities
(BPP) were estimated on a 50% majority rule consensus
tree of the remaining trees.
We examined the performance of individual genes and
datasets based on nucleotides; PBS analyses were performed
in the program combining TreeRot.v3 [28] and
PAUP*4.0b10 [24]. The following datasets were used for
analysis: 13 protein-coding genes, rrnS, rrnL, CR partitions,
the first, second and third codons of PCG, the
three tRNA gene cluster (IQM, WANCY and HSL), ATP
(atp6 + atp8), COX (cox1 + cox2 + cox3) and NADH
(nad1 + nad2 + nad3 + nad4 + nad4L + nad5 + nad6).
The MEGA 4.1 [22] was used to calculate the pairwise
genetic distance for four Crossoptilon species with default
parameters. The mitogenome aligned data and four
single genes (nad2, CR, cytb and rrnS) obtained from
GenBank (Additional file 3) were used to calculate the
genetic distances. These genes were aligned singly, and
be adjusted to consistent sequence lengths manually.
Divergence time estimates focused on Crossoptilon
Along with the PCG dataset obtained in this study, 42
Galliformes sequences were used to estimate the divergence
time of the Crossoptilon species. The divergence
time of Crossoptilon species was calculated using the
Bayesian procedure implemented in BEAST v. 1.7.2
[29,30]. A relaxed clock was used with rates complying
with a log-normal distribution [31]. The GTR + I + G
model and a Yule prior were used in the analysis. The
calibration points were based on the fossil records showing
that stem Numididae-Phasianidae split at 50–54
Mya (million years ago) [32]; Arborophila rufipectus diverged
from the other lineages in the Galliformes around
39 Mya [33]; Coturnix-Gallus split at 35 Mya [34-36].
The results of runs of 10 million generations were used
after a burn-in of 100.
Ka and Ks analysis
To better understand the evolution at the DNA level
and the role of selection in the four Crossoptilon species,
we calculated the nonsynonymous and synonymous substitution
rates using Kaks_calculator 2.0 [37] for six
groups [C. harmani-C. mantchuricum (,
C. mantchuricum-C. crossoptilon (, C. harmani-
C. crossoptilon (C.har-C.cro), C. harmani-C. auritum
(C.har-C.aur), C. mantchuricum-C. auritum (C.
man-C.aur), and C. crossoptilon-C. auritum (C.cro-C.
aur)]. The ratio of nonsynonymous substitution rate
(Ka) to synonymous substitution rate (Ks) is widely used
as an indicator of selective pressure at the sequence level
among different species. It is commonly accepted that
Ka > Ks, Ka = Ks, and Ka < Ks generally indicate positive
selection, neutral mutation, and negative selection, respectively
[38,39]. To calculate Ka, Ks and Ka/Ks, a
model averaging method was selected. This method includes
14 different models for calculation and derived
the average values for Ka, Ks, and Ka/Ks [37]. The genetic
code selected was the ‘vertebrate mitochondrial
code’. To further study the selective pressure acted on
each protein-coding gene in the genus Crossoptilon,
CodeML in PAMLX software [40] was used to find sites
under strong selective pressure. The secondary structure
analysis of amino acid was performed by using an online
software TOPCONS [4
