"DNA barcoding" is of limited value for identifying adelgids (Hemiptera: Adelgidae) but supports traditional morphological taxonomy

The sequence diversity in the mitochondrial cytochrome-c oxidase I (COI) gene was evaluated as a tool for resolving dif- ferences among species of European adelgids collected from several localities across the Czech Republic. Members of 7 genera and 16 species were examined, and as outgroups, two species of Phylloxeridae were used. Sequence divergences within species were on average less than 0.15%, whereas divergences between species ranged from 0.0 to 4.12% for congeneric and to 13.24% for intergen- eric comparisons. It is concluded that DNA barcoding of Adelgidae is a powerful tool for identifying genera, but at the species level it works only in those cases where there are no species complexes. Nevertheless, it can be used as a complement to traditional, mor- phological taxonomy.


INTRODUCTION
Adelgids are a small group of aphids, feeding on a variety of coniferous trees.With their very complex life cycles and biology (cyclical parthenogenesis, multiple generations, host-switching) they present a great taxonomical challenge.Their larval stages are morphologically almost indistinguishable, which makes species identification by traditional methods (i.e.microscopic inspection of series of individuals from the same clone) very cumbersome and time-consuming.In fact, only a very skilled professional can distinguish adelgids with certainity, which is not very practical for every day forestry management.
The classification of adelgids both at the species and genus levels is not completely satisfactory and it is more than obvious, that besides the morphological structures and bionomical details that have been studied since the end of the 19 th century, modern molecular approaches are needed.
A first attempt to utilize molecular markers for adelgid systematics was that of Havill et al. (2007).The phylogenetic relationships based on mitochondrial cytochrome oxidase subunit I (COI), cytochrome oxidase subunit II (COII), and cytochrome b (cytb) genes and the nuclear elongation factor 1 (EF-1 ) locus were reconstructed.The results support the previous morphology-based division into two genera proposed by Annad (1928) or the two subfamilies of Börner & Heinze (1957), respectively.However, this study focused on higher level phylogenetic classification and host specificity.Several species were represented by only one individual or not at all.
Recently, one approach to DNA-based species identification has quickly gained many supporters as well as opponents.The method is called "DNA barcoding" and is based on the assumption that variability found in the segment of cca 650 bp of the mitochondrial gene for cytochrome oxidase subunit I (COI) is sufficient for the identification of species (Hebert et al., 2004) or even for species delimitation (Tautz et al., 2003).As the literature on "DNA barcoding" grows it is clear that these generali-zations do not apply across the whole animal kingdom (Vences et al., 2005;Shearer & Coffroth, 2008), and a more careful statistical approach should be applied (Meyer & Paulay, 2005;Meier et al., 2006Meier et al., , 2008;;Weimers & Fiedler, 2007).
Despite its limitations, "DNA barcoding" can nevertheless help with species identification if applied correctly.However, it is first necessary to generate a database containing the "DNA barcodes" of the target taxa and then determine whether it can be used for species identification in a pilot study.Foottit et al. (2009) did this based on a set of 17 adelgid species.They concluded that "DNA barcoding" has potential for the detection of cryptic species, but can not distinguish species defined by life-cycle characteristics.
Here the goal is to establish a "DNA barcoding database" of adelgid species found in the Czech Republic, and determine its usefulness for identification compared to using morphological and ecological characters.were collected in 2005, 2007, and 2008 from several different localities in the Czech Republic (95 samples), Lithuania (1 sample), and Serbia (1 sample) (Table 1).Specimens (several individuals per clone) were either preserved in 100% ethanol or frozen and kept at -70°C until further analysis.Several individuals of Phylloxera coccinea and Viteus vitifoliae (both Phylloxeridae) were used as an outgroup.Adelgid collections are available in the Institute of Entomology, Biology Centre ASCR, eské Bud jovice.Microscopic identifications were made by J. Havelka.

Samples
Total genomic DNA was extracted from 3-5 specimens per clone using two different methods.A "Quick protocol" (Frati et al., 2001) was used for the frozen samples, while the ethanol preserved samples were extracted via ZR Genomic DNA Kit (Zymoresearch Inc., Orange, CA, USA) according to the manufacturer's protocol.PCR amplification was carried out using the TaKaRa Ex Taq system (TaKaRa Bio Inc., Otsu, Shiga, Japan) or TopBio polymerase Unis (TopBio s.r.o., Prague, Czech Republic) and universal primers LCO1490 and HCO2198 (Folmer et al., 1994) to amplify cca.670 bp of the 5' end of mitochondrial cytochrome oxidase I (cox1).Reaction volumes (50 µl) consisted of 5 µl of template DNA (not quantified), 5 µl of 10 × reaction buffer, 4 µl of dNTP mixture (2.5 mM each), 0.5 mM of each primer, and 1.25 unit of Taq polymerase.The PCR reactions were carried out in a Mastercycler ep gradient S thermocycler (Eppendorf AG, Hamburg, Germany) with the following profile: 94°C for 1 min followed by 35 cycles of 94°C for 30 s, 47°C for 45 s, 72°C for 1 min, and final extension at 72°C for 3 min.PCR products were cleaned with either the DNA Clean&Concentrator-5 Kit (Zymoresearch) or enzymatically with ExoSAP-IT (USB Corporation, Cleveland, Ohio, USA) before directly sequencing.
Sequencing was performed in both directions using the above primers in a BigDye v. 3.1 sequencing reaction on an ABI 310 automated sequencer (Applied Biosystems, Inc., Carlsbad, CA, USA) at the sequencing facility of the Laboratory of Genomics (Biology Centre ASCR, eské Bud jovice).Sequences were edited and aligned both manually and with the assistance of SeqMan (Lasergene 8 package of programs from DNASTAR, Inc., Madison, WI, USA).Morphological vouchers are kept within the collection of the Institute of Entomology, Biology Centre ASCR, eské Bud jovice.
Distance analysis was performed using the MEGA v.4 software (Tamura et al., 2007), which was also used to create Neighbour-Joining tree based on uncorrected p-distances (bootstrap analysis with 10,000 replicates) and divergence time estimates. 2test as implemented in PAUP* 4.0.b10( 2003) was used to determine the homogeneity of base frequencies among sequences.The same program and Modeltest (Posada & Crandall, 1998) were used to select the best fitting model for further phylogenetic analysis.
To explore the sensitivity of our results to the choice of reconstruction method, we also conducted Bayesian analyses using the GTR -site specific rate model with MrBayes v3.1.2(Ronquist & Huelsenbeck, 2003) and maximum-likelihood (ML) analysis with PHYML (Guindon & Gascuel, 2003).In both of these analyses the GTR + I + G model selected by the Modeltest was used.In the Bayesian analysis, 1st + 2nd + 3rd codon position were treated as separate partitions.Two different settings were explored.In the first, node support was assessed as the posterior probability from five independent runs, each with one chain of 2,500,000 generations (sampled at intervals of 100 generations with a burn-in of 6250 trees).In the second, node support was assessed as the posterior probability from two independent runs each with four chains (temperature for hot chains lowered to 0.1) of 10,000,000 generations (sampled at intervals of 100 generations with a burn-in of 25000 trees).For the maximum likelihood analysis, a random initial tree, best of NNI and SPR searches for the tree topology estimate, 5 independent runs and 1000 × bootstrap, were used.
We also utilized TaxonDNA (Meier et al., 2006) to obtain a frequency distribution for intra-and interspecific congeneric genetic variability and evaluate the potential of the "DNA barcoding" for identifying species of adelgid.In short, this program evaluates the similarity of the species sequences, assigning those with the closest match and the same species name as successfully identified.If there are several equally good best matches from different species, identification is considered ambiguous, which applies also to the species represented by a single sequence.

RESULTS
The 687 bp fragment of the COI gene was sequenced for 97 individuals, representing 16 species of adelgids and 2 phylloxerids.Two species, Aphrastasia pectinatae and Pineus pineoides, were represented by a single sample.No insertions, deletions, or stop codons were detected.Therefore, it is concluded that it is likely that only mitochondrial loci were sequenced and there are no nuclear pseudogenes (NUMT) in our data set.Of the 195 variable sites, 181 were parsimony informative.An alignment of the sequences is available from the authors upon request.All sequences have been deposited in GenBank (Table 1).
Nucleotide composition averaged over all adelgids showed an A+T bias (A = 36.4%,T = 39.1%,C = 14.8%,G = 9.7%), a common feature of insect mitochondrial DNA (Simon et al., 1994).Base frequencies were homogeneous among all sequences ( 2 = 37.7, df = 288, P = 1.0) and the overall transition/transversion bias was R = 1.02.According to the guidelines given in Kumar et al. (1994) this value allowed us to use the uncorrected p-distance in further analyses, such as, the estimates of intraspecific and interspecific divergence and phylogenetic tree construction.The phylogenetic analysis showed that all genera formed monophyletic groups.Since the topology of the trees generated by different methods was similar, only the NJ and Bayesian trees are presented (Fig. 1).The only difference between the NJ and the Bayesian trees is the position of the Pineus/Eopineus cluster.While in the NJ  tree Pineus/Eopineus stands as a sister group to the Sacchiphantes/Adelges/Cholodkovskya cluster (Fig. 1a), in the Bayesian tree it forms a polytomy with the branches leading to the Giletteella/Dreyfusia and Sacchiphantes/Adelges/Cholodkovskya clusters (Fig. 1b).Note that the bootstrap support for such clustering on the NJ tree is very low (21%).
Most clusters were strongly supported and formed either by single species (e.g.Cholodkovskya) or by species complexes (Adelges, Sacchiphantes, Dreyfusia, Gilletteella, and Pineus pini/orientalis).Eopineus is nested within Pineus and should thus either be synonymized or three genera have to be recognized in this cluster.On the other hand, nodes representing the subfamilies sensu Annand (1928), Börner & Heinze (1957) or Steffan (1968) were not monophyletic.
The threshold for species identification evaluated by TaxonDNA was 0.6%, nevertheless, only 41 (44.56%) of the sequences were correctly identified according to the "Best Match" criteria; 42 sequences were ambiguous (45.65%) and 9 (9.78%) were incorrectly identified (including 2 species represented by only a single specimen).The higher threshold values (at 1%, 2% and 3% levels) were also tested as recommended by Meier et al. (2008) andRatnasingham &Hebert (2007).As expected, these values did not increase the identification rate, which remained the same as at the 0.6% level in all these tests.

DISCUSSION AND CONCLUSIONS
Our results indicate that "DNA barcoding" using the COI gene can successfully identify adelgid species that are clearly delineated by classical taxonomy.Molecular data produced monophyletic groups representing single  genera.However, this marker was not sufficient to distinguish morphologically identical species in species complexes, whose description is based on ecology.
Mean values of overall intraspecific (0.15%) and interspecific (8.23%) divergence are comparable to those of other insects, such as 0.46% and 4.41-6.02%reported for tropical Lepidoptera (Hajibabaei et al., 2006) or 0.17% and 5.78% for parasitoid flies (Smith et al., 2006).They are also comparable to the results obtained by Havill et al. (2007) and Foottit et al. (2009), although direct comparison is difficult since these authors use a different generic taxonomy and the species sampled only partially overlapped.In addition, figures in the Foottit et al. (2009) study are also higher because the extensive sampling of two pest species suggested the existence of two or three cryptic species.
According to the guidelines proposed by Hebert et al. (2004), successful species identification is possible if mean interspecific divergence equals 10 × the mean intraspecific divergence, and in an ideal case, there should be an observable "barcoding gap", that is, a separation between mean intra-and interspecific congeneric COI sequences (Meyer & Paulay, 2005).However, it is argued by Meier et al. (2006) that instead of the mean value, only the smallest interspecific distance should be used.Although the observed difference in the mean values for our data set fulfills the first criterion, it is not so for the other two.The biggest issue is the large overlap of intraand interspecific divergence, which is most likely caused by the inclusion of species complexes.But there are examples of interspecific congeneric distances as low as 0.001 for the well defined species D. prelli and D. piceae, which makes the use of "DNA barcoding" doubtful.Since the evolution of the species complexes is recent, it is likely that the COI locus is not the best marker for individual species identification.
Our analysis supports the recognition of eight genera as proposed by Börner & Heinze (1957) or Steffan (1968) rather then the two genera system of Annand (1928).Phylogenetic trees distinguished most genera very clearly.However, our results differ from Havill et al. (2007).The first difference is the placement of the Pineus cluster, which is not distinctly separated from all the other adelgids.In our NJ tree and Bayesian analysis, it is positioned closer to the Adelges and Sacchiphantes groups.Second, both trees support sister group relationships between Gil-letteella and Dreyfusia, while Havill et al. (2007) placed Gilletteella closer to the Sacchiphantes/Adelges group.Taken together, our data best fit the system presented by Börner & Heinze (1957).
The most problematic group appears to be Dreyfusia.The status of its species has been disputed.Mantovani et al. (2001) conclude, based on mitochondrial DNA sequences, that at least three of the five Dreyfusia species known from conifers in Italy are doubtful.Our results are similar.Although some species form a species complex (nordmannianae/piceae), the observed pattern is striking.D. prelli is very distinct, based on morphology and life cycle characteristics.It is therefore quite surprising that the DNA marker did not reveal any differences with respect to the other species.
On the other hand, two sister species, G. cooleyi and G. coweni, can be identified based on the 3rd codon position difference (one silent substitution).Adding more sympatric samples from different localities is in this case highly desirable, as well as using other, more quickly evolving markers.
In conclusion, Adelgidae are another group for which the "DNA barcoding"is not the tool of first choice for species identification, although it can provide helpful suggestions for the identification of species at the genus level.The main problem is that the intraspecific and interspecific congeneric variability do not form two separate intervals with a distinct "barcoding gap".In addition, several species share the same haplotypes, thus the identification of these species is impossible.Furthermore, it is suggested that the eight genus system (with revision of the Pineus genus), previously proposed based on morphological studies, should continue to be used.Species complexes still remain an interesting puzzle both at the ecological and genetic level, and as suggested by Foottit et al. (2009) further studies are needed to resolve their species status.
ACKNOWLEDGEMENTS.We thank A. Tatarenkov for comments on an early version of this manuscript.We also thank two anonymus reviewers and R. Meier for valuable advice, which greatly improved the manuscript.The technical assistance of V. Chalušová (Institute of Entomology, BC ASCR eské Bud jovice, Czech Republic) is greatly appreciated.This work was funded by Grant No. A600960705 of the Grant Agency of the Academy of Sciences of the Czech Republic (Prague) and from the Institute of Entomology Z50070508.L.K. acknowledges that

154
The number of base differences per site obtained by averaging over all sequence pairs between genera is shown above diagonal.All results are based on the pairwise analysis of 97 sequences.Standard error estimates are shown in the lower-left part of the matrix and were obtained by a bootstrap procedure (1000 replicates).

Fig. 1 .
Fig. 1.Evolutionary relationships of 97 specimens representing 16 species of adelgids and 2 phylloxerids.1a -Neighbour-Joining tree, the bootstrap values (10000 replicates) are shown next to the branches; 1b -Phylogram from the MrBayes analysis.Numbers above each node represent the posterior probability support.

TABLE 1 .
List of the species of Adelgidae sampled (in alphabetical order) and the Phylloxeridae outgroup species.

TABLE 3 .
Estimates of evolutionary divergence (uncorrected p-distance) among genera.herPh.D. study was funded by grant 521/08/H042 from the Grant Agency of the Czech Republic, Prague.