Phylogeny of Chrysotoxum species ( Diptera : Syrphidae ) inferred from morphological and molecular characters

Relationships of nine Italian Chrysotoxum species were analysed using morphological and molecular data. The morphology-derived cladogram revealed three well-defined groups: (i) C. cautum, (ii) the arcuatum group (C. arcuatum, C. fasciolatum) and (iii) the festivum group (C. festivum – C. vernale, C. bicinctum, C. elegans, C. octomaculatum and C. parmense). Trees inferred from COI-tRNALeu-COII sequences were largely in agreement, but they identified (i) C. parmense as an isolated branch, (ii) C. festivum and C. vernale as separate entities, (iii) C. elegans within a paraphyletic C. festivum clade. ITS2 trees were partially unresolved but C. parmense sequence emerged as a sister to the festivum group. The monophyly of the festivum group derived from morphological data was rejected by a phylogenetic test performed on combined molecular data set. The diagnostic value of some morphological characters commonly used to identify Chrysotoxum species is therefore questioned. 459 * Corresponding author; e-mail: barman@alma.unibo.it of yellow bands on abdomen and, in particular, the connection of these bands with the posterior band of tergite. Unfortunately, the two yellow spots are sometimes marginally connected and the differences between the two states is again not clear. Moreover, in other species, such as Chrysotoxum arcuatum (Linné, 1758), the whole range of phenotypes between the two extreme states can be observed. Chrysotoxum octomaculatum Curtis, 1837 and C. elegans Loew 1841 are usually separated by the yellow spots on the anterior black margin of III–V tergites: these are present in the former species and typically absent in the latter (e.g. Stubbs & Falk, 1983). But, again, all the intermediate forms between the two extreme phenotypes occur and the diagnosis of the two taxa is not always possible. Recently, some authors have begun to study the taxonomy of hoverflies using molecular approaches (Cheng et al., 2000; Skevington & Yeates, 2000; Ståhls & Nyblom, 2000; Milankov et al., 2005), and a phylogeny of syrphids at the intra-familial level inferred from combined molecular and morphological analyses has been proposed (Ståhls et al., 2003). Genetic data could provide an important source of additional evidence to corroborate and test the confusing and often controversial morphology-based species definitions in Chrysotoxum. In the present study, we analyzed DNA regions from both nuclear (internal transcribed spacer 2) and mitochondrial (a region spanning from a portion of the cytochrome oxidase I to the cytochrome oxidase II, including the leucine transfer RNA) genomes in some palearctic Chrysotoxum species. These markers have been used in many systematic studies and were successful in reconstructing the phylogeny of several insect groups at different taxonomic levels (Caterino et al., 2000). Our aims include (a) the reconstruction of phylogenetic relationships among Chrysotoxum species of the Italian fauna, and (b) a critical comparison between molecular and morphological derived phylogenies. MATERIAL AND METHODS


INTRODUCTION
Hoverflies (Syrphidea) are distributed worldwide, with the greatest species diversity in the New World Tropics.More than 6000 species have been described so far, but probably about 14000 species could belong to the family (F.C.Thompson, pers. commun.).Many books are available for the identification of adult stages, at least for western palearctic taxa (see Speight, 2004, for a detailed list).
The genus Chrysotoxum has been recognised from the beginning of XIX century as Antiopa (Meigen, 1800).While most authors agree in placing the genus within the Syrphini (e.g.Rotheray & Gilbert, 1999;Vockeroth, 1992), the taxonomic ranks of several entities it comprises are contested (e.g.Chrysotoxum impressum Becker, 1921, C. gracile Becker, 1921, C. latifasciatum Becker, 1921and C. rhodopense Drenski, 1934; see Speight, 2004).More than 110 Chrysotoxum species have been described so far and, according to Peck (1988), 19 are present in Europe (Caucasus excluded).Several of these species have been described on the basis of small differences in abdomen or leg colour and in other subtle morphological traits.In the past, several revisions were carried out (Giglio Tos, 1890;Loew, 1841;Rondani, 1845;Shannon, 1926;Violovitch, 1974), but in some cases these contributions produced more confusion than clarity.For example, Giglio Tos (1890) revised the European fauna describing 11 new taxa; only one of these is now accepted as a valid species (Sommaggio, 2001).
Further problems result from the use of morphological characters that are not diagnostic such as femora colour or the extension of yellow spots on the abdomen.For example, the first feature is commonly used to distinguish Chrysotoxum festivum (Linné, 1758), the type characterised by completely yellow femora, from Chrysotoxum vernale Loew, 1841 that shows fore and middle femora black in the basal fourth (e.g.Bradescu, 1991;Séguy, 1961;Stubbs & Falk, 1983).Yet, all the possible intermediate states occur between these two extreme phenotypes, preventing a clear-cut morphological separation of single specimens.Albeit C. vernale is usually smaller than C. festivum, body dimension is a continuous character as well and a clear separation between states is not possible.In C. vernale "bars in abdomen are almost straight", while in C. festivum these bars are "usually with distinct elbow bend" (Stubbs & Falk, 1983, p. 59).Even if these features may help in identifying the species, they can not be used in a phylogenetic analysis.An additional character -traditionally reported in the keys -is the extension of yellow bands on abdomen and, in particular, the connection of these bands with the posterior band of tergite.Unfortunately, the two yellow spots are sometimes marginally connected and the differences between the two states is again not clear.Moreover, in other species, such as Chrysotoxum arcuatum (Linné, 1758), the whole range of phenotypes between the two extreme states can be observed.
Chrysotoxum octomaculatum Curtis, 1837 and C. elegans Loew 1841 are usually separated by the yellow spots on the anterior black margin of III-V tergites: these are present in the former species and typically absent in the latter (e.g.Stubbs & Falk, 1983).But, again, all the intermediate forms between the two extreme phenotypes occur and the diagnosis of the two taxa is not always possible.
Genetic data could provide an important source of additional evidence to corroborate and test the confusing and often controversial morphology-based species definitions in Chrysotoxum.
In the present study, we analyzed DNA regions from both nuclear (internal transcribed spacer 2) and mitochondrial (a region spanning from a portion of the cytochrome oxidase I to the cytochrome oxidase II, including the leucine transfer RNA) genomes in some palearctic Chrysotoxum species.These markers have been used in many systematic studies and were successful in reconstructing the phylogeny of several insect groups at different taxonomic levels (Caterino et al., 2000).
Our aims include (a) the reconstruction of phylogenetic relationships among Chrysotoxum species of the Italian fauna, and (b) a critical comparison between molecular and morphological derived phylogenies.

Taxon sampling
Nine Chrysotoxum species were included in this study.Most specimens were field collected as adults by hand net in Northern Italy, and preserved in 100% ethanol for subsequent DNA extraction.Samples of C. octomaculatum and C. vernale were available only as dried, pinned specimens.As the only record in Italy for C. parmense was given by Rondani (1845), we included in the present study two pinned specimens collected in Turkey.Detailed sampling information is given in Table 1.
Leucozona laternaria (Muller, 1776), which belongs to a genus closely related to Chrysotoxum, was selected as outgroup.
The morphological data set comprises 23 characters (Appendix).These are part of a larger matrix used for the morphological analyses of the whole Chrysotoxum genus (Sommaggio, in prep.).In the present analysis, only characters showing clear alternative states were considered.We reduced as much as possible the use of characters based on or related to colour differences (e.g.femora colour; the extension of yellow spots on abdomen).

DNA extraction
For ethanol preserved individuals, genomic DNA extraction was performed on three legs, following the CTAB method (Winnepenninckx et al., 1993).Genomic DNA from dried specimens was extracted from either six legs or part of the thorax using the DNeasy Tissue Kit (Qiagen, Valencia, CA).
Specimen vouchers have been deposited at Dipartimento di Scienze e Tecnologie Agroambientali, Alma Mater Studiorum -Università di Bologna.Genomic DNA vouchers are conserved at Dipartimento di Biologia Evoluzionistica e Sperimentale, Alma Mater Studiorum -Università di Bologna.

PCR amplification and sequencing
PCR amplifications were performed in a 50-µl mixture using the Taq Polymerase Recombinant (Invitrogen, Carlsbad, CA) kit following standard protocol.Thermal cycling was carried out in a Gene Amp PCR System 2400 (Applied Biosystems, Foster City, CA), using the following program: initial denaturation at 94°C for 5 min, 30 cycles of 30 s at 94°C, 30 s at 48-50°C, 30 s at 72°C, and final extension at 72°C for 7 min.
The C1-J-2797 and TK-N-3785 (Simon et al., 1994) primers were used to amplify the mtDNA region including part of the 3' end of cytochrome oxidase I (COI), the leucine tRNA (tRNALeu) and most of the cytochrome oxidase II (COII).C2-N-3389 and C2-J-3279 primers (Simon et al., 1994) were also used in sequencing, to obtain complete sequences of complementary strands.
Internal Transcribed Spacer 2 (ITS2) region was amplified using the primers described in Manonmani et al. (2001).Direct cycle-sequencing of some of these PCR products gave equivocal nucleotide chromatograms, indicating that some Chrysotoxum had more than one ITS2 variant.Therefore PCR products from single individuals were cloned.Amplicons were ligated in pGEM-T Easy Vector (Promega, Madison, WI) and used to transform E. coli DH5 competent cells.Recombinant colonies were identified using the -galactosidase gene blue-white colour system (Sambrook et al., 1989).Positive clones were amplified and sequenced with M13 primers.
All sequencing reactions were performed using the Big Dye Terminator Kit (Applied Biosystems, Foster City, CA) according to the manufacturer's protocol; sequences were generated on ABI 310 capillary sequencer (Applied Biosystems, Foster City, CA).
Two to five ITS2 clones per individual were checked through the mfold server program (Zucker, 2003; http://www.bioinfo.rpi.edu/applications/mfold/old/rna/form1-2.3.cgi) for possible pseudogenic variants.The program finds the optimal and suboptimal secondary structures by calculating the free energy for folded RNA molecules.ITS2 rRNA, in fact, presents a well defined secondary structure required for transcript processing into functional RNA units.We used the secondary structure of various dipteran sequences (Young & Coleman, 2004) as a term of comparison.ITS2 sequences with anomalous folding pattern were indicated as pseudogenic variants.

Phylogenetic analyses
The morphological data set was analysed with the Maximum Parsimony (MP) method (PAUP*, version 4.0b; Swofford, 2001) using heuristic option with 100 random addition searches.Characters states were unordered and weighted equally.Bootstrap values were calculated after 1000 replicates.
Sequences were aligned with CLUSTAL algorithm included in Sequence Navigator program (Version 1.0.1,Applied Biosystems Foster City, CA).Alignments were also edited by eye.
Uncorrected p-distances and nucleotide/amino acid compositions were obtained using Mega 3.0 package (Kumar et al., 2004).
Molecular phylogeny was performed through MP (as described for the morphological data set) and Maximum Likelihood (ML) analyses with PAUP* program.For ML analyses, Modeltest (version 3.06; Posada & Crandall, 1998) was run to determine the best substitution models (F81 + for COI/tRNALeu/COII; HKY + for ITS2 and combined data set); bootstrap support was calculated after 100 replicates.

COI/tRNALeu/COII sequence variation
The sequenced mtDNA fragment covered 925-946 bp: this region included 195 bp pertaining to the 3' end of COI, 66 bp of the leucine tRNA, and 666 bp of COII gene.Furthermore, the two C. cautum individuals revealed an (AT)9-11 microsatellite locus located between the tRNA and the COII genes.The average AT proportion was 76.2%.
The complete sequence alignment comprised 946 residues, with 137 variable sites and 90 parsimony informative sites.
Of the 16 variable sites in the COI fragment, three were at the first codon position, and 13 at the third position.The inferred protein sequence (65 amino acid long) showed two substitutions.In the COII region, 83 variable sites were observed, 16 of which occurred at the first codon position, six at the second, and 61 at the third codon position.The inferred COII protein sequence (222 amino acids) showed 15 replacements.In the tRNALeu gene, seven variable sites were scored.

ITS2 sequence variation
The cloning and sequencing of ITS2 amplicons revealed the presence of pseudogenic variants in all taxa, with the exception of C. cautum and C. elegans.These pseudogenic sequences were easily recognised for their aberrant secondary structures and no further considered.
The here analysed was quite different from the others obtained from conspecific females, since a large deletion occurred between bp 408 and 429.The mean A-T content is 73.7%.
The consistent length divergence of the ITS2 sequences observed among Chrysotoxum species produces ambiguous alignment of some nucleotide tracts (Fig. 1); therefore, a region spanning from position 330 to position 440 of the first alignment was excluded from further Fig. 1.ITS2 sequences alignment with flanking 5.8S and 28S regions of Chrysotoxum taxa.Since the outgroup species Leucozona laternaria is not included in this alignment, the region of ambiguous alignment trimmed from phylogenetic analyses spans here from residues 317 to 424.Identities are denoted by dots (.), gaps by dashes (-), and missing data by question marks (?).Acronyms are as in Table 1.
analyses.The final alignment had 503 positions, with 94 variable sites and 35 parsimony informative nucleotides.

Phylogenetic analysis
A MP heuristic search on the morphological data matrix produced one most parsimonious tree (TL = 29; CI = 0.862).In the cladogram (Fig. 2A same branching topology (Fig. 2B).Chrysotoxum cautum is identified as the most basal clade.A clear differentiation of the C. parmense haplotypes emerges: these appear very distantly related to the festivum group.Chrysotoxum arcuatum and C. fasciolatum cluster together and are recognised as sister taxa of the festivum group, which includes C. bicinctum, C. festivum, C. elegans, C. octomaculatum, and C. vernale.All haplotypes belonging to the same morphospecies form separate monophyletic groups, the only exception being C. elegans that appears closely related to C. festivum.Bootstrap consensus trees reveal overall high nodal support; the only exception is the weakly supported node connecting C. octomaculatum with the two haplotypes of C. vernale.
MP (136 shortest trees; TL = 269; CI = 0.836) and ML (-ln = 1252.70450trees inferred from ITS2 sequences show some polytomic branching patterns (Fig. 2C).The main topological feature is the identification of C. parmense as sister taxon of the festivum species group.On the other hand, the close relationship between C. festivum and C. elegans, evidenced by the mtDNA trees, is here lacking, possibly owing to the high divergence of C. festivum haplotypes.Chrysotoxum octomaculatum and C. vernale sequences, albeit being well differentiated, form a highly supported cluster.Outside this cluster, MP and ML analyses are incongruent.In the MP cladogram, C. cautum and C. arcuatum form a single cluster, even if with weak support (59%), and the two C. fasciolatum samples cluster together (99% of bootstrap support).The ML analysis does not support these nodes.
MP and ML trees obtained from the analyses of combined molecular data sets (not shown) are largely coincident with those obtained from mitochondrial genes, the only differences being the grouping of C. octomaculatum and C. vernale, which becomes highly supported (91%-95%), and the bootstrap value of the C. festivum-C.elegans cluster lowered to 78%-59%.
In both MP and ML analyses of combined data sets we also tested the hypothesis of monophyly of C. parmense sequences and of the festivum group haplotypes, as suggested by morphological data evaluation.The trees obtained by constraining C. parmense within the festivum group are significantly different from the optimal one (P < 0.05, Shimodaira-Hasegawa test): thus, the hypothesis of monophyly can be rejected.
Since the number of molecular characters largely exceeds the morphological ones and overwhelms them in the phylogenetic analyses, MP cladogram obtained combining molecular and morphological data (not shown) is largely coincident with the one generated on mitochondrial sequences alone.

DISCUSSION
In this study, we have analysed nine out of the 15 Chrysotoxum species known from Italy (Daccordi & Sommaggio, 2002).With the exception of some species like C. cautum and C. bicinctum, the sampling of adult specimens implies consistent practical difficulties in Italy, both for the short flight period and the relative rarity of some species.In particular, the species that are not included in our study are considered rare in Italy (i.e.Chrysotoxum lessonae Giglio Tos, 1890;C. verralli Collin, 1940;C. cisalpinum Rondani, 1845;C. intermedium Meigen, 1822) or characterised by an uncertain status (C. impressum, Becker, 1921).Ståhls & Nyblom (2000) report values of uncorrected pairwise sequence divergences ranging from 5 to 10% between Cheilosia species for a 1341 nucleotide fragment of the COI.The variation detected by Ståhls and coworkers for a COI sequence of 1128 nucleotides in intraspecific comparisons within the family was 0.1-0.3%(Ståhls et al., 2003).These data (albeit from different fragments) agree with the level of variation here detected, which ranges between 1.3 and 6.9% in inter-specific comparisons (here excluding C. elegans vs C. festivum), and 0.1-0.8% in intra-specific comparisons.
A comparison of the phylogeny based on adult morphological traits with the hypothesis inferred from mitochondrial and nuclear sequences support several interesting conclusions.First of all, as evident from both morphological and mitochondrial trees, C. cautum should be considered the most basal taxon, sister group of the remaining eight species.The phylogenetic hypothesis inferred with the ML method from ITS2 sequences, where a polytomy emerges, is not incongruent with this interpretation as the higher differentiation of C. cautum is reflected in branch length.On the other hand, the nuclear MP tree suggests a closer affinity between C. cautum and C. arcuatum.However, the bootstrap support is weak and the mutational steps between C. cautum and C. arcuatum (29 to 31) are higher than between C. fasciolatum and C. arcuatum (20 to 23).
Both the tree topologies and the monophyly test point out that the morphological assignment of C. parmense to the festivum group is not supported by molecular data.Although in the morphology-derived tree C. parmense is unequivocally placed within the festivum group, the striking variation in number of character states (from eight to 11 between C. parmense and the remaining species; from one to three in other interspecific comparisons) probably mirrors the relative isolation of this taxon.
Except for the position of C. parmense, in all phylogenetic hypotheses the festivum species group is always monophyletic.In the morphological analysis, a higher relationship between C. festivum, C. bicintum, and C. vernale is suggested.On the contrary, in both mtDNA and ITS2 trees, C. octomaculatum cluster with the haplotypes of C. vernale, with high nodal support in the latter and in combined data sets analysis.This clustering pattern points to a consistent genetic divergence between C. festivum and C. vernale, which is poorly characterized by morphological traits.It should be considered the hypothesis that the diagnostic characters claimed for the separation of these species, even if not clearly defined (see Introduction), could have some relevant significance, at least in their extreme states or in combination with others.The study of larval morphology, which proved reliable in syrphid systematics (Rotheray & Gilbert, 1999;Ståhls et al., 2003), could be useful to confirm this separation.
The taxa of the arcuatum group form a well-supported monophyletic clade in both morphological and mitochondrial gene trees.It is not possible to draw any conclusion about the phylogeny of this group from ITS2 trees, since the topologies obtained following MP and ML analyses are divergent.
The taxonomic status of C. elegans and C. festivum as two separate species could be questioned.P-distance values are the lowest scored for inter-specific comparison and overlap with divergence values between haplotypes of the same species.The two taxa cluster together with high nodal support in both MP and ML analyses of the mtDNA region; this node is not supported in ITS2 trees, but the close relationship between C. elegans and C. festivum is reflected by the short lengths of the branches connecting these entities.This significant genetic similarity suggests the possibility that the differences utilised to diagnose the two taxa are indeed due to intra-specific variability.However, given that the two taxa are sympatric and that the number of specimens analyzed here is low, neither the hypotheses of hybridisation nor incomplete lineage sorting, although never described in syrphids, could be ruled out.To solve the relationships between these taxa a wider sampling of individuals belonging to sympatric and allopatric populations must be analysed.
On the whole, the COI/COII mitochondrial fragment seems to be valuable for phylogenetic reconstruction within the genus Chrysotoxum.The ITS2 region appears more problematic for the ambiguity in the alignment, and because only a subset of the PCRs and sequencing reactions performed on dried specimens were successful.The problems in sampling make it difficult to include more taxa in further systematic studies based on this nuclear marker, since fresh specimens are needed.On the other hand, the sequencing of mtDNA fragments, which is simpler to achieve from poorly preserved specimens (Dean & Ballard, 2001; present study) could be more suitable for additional phylogenetic studies.From a taxonomic point of view, the species-specific sequences here generated could be valuable to start a DNA bar-coding system for Chrysotoxum taxa.The COI region, in particular, has been proposed as metazoan barcode target (Herbert et al., 2003;Blaxter, 2004; http://www.barcodinglife.com).
Fig. 2. A -MP bootstrap consensus tree obtained on the morphological data set (TL = 31, CI = 0.806); B -MP bootstrap consensus tree and ML tree derived from COI/tRNALeu/COII data set (TL = 185, CI = 0.811 / -ln L = 2354.64627);C -MP bootstrap consensus tree and ML tree obtained on ITS2 data set (TL= 280, CI = 0.804 / -ln L = 1252.70450).Values above branches indicate mutational steps (MP) or genetic distances (ML).Values below branches are the bootstrap percentages.Asterisks mark nodes that are not supported in MP or ML analyses.The dotted line in C indicates a node supported by ML but not by MP.Chrysotoxum species are given with haplotype in parentheses.

TABLE 1 .
length of the sequenced ITS2 region varied between 482 bp, in C. cautum and 564 bp, in C. parmense.The sequence of the unique C. arcuatum male 461 a = pinned specimen; b = specimen caught by Malaise trap; nd = not determined.Species, sex, acronym, locality and date of collection, and haplotype/GenBank accession number of each analysed individual.