Phylogenetic evaluation of the taxonomic status of Timandra griseata and T . comae ( Lepidoptera : Geometridae : Sterrhinae )

The sterrhine loopers Timandra griseata and T. comae have been treated as distinct species since 1994. However, morphological differences between the taxa are minor and therefore their status has often been disputed. Here, we present a molecular phylogenetic study, which separates T. griseata and T. comae into different clades. Altogether, 43 Timandra specimens from eight European countries were studied. The phylogeny is based on a comparative sequence analysis of mitochondrial genes coding for the cytochrome C oxidase subunit I (COI) and NADH dehydrogenase subunit 1 (ND1). Nevertheless, a single individual of both species was assigned to the “wrong” clade. The symplesiomorphy of T. griseata and T. comae is considered to be a result of introgressive hybridization. Conditions that could lead to the hybridization of T. griseata and T. comae are discussed, as well as the likely distribution history of these taxa in Northern Europe. Results of the current analysis are in favour of retaining the species status of T. griseata and T. comae.


INTRODUCTION
The sterrhine genus Timandra Duponchel, 1829 is distributed mainly in the Eastern Palaearctic and Oriental region (Kaila & Albrecht, 1994). Only three species occur in the Western Palaearctic (Müller, 1996) and one in the Nearctic (Prout, 1935-38). The total number of species in this genus is 21 (Scoble, 1999). Most species in this genus have similar wing pattern, but differences in genitalia enable the species to be distinguished quite easily (Inoue et al., 1982). Kaila & Albrecht (1994) defined the Timandra griseata group as including species that have a short and blunt sacculus and sclerotized costa of the valva branching at three quarters from the base, forming two branches of about equal length. Three species were considered to belong to this group: sibling species T. griseata (Petersen, 1902) and T. comae (Schmidt, 1931) in the Western Palaearctic and T. recompta (Prout, 1930) in the Eastern Palaearctic. Of these species, T. recompta clearly differs from the others, especially in the characters of the female genitalia. T. griseata and T. comae were treated as separate species for the first time by Kaila & Albrecht, 1994, who also used the name comai. The name comae by Schmidt (1931), which originally was given in honour of Mr. Pedro Coma, was found to be an incorrect original spelling (Kaila & Albrecht, 1994). Their emendation from comae to comai was based on the third edition of the Code ICZN (1985), article 31(a)(ii), which states that if a noun in the genitive case is formed directly from a modern personal name of a man, -i has to be added to the stem of that name; see details also in Kullberg et al. (2002). The name comai was subsequently used in sev-eral publications (eg. Kaila & Albrecht, 1995;Palmqvist, 1997;Kaila et al., 1999;Sihvonen, 2001;Kullberg et al., 2002). On the other hand, an alternative interpretation is possible, as article 31(a)(i) in Code ICZN (1985) and article 31.1.1. in Code ICZN (1999) state that a noun in the genitive case formed from a personal name that is Latin or has been latinized, is to be formed in accordance with the rules of Latin grammar. Following this paragraph results in comae as used in three major recent moth catalogues by Müller (1996), Scoble (1999) and Hausmann (2004). Based on this interpretation and the principle of priority, we consider comae to be the valid name and comai to be an unnecessary emendation.
Morphological differences between T. griseata and T. comae are often regarded as insufficient to treat them as separate species. Therefore, their specific status has been questioned by lepidopterists and criticized in literature (Hausmann, 1997(Hausmann, , 2004. The external morphology of T. griseata and T. comae differs as follows: ground colour of the forewing is whitish in T. griseata, yellowish in T. comae, dusting on wings is dense and grey in T. griseata, sparse and brownish-grey in T. comae, average wingspan is bigger and sexual size dimorphism more accentuated in T. griseata (Kaila & Albrecht, 1994). There are small differences in the position of the junction between the anterior and posterior parts of corpus bursae and in the angle of appendix bursae of the females. The external genitalia of male T. griseata and T. comae, however, are indistinguishable (Kaila & Albrecht, 1994). On the other hand, a detailed study of everted vesicae by Sihvonen (2001) supports the original hypothesis of Kaila & Albrecht (1994).
In addition to the differences in morphology, the distribution of T. griseata and T. comae is also different (Kaila & Albrecht, 1994). These species are allopatric over most of their geographic range, with a narrow contact zone in Fennoscandia and the northern part of the Baltic countries. T. comae is widely distributed in the Western Palaearctic -its range covers all of western, southern and eastern Europe, reaching southern Finland, central Sweden and southern Norway in the north and Turkmenistan in the east (Müller, 1996;Viidalepp, 1996;Aarvik et al., 2000;Huldén et al., 2000;Hausmann, 2004). T. griseata is restricted to northern Europe. This species is widely distributed in Finland and Sweden (Huldén et al., 2000;Hausmann, 2004) and has a wider range in southern Norway than T. comae (Aarvik et al., 2000). T. griseata has also been recorded in northwestern Russia (Kaila & Albrecht, 1994), but is rare in Estonia andLatvia (Kaila et al., 1999, Savenkov &Šulcs, 2004).
The only report of T. griseata from Denmark (Larsen, 1995), as well as notes that this species occurs in Belorussia, Ukraine, Crimea, the Ural Mountains, Caucasus and Trans Caucasus (Viidalepp, 1996) have been interpreted as erroneous (Hausmann, 2004). Moreover, T. griseata and T.comae differ in their phenology. In Finland, where T. griseata and T. comae are sympatric, the first imagoes of the bivoltine T. comae appear about two weeks earlier than the earliest specimens of the usually univoltine T. griseata. The first generation of T. comae is less abundant than the second, therefore, this species is most numerous in August, while the number of T. griseata peaks in late June (Kaila & Albrecht, 1994).
Recent short reports on the differences in mitochondrial NADH dehydrogenase subunit 1 (ND1) (Miller et al., 2001) and cytochrome C oxidase subunit I (COI) (Trusch et al., 2002) sequences of T. griseata and T. comae also indicate that these taxa might represent separate species. However, since Miller et al. (2001) used only one specimen of T. griseata and two of T. comae and Trusch et al. (2002) have not published the details of their study, an additional and more extensive molecular survey is needed to resolve the taxonomic status of T. griseata and T. comae; the need for further investigation is emphasized also by Hausmann (2004).
In this study, we analysed specimens of T. griseata and T. comae from Estonia and Finland, where these taxa are sympatric (Kaila & Albrecht, 1994;Müller, 1996;Kaila et al., 1999), and a few additional individuals from other European countries (Latvia, Sweden, Germany, Belorussia, Ukraine and Bulgaria). Partial sequences from mitochondrial COI and ND1 genes were used for phylogenetic inference, since mitochondrial genes are reported to be more useful than nuclear genes in most cases for resolving the boundaries of recently diverged species (Caterino et al., 2000;Wiens & Penkrot, 2002).

Moths
Forty-three specimens of the European T. griseata group from 27 localities were studied ( Table 1). Most of the material (36 individuals) was collected from 20 localities in Estonia and Fin-land ( Fig. 1). Both older pinned moths from the collection of the Institute of Agricultural and Environmental Sciences (individuals collected 1990-2000, Table 1) and fresh specimens were studied. All specimens were identified following the criteria given by Kaila & Albrecht (1994) and Kaila et al. (1999). L. Kaila kindly re-examined the critical specimens. Abdomens of the moths were removed, cut into two parts and stored at -20°C. Two to three anterior segments of the abdomen were used for DNA extraction; posterior parts with genitalia were retained at -20°C as vouchers and as a backup of the genetic material. Thoraces with head, wings and legs were pinned and kept in the IAE collection as vouchers, except for T. griseata 241, T. griseata 242 and T. comae 243, which are in the collection of the Finnish Museum of Natural History.

DNA purification, amplification and sequencing of gene fragments
Purification of the total genomic DNA was performed by using a High Pure PCR Template Preparation Kit (Roche Diagnostics GmbH, Mannheim, Germany) according to manufacturer's instructions for isolating nucleic acids from mammalian tissue, with the exception that the first incubation step was 55°C for 12 h rather than 1 h.
PCR was performed on a T1 Thermocycler (Biometra, Goettingen, Germany), cycling parameters were a 2 min denaturing  1. Specimens, collecting localities and dates, names of collectors and identifiers; and the NCBI accession codes for the mtDNA sequences. T. gri -Timandra griseata; T. com -Timandra comae. (Est -Estonia, Swe -Sweden, Fin -Finland, Lat -Latvia, Bul -Bulgaria, Ukr -Ukraine, Ger -Germany, Blr -Belorussia). step at 94°C, followed by 35 cycles of 30 s at 94°C, 30 s at 49°C and 60 s at 68°C with subsequent 7-min final extension at 68°C for COI amplification. ND1 amplification was carried out under similar conditions, except for the annealing step, which was 30 s at 54°C. The presence of PCR products was checked on a 1.6% agarose gel. The PCR solution was treated with shrimp alkaline phosphatase and exonuclease I (USB, Cleveland, USA). One unit of both enzymes was added to the PCR solution, which was incubated for 30 min at 37°C, followed by 15 min inactivation at 80°C. DNA cycle sequencing was performed using a DYEnamic ET Terminator Cycle Sequencing Kit (Amersham Biosciences, Uppsala, Sweden ). 33 cycles (15 s at 95°C, 15 s at 53°C and 60 s at 60°C for primer RON; 15 s at 95°C, 15 s at 47°C and 60 s at 60°C for primer NAN; 15 s at 95°C, 15 s at 54°C and 60 s at 60°C for primers NF and NR) were performed in a total volume of 10 µl. Both DNA strands were sequenced with 5 pmol of primers RON and NAN for COI, and NF and NR for ND1. Sequences were resolved using an ABI PRISM 377 automated sequencer (Applied Biosystems, Foster City, USA).

Phylogenetic analysis
Consensus sequences were created with Consed (Gordon et al., 1998), using sequence data from both DNA strands. Sequences were double-checked by eye and aligned with Clustal W (Thompson et al., 1994). BioEdit was used as a sequence editor (Hall, 1999).
Homogeneity of the COI and ND1 sequences was calculated using the partition homogeneity test in PAUP 4.0b10 (Swofford, 1998). The test was performed with the random addition heuristic search option, using 1000 replicates. Reduced median joining network was calculated with Netw 4106 (Bandelt et al., 1999). A neighbour-joining tree was constructed from the sequence data following the Kimura 2-parameter model using MEGA 2.1 (Kumar et al., 2001), branch supports were assessed by 1000 bootstrap pseudoreplicates. Modeltest 3.06 (Posada & Crandall, 1998) was used to search for the model of DNA sub-stitution that best fitted the data. MrBayes 3.04b (Ronquist & Huelsenbeck, 2003) was used for the Bayesian estimation of phylogeny. Since the TrN+I+G model selected by Modeltest is not implemented in MrBayes, the GTR+I+G model was used for Bayesian inference. An initial run was performed to determine the burn-in value, which was used in further analyses to exclude all trees prior to the stable log likelihood estimate. Searches were conducted with four simultaneous Markov Chains over 2 million generations, sampling every 100 generations. To estimate posterior probabilities of recovered branches, 50% majority rule was applied. To ensure that Bayesian inference was not trapped in local optima, the analysis was performed three times, starting from different random trees. Average loglikelihood (lnL) values with standard deviation (SD) at a stationary were calculated using MS Excel and compared for convergence. Phylograms were created as average-branchlength consensus trees in MrBayes and visualised with Tree-View 1.6.6 (Page, 1996).

RESULTS
Altogether, 43 specimens of the Timandra griseata group were successfully analysed. Based on morphological characters, 16 of them were identified as T. griseata and 27 as T. comae (Table 1).
Alignment of COI and ND1 sequences did not reveal any insertions or deletions. Sequences were also aligned to the complete sequence of the mitochondrial genome of Bombyx mori (Linnaeus, 1758) (AY048187). Sequenced COI gene fragment (392-bp) corresponds to positions 1808-2199, and ND1 gene fragment (398-bp) to positions 12064-12461 of the B. mori mitochondrial genome.
Combined analysis of the COI and ND1 sequences yielded 25 variable positions out of 790 (3.16%) ( Table  2). Four substitutions were at the first, two at the second and twenty at the third codon position. The transition-  24   ts  ts  tv  ts  ts  ts  ts  ts  ts  ts  ts  tv  ts  ts  ts  *  ts  ts  ts  ts  ts  ts  ts  ts  tv  Substitution type   3  2  3  3  3  3  1  3  3  3  3  3  2  3  3  3  3  1  3  1  3  3  3  1   2. Nucleotide variation in mtDNA COI and ND1 sequences of T. griseata and T. comae. The number of specimens for each haplotype and the number of localities for each haplotype are given in the last two columns. Dots indicate nucleotide identity to the T. comae 24 sequence. Numbering corresponds to the homologous sequence of Bombyx mori (AY048187). Substitution types are indicated as tv for transversion, ts for transition, and * for both transition and transversion. transversion rate was 22 : 4. Nineteen substitutions were synonymous and six non-synonymous. Intraspecific variation occurred at six positions (0.76%) in the T. comae and 12 positions (1.52%) in T. griseata sequences. Five transitions and two transversions (0.89%) distinguished T. griseata from T. comae. In the COI locus, homoplasy was found at position 591, both A and G was recorded in T. griseata individuals, while all T. comae specimens carried T at this site.
On a minimum spanning network (Fig. 2), T. griseata is clearly separated from T. comae, the shortest distance between the closest haplotypes of these species is 10 mutations. T. comae form a single haplogroup with seven haplotypes. Haplotype C24 is the central and by far the most numerous one in this haplogroup (21 specimens, all other T. comae haplotypes are represented by a single specimen). T. griseata can be divided into two haplogroups: the first consists of haplotypes G183, G254 and G182. The second haplogroup, consisting haplotypes G157, G190, G82 and G241, is six mutations away from the haplotype G183. No haplotype of T. griseata was significantly more abundant than the others (Fig. 2).
The model (TrN+I+G) was selected by Modeltest according to the Akaike Information Criterion. As it is not possible to implement the TrN+I+G model in MrBayes, the GTR+I+G model was used. Three independent Bayesian analyses converged on statistically equivalent log-likelihood scores and reached an asymptotic level after no more than 20,000 iterations (not shown). Majority rule consensus trees of three rounds of Bayesian analysis were identical and posterior probability values highly correlated, indicating that the analyses converged on a single optima. Both neighbour-joining and Bayesian inference gave similar tree topologies, albeit with different statistical support (Fig. 3). A major feature of the phylogenetic analysis, strongly supported by both methods, is that it divides T. comae and T. griseata into separate clades. T. griseata has a more pronounced internal structure, consisting of two subclades. Three of the four individuals in subclade I were collected from northern Estonia (Palmse, Narva-Jõesuu and Endla Nature Reserve), and one from the Tatra Valley near Tartu in eastern Estonia (Fig. 1). Specimens in subclade II originate from a variety of localities in eastern, southwestern and northwestern Estonia and southern Finland (Table 1).

DISCUSSION
The phylogenetic analysis separated T. griseata and T. comae into different clades (Fig. 3), supporting the suggestion that these taxa represent separate species (Kaila & Albrecht, 1994;Miller et al., 2001) or, as expressed by Trusch et al. (2002), "natural entities". Nonetheless, two individuals, T. griseata 180 and T. comae 137, were included in the clade of the other species. In order to exclude the possibility of misidentification, the morphological characters of both individuals were re-examined; 611 Fig. 2. Median joining network of T. griseata (black, haplotype name begins with G) and T. comae (white, haplotype name begins with C), based on COI and ND1sequences (790bp). Fig. 3. Neighbour-joining tree of T. griseata and T. comae, based on sequences of the mtDNA gene fragments: COI (392 bp) and ND1 (398 bp). As the Bayesian analyses yielded trees with similar topology, Bayesian posterior probabilities are included. Above the branches are bootstrap supports of the neighbour-joining tree (NJ) and below are posterior probabilities for Bayesian inference (BY). The average branch-length consensus tree was constructed from 9980 Bayesian trees recovered using the GTR+I+G model, lnL = -1335.49 ± 5.29. our identifications were later confirmed by L. Kaila (University of Helsinki). To rule out mistakes in genetic analysis (picking wrong individual, contamination, sequencing errors) and subsequent misinterpretation of data, the whole analysis from dissection to DNA purification and sequencing was repeated for these two specimens, using the posterior parts of their abdomens as the source of genomic DNA (see also Material and Methods). As this re-examination confirmed the results of the first analysis, we conclude that the identity and positions of T. griseata 180 and T. comae 137 in the phylogenetic tree are valid.
Following the Mayr's (1963) biological species concept, it is generally assumed that species must be monophyletic entities (Harrison, 1998). However, as shown by Pamilo & Nei (1988) and summarized by Wahlberg et al. (2003) and Funk & Omland (2003) in various sources, both polyphyly and paraphyly may arise during the speciation process. Complete lineage sorting may or may not have occurred during speciation and it cannot be assumed as an ineluctable event (Wahlberg et al., 2003).
Incomplete lineage sorting can result either from a recent divergence of populations, or from hybridization, which does not allow complete segregation of lineages. In the case of recent divergence, not enough time has elapsed for mutations to become fixed; both species still share identical or very similar haplotypes (i.e. have retained the polymorphism of their ancestral population). However, the claim of incomplete lineage sorting holds only for the genetic markers under investigation, usually the genes of mtDNA. It is conceivable that complete lineage sorting has occurred at other loci in the genome; candidates are primarily genes related to reproduction and the formation of a reproductive barrier between diverged populations, either at a pre-or postzygotic level (genes associated with genitalia formation, pheromone system, etc.). To find the sorted locus or loci for animals other than model organisms is an unrealistic task at present. A statement solely in favour of incomplete lineage sorting between species presumes that hybridization is ruled out. If hybridization occurs and the hybrids yield fertile offspring, then successful backcrossing does not allow lineages to diverge. On the other hand, when diverged populations hybridize (i.e. after the disappearance of geographic barriers), but do not produce fertile offspring, then one can still observe individuals that carry haplotypes of another species. However, since they do not pass genes to the next generation, the parent population retains its genetic status.
Here, the closest mtDNA haplotypes of T. griseata and T. comae differ from each other by at least 10 substitutions (Fig. 2), indicating that these taxa did not separate from each other recently. Moreover, T. griseata has a significantly diverged intraspecific genetic structure (Fig. 3), which supports the hypothesis that intraspecific diversification of T. griseata has ancient history. However, it is unlikely that haplotypes of T. comae 137 and T. griseata 180 have retained their ancestral mtDNA sequence, while a remarkable amount of mutations have accumulated in the population of T. griseata (and to lesser extent in T. comae). Therefore, we favour the scenario that the few atypical sequences in T. griseata and T. comae most likely originated from hybridization. Additional support for the hybridization scenario comes from the phenology of T. griseata and T. comae. T. comae is bivoltine, while T. griseata is usually univoltine. Both species are protandric (i.e. males appear earlier than females), as most insects with discrete generations (e.g. Carvalho et al., 1998). Flight periods of T. griseata and T. comae differ, but partly overlap in southern Finland (Kaila & Albrecht, 1994).
In Estonia, a similar pattern was observed (Fig. 4): the flight period of the first generation of T. comae ranges from late May to the end of June; the second and more abundant brood is on wing from late July to September. The flight period of T. griseata begins in late June and peaks in the middle of July, lasting until the end of the month. Therefore, the flight periods of the species overlap at least twice in each year, providing an opportunity for the species to hybridize. The earliest possible time when hybridization can occur is the end of June, when both sexes of the first generation of T. comae and only males of T. griseata are flying, but females of T. griseata have not emerged yet. Males of T. griseata not finding a receptive female of their own species can copulate with females of T. comae.
Hybridization may also occur at the end of July, when both sexes of T. griseata, but only males of the second generation of T. comae, are on wing and females of the second T. comae brood have not yet emerged. At this time, males of T. comae may hybridize with females of T. griseata. However, it is unlikely that there are many unfertilized females of either Timandra species searching for mates at the end of their flight period. Therefore the probability for hybridization is low, which can explain why only two Timandra specimens of possible hybrid origin were found during this study.
The hypothesis of hybridization between T. griseata and T. comae receives further support from the discussion of Pittaway (1993) for closely related sphingid species Hyles euphorbiae euphorbiae (Linnaeus, 1758) and Hyles vespertilio (Esper, 1780 A likely explanation is that T. comae 137 and T. griseata 180 are not F1 hybrids and the hybridization event is more ancient. The eastern limit of T. griseata is unknown, but it is probably significantly further east than current data indicate. Notes that T. griseata occurs south of Scandinavia and the Baltic States (Viidalepp, 1996) should be interpreted with care until validated (as the identification was carried out before the recognition of T. griseata and T. comae as separate species, all individuals were identified as T. griseata).
T. griseata and T. comae became sympatric during the 20 th century. T. griseata has been resident in Finland since the 19 th century (the oldest records date back to 1869), while T. comae expanded its distribution into this area during the 20 th century. The earliest Finnish specimen of T. comae was taken in 1920 (Kaisila, 1954;Kaila & Albrecht, 1994). Our survey of the Estonian insect collections gave similar results: all Timandra individuals, collected between 1874 and 1939, were identified as T. griseata; the earliest Estonian T. comae specimen was collected in 1940. The largely different geographic distributions and timing of colonization by these species supports Kaisila's hypothesis, that T. griseata (Calothysanis amataria griseata in his paper) and T. comae (C. a. brykaria) have different colonization histories in northern Europe: in addition to the differences in their time of arrival, they also came via different routes: T. griseata from the east and T. comae from the west.
The T. griseata clade is divided into two subclades (Fig. 3) that differ in geographic distribution. Most of the specimens from Estonia and Finland belong to the subclade II, but three haplotypes constitute the subclade I. Three specimens of the subclade I were from northern Estonia (i.e. all north Estonian individuals) and one from the Tatra Valley near Tartu in eastern Estonia (Fig. 1). Data on the phylogenetic structure and geographic distribution of the different haplotypes of T. griseata further indicate that this species may have come from two directions. The most likely scenario is that specimens of subclade I spread into Estonia via the northern side of Lake Peipsi, while ancestors of subclade II came via a southern route (Fig. 5). A similar colonization pattern is documented for another Lepidopteran, Parnassius mnemosyne estonicus Bryk 1922, which rapidly expanded its range, both in northern and southeastern Estonia, during the last two decades (Viidalepp, 2000).
The completely different intraspecific genetic structure of T. griseata (highly divergent, ancient radiation) compared to T. comae (limited divergence, rapid radiation) proves that the evolution of these taxa has been different. The complex phylogenetic structure and current geographic distribution of T. griseata in northern Europe indicates that this species colonized northern Europe from several (or at least two) refugia. An alternative possibility, that the population with high intraspecific genetic polymorphism came from a single refuge area, is less likely because of its current distribution pattern. Limited genetic divergence of the population of T. comae suggests that all current populations in Europe came from a single refuge area and descend from a population with a low polymorphism at the mtDNA loci. Another explanation that only part of the genetic diversity reached their current locations, i.e. the population has gone through a genetic bottleneck, deserves less credit, as the geographic range of the specimens studied was wide (from Bulgaria to Scandinavia).
For a sustainable identification of animals, Hebert et al. (2003) proposed a system that employs DNA sequences as taxon "barcodes". They suggested that the mitochondrial COI gene could serve as the core of a global bioidentification system. Our results tend to support their proposal, although with reservations. As two Timandra specimens were placed into the "wrong" clade, we suggest that complete lineage sorting and lack of hybridisation are necessary preconditions to be fulfilled before the use of "barcoding" is justified.

CONCLUSIONS
The phylogenetic structure, colonisation history, phenology and current distribution of T. griseata and T. comae seems to be an example of the evanescence of geographic barriers between populations, which has resulted in hybridization, albeit with low frequency. As the populations nevertheless appear to retain their integrity, separation of T. griseata and T. comae into two separate species is warranted.