EUROPEAN JOURNAL OF ENTOMOLOGY EUROPEAN JOURNAL OF ENTOMOLOGY Mitochondrial genomes of two wild silkmoths, Samia watsoni and Samia wangi (Lepidoptera: Saturniidae), and their phylogenetic implications

. The wild silkmoth genus Samia Hübner, 1819 (Saturniidae) contains a number of economically important species in industrial silk production. However, the interspeci ﬁ c relationships within the genus remain unclear. We sequence the mitogenomes of Samia watsoni Oberthür, 1914 and Samia wangi Naumann & Peigler, 2001. Both mitogenomes are annotated and found to be cyclized, with 37 genes (13 PCGs, 2 rRNA genes and 22 tRNA genes). Using maximum likelihood and Bayesian inference methods, we analyze these mitogenomes together with a further 68 downloaded from GenBank (65 Bombycoidea and 5 Lasiocampidae as the outgroup) to investigate the phylogenetic relationships both within the genus and those among the three families of the ‘SBS’ group: Bombycidae, Saturniidae and Sphingidae. The results show that within Samia , S. ricini is closely related to S. canningi , and not S. cynthia of which it has previously been considered to be a subspecies. Although arguments have been proposed to treat S. ricini and S. canningi as conspeci ﬁ c, we choose to accept the morphological arguments and continue to treat them as two separate species. Samia watsoni is corroborated as the sister group of all other Samia species, but nevertheless should be included within Samia rather than being placed in its own monobasic genus. Our analysis recovers the following relationship among the three families of the ‘SBS’ group: (Saturniidae + (Bombycidae + Sphingidae)). This agrees with previous studies based on analysis of mitogenomes but continues to contradict the results derived from phylogenomic analysis of nuclear genomes.

a current focus being on the relationship among the three families, Bombycidae, Saturniidae, and Sphingidae (Hamilton et al., 2019).
Early classifi cations of Bombycoidea were based solely on morphology, but convergent evolution has confused our understanding of their evolution, as exemplifi ed by the two genera Rotunda and Arotros (Hamilton et al., 2019). Furthermore, although the monophyly of Bombycoidea is supported by six morphological synapomorphies, only two of these are systematically informative (Zwick, 2008). More recently, phylogenetic research has increasingly used the techniques of molecular sequence analysis. In the fi rst such analysis of Bombycoidea, Regier et al. (2008) sequenced Mitochondrial genomes of two wild silkmoths, Samia watsoni and Samia wangi (Lepidoptera: Saturniidae), and their phylogenetic implications INTRODUCTION The Lepidoptera (butterfl ies and moths) have unique feeding habits, diverse geographical distributions and multi-directional patterns of species evolution, which makes the group an excellent model for the analysis of the diversity of community systems (De Camargo et al., 2016). Within Lepidoptera, the superfamily Bombycoidea currently comprises ten families (Anthelidae, Apatelodidae, Bombycidae, Brahmaeidae, Carthaeidae, Endromidae, Eupterotidae, Phiditiidae, Saturniidae and Sphingidae), 520 genera and 6092 species (Kitching et al., 2018;Hamilton et al., 2019). However, the phylogenetics of the families within Bombycoidea remains controversial, with Another species of Samia, S. watsoni (Oberthür, 1914), was originally described in the monobasic genus Desgodinsia Oberthür, 1914(Naumann et al., 2014. While some lepidopterists accepted this taxonomy, others, including the great Claude Lemaire (see Lemaire & Peigler, 1982), treated Desgodinsia as a synonym of Samia and placed the species in that genus. Peigler & Naumann (2003) considered that although S. watsoni is the sister taxon to all other Samia, this was insuffi cient to warrant the recognition of a separate genus for it and advocated that S. watsoni should be left in Samia. In contrast, based on a study of the shapes of the wing eyespots, body size and male genitalia structure, Brechlin (2007) considered that S. watsoni did not conform to the diagnostic characteristics that united the other Samia species, and described a new genus, Archae osamia Brechlin, 2007, to accommodate the species watsoni (this new genus name was required because Desgodinsia Oberthür, 1914 is a junior primary homonym of Desgodinsia Senna, 1894 in Coleoptera, and is so unavailable for use in the present case). However, Naumann et al. (2014) pointed out that there were probably suffi cient genera in Lepidoptera at present, especially monobasic ones, and it was not necessary to recognize a new genus for a single species just because it exhibits some relatively minor differences from the remaining species in its genus. Therefore, they synonymized Archaeosamia with Samia and returned watsoni to the latter genus.
Mitochondria are important organelles in eukaryotic cells, not only providing power for cells but also participating in apoptosis (Wang et al., 2009;Saita et al., 2017). As semi-autonomous organelles, mitochondria contain their own genetic material, comprising two ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), one major non-coding sequence (A + T rich region) and 13 protein coding genes (PCGs) (including genes related to autogenesis), and a unique translation system (Singh et al., 2017;Xin et al., 2017;Kim et al., 2018;Wang et al., 2018). The mitochondrial genome is a double stranded circular molecule with a size range of 14-19 kb and has been widely used in animal evolutionary research, including molecular evolution, evolutionary genomics, phylogenetics and population genetics, due to its small size, maternal inheritance, lack of genetic recombination, rapid evolutionary rate, multiple copies within cells and easy amplifi cation (Liu et al., 2012;Chen et al., 2014;Wu et al., 2016).
In this study, we sequenced the complete mitogenomes of S. watsoni and S. wangi and undertook a phylogenetic analysis based on these new data and sequences from Gen-Bank to explore the internal relationships of Samia and the phylogeny of 'SBS' group.

Sampling and DNA extraction
Adult moths of S. wangi and S. watsoni were collected in Huangshan city, Anhui Province, China. Species identifi cation was confi rmed by examination of the dissected male genitalia. Legs were preserved in absolute ethanol at -20°C before DNA extraction. Total DNA was isolated using a TIANamp Genomic fi ve nuclear genes from 66 species and found that the families Bombycidae, Sphingidae and Saturniidae, which had been placed on different branches in the superfamily, actually comprised a single monophyletic group, with the relationship: (Sphingidae + (Bombycidae + Saturniidae)). In the same year, based on a study of two nuclear genes (CAD and Ef-1a), Zwick (2008) obtained a different pattern of relationships, namely (Saturniidae + (Bombycidae + Sphingidae)). The following year, a new study by Regier et al. (2009) based on fi ve nuclear genes obtained results consistent with those of Zwick but in the same year, in a study that increased the sampling to 20 genes, Zwick et al. (2009) recovered the original pattern of relationships found by Regier et al. (2008), namely (Sphingidae + (Bombycidae + Saturniidae)). They also introduced the concept of the 'SBS' group for the clade comprising these three families. In one of the fi rst phylogenomic studies of Lepidoptera, Kawahara & Breinholt (2014) analyzed a nuclear gene dataset (combining 33 new transcriptomes with 13 available genes, transcriptomes and expressed sequence tags) for 46 species of butterfl ies and moths and recovered the third possible topology for the 'SBS' group: (Bombycidae + (Saturniidae + Sphingidae)). Xin et al. (2017) undertook a phylogenetic analysis on 34 complete mitochondrial genomes with the same result and in the same year, Kim et al. (2017) also published an analysis of mitogenomes, concluding the relationship was (Saturniidae + (Bombycidae + Sphingidae)). Wang et al. (2018) then arrived at the same conclusion as Kim et al. (2017), this time based on 39 complete mitogenomes. Most recently, using a newly developed anchored hybrid enrichment probe set sampling 571 genes across 117 species and all major bombycoid lineages, Hamilton et al. (2019) concluded the relationship among the three 'SBS' group families was again (Bombycidae + (Saturniidae + Sphingidae)). Thus, there is still considerable uncertainty over these relationships, with much perhaps depending on the sampling, of both genes and taxa, and the analytical methods employed (Fig. 1).
Within the Saturniidae subfamily Saturniinae, tribe Attacini, the genus Samia includes several species that are used as both model organisms in scientifi c research and in industrial silk production, particularly S. ricini W. Jones, 1791. Compared with Bombyx mori Linnaeus, 1758, S. ricini has the advantages of higher silk yield, greater disease resistance and easier rearing, and so has been regarded as a new model species to replace Bombyx mori in molecular and cellular experiments (Meier et al., 2000;Lee et al., 2021). Samia ricini has sometimes been treated as a subspecies of S. cynthia (Drury, 1773) but molecular phylogenetic analyses have now shown that S. ricini is instead very closely related to S. canningi (Hutton, 1859) (Lemaire & Peigler, 1982). Indeed, Peigler & Calhoun (2013) confi rmed that S. ricini is a domesticated species, derived of S. canningi, that is not known in the wild and treated the two as conspecifi c. However, because of their obvious morphological differentiation, many taxonomists continue to regard them as two separate species (Huang et al., 2021). DNA Kit according to the manufacturer's instructions. The extracted DNA was then used to amplify the complete mitogenomes by PCR following the protocols given by Tyrrell (1997).

Sequencing and assembly
A whole genome shotgun (WGS) strategy was used for sequencing on an Illumina NovaSeq platform. Data quality was checked using FastQC (Andrews, 2020) and mitogenome assembly was undertaken using NOVOPlasty (Dierckxsens et al., 2016).

Mitochondrial genome annotation
MitoZ was used for gene annotation and the MITOS WebSever was used to identify tRNA genes and predict their secondary structure (Bernt et al., 2013;Meng et al., 2019). The parameters were set with Invertebrate Mito genetic code. Each tRNA gene sequence was checked manually. Protein-coding genes (PCGs) were identifi ed as open reading frames corresponding to the 13 PCGs of Saturniidae mitogenomes.

Sequence analysis
MEGA X was used to analyze base composition and relative synonymous codon usage (RSCU) (Kumar et al., 2018). The calculation of AT-skew and GC-skew was based on the formula proposed by Hassanin et al. (2005): AT-skew = (A -T) / (A + T), GCskew = (G -C) / (G + C). DnaSP was used to compute the numbers of synonymous substitutions per synonymous site (Ks) and nonsynonymous substitutions per nonsynonymous site (Ka) for the 13 PCGs in the mitogenome (Rozas et al., 2003).

Phylogenetic analysis
A total of 68 mitochondrial genomes were downloaded from GenBank (Table S1) and together with the two newly sequenced species were used to construct a phylogenetic tree. Five species of the family Lasiocampidae were used as the outgroup, and the remaining 65 species represent six families of Bombycoidea (Bombycidae, Brahmaeidae Endromidae, Eupterotidae, Saturniidae and Sphingidae). Alignment of PCGs was conducted by MAFFT 7.3.1 using G-INS-I algorithms (Katoh et al., 2016). Two rRNA segments were aligned with MEGA X (Kumar et al., 2018). The alignments were then concatenated into a single matrix using Phylosuite (Zhang et al., 2019). Two data sets were analyzed: (1) PCG, comprising just the 13 protein coding genes; and (2) PCG + rRNA, which comprises the 13 protein coding genes and the two rRNA genes.
To reconstruct the phylogenetic tree, both ML (maximumlikelihood) and BI (Bayesian inference) methods were applied to the concatenated dataset. Maximum likelihood analysis was conducted in W-IQ-TREE (Trifi nopoulos et al., 2016) using the best-fi t substitution model. An ultrafast bootstrap (UFB) of 1000 replications and the SH-aLRT test were used to assess branch supports. Bayesian inference analysis was conducted using Phy-loBayes (Lartillot et al., 2013). The fi rst 25% of samples were discarded as burn-in and the remaining samples used to generate a 50% majority rule consensus tree. FigTree v.1.4.0 was used to view the resulting trees (Rambaut, 2020).

Comparative mitogenomes analyses within Samia
Samia watsoni and S. wangi are the newly obtained cyclized sequences and the mitogenomes of S. canningi, S. cynthia and S. ricini were downloaded from GenBank. The genes of these fi ve species were annotated with MITOS WebServer (Bernt et al., 2013) and the secondary structure of their tRNAs analyzed. These tRNAs were mapped with AI (McLean., 2002), and the structural differences of S. canningi, S. cynthia and S. ricini were then compared. Geneious was used to compare the different sites in the mitogenome sequences of S. canningi, S. cynthia and S. ricini (Kearse et al., 2012). MEGA X was used to calculate the pairwise distances among the fi ve species.
Ka/Ks analysis shows this ratio to be less than one in all fi ve species, indicating that these genes are negatively selected. The 13 protein coding genes of S. canningi and S. ricini are under almost the same selection pressure. The selection pressures of S. cynthia and S. wangi are also close (Fig. S2).

Protein-coding genes (PCGs)
As in other Lepidoptera, the mitogenomes of S. wangi and S. watsoni contain three cytochrome c oxidase subunits, seven NADH dehydrogenase subunits, two ATPase subunits and one cytochrome b gene. The total lengths of the 13 PCGs of S. wangi and S. watsoni are 11227 bp and 11224 bp respectively. Tables S2-S3 list the composition of the mitogenomes of S. wangi and S. watsoni. The initiation codons of COX1 in S. wangi and S. watsoni are CGA, and the initiation codons of COX2 in S. wangi and S. watsoni are GTG (Kim et al., 2009(Kim et al., , 2014Margam et al., 2011;Park et al., 2016). COX2 in S. wangi and S. watsoni has a single t-termination, and the termination codons of the other PCGs are complete. The frequencies of A and T in the protein coding genes are signifi cantly higher than those of C and G (Table 1). To further explore the composition of the protein coding genes, we carried out RSCU (relative synonymous codon usage) analysis (Fig. S3). The comparison shows that UUA is the most frequently used codon, and GCG is the least frequently used. The frequency of NNT and NNA is signifi cantly higher than that of NNG and NNC, indicating that there is a strong A and T bias in the third codon position.

Transfer RNA and ribosomal RNA genes
There are 22 tRNAs in each of the two species (Figs S4-S5), and the total lengths are 1462 bp (S. wangi) and 1472 bp (S. watsoni), accounting for 9.5% and 9.6% of the total mitogenome respectively. A and T are used more frequently than C and G ( Table 1). The AT skew is positive and the GC skew is negative. The total lengths of the rRNA gene fragments of these two species are 2190 bp (S. wangi) and 2191 bp (S. watsoni), accounting for 14.3% (S. wangi) and 14.2% (S. watsoni) of the total mitogenome respectively.

A+T-rich region
The A+T-rich region of the two species is located between rrnS and trnM, with lengths of 328 bp (S. wangi) and 322 bp (S. watsoni). A+T accounts for 91.1% of the whole A+T-rich region in both species. A-T skew and G-C skew analysis showed that S. wangi and S. watsoni have clear T and C usage bias (see Table 1 for further details).

Phylogenetic analysis
The monophyly of Samia is highly supported by both datasets (13 PCGs and 13 PCGs+2 rRNA) and both analytical methods (BI and ML) (Fig. 3). Samia watsoni is the fi rst species to diverge within the genus (Fig. 3). The remaining four species form a clade within which S. ricini + S. canningi and S. cynthia + S. wangi form two reciprocally monophyletic pairs.
The relationships among the families in Bombycoidea are consistent in both the ML and BI analyses based on the two different datasets, with the topology: ((((Bomb ycidae + Sphingidae) + Saturniidae) + Endromidae) + (Eupterotidae + Brahmaeidae)). Thus, our results agree with those of previous studies that found Bombycidae and Sphingidae form a clade to the exclusion of Saturniidae. All families with more than one representative are recovered as mono-phyletic and most nodes are highly supported (every node in Endromidae, Bombycidae and Sphingidae). Although Saturniidae was monophyletic, it is only moderately supported. With the exception of some small differences at lower levels, the topological structures of ML and BI trees are the same.

Comparative mitogenomes analyses within Samia
The tRNA structures of S. canningi and S. ricini were almost identical, the only differences being alternative codons at four sites in the sequences of trnM, trnI and trnK. More signifi cant differences are found between the tRNAs of S. cynthia and S. ricini, in which the TΨC loop of trnR and DHU loop of trnF showed clear structural differences, and trnM, trnI, trnQ, trnY, trnK, trnD, trnA, trnE, trnH, trnT, trnS2 and trnL1 all varied in sequence (Figs S6-S7). Table S4 shows the conserved and variable sites among S. ricini, S. canningi and S. cynthia. CYTB in S. ricini and S. canningi has the most variable sites (55/1149). The number of variable sites of S. canningi and S. ricini are the least among the comparisons. The variation in sites of most genes between S. canningi and S. ricini is either zero or only a single site, indicating a high degree of sequence similarity.
The pairwise distance analysis shows that S. wangi is closest to S. cynthia, which is consistent with the results of the phylogenetic analyses. The distance between S. canningi and S. ricini is only 0.003. Samia watsoni is much more divergent from the other four species (Table S5).

DISCUSSION
Previously, S. ricini was sometimes treated as a subspecies of S. cynthia (Peigler & Calhoun, 2013). However, the results of our study refute this taxonomic treatment for the following reasons. (1) In the Ka/Ks analysis, the selection pressures on S. ricini and S. canningi are similar, as are those of S. cynthia and S. wangi, whereas the selection pressures between S. cynthia and S. ricini are much greater.
(2) 14 of the 22 tRNAs of S. cynthia and S. ricini have a different structure. (3) The number of variable sites in S. cynthia and S. ricini is much higher than between S. canningi and S. ricini, which is particularly evident in the 13 PCGs (Table S4). (4) The results of interspecifi c genetic distance analysis showed that the genetic distance between S. cynthia and S. ricini was 0.11 but that between S. cynthia and S. wangi was only 0.005, but S. cynthia and S. wangi are considered to be two separate species (Table S5). (5) The phylogenetic tree derived from the two data sets and analytical methods showed that S. cynthia and S. wangi are more closely related each other than either is to S. ricini. Therefore, S. ricini cannot be a subspecies of S. cynthia. Instead, S. ricini is recovered as the sister taxon to S. canningi. Samia ricini has been considered to be a domesticated species derived from S. canningi and as a consequence, the two taxa should be treated as conspecifi c (Peigler & Calhoun, 2013). To test this conclusion, we carried out a number of analyses. The tRNAs of S. ricini and S. canningi are very similar, with only four mutations separating them. Evolutionary rates analysis showed that the 13 mitochondrial genes of the two species are under almost the same selection pressure. The codon usage frequencies of S. ricini and S. canningi are identical and the genetic distance analysis of the COI gene between S. ricini and S. canningi is zero. All of this evidence supports S. ricini and S. canningi being one and the same species. Huang et al. (2021) reached similar conclusions using DNA barcoding methods (Huang et al., 2021). They concluded that interspecifi c genetic distance played an important role in determining species delimitation, a position we adopted here. Moreover, Huang et al. (2021) found that multiple COI genes of S. ricini and S. canningi nested among each other on their phylogenetic tree, again providing strong evidence that they are the same species. However, another Fig. 3. ML tree and BI tree based on AA (amino acid sequence) and 13 PCGs (protein coding genes) + 2 rRNA data sets. The order is: AA (ML) / AA (BI) / 13 PCGs + 2 rRNA (ML) / 13 PCGs + 2 rRNA (BI). All nodes that do not display support are 1/100/1/100. All images in this fi gure are provided by Decai Lu. analysis paints a contradictory picture. Peigler & Naumann (2003) considered S. ricini and S. canningi to be two different species based on structural differences, as well as behavioural differences. So, given that the names S. ricini and S. canningi have been used for a long time for two separate species, we consider that, from the perspective of research convenience, it is best to provisionally consider S. ricini and S. canningi as two different species.
Samia wangi was named by Naumann & Peigler (2001) for those populations of Samia in southern Mainland China, Taiwan and northern Vietnam that were once treated as S. walkeri (S. walkeri is now considered to be a junior synonym of S. cynthia). Early scholars clearly failed to distinguish S. wangi and S. cynthia as separate species, which is perhaps not surprising given how close they are on our phylogenetic tree (Fig. 3). Peigler & Naumann (2003) did note, however, that although the two species look very similar, they occur in very different habitats. Samia wangi lives in lowland and lower montane evergreen broad-leaved forest in the south, whereas S. cynthia lives in deciduous forests on the northern plain. Thus, their respective ecologies are clearly distinct.
Although Brechlin (2014) proposed a new genus, Archaeosamia, to accommodate Samia watsoni, this was not supported by Naumann et al. (2014). Our phylogenetic analysis corroborated S. watsoni as the sister group of all the other Samia studied here and that the genetic distance between it and any other Samia is greater than that between any pair of those other four species, However, this genetic distance is still far less than that between S. watsoni and Attacus, the other genus of the tribe Attacini included in our study. So, we concur that S. watsoni should be treated as a member of Samia rather than placed in a monobasic genus of its own.
Regarding the relationships among the three families of the 'SBS' group, our mitogenomic analysis yielded the following pattern: (Saturniidae + (Bombycidae + Sphingidae)). Thus, our study corroborates the conclusions of Kim et al. (2017) and Wang et al. (2017) but is based on increased taxon sampling and with greater support values for many clades. However, results from mitogenome analyses continue to disagree with those derived from phylogenomic analyses of nuclear genomes, in which increased taxon and gene sampling now consistently supports a sister group relationship between Saturniidae and Sphingidae to the exclusion of Bombycidae (Kawahara et al., 2014;Xin et al., 2017;Hamilton et al., 2019). Whether this confl ict can be resolved by yet further taxon sampling or, if not, what is the underlying explanation for this confl ict remains to be determined.

CONCLUSION
The conclusions of our study are as follows: (1) Samia ricini is very closely related to S. canningi, and more distant from a clade comprising S. cynthia and S. wangi.
(2) Our results are consistent with S. ricini being derived from S. canningi by a process of domestication, but we regard them as two species, rather than conspecifi c, based on the morphological evidence provided by previous authors (e.g., Naumann & Peigler, 2014). (3) We concur with Naumann et al. (2014) that S. watsoni should be included within the genus Samia rather than being placed in its own monobasic genus. (4) Our analysis recovered the following relationship for the three families of the 'SBS' group is: (Saturniidae + (Bombycidae + Sphingidae)). This agrees with previous studies based on analysis of mitogenomes but continues to contradict the results derived from phylogenomic analysis of nuclear genomes.
FUNDING. This work was supported by the National Natural Science Foundation of China (Grant No. 32100352, 32100355, 31871964), the Natural Science Fund of Anhui Province (Grant No. 1908085QC93) DATA AVAILABILITY STATEMENT. The raw data and the assemblies were deposited in the National Center for Biotechnology Information, with the BioProject access number PRJNA818861 (Samia watsoni) and PRJNA818466 (Samia wangi), with the Bi-oSample access number SAMN26885815 (Samia watsoni) and SAMN26863503 (Samia wangi), with the SRA access number SRR18441626 (Samia watsoni) and SRR18441011 (Samia wangi), with the GenBank access number ON059173 (Samia watsoni) and ON080860 (Samia wangi).

ACKNOWLEDGMENTS.
We thank Hao Zhang, a graduate student in our laboratory, for collecting and sorting the samples for this study.