Nucleotide diversity based on csd gene of the black giant honey bee , Apis laboriosa ( Hymenoptera : Apidae )

The mechanism of sex determination is common for all honeybee species (Apis spp.) by the complementary sex determination (csd) gene. The csd gene has been studied in the Western honey bee (Apis mellifera L.), the Eastern honey bee (Apis cerana F.) and the giant honey bee (Apis dorsata F.), whereas no studies had been conducted on the high altitude Himalayan or black giant honey bee, Apis laboriosa Smith. In the present study, we cloned the genomic exon 6 to exon 9 region of the A. laboriosa csd gene, and identified 13 csd haplotypes. The data was analyzed and compared with the other aforementioned three honeybee species. The results showed that, as with the other three Apis species, region 3 of the csd gene contains an RS domain at the N terminal, a prolinerich domain at the C terminal, and a hypervariable region in the middle. A phylogenetic tree showed that the csd haplotypes from A. laboriosa fell into one clade with those from A. dorsata, and were separated from those from the other two species, A. mellifera and A. cerana. The network map also showed that the csd haplotypes from A. laboriosa and A. dorsata are well mixed among each other, and do not form two separate branches. Pairwise Fst analysis revealed that the value between A. laboriosa and A. dorsata was very low (0.098), confirming a close relationship to exist between them. 215 * Corresponding author. this species, as it appears to be in the other three Apis species previously tested. In A. mellifera, the csd gene contains nine exons, which form three clusters separated by two large introns (Beye et al., 2003). The genomic region of the third cluster (from exon 6 to 9, region 3) has the highest level of polymorphism compared with the other two regions. Therefore, we chose this region to study in terms of potential polymorphism. MATERIAL AND METHODS


INTRODUCTION
In the honeybee, sex is determined by heterozygosity at a single locus (the Sex Determination Locus, SDL) which carries the complementary sex determiner (csd) gene (Gempe et al., 2009).Bees with a heterozygous csd combination are female, those with a hemizygous copy (haploid, unfertilized eggs) or with a homozygous combination are male.When csd genes are homozygous, a honeybee will develop into a diploid male, which is consumed at the larval stage by workers a few hours after it hatches (Woyke, 1963).Therefore, honeybees are under great selective pressure to become heterozygous (Liu et al., 2011).This complementary sex determination theory remained a hypothesis for a long time.About a decade ago, Beye et al. (2003) cloned the csd gene from Apis mellifera by positional cloning.They found that no transcription differences existed between the two sexes, but suppression of csd in females with double stranded RNA for csd resulted in male phenotypes.The exon 6 to exon 9 region of the csd gene has an R (arginine)-and S (serine)-rich domain in the middle and a P (proline)-rich domain at its C terminus, which is a potential splicing factor.Between these two domains is a hypervariable region that differs significantly between alleles, and has a variable region of short repetitive sequences (Beye et al., 2003;Cho et al., 2006;Wang et al., 2012).
Previous studies have shown that the csd gene has evolved under balancing selection in three honeybee species (A. mellifera L., A. cerana F. and A. dorsata F.), and several parts of the coding region are possible targets of selection (Charlesworth, 2004;Hasselmann & Beye, 2004, 2006;Cho et al., 2006;Hasselmann et al., 2008).In addition, the level of polymorphism is approximately seven times higher in the csd coding region than in the neutral regions (Hasselmann et al., 2006).
More recently, Liu et al. (2011) analyzed the polymorphism of the csd gene in the dwarf honey bee, A. florea F., and found a higher level of polymorphism of the csd gene in this species than in the other Apis species so far examined.Moreover, they found that the A. dorsata csd gene showed no founder effect between geographical population groupings from the Guangxi and Hainan provinces of China (Liu et al., 2012).
Although much evolutionary research has been conducted on the csd gene in A. mellifera, A. cerana, A. dorsata and A. florea, no research had hitherto been performed on the csd gene of the Himalayan or Black Giant honeybee, A. laboriosa Smith, possibly a high altitude subspecies of A. dorsata sensu stricto (s.s.), and with which it may share little or no contemporary gene flow (Engel, 1999).
In the present study, we analyzed the polymorphism of the csd gene of A. laboriosa, which has the biggest body size among the nine Apis species.This species A. laboriosa is widespread, mainly in Nepal, Bhutan, North-East of India, Southern Tibet, Western and Southern of Yunnan province in China.In light of the dearth of molecular biological information on A. laboriosa and its sex determination system, we believed it to be worth investigating whether the csd gene is also polymorphic in this species, as it appears to be in the other three Apis species previously tested.In A. mellifera, the csd gene contains nine exons, which form three clusters separated by two large introns (Beye et al., 2003).The genomic region of the third cluster (from exon 6 to 9, region 3) has the highest level of polymorphism compared with the other two regions.Therefore, we chose this region to study in terms of potential polymorphism.

Sample collection
A. laboriosa samples were collected from Yunnan province, China.A total of 30 workers from five different colonies were sampled, with each colony providing six workers.The samples were initially collected into 95% ethanol and then stored deep frozen at -70°C until molecular DNA analysis.

DNA extraction
Total genomic DNA was extracted from the cephalothorax of each sampled bee according to the protocol of the Animal Genomic DNA Extraction Kit (BEST ALL-HEAL LLC, New York, NY, USA).

PCR and sequencing
The primers used in this study for amplifying region 3 of the A. laboriosa csd gene were the same as reported by Cho et al. (2006).The high fidelity LA Tag DNA polymerase (BEST ALL-HEAL LLC, NY, USA) were used for all PCR reactions.PCR conditions were: denaturation at 94°C for 3 min, followed by 30 cycles of 94°C for 30 s, annealing at 50°C for 30 s and extension at 72°C for 2 min, with a final extension at 72°C for 10 min.PCR products were purified using DNA GEL EXTRACTION kits (Sangon, Shanghai, China) and cloned into the pEASY-T3 vector (Transgene, Beijing, China).To obtain as many csd alleles as possible, the genomic region 3 of the csd gene was cloned from cephalothorax of each sampled worker bee, and 1-3 clones of each cloned fragment were subjected to double-strand sequencing.Single-sequencing reads were assembled using the Seqman program in the DNAstar software (Burland, 2000).

Sequence analysis
After assembly, the vector sequences were removed from the assembled raw sequences, whereafter all the csd sequences obtained were compared with each other by sequence alignment to remove repeated sequences, the remaining sequences being different haplotypes of genomic region 3 of csd gene, which were then used for further analysis.The exons, introns and coding regions of these sequences were determined by consulting those previously determined for the genomic region 3 of the csd gene of A. mellifera, A. cerana and A. dorsata as reported by Cho (Cho et al., 2006) and cDNA sequences for the same three honeybee species as reported by Hasselmann et al. (2008).Since the presently determined sequences for A. laboriosa are genome sequences containing exons and introns, it was easy to identify these regions by aligning our sequences with the cDNA sequences for the csd gene reported by Hasselmann et al. (2008).Coding sequences of A. mellifera, A. cerana and A. dorsata csd genomic region 3 (n = 228 sequences) published by Cho et al. (2006) and Hasselmann et al. (2008) were downloaded from Genbank under accession numbers DQ324946-DQ325026, DQ325038-DQ325133 and EU100885-EU100935.Nucleotide sequence alignments were performed using Clustal X version 1.8 (Thompson et al., 1997), and alignment results were adjusted manually for obvious alignment errors.DAMBE 4.1.19(Xia & Xie, 2001) was used to identify haplotypes.The minimal evolution (ME) method and Kimura's 2-parameter distances were adopted to obtain an unrooted tree with 2,000 bootstrap replications.Nucleotide diversity () was calculated by using the DNAsp 5.0 program (Librado & Rozas, 2009).Pair-wise Fst distances were computed using ARLE-QUIN 3.1 software (Excoffier et al., 2005), whilst phylogenetic trees were constructed using MEGA version 4.0 (Tamura et al., 2007).Two tailed Z test was adopted to detect significant difference between two  values.Kimura's 2-parameter genetic distance was calculated by MEGA 4.0.A Median joining (MJ) network was drawn using the program Network 4.5 (Bandelt et al., 1999); briefly, an "rdf" file based on all the haplotypes was generated using DNAsp 5.0, it was then used for network calculation using Network 4.5 with the median joining method, and the final network map was drawn based on the calculated results.

Polymorphism of A. laboriosa csd haplotypes
Following cloning and sequencing of the genomic region 3 of the csd gene from A. laboriosa, we obtained Data for A. laboriosa was from this study, while data for species other than A. laboriosa were derived from Cho et al. (2006) and Hasselmann et al. (2008). 1 N-the sequence number used for analysis; 2 L-the sequence length excluding alignment gaps; 3 in this column, means followed by different uppercase letters differ significantly at P < 0.01 by two-tailed Z-test; 4  in this column, means followed by different lowercase letters differ significantly at P < 0.05 by two-tailed Z-test.laboriosa also shows a high level of polymorphism.This result further confirmed that the complementary sex determination mechanism is common for all the honeybee species so far examined (Cook, 1993).
Previous studies have shown that the (N)1-4Y repeats and (KHYN) 1-5 repeats are two types of important repeat sequences in the hypervariable region of the csd protein, which most likely plays a key role in determining the specificity of alleles.Interestingly, the (KHYN) 1-5 motifs exist in all the species investigated, except in A. mellifera (Hasselmann et al., 2008).Thus whilst this kind of repeat still exists in the ancient csd alleles, they have apparently been lost in A. mellifera at sometime during its evolution, perhaps during its intense domestication over the past 7000 years or so (Bloch et al., 2010).
A. laboriosa was named by F. Smith in 1871, according to morphological differences observed between A. dorsata s.s. and this putative species.In 1980, Sakagami found that these two species have significant difference in more than 100 morphological features.Arias & Sheppard (2005) have divided ten known honeybee species into three major clusters based on ND2 mitochondrial gene and EF1- intron.These are giant bees A. dorsata, A. laboriosa and A. binghami (author), dwarf bees A. andreniformis and A. florea, and the cavity-nesting bees A. mellifera, A. cerana, A. koschevnikovi (author), A. nuluensis (author), and A. nigrocincta (author) (Arias & Sheppard, 2005).Recent studies of phylogenetic relationships within the genus Apis also suggest that A. dorsata s.s. and A. laboriosa are in fact two separate species (Arias & Sheppard, 2005;Raffiudin & Crozier, 2007;Lo et al., 2010), although other researchers have failed to find obvious differences between the genitalia of drones in these two groups (McEvoy & Underwood, 1988).Thus, many scholars have questioned the classification status of A. laboriosa.In our study, the phylogenetic tree and the network of the A. dorsata s.s. and A. laboriosa csd haplotypes clearly reveal that the latter form is not a separate species, but rather, is probably an ecologicallydivergent (altitude-related) subspecies, more especially because the genetic differentiation between these groups is very low (these results).Probably the split of A. dorsata s.s. and A. laboriosa is a very recent event, while the divergence time of the csd gene is older than the divergence time between these two groups, leading to a transspecific polymorphism of the csd gene.To date, trans-specific polymorphism of alleles has been reported in several other organisms, including the major histocompatibility complex (MHC) of jawed vertebrates (Klein, 1987;Hughes & Nei, 1988;Takahata & Nei, 1990; Taka- hata et al., 1992), the self-incompatibility S locus of plants (Ioerger et al., 1990;Richman et al., 1996;Charlesworth & Awadalla, 1998), and the het-c heterokaryon incompatibility locus of fungi (Wu et al., 1998;May et al., 1999).The MHC and S alleles have been used to analyze population genetic structure and infer relative coalescence times and population history in many organisms (Takahata, 1990;Yuhki & O'Brien, 1990;Richman et al., 1996;Miller & Withler, 1997;Richman & Kohn, 1999;Bos et al., 2008).If trans-specific polymorphism of the csd gene really exists between the A. dorsata s.s. and A. laboriosa, the csd gene can be used in the future to determine the historical population size and study the speciation process under eusocial organization.
In conclusion, we found a high level of polymorphism of the csd gene in the black giant honeybee, A. laboriosa following molecular biological analysis.However, subsequent phylogenetic tree and network analysis of the sequences obtained revealed that A. dorsata s.s. and A. laboriosa are mixed together and are not separable by such means, thereby providing further convincing evidence, here molecular biological, that the latter is indeed a subspecies of the former.

Fig. 2 .
Fig. 2. The gene genealogy of csd haplotypes in region 3 of four species.The minimum evolution method and Kimura's two parameter distances are used to construct the tree.Bootstrap percentages are shown on internal branches.The scale bar represents the number of nucleotide changes per site.Accsd -A.cerana; alcsd -A.laboriosa; adcsd -A.dorsata; amcsd -A.mellifera.

TABLE 1 .
Nucleotide diversity () and nucleotide polymorphism () of the coding regions of csd alleles in four Apis species.