Molecular characterization of the gene pool of Exorista sorbillans ( Diptera : Tachinidae ) a parasitoid of silkworm , Bombyx mori , in India

Exorista sorbillans, the uzi fly, is a serious tachinid pest of silkworm and is present in all silk producing areas of Asia. Assuming that E.sorbillans was accidentally transported from West Bengal to southern states of India, its population genetic struc­ ture was studied using 13 ISSR, 3 RAPD, two sets of universal primers and two sets of primers designed from a lepidopteran repeat sequence. Statistical analyses of DNA markers revealed significant genetic variability between the E. sorbillans populations from 4 different geographic locations (within 400 km of one another) in the southern states and the one from West Bengal (Murshidabad). Multivariate and discriminant function analyses indicate that the E. sorbillans from south India has diverged from the original gene pool of West Bengal and is suitable for studying the microevolution of adaptation to the conditions prevailing in the different cocoon producing areas in India. Abbreviations used. GP = geographic population; ISSR = Inter Simple Sequence Repeat; PCR = Polymerase Chain Reaction; RAPD = Random Amplified Polymorphic DNA; SPSS = Statistical Package for Social Sciences; UBC = University of British Columbia; UNIV = Universal.

In West Bengal, E. sorbillans is present in all sericulture areas (Mukherjee, 1899) except above 400 m in the hilly regions.The endoparasitoid arrived in south India and became a pest of silk worm there in 1980 (Samson, 1980).This is attributed to its transportation in live cocoons from West Bengal to Hosakote, a town near Bangalore.It is thought that it then gradually spread to nearby sericulture areas in Karnataka, Andhra Pradesh and Tamilnadu (Samson, 1980;Narayanaswamy & Devaiah, 1998).
This is the first report on the population genetic structure of Exorista sorbillans (five geographic populations or GPs) from four locations within an area with a perimeter of 1600 km around Bangalore, Karnataka and one from Murshidabad, one of the most traditional silk producing areas ofWest Bengal.

MATERIALS AND METHODS
Uzi fly populations.Exorista maggots or live pupae were collected from a rearing house of SeriBiotech Laboratory at Kodathi (GP 5), cocoon markets at Channapatna (GP 3), Ramanagaram (GP 4), all the three from Karnataka, one from Dharmavaram (GP 2), near Anantapur of Andhra Pradesh and another from the stock maintained at the Central Sericulture Research & Training Institute,Murshidabad,West Bengal (GP 1).The two cocoon markets at Channapatna and Ramanagaram in Bangalore are only 10 km apart and cater for the needs of sericulture farmers of the area, near and far.GP refers to a geo graphical population of E. sorbillans collected from a specific area.The silk farm at Kodathi is an isolated place and has a perimeter of 10 km, supports a few cocoon growers and thus the population can be considered as a natural population with very little chance of out-crossing, the flight range of E. sorbillans being less than 3 km (Narayanaswamy et al., 1994).
DNA extractions.DNA was extracted both from individual maggots / pupae or en masse from a minimum of five maggots / pupae, by either phenol-chloroform extraction or by using a genomic DNeasy kit (QIAGEN, Hilden, Germany).The DNA extracted en-masse and individually will be referred to as DNAmass and DNAind, respectively.Extracted DNA was sub jected to further purification and dilution as described by Sethuraman et al. (2002).Quantification was done using electropho resis on 0.8% agarose (GibcoBRL, Life Technologies, Paisley, Scotland, UK) gel and the result compared with uncut lambda DNA ( New England Biolabs Inc, Beverley, MA, USA).
Primers.Four groups of primers were used: (1) twelve dinu cleotide and one trinucleotide ISSR primer of UBC Kit 9 (coded numerically), selected on the basis of earlier experiments, (2) three UBC RAPD primers (OPH 5, OPI 13 and 15) selected by random testing, (3) two universal primers, and (4) two pairs of primers, SBL534R and SBL578R, designed by the senior author on the basis of specific repeat sequences of Bombyx mori.PCR was done with a minimum of three DNAmd and one DNAmass, from each GP.
PCR, separation of amplified products and scoring.PCR was done with a MJ Research Thermocycler PTC 200 in 25 or 30 pl of reaction mixture having a MgCl2 concentration of 2.5 mM for ISSR, 2.0 mM for RAPD, and 0.3 unit of Taq DNA polymerase per reaction.For other primers, the concentration of Fig. 1.Three dendrograms based on cluster analysis of the DNA profiles of five DNAmass from five geographic populations (GP 1 -GP 5) using Anderberg's (1), Jaccard's (2) and squared Euclidean D2 statistics (3).
MgCl2 was suitably adjusted to get the best amplification.Taq polymerases came from Bangalore Genei (Bangalore, India), Genetaq (Singapore), and MBI (Fermentas, USA).However, for a particular type of primer, the Taq polymerase from only one source was used.
The thermal cycles adopted for amplification were different for the three groups of primers.These were: (a) preheating at 94°C for 2 min followed by 35 cycles of 94°C for 0.5 min, 50°C for 0.5 min, 72°C for 2 min and final extension at 72°C for 10 min for ISSR primers; (b) preheating at 93°C for 2 min followed by 45 cycles of 93°C for 1 min, 36°C for 1 min, 72°C for 2 min and final extension at 72°C for 10 min for UBC RAPD primers; (c) preheating for 4 min at 94°C and 36 cycles of 94°C for 1 min, 46°C for 1 min, 72°C for 2 min followed by a final exten sion for 15 min at 72oC for UNIV1 primers; (d) preheating for 4 min at 94°C followed by 36 cycles of 94°C for 1 min, 55.4°C for 1 min, 72°C for 2 min and finally extension at 72oC for 15 min for UNIV2 primers, and (e) preheating for 4 min at 94oC followed by 36 cycles of 94oC for 1 min, 54oC for 1.5 min, 72°C for 1min with final extension at 72°C for 10 min for SBL534R and SBL578R primers.
Function 1 10 ¿0 Fig. 2. Plotting of the five GPs (marked 1-5) in a matrix based on a discriminant function analysis.The rectangles with the GP numbers denote the estimated centres for the five GPs.The variability of the respective GPs is indicated by different symbols.
Table 1.Results of the "Tests of Between-Subjects Effects" for the profiles generated with 13 ISSR and 3 RAPD primers using mass genomic DNA of E. sorbillans from the five GPs detailed in the text.PCR products were separated on a 1.5% agarose gel using Tris-Boric-EDTA (ethylene diamine tetraacetic acid) buffer, stained with ethidium bromide and photographed with a Nikon FM2 SLR (Sethuraman et al., 2002).
Statistical analysis.Binary scoring was transferred to SPSS format (SPSS/PC+ 10.0 version; M.J. Norusis, SPSS Inc., Chi cago) for all the analyses.The molecular profiling was subjected to an analysis of genetic variability by multivariate analysis, hierarchical clustering (on the basis of five different statistics and an algorithm of average linkage between groups) and analysis of discriminant functions (DFA) based on Mahalanobis Distances (D2) statistics (www.spss.com).

RESULTS
A total of 138 markers, ranging between 0.3 kb to 3.0 kb, were scored from ISSR profiles of which 61.4% were polymor phic.Three RAPD primers generated an average of 10.6 markers/primer, but with a significantly (t = 2.31; P = 0.016) lower level of polymorphism (43.7%).
The multivariate analysis of "between subject effects" of the five DNAmass samples ascertained the significance of the contri bution of selected independent variables (DNA markers in this case).Table 1 presents the markers that contributed significantly towards the total variability and variability between GPs.
However, to determine the degree of association between these five GPs, hierarchical clustering was tried using several distance matrix methods, e.g.Jaccard's, Sokal and Sneath5, Anderberg's D, squared Euclidean distance matrix and phi 4-point correlation (www.spss.com).For all of these, the algo- rithm of average linkage between groups was adopted.It is evi dent from the results that the four steps of agglomeration are similar for clustering using phi 4-point correlation and distancematrices generated after Euclidean and Sokal and Sneath statis tics.However, when clustering was done on the basis of Jaccard's distance matrix, the first two stages were reversed, but resulted in a similar cluster diagram (Fig. 1).In contrast, clus tering done using Anderberg's distance matrix produced a com pletely different picture with the closest distance between GP 1 and GP2.The last agglomeration resulted in a cophenetic corre lation of 0.997 compared to the 0.95 for the others.The relation ship was further investigated by analyzing individual profiles.

Characterization of five uzi fly groups on the basis of repli cated profiles
A total of 83 markers were scored on the ISSR profiles of the DNA from 15 individuals belonging to the four GPs from south India and the DNAmass of GP 1. Working with individuals, more markers were scored.For example, using DNAmass as the tem plate the primer 810 generated 11 markers, whereas using the same primer with individual DNA, 19 markers generated.
The estimate of Wilk's Lambda from multivariate analysis revealed variability within populations and the variability between GPs was evident from the estimates of Pillai's trace and Roy's largest root.Further, using the multivariate approach, 31 markers revealed significant difference in frequency of occurrence between two groups.Table 2 presents twelve such markers that differed between GP 1 and other GPs.For example, B810.01 revealed significant differences between GP 1 and the remaining four GPs (Table 2), but not between any two of the latter four GPs (data not shown).

Testing for variability between populations using discrimi nant function analysis
The eighty-three markers, analyzed above, were subjected to DFA.Four functions were identified of which the first two together explain 94.1% of the variance and had canonical corre lation estimates of 0.99 and 0.96.The %2 estimate of Wilk's Lambda further supports the significance (P = 0.000 to 0.022) of these functions.GPs plotted in the matrix of function 1 (X axis) against the function 2 (Y axis) are shown in Fig. 2, wherein GP 1, GP 2 and GP 5 appear to have distinct distributions whereas those of GP 3 and GP 4 overlap.The divergence of 30 units on the X axis between GP 5 from GP 1 is most striking and even that between GP 2 and either GP 1 or GP 5 is also remarkable as they cover 50% of the Y axis.Fig. 3. Sixteen profiles generated with ISSR primer-881 using the genomic DNA of individuals from four GPs from south India and the DNAmass of GP 1. Markers 881.07 (arrow, 1) and 881.11 (arrow, 2) were identified by the discriminant function analysis (Table 3).This analysis also identified seven (Table 3) out of 83 markers scored in the 16 replicates, which differentially charac terize the functions identified, as indicated earlier.Two of the markers are indicated by arrows in Fig. 3.Only four out of the seven primers utilized, generated these seven markers, which varied in size from 0.7 kb to 3.0kb.
Data from the ISSR and RAPD profiles were scanned for unique presence or absence of specific DNA markers in a par ticular GP and the results ae summarized in Table 4. Few of these identified markers are shown in Figs 3-5.There are more such markers in GP 1. Table 4 indicates that the presence and absence of unique markers in the five GPs involved varying number of primers (see Table 4).The number of primers used in the analysis of GP 1 was 8 for unique presence and 6 for unique absence, while for GP 5 it was as few as 1 and 3 for presence and absence, respectively.The absence of RAPD primers (marked with letters OP) is apparent in the profiles for GP 3, 4 and 5.The universal primers also revealed certain differences, but as few markers were scored, these results were not included along with those of ISSR and RAPD.However, the difference between the profile generated for GP 1 and that of the other GPs is illustrated in Fig. 6.Likewise, profiles generated with repeat primers also revealed differences between the profiles of five GPs (Fig. 7).It is also of interest to note that with primer SB534R, a very prominent marker of ~450 bp was identified as a specific marker for GP 1.The presence and absence of spe cific markers in the different GPs revealed by using selected primers can be compared with the inference drawn by Wolfe et al. (1998) from their studies on natural population of Penstemon (Scropulariaceae).For example, our results showed that GP 2 and GP 3 are characterized by the presence of six and nine markers generated by five and four primers, respectively, out of the total of sixteen primers used.

DISCUSSION
The use of ISSR primers revealed > 80% polymorphism in the population genetic structure, which substantiates their use by others (Fang and Roose, 1997).More than one tool is used to analyze the genetic variability in plant systems (Hess et al., 2000;Davierwala et al., 2000), but rarely animal genomes (Kumar et al., 2001).The present result indicates the usefulness of such an approach.Table 4. List of markers unique for its presence or absence in a particular GP, for the five populations (markers generated using DNAmass)."+" and "-" indicate the presence or absence of the marker in that specific GP.The last row is the number of primers used for scoring presence or absence of markers.

Genetic diversity in uzi fly populations from five GPs and its relevance
The results (Fig. 3) clearly reveal genetic variability between individuals in particular GPs, but what do such differences in profile signify?Assuming microevolution through the accumu lation of neutral mutations (Charlesworth & Wright, 2001), the differences in the ISSR and RAPD profiles may represent small changes in the sequence not leading to any apparent different morpho-physiological or life-history traits.There is a statisti cally significant association between a group of specific DNA markers and a particular yield trait in Bombyx mori (Sethuraman et al., 2002).It is expected that profiling a large number of indi viduals of a particular GP would reveal the presence of sub groups within a particular GP, as DNA typing has done for Mycobacterium tuberculosis (Nadler, 1995).The statistical analysis used in this study led to the identification of specific marker(s) in particular GPs, the sequencing of these markers might reveal single nucleotide polymorphism in different indi viduals from a particular GP, which in turn may throw more Fig. 6.Profiles of five GPs generated with universal primer-II.The distinct presence of marker 22ooobP (arrow,l) in GP 3, distinct absence of marker 6Poobp (arrow, 2) and presence of 87oobp (arrow, 3) in GP l is evident.light on the rate of microevolution in different populations of E. sorbUlans.
The statistical analysis confirms that the variation within populations is not significant but that between GPs is significant.The cluster analysis, further reveals the greater divergence of the E. sorbillans in GP 1 from those of the other four GPs in the south as is also evident from dendrograms pro duced using three different statistics (Fig. 1).Anderberg (1973) states that "algorithm can assemble observations into groups which prior misconception and ignorance would otherwise pre clude".This is very relevant to the population genetic structure established for the five Exorista populations.However, no com parable data could be found through websites on tachinids.The overlapping distribution (Fig. 2) of GP 3 and GP 4 is expected as they are from two cocoon markets that are only 10 km apart.But, as farmers come from near and far to these markets the variability is more than that observed in the other GPs.
Comparing the result of the DFA which was done with DNAmd profiles with the cluster data (derived from DNAmass) indicates that the former has a better resolving capacity.Sum marizing, the Exorista samples from the five GPs differ from one another significantly, but the most striking is the difference between GP 1 and the other four.But, what is the relevance of these findings?Does it contradict the earlier report of a single accidental introduction in 1980 of Exorista to the southern states (Samson, 1980), from Murshidabad or Malda?
The distinctness of the GP 1 profile generated using universal primers further strengthens the contention that this population has diverged from those of the southern states.This is because universal primers are more conserved sequences and reveal highly "infrequent difference" between sub populations of a species, and appear to be highly efficient in revealing genetic differences (Buso et al., 2001).The use of the other two primers designed on the basis of repeat sequences of B. mori also exposed discrete differences between GP 1 and those from southern states.
The alternative possibility that the uzi fly (Exorista spp.) was present in the south, but silkworm was not its primary host cannot be ruled out, as it is well known to parasitize other Lepidoptera (Clausen, 1940).In that case, it may be presumed that that infusion of the gene pool from West Bengal in 1980s cre ated more genetic variability in the Exorista population in south India and simultaneously there was a fast rate of divergence in the southern states.In this context it is relevant that silkworm is grown continuously in south India and provides a host on which this pest can complete several generation per year.This would facilitate a faster rate of divergence.All these aspects expand the scope of a further in depth study on E. sorbillans.

Fig. 4 .
Fig. 4. Profiles of five GPs (lanes 1-5) generated with the ISSR primer-809.Arrows indicate unique absence of markers in GP 1.This analysis also identified seven (Table3) out of 83 markers scored in the 16 replicates, which differentially charac terize the functions identified, as indicated earlier.Two of the markers are indicated by arrows in Fig.3.Only four out of the seven primers utilized, generated these seven markers, which varied in size from 0.7 kb to 3.0kb.Data from the ISSR and RAPD profiles were scanned for unique presence or absence of specific DNA markers in a par ticular GP and the results ae summarized in Table4.Few of these identified markers are shown in Figs3-5.There are more such markers in GP 1. Table4indicates that the presence and absence of unique markers in the five GPs involved varying number of primers (see Table4).The number of primers used in the analysis of GP 1 was 8 for unique presence and 6 for unique absence, while for GP 5 it was as few as 1 and 3 for presence and absence, respectively.The absence of RAPD primers (marked with letters OP) is apparent in the profiles for GP 3, 4 and 5.

Fig. 7 .
Fig. 7. Profiles generated with primers SBL534R (left) and SBL578R (right) for five GPs of E.sorbillans.The distinct divergence of the other profiles from that of GP 1 is evident.The presence of a specific marker, of ~450bp. is marked with arrow 2. The other two (arrows 1 and 3) indicate allelic difference in the five GPs.
Only the markers showing significant con tributions to variability are shown.*, values in parentheses denote degrees of freedom.

Table 2 .
Results of pair wise multiple comparison between profile of GP 1 and others, indicating markers significantly dif ferentiating GP 1 from other four.*, DNA markers; I and J, two groups compared.

Table 3 .
Seven markers identified by discriminant function analysis and the function coefficient of these seven markers for four functions referred to in the text.
List of markers unique for its presence or absence in different localities