Determining the instar of a weevil larva (Coleoptera: Curculionidae) using a parsimonious method

Tracking the larval development of an insect is important in both applied and basic ecology. Yet, it is difficult to discriminate between the different larval instars of holometabolous insect species, particularly in the field. The methods currently available are of limited use as they rely on an a priori determined size distribution of every immature instar and are irrelevant whenever the distributions of two instars actually overlap. We developed a model that computes, for a given species, the most probable instar of a larva based on its individual head capsule width and the size distribution of the last larval instar. The model presented here is specifically for species of curculionid that develop through four immature instars, but can be adapted for species of weevil with a different number of instars or insects of other taxonomic units. Our method computes the risk error associated with assigning a larva to any of the possible instars and might not assign a larva if its size falls within the overlapping zone of the size distributions of two successive instars. Thus, this parsimonious method might be widely used, notably for wild-caught larvae, and can be readily used thanks to the R package CINID that we developed for that purpose.


IntroductIon
Precise information on the phenology and dynamics of an insect's life cycle is of major importance in several areas of applied ecology and fundamental research.Notably, methods designed for controlling populations of pest insects are most effective when the dynamics of larval growth are known (Logan et al., 1998;Broughton, 1999;Goldson et al., 2001;Hammack et al., 2003;Cave et al., 2006;Johnson & Williamson, 2006;Pantoja et al., 2006;Panzavolta, 2007;Diaz et al., 2008).For example, methods recently designed for controlling insect-borne pathogens that rely on endosymbiotic infection require knowledge of the stage-specific mortality rate of the insects (Yeap et al., 2011); in forensic entomology, unambiguously identifying the larval stage of necrophilous insects is a critical step for determining time of death (Byrd & Castner, 2001;Watson & Carlton, 2005).Moreover, monitoring changes in the phenology of natural population of insects in response to global warming is becoming increasingly important and requires the tracking of various bioindicator species (Peñuelas & Filella, 2001), which might be used for conserving species (Bonebrake et al., 2010).The knowledge of larval development is also important for research on ontogeny (Chatterton et al., 1994), evolution (Cronier et al., 1998), ecology (Cisne, 1973;Retallack & Feakes, 1987), ecophysiology (Jarošík et al., 2011) and paleontology (Retallack & Feakes, 1987;Hunt & Chapman, 2001).
It is often difficult to identify the larval instar of a given species (Fink, 1984;Benton & Pritchard, 1988;Skuhrovec, 2006), particularly in field studies, for species for which the data set is often limited, notably in terms of most of the immature instars , and/or for those populations that display overlapping distributions between successive instars.Thus, the available methods are mostly only suitable for use in the laboratory, which limits the number of species that can be investigated in this way.In this study we propose a method aimed at determining the most probable instar of a larva of a species of curculionid that requires four instars to complete larval development.We argue that this method is more suitable for use in field studies because it only requires an a priori knowledge of the last instar of the species under study.Last instar larvae are easily detected and identified in the field, either because of their large size, specific morphological characteristics, long duration and/ or because mature larvae leave their plant host, such as those that diapause in the soil (Menu & Debouzie, 1995;Paparatti & Speranza, 2005;Skuhrovec, 2006;Corn et al., 2009;Pélisson et al., 2011).
Two distinct methods were previously used to identify the larval instar of a holometabolous insect: the first is based on the morphological characteristics of the exoskeleton of each instar, such as the chaetotaxy (Skuhrovec, 2006) and the second is based on body size distributions (Dyar, 1890;Logan et al., 1998).The first method, despite the risk error associated with such assignment, which is expected to increase as the individual's head capsule width comes closer to the overlapping zone between two successive instars (Hunt & Chapman, 2001).
Here, we propose a method that does not depend on the prerequisites of previous models in that it predicts the most probable instar of a larva of a species of curculionid based on knowledge of the distribution of head capsule widths of the last instar.This is best achieved if the last instar can be unambiguously identified based on ecological and/or morphological features (see above).Our model might therefore be used in various contexts and for wild-caught larvae.The first step in this study was to develop models for predicting the distribution of head capsule width (HCW), mean and standard deviation (SD) of earlier instars from that of the last instar.To that end, we obtained from the literature datasets of species with the HCW for each larval instar.We selected species of curculionid that have four larval instars since we found most datasets for such species, which in addition cover a wide phylogenetic range.Despite this, there are still few suitable datasets, therefore to strengthen the statistical relationships between the fourth and earlier larval instar, we also included datasets for those species of weevil that developed through five instars, thereby ignoring the fifth instar.For every instar, we used both the HCW mean and SD to determine the statistical relationship between the fourth and each of the three previous instars.Then, we selected the six models that best fitted the mean and SD values of each of the first three instars based on the information for the fourth instar.Using these predicted paired parameters in addition to the recorded HCW mean and SD of the fourth instar, we set up four Gaussian curves, one for each instar.Finally, we validated the method using data on HCW of the larvae of different instars of laboratory-reared rice weevil, Sitophilus oryzae (Linnaeus, 1763) (Coleoptera: Curculionidae: Dryophtorinae).Using this species, we compared our predictions with those of Logan's method (Logan et al., 1998) as a reference model and discuss the advantages and limits of our method.
being most precise, relies on a key for determining each instar, which is often unavailable beyond the subfamily level (Bousquet & Goulet, 1984;Grebennikov, 2004;Skuhrovec, 2006).The second is based on highly specific growth models that rely on the morphometry of sclerotized tissues and organs.Models based on this method mainly use head capsule width (Taylor, 1931;Bukzeeva, 1965;Logan et al., 1998;Panzavolta, 2007) since it remains constant during intermoult intervals (Daly, 1985).The Brooks-Dyar rule is widely used to infer the number of larval instars in species of insects belonging to various orders based on the simple assumption of a regular geometric progression in the head capsule width in successive larval instars (Dyar, 1890;Taylor, 1931).Even if Dyar's rule ratio is often considered to be constant (averaged to 1.44 for insects), it might actually differ among arthropods and might even vary within species, with the growth ratio decreasing as larvae grow (Cole, 1980;Klingenberg & Zimmermann, 1992;Corbet, 2002), or depends upon extrinsic factors such as food availability (Kleinteich & Schneider 2011).Because head capsule width is usually assumed to be normally distributed within each larval instar as this variable is influenced by a great number of environmental or endogenous factors (Logan et al., 1998;Hunt & Chapman, 2001), more sophisticated methods were developed, which included a multimodal distribution of head capsule sizes across the entire larval development with as many modes as the number of successive instars (Moré, 1978;McClellan & Logan, 1994;Logan et al., 1998;Goldson et al., 2001).These methods provide partitioning of the head capsule distribution by defining non-overlapping range values for each instar and thus always assign any insect larva to a unique instar according to its head capsule width.Among them is Logan et al.'s (1998) widely used and efficient program (HCap, Matlab software) that infers the instar of an insect larva from the recorded multimodal distribution of the head capsule widths of all instars.However, methods based on the head capsule size distribution are of limited use since they require large samples of larvae from every instar that must be determined a priori.Moreover, these methods inevitably assign larvae to a single instar without information on TAble 1.The datasets obtained from the literature that were used to develop the statistical models (see Table 2, Fig. 1).These eleven species of curculionid, with either four or five immature stages, were selected because they included data on the distributions of the head capsule widths of each of the first four stages.Data for a single population per species is included in the model.

Species
Subfamily

Statistical models
We developed statistical models for predicting either the mean or the standard deviation of the head capsule width (HCW) of an early immature instar from the observed mean (or SD) HCW of larvae in the last, mature instar of a given species of curculionid.For this, we obtained from the literature data sets for species of curculionid that included the mean and SD of the HCW for each immature instar.We found suitable datasets for six species of curculionid that go through four larval instars before reaching adulthood, which is the most represented group.As the sample size was unlikely to be sufficient for fitting the statistical models, we added five more datasets for species of curculionid that go through five instars, which resulted in a total of eleven species in our database (Table 1).Based on this data, we fitted six distinct linear regressions, each predicting the mean or SD values of one of the first three larval instars, using the information on the mean and SD values of the fourth instar.For instance, one model predicted the mean HCW of the third instar from that of the fourth instar, and was fitted to eleven paired mean values, i.e., one per species.The fourth larval instar may correspond to either the final or penultimate instar, depending on whether the species needs four or five instars to develop (see below; Table 2, Fig. 1).As the last instar of a species that has five instars cannot be included in our analyses as there very few such datasets, the method presented is suitable for species of curculionid that have only four instars.

Data selection
Data sets for ten species of curculionid that require either 4 (5 species) or 5 instars (5 species) to complete their larval development, belonging to 7 distinct subfamilies, were obtained from the literature (Table 1).For each species/population, the estimated population mean and SD HCW were determined for the first four larval instars.Whenever data for more than one population were available for a species, we selected the dataset based on the largest sample size.We included an eleventh species in our database, the maize weevil (Sitophilus zeamais), for which we measured the HCW of larvae reared on wheat grain in the laboratory under constant relative humidity and temperature conditions (70% rh, 27.5°C).Based on the timing of the larval growth under such conditions (Laviolette & Nardon, 1963), we dissected grains at fixed numbers of days following egg-laying to obtain 25, 18, 8 and 111 larvae in the first to fourth instar, respectively.Each larva was frozen, decapitated and then its head capsule was photographed using a binocular microscope (Zeiss stemi-C, magnification 32) coupled with a camera (mc-camera1.1),which enabled its maximum width to be measured to the nearest µm (software Motic Image plus 2.0; Motic, Hong Kong, China).

Identifying the instar of a larva: the method
Individual assignment to a larval instar First we computed the absolute probability P i (x) of a larva with a HCW value x of belonging to each of four possible instars i.For that we used four pairs of parameters -one pair comprising the HCW mean µ ̂i and standard deviation σ̂i for instar i, which were either predicted from the statistical models computed in the initial step of this study (for the first three instars), or recorded (the fourth instar).These pairs of parameters were then used to compute four Gaussian distributions, each of which enabled us to compute the probability P i (x): TAble 2. Linear relationships used to predict either the mean or the standard deviation of the head capsule widths of each of the first three instars based on the data for the fourth instar.Each of the six models is based on the mean (or SD) HCW values for the eleven species of curculionid as shown in Fig. 1.The degree to which each model fits the data is indicated by its associated R² value.Whenever P i (x) > 0.05, we could not reject the null hypothesis that the larva belonged to instar i.Because the probability of a larva belonging to any instar was determined independently for each of the four instars, a larva might be assigned to more than one instar, especially when some of the four Gaussian curves overlapped.In the second step, we assessed which of the four instars is the most probable for a given larva x.For that purpose, four relative densities were computed for a larva with a recorded HCW value x.The relative density for instar i (rd i ) is defined as the ratio of the density probability value of instar i under consideration and the summed density probability values inferred for all four Gaussian distributions, which was calculated as follows (Eq.2):

Instar
This ratio implies that for any recorded HCW value x, the summed four rd i (x) values equals 1, so that any rd i (x) can be interpreted as the relative probability of a larva x belonging to instar i.The four density probability values are equally weighted (denominator of Eq. 2) meaning that the relative abundances in the population of the four larval instars are assumed to be equal: this might be modified whenever there is information on the relative proportions of the four instars.To avoid low-quality assignment, we further compared the maximum value of all four computed for an individual larva with a threshold criterion C (set to either 0.95 or 0.999 in subsequent analysis), below which the instar of the larva was left undetermined.

Inferring the relative abundance of the four larval instars in a population
In addition to the individual assignment of larvae from a population of a known species of curculionid, it might be worth estimating the relative abundance of each instar within that population.For that purpose, every recorded HCW value x was associated with a specific quadrinomial distribution defined by four probabilities that corresponded to the four relative densities (rd 1 to rd 4 ) predicted by Eq. 2.Then, each larva in the sample studied was randomly assigned to one of the four instars according to its own quadrinomial distribution.Following that step, we computed the expected relative frequency of instar i in the population, f i , as the ratio of the number of larvae assigned to the i th instar to the total sample size; this was repeated for all four instars.

Application and validation of the method
We determined whether our method accurately assigned individuals to the correct instar and whether it predicted the right frequencies of the four larval instars in a population by comparing the prediction of our method with that of Logan (1998) by independently applying both of them to data for larvae of the rice weevil Sitophilus oryzae, a species not included in our database (Table 3, Fig. 2).S. oryzae larvae were reared under the same, standardized conditions as S. zeamais (see "Data selection").We collected 97 individuals belonging to one of all four larval instars by dissecting wheat grains at various intervals following egg-laying by females.Forty of these 97 larvae could unambiguously be assigned to the fourth and ultimate instar on the basis of the time from egg-laying coupled with specific morphological characteristics of mature larvae (Laviolette & Nardon, 1963) and were used to compute the HCW mean and SD of the last instar, these paired parameters served as a reference for our method.

results And dIscussIon
The method proposed is designed to infer the instar of a larva belonging to a particular species of curculionid based on its head capsule width, provided that the number of instars is known for the population it comes from.Our method depends solely on knowledge of the distribution of head capsule widths of the fourth and last instar measured on mature larvae sampled under the same environmental conditions as the larvae to be assigned.Despite the fact that empirical data is available for only a limited number of species of curculionid, the eleven included in the analysis belong to several sub-families and exhibit a wide range of HCW mean values (averaging from 500 to 2000 µm for the fourth instar).We could not take into account phylogenetic inertia in our analysis as the number of species in the sample used to infer linear relationships was too small; yet, this TAble 3. Predicted assignment of 97 laboratory-reared larvae of S. oryzae to one of four possible larval instars using either our method or that of Logan (1998).Our method either assigned each larva to one of the instars or left it unassigned, which might depend on the threshold criterion C (set to 0.95 and 0.999, respectively).The HCW mean (µ̂i, µm) and SD (σ̂i, µm) were estimated for each of the first three instars and the number of larvae assigned to each instar is given (N C = 0.95 and N C = 0.999 according to the threshold criterion C).In addition, the relative frequency of each instar in the population, f i , is shown, either computed by our method (random sampling in a unique multinomial distribution with four parameters p i , independent of the individual assignment) or that of Logan (associated with individual classification).

Parameter
First should be done in future analyses if data for a greater number of species of a particular taxon are available.The linear regressions obtained reliably predicted the mean HCW of each of the three first instars from the data recorded for the fourth instar (all three models with R² greater than 0.89, Table 2; Fig. 1A).Regarding the standard deviation (SD), we also found significant linear relationships between the fourth instar and each of the three previous instars (Fig. 1B), with that for the SD of the first instar the weakest (R² = 0.391; Table 2).This weaker relationship might be explained by the high SD value for the first instar of Cosmopolites sordidus compared with that of those of the ten other species included in the linear model (see arrow in Fig. 1B).HCW values recorded for early immature stages might also lack precision due to the very small size of their head capsule.This could be compensated for by including in our model additional data for a greater number of species of curculionid, which is achievable given that data are accumulating in the literature.
The head capsule widths of 40 laboratory-reared final instar larvae of S. oryzae averaged 589.00 µm (± 24.43 µm, mean ± SD).Based on these reliable estimates, we used the linear regressions computed above to predict the mean and SD of the HCW for each of the first three larval instars.The mean and standard deviation of the head capsule widths, estimated for all four instars using our's and Logan's method are quite similar (Table 3, Fig. 2).The mean HCW of second instar larvae obtained using our method was slightly underestimated (by ca.6%) and the standard deviations for the second and third instar were greater than those obtained using Logan's method.In order to account for these moderate discrepancies and determine whether they are consistent this experiment needs to be repeated using other laboratory-reared species of curculionid or on wild populations.In addition, the frequencies of the four instars in the population estimated by random sampling of a multinomial distribution are very similar to those obtained using Logan's method.Finally, our method for assigning instars to larvae of S. oryzae gave the same results as that of Logan, which forcibly assigned every larva to one of four instars, when the threshold criterion was set to 0.95 (Table 3, Fig. 2).When the threshold criterion was increased to 0.999, 3 of 97 larvae were undetermined using our method (Table 3, Fig. 2), meaning that their maximum rd i value ranged between 0.95 and 0.999.Why their rdi is below 0.999 mainly stems from the fact that their HCW values fall between two successive instars that overlap slightly (i.e., instars 1-2 and 3-4).Moreover, although our theoretical distribution of the second instar underestimates the recorded one (see Fig. 2), no individuals with a HCW between 340 and 350 µm were left unassigned, which is due to the lack of an overlap between the theoretical distributions of the second and third instar.Based upon this experiment, our method therefore seems to be appropriate since it produces a rather low proportion of unassigned S. oryzae larvae and all the assigned larvae are classified in the same instars as when using Logan's method.
The method proposed in this study is parsimonious in that it does not require the a priori determining of the distribution of HCW for all immature stages and, therefore, should be more generally applicable than previous methods, especially in those cases when it is not easy to sample the early instars.Our method should be affordable for non-specialist users as we have developed a package that is compatible with R freeware [Package CINID, Curculionidae INstar IDentification, R Core Team (2013)], which automatically computes the probability of a larva being assigned to each instar and can be updated and the method refined as additional datasets are published.In addition, our method computes the risk error associated with incorrectly assigning a larva to a given instar.This property allows the choice of a confidence level of the inference below which larvae are left undetermined, and thus ensures that a larva is either safely assigned or left unassigned, whereas Logan's method forcibly, and possibly erroneously, assigns it to an instar (this occurs notably when the head capsule width is in the overlapping zone between two successive instars).
Our method can be used to infer the distribution frequencies for all the larval instars of a species of curculionid that has four instars.This can be achieved after randomly assigning each larva in a sample, irrespective of whether they are individually successfully assigned, to one instar using a quadrinomial distribution specific for each recorded HCW x.
The method described in this study is applicable to other species of weevil with a different number of immature stages, provided that there are sufficient data in the literature to set up predictive models.Three criteria are needed for a robust implementation of our method: (i) a large number of complete datasets for species/populations with the same number of larval instars, (ii) the species should be representative of their taxonomic unit and (iii) the size of the final instar of each of the species should differ.In the same way, even though the family Curculionidae is one of the most widely diversified of insect families and notably includes a great number of pests (Marvaldi et al., 2002;McKenna et al., 2009), our method can easily be adapted to other insect families.Notably, it might easily be applied to other Coleopteran families such as Chrysomelidae, which include numerous agricultural pests for which there is a need to develop biocontrol policies based on a knowledge of the dynamics of the development of their larvae in the field (Broughton, 1999;Hammack et al., 2003;Johnson & Williamson, 2006).
Fig. 1.Linear relationships used to predict the mean (A) and standard deviation (B) of the head capsule widths (HCW, µm) of each of the first three larval instars based on the mean and the SD of those of the fourth instar, respectively.The results for the fourth instar is plotted against the predictions for the first (white), second (grey) and third (black) instars of the corresponding species of weevil, respectively, which have either four (circles) or five (squares) larval instars.The arrow (panel B) indicates an extreme value for the first instar of Cosmopolites sordidus.See the main text andTable 2 for details.

Fig. 2 .
Fig.2.Recorded and theoretical HCW distributions of 97 laboratory-reared larvae of S. oryzae.Theoretical assignments to one of four larval instars were made either using our method (computing four unimodal distributions, solid lines) or that of Logan (computing a single, multimodal distribution, dotted line).Hatched bars show the three larvae that were unassigned using our method and the most selective Threshold criterion (C = 0.999).See the main text for details.