Metrological framework for selecting morphological characters to identify species and estimate developmental maturity of forensically significant insect specimens

Abstract Accurate age estimates of immature necrophagous insects associated with a human or animal body can provide evidence of how long the body has been dead. These estimates are based on species-specific details of the insects’ aging processes, and therefore require accurate species identification and developmental stage estimation. Many professionals who produce or use identified organisms as forensic evidence have little training in taxonomy or metrology, and appreciate the availability of formalized principles and standards for biological identification. Taxonomic identifications are usually most readily and economically made using categorical and qualitative morphological characters, but it may be necessary to use less convenient and potentially more ambiguous characters that are continuous and quantitative if two candidate species are closely related, or if identifying developmental stages within a species. Characters should be selected by criteria such as taxonomic specificity and metrological repeatability and relative error. We propose such a hierarchical framework, critique various measurements of immature insects, and suggest some standard approaches to determine the reliability of organismal identifications and measurements in estimating postmortem intervals. Relevant criteria for good characters include high repeatability (including low scope for ambiguity or parallax effects), pronounced discreteness, and small relative error in measurements. These same principles apply to individuation of unique objects in general. Key points Metrological rigour can increase in forensic entomology by selecting measurements based on their metrological qualities. Selection of high-quality features for morphological identification of organisms should consider these criteria: (1) pronounced discreteness of features (minimising group overlap or maximizing interval); (2) high repeatability of assessment (such as symmetrical width rather than asymmetrical length); (3) small relative error in measurement (selecting the physically largest continuous rigid feature for measurement). These metrological principles also apply to individuation of unique objects in general.


Introduction
Immature insects associated with a corpse can provide evidence of the time since death if their ages can be inferred based on their species and developmental stage [1][2][3][4][5]. However, many forensic professionals who generate such forensic evidence and most legal professionals who cite these identifications and estimates have little explicit training in taxonomy or metrology. In our experience, they appreciate having access to formalized principles and standards for biological identification. The call for such standards is also a hallmark of quality management in these professions [6,7]. This review addresses these needs.
Metrology is the science of making reliable measurements, and is relevant to the use of morphological characters in forensic entomology to produce evidence that meets the quality standards of the courtroom [28]. Metrologically, morphological characters can be qualitative or quantitative, and discrete or continuous ( Figure 1, Table 1).
Categorical (or nominal or attribute) traits are qualitative, discrete, and lack both magnitude and rational order or sequence [30], e.g. the presence or absence of features, and are generally easy to decide. In systematics, they are often binary.
Ranked (or ordinal) traits are qualitative and discrete like categorical traits but have relative magnitude and can therefore be placed in a sequence [30], e.g. anterior or posterior positions. They lack absolute magnitude, and therefore cannot be measured. They are usually easy to decide if observations are not close to being tied (as horse racing enthusiasts will attest).
Quantitative (or measurement) characters have absolute magnitude, usually expressed in standard units, and may be discrete (or meristic: counted in integers) or continuous (measured in real numbers) [30]. Some continuous quantitative traits (e.g. colour measured as light wavelengths) may be functionally discrete or even categorical (e.g. colour due to the presence or absence of causative pigment). Continuous traits can be measured on interval scales that have a relative baseline and negative values (e.g. the Celsius, Fahrenheit, pH, and decibel scales) or on ratio scales with an absolute baseline, where a value of zero means that there is nothing to measure (e.g. the Kelvin scale). Only ratio scales can produce arithmetically meaningful ratios [31].
Continuous variables may show variation for several reasons, one of which is imprecision (also technically termed error) in the measuring process, which is typically addressed using statistical analysis. Statistical analyses of ratios and angles need to be conducted with particular attention, because ratios, proportions, and angles can have non-Gaussian distributions that do not meet the assumptions for some analyses. Qualitative traits are thus easier to assess than continuous quantitative traits (Figure 1).
Two concepts are germane to assessing imprecision. Repeatability refers to whether multiple independent measurements of one observation will produce the same result (as opposed to   [11] individual measurement of multiple observations, which are termed replicates). Relative error refers to what proportion of a measurement is attributable to the resolution limit of the measuring device. If the resolution limit of a measuring device is fixed (e.g. at 1 mm), its relative error is smaller for large measurements (e.g. 1% for 100 mm) than for smaller measurements (e.g. 10% for 10 mm). Both of these concepts affect the choice of diagnostic traits for forensic identification and estimation.
Qualitative features (such as mouthpart shape [11] or colouration [32]) might be subtly different between some related species and clearly different between others. However, typically, they do not drastically differ between instars within a species, which makes quantitative measures of maturity necessary. The shape or colour of a feature can be quantitatively described by morphometrics [33,34] or spectrometry, respectively. Pupal development does not progress through discrete stages, but the development of certain features (such as seta formation or setal pigmentation) can be treated as discrete developmental landmarks at sufficiently coarse time scales and continuous developmental processes at finer temporal resolutions [3].
Ideal identifying characters are unambiguous and categorical, such as the presence of setae or absence of spinose bands on the cuticle [18,19], or meristic (quantitative and discrete), such as the number of slits or buttons in the posterior spiracles [35] or the number of teeth on the mouth hooks [19]; all of these are used as diagnostic traits of blow fly larvae (Table 1).
Although character selection in modern taxonomic research is generally robust, these studies typically address the adult stages of sister taxa [36,37]. In contrast, the early descriptions on which modern taxonomy is built are regularly not particularly informative because the authors could not know which traits would become significant in the future. Similarly, many forensic studies that focused on larvae effectively used circular logic by choosing features based on the significance to their own results without assessing the metrological characteristics of the traits [29,38]. Multiple features are often tested, with the most significant retained and the source of unreliability in other measures not investigated. However, the measure could be inherently poor (e.g. poor repeatability) or poorly applied (e.g. relative error).
The implications of these ideas is illustrated by a case study of this metrological framework to assess the use of different types of morphological characters to differentiate between species and between conspecific instars of carrion beetles of the genus Thanatophilus Leach. We then suggest a general strategy for the selection of identifying characters that is applicable in forensic identification. This strategy can be extended to non-entomological objects that require physical characterization, e.g. for forensic individuation.

Differentiating species
The adults of at least 19 species can be identified by combinations of qualitative characteristics of colouration, the presence or absence of tubercles, and the shapes of the male genitalia, female propygidia, and the sexually dimorphic forewing apices [39]. Using Schawaller's [39] study, shape characters can be assessed without specialized equipment or measurement, let alone skill in deploying them, by making comparisons with diagrams that Schawaller specifically laid out to facilitate them.
The known larvae of Thanatophilus have fewer known qualitative morphological characters [11,14,17,32,40,41], and those characters which have been identified are currently only sufficient for identifying the larvae of closely related Thanatophilus species (Figure 2, Table 2). This provides a potentially interesting model to explore continuous quantitative characters that might be useful for identifying species and instar, using morphometrics.
Accounts of development in Thanatophilus have focused on categorical developmental events rather than on continuous, quantitative indicators of maturity [12][13][14]17]. In captivity, it is easy to determine which instar a larva has reached, because the advent of ecdysis (or larval moulting) is marked by the appearance of an exuvium (the moulted exoskeleton) in the rearing chamber [12]. However, the instars of larvae sampled directly from a corpse can be harder to differentiate, and continuous, quantitative measurements of growth are often required. Because Thanatophilus larvae can be reared individually [12], they produce more accurate experimental data than flies, which must be reared communally [42]; communal rearing does not allow the same individual to be repeatedly measured, or the identification of sick specimens [12].

Materials and methods
To differentiate the species and larval instars, specimens of each instar of Thanatophilus micans and T. capensis (¼ T. mutilatus) were taken from laboratory colonies held at the Department of Zoology and Entomology, Rhodes University and drowned in ethanol, which is the preferred method for preserving forensic samples of sclerotized beetle larvae [43].
Species identification information was taken from Daniel et al. [11]. Additional heads and urogomphi (posterior horns) were mounted in Euparal on microscope slides to augment the data from Daniel et al. [11]. Head capsule width and urogomphus length of at least 11 specimens of each species-instar combination were measured using a Wild M5A stereomicroscope (Wild Heerbrugg, Switzerland) with a reticle. Body length and width measurements were taken from Ridgeway et al. [12], and were measured with a simple gauge [44,45] precise to 0.1 mm. General size and shape observations were made while collecting data for experiments in both Midgley and Villet [13] and Ridgeway et al. [12].

Results
Larvae increased in length, width, and volume as they grew, and changed shape. Early in an instar, their body cross-section was relatively flattened, but it became more circular as their digestive tracts filled. Cessation of feeding prior to ecdysis resulted in a slight flattening, and sometimes also a telescopic shortening, of the body. Immediately after ecdysis, the body became wider as the exoskeleton became larger. This resulted in a further shortening and flattening of body length to maintain total body volume. To achieve an approximately constant volume despite widening of the body, the abdomen telescoped into itself, which made the abdominal taper sharper and steeper stepped. The length of the abdomen compared with the thorax allometrically increased as the larvae grew, presumably because the fat bodies enlarged. Body length rapidly increased once feeding resumed. The change to a rounder body profile happened more slowly and was most noticeable closer to the following ecdysis. In mature final instar larvae, the body was most noticeably rounded and elongate as the body stretched to prepare for pupation. As the intersegmental membranes became stretched, the abdomen appeared more gradually and evenly tapered.
Qualitative characteristics that differentiate the instars of Afrotropical Thanatophilus species were not noted [11], and instar was most reliably quantitatively determined from head capsule width. In contrast, head capsule length significantly overlapped between instars and was erratic within instars ( Figure 3). This was because it was difficult to orientate the capsule in a standard position because it tipped up to variable degrees; this led to parallax effects [46] and unreliability of this particular measurement, because there was low repeatability.  T. micans [11] T. sinuatus [17] T. rugosus [17] T. trituberculatus [41] T. lapponicus [32,41] T. coloradensis  Maximum urogomphus length (Figure 4) or thoracic width ( Figure 5) were more repeatable and moderately discriminatory; however, in several cases, these traits showed overlap between instars and could not be reliably used. The absolute range of measurement for body width was consistent between instars, which indicates that the method shows repeatability.
Total body length showed significant overlap between instars in both species ( Figure 5). Because the body is extensively jointed between segments, it showed subtle postural distortions, even if the larvae were killed in a manner that minimized distortion and standardized posture as much as possible [43].
Inevitably, if body length and width are measured with the same instrument at the same magnification, the relative error in the larger measurement will be smaller. Body length is thus effectively measured at a coarse scale of resolution, which gives it the appearance of being less variable [47]. This effect is shown in Figure 5, where the width, with high relative error, appeared to have a quantized distribution and formed rows in the plot. The smaller relative error in length gave the distribution a more even resolution and did not show distinct columns.

Discussion
Recent publications on forensically important insect larvae have addressed aspects of larval preservation and measurement [29,43,48,49]. However,   standard measurements or a metrological framework for standard criteria to select features for such measurement have not been established, despite recent concern [50]. It is often assumed that measurements should be continuous-quantitative because growth is a continuous process. However, meristic and qualitative measurements are also relevant to identification. In some cases, continuous-quantitative measures can even be undesirable because the metrological precision exceeds the biological meaningfulness of the measure. This can be exacerbated by averaging of multiple samples, which creates false arithmetic precision, such as in developmental models based on communally-reared maggots.
Quantitative measures are also not independent of allometric variation, and measurement of rigid structures that are metrically stable between moults is preferable for instar determination. Usually, qualitative traits are useful for determining species, whereas continuous-quantitative measurements are more useful for determining instar. Meristic traits can be useful for both because some features increase in number with growth, and differences between species can also be incremental. Thoughtful selection of measurable characters is needed to sufficiently and reliably determine both species and instar for presentation in court. Relevant criteria for good measurements include high repeatability, pronounced discreteness, and small relative measuring error. This also applies to individuation of single objects.

Natural history of size and shape of larvae
Our observations on Thanatophilus larvae size and shape are the most detailed about beetles to date in the forensic entomology literature. Observations of immature insects in published studies were usually superficial [14,17,51,52], and these changes in size and shape were sometimes not even discussed, despite being visible in figures [53]. Additionally, average measurements at a given time point or long measurement intervals can disguise changes in shape or the magnitude of these changes. Because beetle larvae are kept in solitary conditions, repeated measures of length showed changes in shape more clearly (see Figure 2 in Ridgeway et al. [12]).

Species determination
The larvae of Afrotropical Thanatophilus species can be easily separated using categorical (sternite and urogomphus shape), meristic (thoracic setae), or continuous-quantitative (ratio of urogomphus length:sternite 10 length) features [11]. Although the length of the urogomphus showed overlap between instar-species combinations (Figure 4), using a ratio to assess this character reduced the effect of allometric variation to an acceptable level [11,14,17]. Characters used in species determination are usually limited to categorical or meristic features, but it is actually the discreteness of a feature that is important. In Afrotropical Thanatophilus, urogomphus length shows continuous intraspecific variation, but discrete groups for each species, which makes it a useful measure for species determination. Multiple types of characters were also used in Rognes and Paterson's [36] unusually detailed treatment of Chrysomya chloropyga and C. putoria. In this case, it was meant to settle a 50-year-old disagreement over the status of C. putoria; however, traditionally, the use of continuous measures in taxonomy has been limited. Rognes and Paterson [36] even expressed "surprise" (p. 56) at the usefulness of this feature type.

Instar determination
As in previous studies on larvae of European Thanatophilus species [40], the three measures used in this study showed different utility in determining instar. Measurement repeatability was illustrated by the head dimensions. Measuring head capsule width is easily repeated because misaligned specimens appear asymmetrical, which make it easy to achieve standardized orientation and measurement. Head capsule length is more difficult to measure because symmetry cannot help to assess whether a specimen is correctly orientated, which makes standardized, repeatable measurements less likely because of parallax errors. In this study, this was exemplified by the larger variation found within instars, and the larger degree of overlap between instars in head length compared with width ( Figure 3). This disparity in repeatability also has relevance for morphometrics because assessment of head capsule shape is only as repeatable as the least repeatable measure, in this case the head capsule length. The parallax effect has been shown to be significant in morphometrics [46,54]. Additionally, the same problem has been recognized in proturan taxonomy [55]. The concern of repeatability is not limited to continuous variables. Williams and Villet [37] identified the angle (right or obtuse) formed by the vertical and prevertical setae as a useful categorical feature; however, the angle can be difficult to determine if the specimen is not optimally orientated.
Urogomphus length and thoracic width are less reliable measures. In most cases, instar determination is possible using these measures, but some overlap does occur. Thoracic width overlapped less often than urogomphus length, but neither of these measures could be classified as discrete. Thoracic width is more repeatable than urogomphus length and head capsule length because of orientation symmetry.
Unlike the dimensions of rigid exoskeletal structures (like the head) that showed discrete growth, body length changed continuously as larvae grew and showed notable overlap between instars in both species in this study ( Figure 5). Part of this overlap is due to the small loss of body size just prior to ecdysis [12], which is while the larva is not feeding and its gut empties, and also after ecdysis, when the body is wider but has the same total volume. These overlaps are not attributable to measurement error. Additional error was introduced by postural distortion. Although these are measurement errors, they are currently unavoidable because the recommended best practice was used [43]. These effects completely overwhelmed the improvement in relative error that is offered by using the much larger size of the body rather than either thorax or head width. For these reasons, this type of dimension is not recommended for assessing instar or age.
Finally, if the precision limit of a measuring instrument is constant, e.g. to the nearest 0.1 mm, the relative error of a measurement will be smaller for larger measurements, as has been noted in fisheries research [56]. It is important to either use a more precise measuring tool for small features, such as a microscope with an ocular micrometre over a measuring gauge; alternatively, if one is not available, care should be used when interpreting overlap in measurement. A given measuring technique is likely to have a higher relative error when measuring younger and smaller individuals, as illustrated in Figure 4. The absolute range of the body width measurements was consistent, despite older instars being larger, which indicates higher relative error in smaller measurements.

Conclusion
The selection of features for measurement depends on the aim of the measurement. Measures that proved useful for separating Thanatophilus species were all qualitative and had no use in determining instar, which was most effectively identified using quantitative measures of the rigid parts of the exoskeleton. Quantitative measurement of the telescopic body parts was least useful. It is also not possible to identify a single feature that should be measured for all forensically important taxa. In fact, it is not even possible to suggest a preference for qualitative or quantitative features. Instead, our data show that high-quality feature selection should focus on the following selection criteria: 1. Pronounced discreteness (minimizing overlap or maximizing interval); 2. High repeatability (such as symmetrical width rather than asymmetrical length); 3. Small relative error (selecting the physically largest continuous rigid feature for measurement).
By selecting measurements based on their quality, rather than the resulting data type, metrological rigour will be increased in forensic entomology. These same metrological principles apply to individuation of unique objects in general.