An inter-laboratory study of DNA-based identity, parentage and species testing in animal forensic genetics

Abstract The probative value of animal forensic genetic evidence relies on laboratory accuracy and reliability. Inter-laboratory comparisons allow laboratories to evaluate their performance on specific tests and analyses and to continue to monitor their output. The International Society for Animal Genetics (ISAG) administered animal forensic comparison tests (AFCTs) in 2016 and 2018 to assess the limitations and capabilities of laboratories offering forensic identification, parentage and species determination services. The AFCTs revealed that analyses of low DNA template concentrations (≤300 pg/µL) constitute a significant challenge that has prevented many laboratories from reporting correct identification and parentage results. Moreover, a lack of familiarity with species testing protocols, interpretation guidelines and representative databases prevented over a quarter of the participating laboratories from submitting correct species determination results. Several laboratories showed improvement in their genotyping accuracy over time. However, the use of forensically validated standards, such as a standard forensic short tandem repeat (STR) kit, preferably with an allelic ladder, and stricter guidelines for STR typing, may have prevented some common issues from occurring, such as genotyping inaccuracies, missing data, elevated stutter products and loading errors. The AFCTs underscore the importance of conducting routine forensic comparison tests to allow laboratories to compare results from each other. Laboratories should keep improving their scientific and technical capabilities and continuously evaluate their personnel’s proficiency in critical techniques such as low copy number (LCN) analysis and species testing. Although this is the first time that the ISAG has conducted comparison tests for forensic testing, findings from these AFCTs may serve as the foundation for continuous improvements of the overall quality of animal forensic genetic testing.

The probative value of animal forensic genetic evidence relies on laboratory accuracy and reliability. Inter-laboratory comparisons allow laboratories to evaluate their performance on specific tests and analyses and to continue to monitor their output. The International Society for Animal Genetics (ISAG) administered animal forensic comparison tests (AFCTs) in 2016 and 2018 to assess the limitations and capabilities of laboratories offering forensic identification, parentage and species determination services. The AFCTs revealed that analyses of low DNA template concentrations (≤300 pg/µL) constitute a significant challenge that has prevented many laboratories from reporting correct identification and parentage results. Moreover, a lack of familiarity with species testing protocols, interpretation guidelines and representative databases prevented over a quarter of the participating laboratories from submitting correct species determination results. Several laboratories showed improvement in their genotyping accuracy over time. However, the use of forensically validated standards, such as a standard forensic short tandem repeat (STR) kit, preferably with an allelic ladder, and stricter guidelines for STR typing, may have prevented some common issues from occurring, such as genotyping inaccuracies, missing data, elevated stutter products and loading errors. The AFCTs underscore the importance of conducting routine forensic comparison tests to allow laboratories to compare results from each other. Laboratories should keep improving their scientific and technical capabilities and continuously evaluate their personnel's proficiency in critical techniques such as low copy number (LCN) analysis and species testing. Although this is the first time that the ISAG has conducted comparison tests for forensic testing, findings from these AFCTs may serve as the foundation for continuous improvements of the overall quality of animal forensic genetic testing.

KEY POINTS
• Comparison tests allow laboratories to evaluate their analyses for accuracy and reliability. • Two forensic identification, parentage and species determination comparison tests were performed. • The study showed that the LCN DNA analysis represented a significant challenge to most laboratories. • Lacking familiarity with species tests curbed most laboratories from reporting accurately. • A reliance on forensically validated testing standards may have prevented some of the common errors.

Introduction
Forensic genetic testing of animal biomaterials is a firmly established investigative approach. However, laboratories providing these services must continue to take rigorous quality assurance measures to gene rate reliable and accurate results. The International Society for Animal Genetics (ISAG) implemented two animal forensic comparison tests (AFCTs) in 2016 and 2018 to methodologically and robustly evaluate each participating laboratory's ability to: (1) genotype dog short tandem repeat (STR) markers for identi fication and parentage resolution and (2) determine the species of test samples. Conducted under the quality assurance standards for DNA analysis recom mended by the DNA Advisory Board [1] and recom mendations made for animal DNA forensic testing [2], the AFCTs mark the first instance that the ISAG has conducted comparison tests based on forensic testing standards among its member laboratories.
Although standardized genetic tests are critical for sharing information and combining datasets, there are no ISAGrecommended markers or foren sic testing protocols [2]. Consequently, each partici pating laboratory was permitted to use domestic dog genotyping panels of their choice to generate individual identification and parentage outcomes. Similarly, the laboratories were also allowed to choose their preferred method of species identifica tion. Thus, this study aimed to learn if different standard operating protocols used by the participa ting laboratories would limit their ability to produce acceptable results and affect their conclusions when analyzing the same sample. Supplementary File 1 shows the AFCT survey questions provided to the laboratories in 2016 and 2018.

AFCT
The 2018 AFCT involved an identification test, a parentage test and a species test ( Table 1). As with the 2016 AFCT, the 2018 identification test com prised one diluted (300 pg/µL) and two undiluted (50 ng/µL) DNA samples from the same animal. The 2018 AFCT also implemented a parentage test involving undiluted DNA sample concentrations ranging from 10 ng/µL to 20 ng/µL from a parentoffspring trio. The 2018 AFCT also included two separate species tests comprising undiluted cattle (50 ng/µL) and fish (Atlantic cod; Gadus morhua) (50 ng/µL) DNA. For the 2018 AFCT, a set of eight DNA samples, including the two species test sam ples, were shipped to 26 laboratories. Ten of these laboratories had also participated in the 2016 AFCT.

Sample preparation, shipment and analysis
DNA samples used for the 2016 and 2018 AFCTs were extracted from dog whole blood, as well as cattle and fish meat, using the Chemagic™ MSM I System  [3].
Laboratories that reported STR data from the undiluted dog DNA were able to amplify between 19 (eight laboratories) and 23 markers (one labora tory) in 2016, and 19 (three laboratories) to 40 markers (one laboratory) in 2018 (Supplementary File 2). Analyses of undiluted dog DNA samples permitted all but one laboratory to correctly respond to the individual identification test questions (over 97.30%) because of the high percentage of correct STR allele calls. When results from that one labo ratory were ignored, both Panel 1.1 and the ISAG core panel performed comparably on undiluted sam ples by yielding genotyping results that could be readily interpreted. Approximately 84.21% of the parentage test questions which were based on undi luted DNA samples, including paternity and mater nity assignments, were correct, mostly because these laboratories considered the DNAbased sex infor mation in their parentage analyses. Correctly assign ing the sex proved essential for confirming paternity/ maternity and the identification of each animal involved in the parentage tests.
Because the laboratories employed different STR panels, consensus allele sizes and genotypes were not compared across laboratories. However, the relative genotyping accuracy across STRs ranged from 70% to 100% with an average of 85.41% in 2016. In 2018, this value ranged from 85.71% to 100% with an ave rage of 92.47% (Table 2). Despite the increase in genotyping accuracy from 2016 to 2018, the perfor mance variability across laboratories may be attributed to the lack of an effective allelic ladder, which has yet to be developed and validated for the STR panels used in this study. As allelic ladders present the com mon alleles of an STR, the size of the alleles of unknown samples can be designated by comparing them to the rungs of the allelic ladder to obtain the most accurate allele assignments possible [2].
Supplementary Based on their Aga and Rga values, private labora tories performed slightly better than government/ academic laboratories in 2016 (Table 3). In 2018, however, the Rga% government/academic laboratories performing was better than their private counterparts. Based on the data presented in Table 3, the estimates of standard deviation within each group were higher than those between groups suggesting that there were high levels of performance variability across government/academic laboratories and academic lab oratories, respectively.
The type and rate of genotyping errors that were used to estimate the Aga and Rga values for the 2016 and 2018 AFCTs are presented in Table 4. For both AFCTs, "missing data" due to results being omitted by the laboratories were the most common errors and accounted for 46.00% of all errors detected. For clarity, "missing data" errors were grouped into three categories: STR blanks (no amplification or less effi cient amplification of one or more STRs occurred), sample blanks (one or more samples failed to amplify) and genotype blanks (one or more markers failed to amplify during the multiplex reaction). False homo zygotes mainly due to stochastic allele dropout arising from diluted DNA samples, incorrect genotype calls due to improper binning of alleles, mistaken alleles, or misidentification of allelic microvariants and false heterozygotes due to stutter products collectively rep resented 31.23% of the total errors observed. The incidence of elevated stutter peaks from dinucleotide STRs in Panel 1.1 and the ISAG core panel accounted for 5.50% of the total number of errors in both AFCTs. Additionally, typing and nomenclature errors represented approximately 22.77% of the total number of errors observed in this study (Table 4).
Forensic genetic analysis regularly depends on trace evidence. Thus, the ability to resolve identity and parentage using low template DNA is particularly critical. The 2016 and 2018 AFCTs, however, exposed the challenges in testing low template DNA samples with concentrations of ≤300 pg/µL. Yet, this concen tration is higher than the low copy number (LCN) limit in human forensic DNA analysis, which is 100 pg/µL [4,5]. Most laboratories agreed that the diluted nuclear DNA concentration yielded STR peaks below their ideal detection threshold. Only one labo ratory successfully reported genotypes with no errors across all 19 STRs that were employed. The    [9][10][11][12], only one laboratory opted to use it in 2018. Except for VWF.X, a mostly hexameric marker, and FH3377, a mostly pentameric marker, all other STRs in Panel 2.1 are tetrameric [3]. Therefore, all Panel 2.1 STRs exhibit reduced stutter product formation, which bene fits sample mixture interpretation and LCN analysis [2,10]. This panel was put through a series of validation steps to further determine its robustness and reliability in forensic DNA typing [3,10,11]. These validation measures included sen sitivity, sizing precision, reproducibility, allele drop out, polymerase chain reaction (PCR) artefact characterization (e.g. dye blobs, stutters, split peaks), intra and interlocus colour balance, annealing tem perature and cycle number studies, peak height ratio determination, mixture analysis (deconvoluting sam ples from more than one donor), species speci ficity, concordance, forensic case type sample (e.g. limited and degraded samples) and population stu dies [3,10,11]. If any of these rigorous development pro cedures and validation studies were not performed for Panel 1.1 and the core ISAG panel, these marker panels may have been prevented from meeting the quality standards expected for uniform forensic test ing protocols.
Biomaterials from a wide variety of species can be encountered during forensic investigations. Because most animal genetic markers are species specific, species confirmation is typically done before genotyping analysis [13,14]. Furthermore, species determination based on genetic testing tends to be more accurate than documentary, physical or bio logical evidence for identifying, authenticating or tracing the source of biological products, including food and other artefacts [15]. The 2016 species test included one cattle sample, but the 2018 species test included two test samples: cattle and fish. Cumulatively, the 2016 and 2018 species tests involved 56 genetic analyses (three samples per lab oratory). Fifteen of the 18 (83%) participating lab oratories identified the cattle sample correctly in the 2016 AFCT versus 16 of the 19 (84%) laboratories that participated in the 2018 AFCT. Unfortunately, the remaining laboratories could not resolve the species test successfully.
Over 89% (17/19) of the laboratories answered the species test questions correctly. Several labora tories failed the species test because they had either eliminated cattle as the correct species or experi enced potential crossreactivity with other species, including bovine-equine mixes and bovine-canineovine mixes. In contrast, others initially failed the species test because of low DNA quantity and sub sequently needed more DNA to conclude the test correctly or did not submit any result. The labora tories employed a variety of species testing methods, including approaches that were better suited for a wide range of target species, such as sequencedbased typing (PCRSBT) of mitochondrial 16S ribosomal ribonucleic acid (rRNA), cytochrome b (Cytb) or cytochrome c oxidase I (COI) sequences, and others with a narrow range of target species such as allelespecific PCR and species identification by insertion/deletion (SPInDel) assays [16,17]. The limi ted range of species detection capability is likely why several laboratories could not correctly exclude the different possible species listed in the tests. The 2018 fish species test resulted in 10 laboratories (almost 53%, 10/19) correctly identifying the sample as Gadus morhua. The other laboratories did not report any results because either testing for fish species was outside of their scope of expertise or they lacked access to technology such as DNA sequencing that could have enabled them to identify the fish species. Therefore, the species testing evalu ation of the AFCTs indicates that a lack of famili arity with species testing protocols, interpretation guidelines and representative databases or the taxa have restricted several laboratories from submitting correct responses for this evaluation [2,18].

Conclusion
This report describes the results of the 2016 and 2018 AFCTs that were administered for the first time by the ISAG to determine the limitations and capabilities of animal genetic laboratories that pro vide forensic services worldwide. Data from labo ratories that participated in the 2016 and 2018 AFCTs confirmed that six out of 10 of these labo ratories' performances improved with time. Therefore, AFCTs can be used periodically to demonstrate the quality performance of an animal forensic genetic laboratory, as well as serve as a mechanism for critical selfevaluation. The results of the AFCTs are also an impetus for the ISAG Animal Forensic Genetics Standing Committee to vigorously and collaboratively develop and validate uniform forensic testing protocols, such as a stan dard forensic STR kit, prefe rably with an allelic ladder, and stricter guidelines for STR analysis. These initiatives could help curtail some of the common, but avoidable, issues observed in the AFCTs, such as incorrect genotyping, missing data, and loading errors. Laboratories providing animal forensic genetic testing services should keep improv ing their scientific and technical capabilities and continuously evaluate their personnel's know ledge, skills and abilities to enhance their competency with important animal forensic techniques and technol ogies, including LCN analysis and species testing. The use of forensically validated approaches will facilitate uniform techniques and promote data shar ing so that laboratories worldwide can develop their skills and abilities and provide quality animal foren sic genetics services.