Validation of the Investigator 24plex QS Kit: a 6-dye multiplex PCR assay for forensic application in the Chinese Han population

Abstract The Investigator 24plex QS Kit (QIAGEN, Hilden, Germany) is a 6-dye fluorescent chemistry short tandem repeat (STR) polymerase chain reaction (PCR) amplification system that simultaneously amplifies 20 of the expanded Combined DNA Index System (CODIS) core STR loci, SE33, DYS391, and the standard sex-determining locus, amelogenin, as well as two special internal performance quality sensor controls (QS1 and QS2), which are included in the primer mix to check the PCR performance. This study was designed to be a pilot evaluation of this STR-PCR kit in a Chinese Han population regarding the PCR conditions, sensitivity, precision, accuracy, repeatability, reproducibility, and concordance; tolerance to PCR inhibitors; applicability to real “forensic-type” samples; species specificity; mixture, balance and stutter analyses, and utility in a population investigation. The exhaustive validation studies demonstrated that the Investigator 24plex QS system is accurate, sensitive and robust for STR genotyping. In addition, these genetic markers in the population data in our study indicated that they can also be useful for forensic identification and paternity testing in the Chinese Han population.


Introduction
Since the 1990s, short tandem repeat (STR) analysis, based on capillary electrophoresis (CE) technology, had been recognized as a standard approach for human identification and paternity testing worldwide [1][2][3]. Multiplex amplification of widely used STR markers makes it convenient to establish centralized STR databases in different counties throughout the world, while the valuable genetic information provided by databases also makes STR technology powerful, as it can help to directly search and match STR profiles generated from crime scenes. To increase the power of discrimination for individual identification and to facilitate international compatibility, the Federal Bureau of Investigation (FBI) expanded the Combined DNA Index System (CODIS) core STR loci from the original 13 loci to 20 loci: D1S1656, D2S441, D2S1338, D3S1358, D5S818, D7S820, D8S1179, D10S1248, D12S391, D13S317, D16S539, D18S51, D19S433, D21S11, D22S1045, CSF1PO, FGA, TH01, TPOX and vWA [4,5].
Using the current 6-dye labeling technology, the Investigator 24plex QS Kit (QIAGEN, Hilden, Germany) was developed by combining 20 expanded CODIS loci, one Y-STR (DYS391), the SE33 locus (highly recommended), amelogenin, and two internal performance controls, Quality Sensors (QS1 and QS2) [6]. These internal controls include a 70-bp and 435-bp PCR fragment, which are included in the Primer Mix and simultaneously amplified with input DNA to monitor PCR performance, distinguish failed PCR progress resulting from a lack of qualitied DNA, and differentiate sample DNA from degraded DNA [7]. According to Kraemer et al. [8], this system was developed for detecting challenging, low quality and low quantity samples for forensic casework. Although validation studies on this STR typing system have been conducted in several forensic laboratories, these studies had not been conducted in the Chinese population.
To understand the availability and practicality of the Investigator 24plex QS Kit in Chinese populations, we evaluated the overall performance of the kit by following the Validation Guidelines for DNA Analysis Methods (2016) issued by the Scientific Working Group on DNA Analysis Methods (SWGDAM) [9] and the Chinese National Standard (CNS) Basic Quality Requirements of Forensic Science Human Fluorescent STR Multiplex PCR Testing Reagent (GA/ T815-2009) [10]. As such, the PCR conditions, sensitivity, precision/accuracy, repeatability/reproducibility, concordance, stability, performance of genuine "forensic-type" samples, species specificity, mixture analysis, balance, stutter analysis, as well as population genetics were investigated in this study. The results illustrated that the Investigator 24plex QS Kit is sensitive and reliable in forensic application and, particularly, that the included Quality Sensors can provide useful information for troubleshooting strategies. Furthermore, these STR markers are informative in the Chinese Han population and have a robust individual identification capability, non-parent exclusion probability, and database comparison compatibility.

Sample preparation and DNA extraction
Peripheral blood samples from 500 healthy, unrelated Chinese Han individuals (250 females and 250 males) were collected after receiving written informed consent. DNA isolation was carried out by using the QIAamp DNA Blood Mini Kit (QIAGEN), and the quantity of genomic DNA was detected using a NanoDrop 2000 spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE, USA) in accordance with the manufacturer's protocol. DNA from the 9947A, 9948, 007 (Thermo Fisher Scientific, Waltham, MA, USA) and 2800M (Promega, Madison, WI, USA) human cell lines were used as positive controls. DNA samples were diluted to 0.5 ng/mL, or the appropriate concentration, using Tris-EDTA (TE) buffer.

PCR amplification and capillary electrophoresis (CE)
Amplification was performed in a single multiplex PCR system using 7.5 mL Fast Reaction Mix 2.0, 2.5 mL Primer Mix, 0.5 ng genomic DNA and appropriate amount of water for a final reaction volume of 25 mL. PCR was performed on the GeneAmp 9700 PCR system (Thermo Fisher) with a Goldplated Silver 96 Well Block in the "Max Mode", including 3 cycles at 98 C for 30 s, 64 C for 55 s and 72 C for 5 s; 27 cycles at 96 C for 10 s, 61 C for 55 s and 72 C for 5 s; a final extension at 68 C for 2 min and then held at 60 C for 2 min.
Amplified products were analyzed by adding 1 mL of each PCR product to 12 mL of a 1:24 mixture of BTO_550 size standard (60,80,90,100,120,140,160,180,200,220,240,250,260,280,300,320,340,360,380,400,425,450,475,500,525, and 550 bp) (QIAGEN) and Hi-Di formamide (Thermo Fisher) for CE detection. The mixture was then denatured by heating to 95 C for 3 min and cooling to 4 C for 3 min. Samples were injected at 1.6 kV for 33 s and electrophoresed at 13 kV for 1 550 s using a run temperature of 60 C, the BT6 filter set and POP4 polymer (Thermo Fisher) on the 3500xL Genetic Analyzer (Thermo Fisher). Genotyping data were collected and analyzed by the GeneMapper V R ID-X Software v1.5 (Thermo Fisher). Peak detection default threshold of the analysis software, above 100 Relative Fluorescence Units (RFUs), was applied for analysis.

PCR condition tests
To evaluate the parameter range of PCR conditions that can produce reliable STR genotyping results, different annealing temperatures (57 C, 59 C, 61 C, 63 C, and 65 C) and varying numbers of PCR cycles (25, 27 and 29) were performed in triplicate with 0.5 ng positive control DNA from the 9948 human cell line. These varying conditions were tested in a series of PCR reactions in which only one parameter was altered and others remain as suggested in the protocol.

Accuracy study and sizing precision
For the accuracy study, 200 random individual DNA samples were genotyped under the standard conditions on the 3500xL Genetic Analyzer. Based on the genotyping information, each allele size was compared to that of the corresponding allelic ladder to calculate the size difference.
For the precision study, the fragment size was measured via CE for the 24 allelic ladder samples on the 3500xL Genetic Analyzer. Then, the average fragment length and standard deviation of each allele were calculated.
Repeatability, reproducibility and concordance tests The genotyping profiles of 50 individual DNA samples were compared to their replicate results to evaluate the repeatability of the kit. These samples were also amplified and genotyped in triplicate in an independent accredited lab to test the reproducibility. Since the amelogenin and 22 STR loci in the Investigator 24plex QS Kit are also contained in the GlobalFiler TM Kit (Thermo Fisher), the aforementioned 50 DNA samples were subsequently genotyped using the latter kit for the concordance test of the 24plex system. "Forensic-type" samples In forensic casework, it is often necessary to handle various of biological tissues, such as hair, nail, bone, and blood. Therefore, a series of different sample types, including saliva, saliva stain, buccal swab, blood stain on an FTA card, 10-year-old blood stain on gauze, FFPEB (formalin-fixed and paraffinembedded biopsy) sample, hair roots, nail, vaginal secretion, menstrual blood, semen, semen stain, muscle, and bone were chosen and amplified in triplicate to test the ability of the Investigator 24plex QS Kit to process various sample types.

Sensitivity
To assess the optimal quantity and variable range of DNA input for the 24plex panel, a serial dilution of the positive control DNA 9948 was amplified in triplicate with quantities of 5 ng, 2 ng, 1 ng, 500 pg, 250 pg, 125 pg, 62.5 pg, 31.25 pg and 15.625 pg DNA. The average percentage of the detected loci and the average peak heights in each of the above tests were measured. In this part, control DNA 9948 was quantified by Qubit V R dsDNA HS Assay Kit on a Qubit V R 2.0 Fluorometer (Thermo Fisher) according to the manufacturer's protocol.

Species specificity
For the species specificity test, 2 ng DNA samples from 11 species of mammals (monkey, cow, horse, donkey, deer, sheep, pig, rabbit, dog, cat, and rat), four species of non-mammals (fish, snake, chicken and duck) and three microbial species (Escherichia coli, Staphylococcus albus and Staphylococcus aureus) were subjected to PCR amplification using the Investigator 24plex QS Kit in triplicate.

Mixture study
At serial ratios (1:1, 1:3, 3:1, 1:9, 9:1, 1:19, and 19:1), 14 mixed samples were prepared using control DNA from the 9947A human cell line mixed with DNA from the 9948 human cell line and using the DNA from the 9948 cell line mixed with DNA from the 2800M cell line. Using the 24plex system, 1 ng of each mixture was amplified and detected in triplicate.

Balance and stutter analysis
The allele height information of the 200 DNA samples used in Section repeatability, reproducibility and concordance tests was used to evaluate the balance performance and perform stutter analysis of the Investigator 24plex QS Kit. The balance of heterozygous alleles per locus was defined as the intra-locus balance; the balance of heterozygotes within one dye labelled loci mirrored the intra-colour balance; and the balance of heterozygotes within all analyzed loci among different STR profiles was regarded as intercolour balance. Each calculation method has been described in a previous study [11]. For the stutter analysis, only peaks with one repeat smaller than the corresponding real allele were included in our study. The stutter ratio was calculated by dividing the peak height of the stutter peak by that of the allele.

Population investigation and statistical analysis
The 500 unrelated Han individuals mentioned in Section sample preparation and DNA extraction were amplified and genotyped using the Investigator 24plex QS Kit. Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) in the studied Han population were determined using Arlequin software v3.5.2 [12]. PowerStats V12.xls [13] was used to calculate the allele frequencies and other forensic parameters for these autosomal STR loci. According to the "Specification of parentage testing" (SF/ ZJD0105001-2016), the total probability of discrimination power (TDP), the power of exclusion in duos (PE D ) and trios (PE T ), and the combined power of exclusion in duos (CPE D ) and trios (CPE T ) were counted. The allele frequencies of the DYS391 locus were calculated by direct counting. Gene diversity (GD), haplotype match probability (HMP) and discrimination capacity (DC) were calculated using the formulas GD ¼ N Â ð1 À P k i¼1 p i 2 Þ=ðN À 1Þ, HMP ¼ P k i¼1 p i 2 , and DC ¼ k= P k i¼1 ðp i Â NÞ, respectively, in which p i indicates the frequency of the ith haplotype, k indicates the number of haplotypes and N is the total number of individual samples.

PCR condition tests
All alleles were detected at each of the different annealing temperatures (57 C, 59 C, 61 C, 63 C and 65 C). The average peak height of the 23 loci ranged from 7 197.92 RFU (65 C) to 16 413.73 RFU (61 C); the latter was determined to be the ideal annealing temperature for genotyping (Supplementary Figure S1(A)). Complete profiles were obtained, and no allele dropout presented when the PCR cycle number ranged from 25 to 29. However, the average peak height for the 29 cycles reaction was too high (23 446.96 RFU) and produced spectral bleed-through. Therefore, the optimal cycle number of the 24plex system was considered to be 27 with a moderate average peak height of 16 282.96 RFU (Supplementary Figure S1(B)). These results were in accordance with those recommended in the user guide, and PCR was completed in less than 1 h, which aligns with forensic practice. The general characteristics of the 23 loci included in this system as well as the genotypes of the aforementioned four positive control DNA samples are shown in Supplementary Table S1.

Sizing precision and accuracy study
A sizing precision study is necessary for reliable and accurate genotyping. Based on the sizing data of the 24 allelic ladder samples, the average fragment sizes were plotted against the 3 Â standard deviation (3 Â SD), and the largest standard deviation was 0.0685 for D12S391 at allele 17.3 ( Figure 1). In addition, the 24 allelic ladder samples were even equivalent (199.64 bp) for allele 20.3 at the D1S1656 locus.
The accuracy was evaluated by calculating the size differences of all 8 461 alleles from the 200 DNA samples compared to the corresponding allelic ladder. All the sample alleles were observed to be within ±0.5 bases of a corresponding allele, most of which were within ±0.3 bases (Figure 2). This result demonstrates that the multiplex system has sufficient sizing accuracy and that there is little risk of error in micro-variant allele genotyping.
Repeatability/reproducibility, "forensic-type" samples and concordance testing The genotyping profiles of the 50 individual DNA samples from the replicate tests of the 24plex QS Kit in our laboratory and from the tests reproduced in an independent accredited laboratory were consistent, which indicated the repeatability and reproducibility of this kit. Moreover, the genotypes of the 50 DNA samples from the 24plex system were fully concordant with those generated using the GlobalFiler TM kit, and no null alleles were obtained, which further illustrated the concordance and accuracy of the 24plex kit.
We further demonstrated that the 24plex kit could successfully genotype a batch of genuine "forensic-type" samples, including hair, nail, and those mentioned in Section repeatability/reproducibility, "forensic-type" samples and concordance testing. The genotype profile of the FFPEB sample is shown in Supplementary Figure S2, in which the allelic peak heights decrease with the increasing amplicon length, but the QS1 and QS2 peaks were not affected (the height ratio greater than 60%). The complete genotype profiles of a blood stain on an FTA card and a 10-year-old blood stain on gauze were also obtained through direct amplification by this kit. All these results illustrated that the Investigator 24plex QS Kit was robust and applicable in forensic casework.

Sensitivity
A sensitivity study was performed to determine the range of the input DNA quantities that would generate reliable genotyping profiles and to confirm the detection limit of the present system. Triplicate results of serially diluted control DNA from the 9948 human cell line illustrated that the average peak heights proportionally decreased with the decreasing quantity of template DNA. Complete profiles could be recovered with as little as 31.25 pg DNA ( Figure 3A). The average peak heights fluctuated from 23 749.98 RFU to 1 411.36 RFU as the input DNA ranged from 5 ng to 31.25 pg, and a further decrease of the quantity of template DNA (15.625 pg) produced a 5.83% allelic dropout with a lower 656.25 RFU average peak height. In previous studies [8], the Investigator 24plex QS Kit recovered 100% of alleles with at least 125 pg input DNA, while it was demonstrated to be more sensitive (31.25 pg) in our study, which may be the result of using the optimal electrophoresis and instrument status. Moreover, as shown in Figure 3B, the heterozygous balance values of the corresponding markers generally decreased with decreasing quantities of input DNA. When the input DNA ranged from 5 ng to 125 pg, all the heterozygous balance values were greater than 0.6, with only two exceptions at the D10S1248 locus at 5 ng and 125 pg. Spectral bleedthrough was observed at 5 ng and 2.5 ng. Therefore, our results recommend a DNA input range between 1 ng and 125 pg DNA.

Stability test
The ability of the Investigator 24plex QS Kit to obtain genotyping profiles from DNA subjected to common PCR inhibitors was assessed. As shown in Figure 4, for the hematin, humic acid, urea and indigotin groups, both the average percentage of loci detected and the average allele peak height decreased as the inhibitor concentrations increased. Full profiles were obtained at 500 mmol/L hematin, 160 ng/mL humic acid, 16 000 ng/mL urea, and 4 000 ng/mL indigotin, which is a significant improvement compared to many other CE-STR kits [14][15][16]. For nigrosine, concentrations of 40 ng/mL to 150 ng/mL produced profiles with a slightly altered peak height (9 070.43 -10 387.76 RFU) but with several non-product peaks of fragments 85-120 bp in length. This was also observed in our previously established SiFaSTR TM 23-plex system (data not shown) and may be due to the influence of nigrosine on metal ions as well as some small molecules and enzymes involved in PCR [17]. In terms of the quality sensors, the peak height of QS2 was generally reduced more than that of QS1 in each concentration test of hematin, humic acid, urea, and indigotin. However, the peak height of the two quality   sensors remained similar in the nigrosine group. At concentrations !320 ng/mL humic acid, !8 000 ng/mL indigotin, !32 000 ng/mL urea, and !700 mmol/L hematin, the QS2 peaks dropped out. The QS1 peaks, along with PCR progress, were completely inhibited at !16 000 ng/mL indigotin and !48 000 ng/mL urea.

Species specificity
The DNA samples described in Section species specificity were amplified and assessed as an evaluation of the system's potential cross-reactivity. The QS1 and QS2 loci were observed to be normal in profiles of all the non-human samples, which confirmed successful PCR amplification. Apart from these results, no recognizable peaks above 100 RFU were found except from the deer, snake and chicken groups.
As shown in Supplementary Figure S3(A), two "OL" peaks (peak height 1 012 and 1 069 RFU, respectively) at the D5S818 locus in BTP dye with sizes of 329.54 bp and 331.14 bp were detected in chicken DNA. Additionally, there were two "OL" peaks (peak height 224 and 223 RFU, respectively) observed at the D10S1248 locus in BTY dye with sizes of 84.12 bp and 91.66 bp, respectively in deer DNA (Supplementary Figure S3(B)). Snake DNA generated a 92.49 bp "OL" peak (peak height 419 RFU) at the D16S539 locus in BTP dye (Supplementary Figure  S3(C)). Although several peaks located outside of each bin set were detected, the genotyping results were not affected. Overall, the species specificity results indicated that the Investigator 24plex QS Kit is human-specific.

Mixture study
Forensic evidence samples that contain body fluids or secretions from two or more individuals are common in forensic cases [18]. Thus, it is vital to evaluate the ability of STR typing kits to detect a DNA mixture, especially for components from minor contributor. In our study, male and female DNA (9947A and 9948 human cell lines, respectively) or DNA from two individual males (9948 and 2800M cell lines) was mixed to yield 1 ng DNA input at various ratios, which were amplified in triplicate using the 24plex kit. The genotype profiles of 9947A/9948 and 9948/2800M at a mixed ratio of 1:1 are shown in Supplementary  Figure S4. As the minor component DNA reduced, a corresponding drop in peak height occurred for its alleles. As a whole, full profiles (59 alleles for 9947A/ 9948 and 66 alleles for 9948/2800M) were obtained at mixed ratios ranging from 1:1 to 1:19 (19:1) for each triplicate test, which was consistent with the sensitivity study. The QS1/QS2 peaks remained unaffected. However, when the mixed ratio reached 1:19 or 19:1, the alleles of the minor contributor was indistinguishable from the stutter peak of the major contributor's alleles, and in this situation, the reference genotype profiles are helpful.

Balance and stutter determination
For accurate heterozygote genotyping and low-template or degraded sample detection, the intra-locus, intra-colour, and inter-colour balances are recommended to be !0.7, !0.5, and !0.3, respectively [10,19]. As shown in Table 1, based on the allele height data of 200 DNA samples, the intra-locus balance values of the 22 loci in the kit (except for the homozygous DYS391) ranged from 0.8366 (vWA) to 0.8849 (TPOX). The intra-colour balance values of the five different florescent dyes ranged from 0.8592 (BTG) to 0.9457 (BTR2), and the value of intercolour balance was 0.8967. These values fully satisfy the aforementioned established standards and even exceed many other STR typing systems [11,15,20].
Stutter peaks are considered to be the additional PCR products that result from strand slippage [21]. A stutter peak usually manifests as one repeat unit smaller or larger than the associated main allele peak (n ± 3 for trinucleotide and n ± 4 for tetranucleotide) [3,22]. In terms of the stutter analysis for the 24plex system, only minus stutter peaks were taken into The balance parameters of intra-locus, intra-colour and inter-colour were calculated based on the genotyping results of 200 individuals (male ¼ 100, female ¼ 100). The balance criteria were as follows: intra-locus balance above 0.7, intra-colour balance above 0.5 and intercolour balance above 0.3. SD: standard deviation.
consideration as few plus stutters were detected among these samples. The peak heights of the alleles and the corresponding stutters of these 200 DNA samples were used to calculate the stutter percentage and SD. As shown in Table 2, the tetranucleotiderepeat locus SE33 represented the highest average stutter ratio (12.43%), and the lowest was observed at the tetranucleotide-repeat locus TH01 (3.82%); these results agree with reports from Singapore and the US population sets [8,23]. The stutter filter was defined by the average percentage of stutter plus or minus 3ÂSD, which is also considered to be significant in mixture interpretation.

Population study
In the present study, a total of 500 unrelated Chinese Han individuals (250 males and 250 females) were successfully genotyped using the 24plex system. A total of 279 alleles at 22 STR loci were observed with allelic frequencies spanning from 0.001 to 0.500. According to the exact Chi square test, there was a statistically significant deviation from HWE at SE33 (P < 0.05) in the studied population, which could be the result of a bias of the sample set. Although, after applying Bonferroni's correction, no significant deviation was observed, adding more unrelated individuals might be necessary in the further study. No significant LD was observed between these STR loci in the studied population after Bonferroni's correction. The HET values ranged from 0.6280 (TPOX) to 0.9180 (SE33); correspondingly, the PIC of these two loci were the lowest

Conclusion
In our study, we conducted a pilot validation study of the Investigator 24plex QS Kit in the Chinese Han population. One of the special features of this kit is the inclusion of two innovative internal performance controls, QS1 and QS2, which evaluate the PCR performance and can aid in troubleshooting. This 6-dye DNA profiling system was found to be highly sensitive and reliable for genotyping various forensic-type samples and could tolerate common PCR inhibitors. The population investigation demonstrated its applicability in forensic genetics among the Chinese Han population, which also provided useful population STR data that enriched the STR database.

Compliance with ethical standards
These samples of different species were collected and prepared in an accredited laboratory (ISO 17025) upon approval of the Ethics Committee on Animal Experimentation at the Academy of Forensic Sciences, Ministry of Justice, P.R. China. Written informed content was received from each participant for sample collection.

Disclosure statement
The authors had no conflicts of interest.