Sequencing of 231 forensic genetic markers using the MiSeq FGx™ forensic genomics system – an evaluation of the assay and software

ABSTRACT The MiSeq FGx™ Forensic Genomics System types 231 genetic markers in one multiplex polymerase chain reaction (PCR) assay. The markers include core forensic short tandem repeats (STRs) as well as identity, ancestry and phenotype informative short nucleotide polymorphisms (SNPs). In this work, the MiSeq FGx™ Forensic Genomics System was evaluated by analysing reproducibility, sensitivity, mixture identification and forensic phenotyping capabilities of the assay. Furthermore, the genotype calling of the ForenSeq™ Universal Analysis Software was verified by analysing fastq.gz files from the MiSeq FGx™ platform using the softwares STRinNGS and GATK. Overall, the performance of the MiSeq FGx™ Forensic Genomics System was high. However, locus and allele drop-outs were relatively frequent at six loci (two STRs and four human identification SNPs) due to low read depth or skewed heterozygote balances, and the stutter ratios were larger than those observed with conventional STR genotyping methods. The risk of locus and allele drop-outs increased dramatically when the amount of DNA in the first PCR was lower than 250 pg. Two-person 50:1 mixtures were identified as mixtures, whereas 100:1 and 1 000:1 mixtures were not. Y-chromosomal short tandem repeats (Y-STRs) alleles were detected in the 100:1 and 1 000:1 female/male mixtures. The ForenSeq™ Universal Analysis Software provided the data analyst with useful alerts that simplified the analysis of the large number of markers. Many of the alerts were due to user-defined, locus-specific criteria. The results shown here indicated that the default settings should be altered for some loci. Also, recommended changes to the assay and software are discussed.


Introduction
For more than 15 years, forensic DNA profiles have been generated by polymerase chain reaction (PCR) amplification of the core forensic short tandem repeats (STRs) and by detection of the PCR products with capillary electrophoresis (CE) [1]. PCR-CE typing proved to be highly efficient and relatively cheap. However, it has limited multiplexing capability, and the STR analyses cannot easily be combined with typing of other types of forensic loci, e.g. short nucleotide polymorphisms (SNPs), indels, mtDNA or RNA, in the same assay. With the introduction of PCR-based next generation sequencing (NGS) methods in forensic genetics [2][3][4][5][6][7][8][9][10][11][12][13], a new method for the analysis of PCR products was made available that may surmount many of the limitations of PCR-CE. The multiplexing capacity of the current PCR-NGS assays far exceeds the number of classical forensic genetic markers, and different types of loci (SNPs, indels, STRs or any combination of SNP-STR-indel haplotypes) may be typed in the same multiplex assay. Furthermore, the amplification products need not to be separated by colour and/ or size as in PCR-CE assays, because the loci can be identified by their DNA sequences. Thus, the PCR amplicons can be designed to be as short as possible, which may improve the probability of typing degraded DNA. Finally, the complete sequence variations of the loci are revealed, which increases the statistical weight of the DNA evidence and may improve the resolution of mixtures [4].
The MiSeq FGx TM Forensic Genomics System includes the ForenSeq TM DNA Signature Prep Kit, the MiSeq FGx TM sequencing instrument and the Fore-nSeq TM Universal Analysis Software (UAS) [11]. The ForenSeq TM DNA Signature Prep Kit is one of the first commercial kits for simultaneous typing of forensically relevant STRs and SNPs. The loci may be amplified with one of two primer mixes: Primer Mix A amplifies loci for human identification (HID), while Primer Mix B amplifies the same HID loci along with ancestry informative markers (AIMs) as well as markers associated with eye and hair colour. The MiSeq FGx TM Forensic Genomics System types 27 autosomal STRs, 7 X-STRs, 24 Y-chromosomal short tandem repeats (Y-STRs), amelogenin and 94 HID SNPs that were specifically selected for HID purposes by the forensic community [14][15][16][17][18][19][20][21]. The STRs include the 13 combined DNA index system (CODIS) core loci [22], the 12 European Set of Standard (ESS) markers [1] and 6 of the 7 Y chromsome STR haplotype reference database (YHRD) core loci in the minimal Y-STR haplotype [23], which ensure comparability with the DNA profiles in already existing DNA databases. The kit also amplifies 56 AIM SNPs [24] as well as 24 eye and hair colour predictive SNPs [25]. In total, 231 markers relevant for forensic genetics are sequenced. The amplicon lengths of the STRs and SNPs are in the ranges 61-467 and 63-231 bp, respectively.
Previous studies have already assessed the performance of the MiSeq FGx TM Forensic Genomics System [6,11,[26][27][28][29][30]. In this study, we evaluated the system by assessing the read depth of the markers, allele balances and stutter and noise ratios using the fastq files from the MiSeq FGx TM platform. We compared these analyses with the results generated by the ForenSeq TM UAS and evaluated the reports generated by the ForenSeq TM UAS. Furthermore, we analysed the sensitivity of the workflow and the abilities of the ForenSeq TM UAS to identify DNA mixtures, and to predict biogeographical ancestry as well as eye and hair colour.

Samples and DNA extraction
Material from buccal swabs was collected on FTA cards from 12 Danes and 1 Argentinian individual. The DNA was extracted from two 3 mm punches using the Trace Tip Dance (TD) procedure of the EZ1 DNA Investigator Kit (Qiagen, Hilden, Germany) and the EZ1 Advanced XL Instrument (Qiagen). The extracted DNA was eluted in 50 mL water.
Blood samples from 10 Italians, 2 Ethiopians and 7 individuals of Eastern European, Moroccan, Iraqi, Indian, Korean, Chinese and Brazilian origin, respectively, were used in this study. The DNA was extracted from 200 mL blood using the spin protocol of the QIAamp Ò DNA Mini Kit (Qiagen). The DNA extractions were quantified using the Qubit 3.0 instrument (Thermo Fisher Scientific, Waltham, MA, USA).

Library building, MiSeq FGx TM sequencing and data analysis
Libraries were built using the ForenSeq TM DNA Signature Prep Kit (Illumina Ò ) following the manufacturer's protocol. Four library pools were made. Pool 1 contained libraries from 10 Danes, 10 Italians and 10 individuals from various countries (Table 1) MiSeqFGx TM 's "cluster passing filter" in Pools 1, 2 and 4, respectively. In Pool 3, the read depth ranged from 983 to 102 329 reads per sample with a median of 18 398 reads. The cluster density was 665 k/mm 2 , and 96.8% of the clusters passed the "cluster passing filter". Overall, the sample read depth decreased as the amount of input DNA decreased. All sequencing runs were analysed with the ForenSeq TM UAS [32]. The "Project Detail Report" offered by the ForenSeq TM UAS was used for further analysis.

STRinNGS STR analysis
Fastq.gz files from the MiSeq FGx TM sequencing were mapped to hg19 (GRCh37) using an in-house script. The resulting BAM files were analysed with the STRinNGS v.1.0 software [5]. In short, STRinNGS sorted the reads from the BAM files according to the barcodes assigned to each individual. For each barcode, the reads for each STR system were subsequently sorted via their specified flanking sequences, and each sequence read was parsed according to the specified nomenclature. Configuration files stated the length of the flanking regions as well as the repeat structure according to information in the STRbase (http://www.cstl.nist. gov/strbase/) and the literature [14][15][16][17][18][19][33][34][35] (Supplementary File 1). The nomenclature described by Gelardi et al. [36] was applied. Forward strand repeat structure was used in accordance with the recommendations of the International Society for Forensic Genetics [35]. The main output file contained a description of all sequences with read counts above 1% of the total number of reads per locus. STRinNGS output was used to determine the read depth, heterozygous balance (Hb) and fractions of stutter reads based on duplicate sequencing of 30 single contributor samples. The Hb for STRs was calculated as the read count for the longest allele divided by the read count for the shortest allele.

GATK SNP analysis
Reads obtained from the fastq.gz files from the MiS-eqFGx TM instrument were trimmed to a Q-score of 30 using AdapterRemoval v.2.1.3 [37]. Reads were mapped to hg19 using BWA-MEM (http://bio-bwa. sourceforge.net/) and aligned reads were extracted using SAMtools [38]. Genotypes were called with GATK 3.6 (https://software.broadinstitute.org/gatk/) using the default genotyping procedure [39]. Alleles with read depth below 20 reads were discarded. Locus read depth, Hb and noise levels were identified and plotted using in-house Python scripts. Hb was calculated as the number of reads of one nucleotide divided by the number of reads of the other nucleotide in the order A, C, G and T. Noise was calculated as the number of reads that was different from the true genotype divided by the total number of reads.

Ancestry and phenotype predictions
The ForenSeq TM UAS ancestry predictions were compared to self-reported ancestry. The ForenSeq TM UAS eye colour predictions were compared to the Pixel Indix of the Eye (PIE)-scores obtained with digital photographs and the DIAT software [40]. The ForenSeq TM UAS hair colour predictions were compared to self-reported hair colours, which were available for all study subjects except for the individuals of Eastern European, Iraqi, Indian and Brazilian origin.

STR typing using the MiSeq FGx TM forensic genomics system
A total of 30 samples were sequenced in duplicate with the ForenSeq TM DNA Signature Prep Kit using Primer Mix B (Pools 1 and 2) and analysed with the Fore-nSeq TM UAS. All 58 and 34 STR markers were typed in all samples from male and female individuals, respectively. However, DXS10103 was not called in one of the duplicate typings of two samples because of low read depth (30 allele read counts).
A total of 1308 STRs (58 STRs typed in 12 male samples and 34 STRs typed in 18 female samples) were sequenced twice, and 1290 genotypes (98.6%) were reproduced. The details of the 18 genotypes that were not reproduced are shown in Supplementary File 2. Besides the two locus drop-outs in DXS10103, discordant genotypes were observed in six loci (D9S1122, D17S1301, D20S482, D21S11, DYS392 and DXS10135). In 13 incidents, the discordance was caused by allele drop-in. The drop-in alleles had markedly lower read depth than the other allele(s), and the lengths and sequences of the drop-in alleles indicated that the alleles were stutter artefacts. Only in one incident, the 10,13 genotype call in DYS392, the drop-in allele had an unusual length. A total of four allele drop-outs, which were all in DXS10103, were observed. In all incidents, sequences of the drop-out allele were present. However, the alleles were not called because the numbers of reads were below the interpretation threshold (IT) defined in the ForenSeq TM UAS.
All 30 samples were also typed with PCR-CE using the AmpFℓSTR Ò NGMSElect Express PCR Amplification Kit. Of the 450 STR genotypes, 448 (99.6%) were concordant with the genotypes obtained using the ForenSeq TM DNA Signature Prep Kit. Two discordances were caused by drop-ins in the PCR-NGS assay (Supplementary File 3).

STR sequence analysis
The locus balance, Hb and fraction of stutter artefacts for each STR locus were analysed using STRinNGS [5] and the fastq files from the duplicate typing of 30 samples (Pools 1 and 2). STRinNGS uses the flanking sequences to sort the reads according to locus. For most loci, the numbers of allowed mismatches in the flanking sequences were set to no more than three. However, more than three mismatches in the flanking sequences were needed to identify reads of D1S1656, D19S433, HPRTB and DYS522 (Supplementary File 1), indicating that some flanking sequences included a relatively high number of sequencing errors or true variants. The read depth of the Y-chromosome STR DYS461 is shown in Figure 1. This locus is not included in the list of markers amplified by the Fore-nSeq TM Signature Prep Kit. However, DYS461 was amplified on the same fragment as DYS460 [6], and was readily analysed by STRinNGS (Supplementary File 1).
The median read depth of all STRs was 3 145 reads. The median read depth varied from 52 to 18 164 reads for the various loci ( Figure 1). Two loci, DYS389I and D3S1358, had median read depth above 10 000 reads, whereas DXS10103 had a median read depth of only 52 reads. The coefficient of variation of the STRs' read depth varied from 30% to 120%, and was highest for the Y-chromosome STR DYS392 (Table 2).
Hb was calculated in all heterozygous genotypes as the read count of the longest allele divided by the read count of the shortest allele ( Figure 2). For most loci, the median Hb was close to or slightly smaller than 1.0, which indicates that the sequencing assay had a tendency to favour the amplification of the shorter allele of a locus. D22S1045 was the only STR locus with a median Hb < 0.5 (based on 44 heterozygous  [TCTA] repeats. These alleles had lower read counts (approximately one-third) than the alleles without the trinucleotide (data not shown). The fraction of stutter reads that were one repeat unit shorter than the parent allele (n ¡ 1 stutter) was calculated for each allele (Table 3, Figure 3, Supplementary File 4). The highest n ¡ 1 stutter ratio was observed for the two three-base-pair Y-chromosome repeats DYS481 and DYS612 that both had an average stutter ratio of 31.5%, and a maximum stutter ratio of 43% and 37%, respectively. In contrast, the four fivebase-pair repeat STR systems Penta D, Penta E, DYS438 and DYS643 had some of the lowest n ¡ 1 stutter ratios, and no stutter was observed in the sixbase-pair repeat system DYS448. At four STR loci, D12S391, D21S11, DYS390 and DXS7132, two stutters of the same length but with different sequences were observed, and at DYS389II, three same-length stutters with different sequences were observed. These STRs have two or more sub-repeats, and slippage during the PCR may occur at any of the sub-repeats and, thus, generate stutter artefacts. For example, the allele D21S11 27 [TCTA] 6  (stutter ratio: 3.0%). In general, the fraction of stutter reads depended on the sequence and the length of the longest uninterrupted repeat [31,[41][42][43]. The stutter ratio was high for alleles with long uninterrupted repeats. This was seen in e.g. D18S51, where the stutter Autosomal loci are marked with white colour, X-chromosomal loci with pale grey colour and Y-chromosomal loci with dark grey colour (two amplicons are generated from each of the Y-STRs DYF387S1 and DYS385a/b). The box-and-whisker plot parameters are described in Figure 1. ratio was linearly correlated with the number of [AGAA] repeats ( Figure 3(A)). STR alleles with microvariants in the longest repeat had lower stutter ratios than alleles of similar length without microvariants, e.g. D1S1656 (compare e.g. alleles 16.3 and 17 in Figure 3 (B)). D1S1656 had a long sequence of [TCTA] repeats. However, in some individuals, a [TGA] trinucleotide broke the longest repeat up into two smaller repeats. These alleles had much lower stutter ratios than alleles without the [TGA] sequence. In contrast, microvariants in D19S433 did not affect the stutter ratio because the variants did not interrupt the longest repeat (compare e.g. alleles 14.2 and 15 in Figure 3(C)). Similarly, variations in the sub-repeat structures caused stutter ratios to be lower than the expected ones from the total number of repeats.  Figure 3(D)).
Only stutter reads that were one repeat unit shorter than the parent allele were considered here. Many of the sequences that were two repeats shorter (n ¡ 2 stutter) or one repeat longer (n + 1 stutter) than the parent allele were not represented in the STRinNGS output file because STRinNGS ignored unique sequences with less than 1% of the total reads for a locus. Nevertheless, n ¡ 2 and n + 1 stutters were observed in loci with frequent n ¡ 1 stutter reads, e.g. DYS481, DYS612, DYS385a/b, DYS392 and DXS7132.

HID-SNP typing using the MiSeq FGx TM forensic genomics system
A total of 94 HID-SNPs were amplified with the Fore-nSeq TM DNA Signature Prep Kit, and analysed with the ForenSeq TM UAS. Eighteen locus drop-outs (0.32%) were observed in the duplicate typing of the 30 samples. Eleven of the 18 locus drop-outs were found at the rs1736442 SNP. The remaining locus drop-outs were detected in rs1031825, rs2920816, rs907100 and rs7041158.
A total of 2820 SNPs were typed twice, and 2780 genotypes (98.6%) were reproduced. The inconsistencies were caused by the 18 locus drop-outs mentioned above, 21 allele drop-outs at 8 loci (rs2920816, rs1493232, rs1031825, rs1294331, rs7041158, rs1736442, rs1454361 and rs338882) and one allele drop-in at rs1454361. Most allele drop-outs were found in rs2920816 and rs1493232 with seven and four incidents, respectively. Allele drop-outs were observed in loci with low read depth (<94£). The locus drop-in occurred in a locus with high read depth (846£) and very skewed Hb (24.6).

Analysis of SNP amplicons
The locus balance, Hb and noise level of each of the 172 SNP loci were analysed using the freeware GATK (https://software.broadinstitute.org/gatk/) and the fastq files from the duplicate typing of 30 samples. The median read depth was 505 reads or around six times lower than that of the STR loci. The median read depth ranged from 60 (rs1736442) to 1 212 (rs917115) reads (Supplementary Figure 1). Seventeen of the 18 locus drop-outs and 13 of the 21 allele drop-outs were observed in the four loci with the lowest overall read depth (rs1736442, rs2920816, rs1031825 and rs7041158). A read depth of 200£ has previously been suggested as a threshold for SNP typing with NGS platforms in relationship testing [8]. Eighty-six of the HID SNPs (91%), 50 of the AIM SNPs (93%) and all phenotypical SNPs had median read depth higher than 200£.
Hb (Supplementary Figure 2) was calculated for SNP loci with six or more heterozygous genotype calls among the duplicate sequencing of 30 individuals (150 loci). For most loci, the median Hb was close to one, and there were only few outliers. However, four SNPs (rs798443, rs338882, rs6955448 and rs279844) had median Hbs below 0.5 or higher than 2.0.
The reads that differed from the SNP genotype call were designated as noise. Only two SNP loci, rs1229984 and rs1979255, had median noise levels >0.1%. Of the 10 302 SNP genotypes, only 46 (0.45%) had noise levels >1%. Almost 50% (20) were at the rs1979255 locus.

Sensitivity
Eight dilutions of DNA from two male and two female individuals were amplified in triplicates using Primer Mix A and genotyped with the MiSeq FGx TM Forensic Genomics System (Pool 3). Overall, the typing efficiency and quality of the results decreased with decreasing amount of DNA in the first PCR. The number of allele and locus drop-outs increased dramatically when the DNA input was <250 pg (Figure 4). Similarly, the number of loci flagged for low read depth by the ForenSeq TM UAS increased dramatically for <250 pg input DNA (Table 4). DXS10103 and DYS389II were the STR markers most prone to locus drop-out with 250 pg input DNA (locus drop-outs occurred in 56% and 39% of the male samples, respectively; see Supplementary  Table 1). DXS10103 had a lower read depth than the other loci (Figure 1). The high susceptibility to locus drop-out of DYS389II may be due to the long alleles of this locus (http://strbase.nist.gov/str_y389.htm). HID SNPs prone to locus drop-outs with 250 pg input DNA included the four SNP loci with the lowest read depth, rs2920816, rs1031825, rs7041158 and rs1736442 ( Supplementary Figure 1), as well as rs354439 and rs1294331.
High proportions of allele drop-out with 250 pg input DNA were observed for DYS385a/b, DXS10103, vWA and D1S1656 (Supplementary Table 2). D1S1656 was the marker with the largest variation in Hb. The DYS385a/b amplicons were long (316 bp) [44] including a 163 bp upstream flanking sequence. DXS10103 and vWA were among the loci with the lowest read depth ( Figure 1 and Table 2). HID SNP loci prone to allele drop-outs with 250 pg input DNA included rs1493232 and rs10488710 (allele drop-out in 28% and 22% of the samples, respectively). Both loci frequently experienced locus drop-outs at <250 pg input DNA.
Sequences with n ¡ 1 repeats (most likely n ¡ 1 stutters) that had high read counts led to allele dropins. At 250 pg input DNA, allele drop-ins were observed in DYS612, D20S482, D9S1122, D2S1338, D3S1358, FGA, DYS505, D6S1043, D8S1179, TH01 and D21S11 (Supplementary Table 3 Table 5. All samples with mixture ratios between 1:1 and 1:50 were  identified as mixtures, whereas 1:100 and 1:1 000 mixtures were called as single-source samples. It is noteworthy that Y-STR alleles were detected in 1:100 (six and seven alleles in the two duplicates) and 1:1000 (two and zero alleles) male/female mixtures, and that this did not influence the single-source indicator of the ForenSeq TM UAS (data not shown). The average numbers of alerts for the 30 singlesource samples typed in duplicate (Pools 1 and 2) were 2.7 and 1.05 for STRs and SNPs, respectively ( Table 5). The "allele count" alerts were seen in D7S820 (eight times), D20S482 (six times) and D21S11 (five times). These loci had a relatively low stutter threshold compared to the observed fraction of stutter reads (Table 3). Sometimes, the number of n ¡ 1 stutter reads exceeded the threshold, which released an "allele count" alert. The largest number of alerts for "imbalance" was observed in D22S1045 (44 times), D1S1656 (26 times), Penta E (12 times), D5S818 (11 times), rs338882 (19 times), rs6955448 (12 times) and rs4530059 (9 times). These loci have either skewed median Hb or large variations of Hb (Figure 2, Supplementary Figure 2), which triggered the "imbalance" alert of some samples.

Ancestry prediction
The ForenSeq TM UAS comprises a tool for prediction of biogeographic ancestry using four reference populations: European, East Asian, African and admixed American. The predictions were compared to selfreported ancestry of 30 individuals. Twenty out of 21 self-reported Europeans were assigned European ancestry. One of the Italian individuals was predicted as admixed American. The two self-reported East Asians (Korean and Chinese) were assigned East Asian ancestry. Two Africans from Ethiopia were predicted as African and admixed American, respectively. The Brazilian individual was predicted to be of admixed American ancestry, and the Argentinian individual to be of European ancestry. The Argentinian individual had blue eyes and blond hair (see the following paragraph) indicating that the individual was of European descent. Finally, a Moroccan, an Iraqi and an Indian individual were all assigned as admixed Americans.

Phenotype prediction
The ForenSeq TM UAS comprises a tool for prediction of eye and hair colours based on the HIrisplex model [25]. The eye colour predictions were compared to the PIE-scores that are an objective measure of the eye colour [40], whereas the hair colour predictions were compared to self-reported hair colours.
The ForenSeq TM UAS reported probability values for each colour category. With a probability threshold of P > 0.7, as suggested in [25], 11 of the 30 individuals were predicted to have blue eyes, 16 individuals to have brown eyes and the eye colours of 3 individuals could not be predicted ( Figure 5). Of the 11 individuals predicted to have blue eyes, 9 had PIE-scores >0.8, and 2 individuals had PIE-scores of 0.63 and 0.64, respectively. Most likely, the eyes of individuals with PIEscores >0.8 would be categorized as blue or light-coloured, whereas PIE-scores around 0.6 most likely are associated with an intermediate eye colour [40].
With a probability threshold of P > 0.7, the hair colour was predicted for only 10 of the 30 individuals ( Figure 6). Nine of the predictions were in concordance with the self-reported hair colour including two, three and four individuals with red, blond and black hair colour, respectively. One individual was predicted to have red hair, but had brown hair colour. Noteworthy, no brown hair colour predictions were made, even though seven individuals had self-reported brown hair colour.

Discussion
The MiSeq FGx TM Forensic Genomics System [11] is capable of typing 153 (Panel A) or 231 (Panel B) loci commonly used in forensic genetics in one multiplex assay. The loci include STRs and SNPs, which make the ForenSeq TM DNA Signature Prep Kit the first commercial assay for typing of both types of HID markers and an excellent example of what PCR-NGS assays may provide to forensic genetic investigations in future.
Except for two drop-ins in D21S11 (Supplementary File 3), complete concordance between the automated ForenSeq TM UAS analysis and the PCR-CE results was observed. Both drop-ins were flagged by the Fore-nSeq TM UAS, and would have been called as stutter artefacts by the data analyst. The reproducibility was >98.5% for both STRs and SNPs in the duplicate typing of 30 samples. Most of the discordances were caused by locus or allele drop-outs, whereas allele drop-ins were rare and mainly caused by stutter artefacts. The analysis of the assay sensitivity indicated that most loci may be successfully typed with an input DNA amount of >250 pg in the first PCR. In the reproducibility and sensitivity studies, DXS10103 and the four HID-SNPs, rs1736442, rs2920816, rs1031825 and rs7041158, accounted for the majority of dropouts. All these loci had relatively low read counts ( Figure 1 and Supplementary Figure 1), which explained why they frequently dropped-out. Analysis of the heterozygote balances revealed that D22S1045 was another poorly performing locus. The variation in allele read depth was so large that allele drop-out was likely, and it was difficult to distinguish stutters from true alleles for some samples. The power of discrimination of the MiSeq FGx TM Forensic Genomics System is so large that it may be considered to eliminate these loci from the assay. However, D22S1045 is one of the ESS markers. Therefore, it would be preferable to keep the locus in the multiplex, but design new, more efficient PCR primers. Similarly, one of the YHRD core loci [23], DYS393, is not amplified with the MiSeq FGx TM Forensic Genomics System, and it should be included in any future design.
The ForenSeq TM UAS is not an expert software that reports genotypes and conclusions in a final report without manual inspection of a case officer. The software assists the data analyst by presenting the analysed results in detailed reports and by raising alerts, but it is the data analyst who makes the conclusions. The software uses two minimal thresholds: the IT and the analytical threshold (AT). If the locus read depth is less than 650 reads, the IT is set to 30 reads, and the AT to 10 reads. When the read depth is higher, the software employs IT and AT settings that are user-defined as percentages of the total reads for each locus. Only sequences with read depth higher than the IT are called. The AT is used exclusively to raise alerts and to inform the data analyst that there are sequences with fewer reads, which may be relevant for the interpretation. The threshold for the Hb is also user-defined. However, it is set commonly for all STRs and all SNPs, and not for each locus, separately. Some loci have skewed Hbs (Figure 2 and Supplementary Figure 2) and frequently trigger "imbalance" alerts. Therefore, it would be convenient if locus-specific Hb thresholds were implemented in future versions of the software. The ForenSeq TM UAS uses the number of "imbalance" and "allele count" alerts to identify possible mixtures. However, it is up to the data analyst to act on this information and draw the necessary conclusions. The locus-specific stutter filters are user-defined and take the possibility of observing n ¡ 1, n ¡ 2 and n + 1 stutters into account. The filter for each locus is a fixed percentage of the number of reads of the parent allele (Table 3). However, as shown in Figure 3 and Supplementary File 4, the stutter ratio varied between alleles of the same locus and depended on the longest uninterrupted repeat [31,41,42]. In the ideal situation, the stutter filter values should be set for each allele, i.e. be sequence-dependent.
In this work, STRinNGS [5] was used to verify the data analysis of the ForenSeq TM UAS and to analyse the flanking regions of the STRs. We found true sequence variations in the flanking regions (all previously reported by Novroski et al. [6]), and what appeared to be sequencing errors or base degeneration in the primer sets of the first PCR. The latter were problematic because of their high frequency at some loci. For example, in D1S1656, four unique sequences with almost equal read counts in all heterozygous individuals were identified: two different STR alleles and for each STR allele, two different SNP-STR alleles. The variant in the flanking region of D1S1656 was detected 6 bp downstream and was most likely caused by a degenerated base in the downstream PCR primer. These positions must be identified to allow proper analysis of the flanking regions and should be made available by the manufacturer (Illumina or Verogen). The ForenSeq TM UAS does not report variants in the flanking regions, except for the SNP rs73801920 positioned 4 bp downstream of D5S818 [5]. Variation in this position is indicated in the report, if an individual has two SNP-STR alleles of the same size. However, it is up to the data analyst to use that information and name the SNP-STR alleles. In the case of an insertion or deletion in the flanking region, the ForenSeq TM UAS reports the allele name based on the length of the fragment, similarly to PCR-CE detection methods, and the sequence of the STR region as a string of bases. Again, it is up to the data analyst to interpret the STR sequence and name the allele. Two loci with deletions in the flanking regions were identified in our study. One individual had a 13 bp deletion immediately upstream of Penta D, and four individuals had a 3 bp deletion in the downstream flank of DXS10135 (Table 6). In the example with a deletion in Penta D, the ForenSeq TM UAS named the allele Penta D [2.2], which is identical to the PCR-CE allele name. However, the allele had five [AAAGA] repeats, and the 13 bp deletion was not reported. Without a more detailed analysis of the reads, it was not possible for the data analyst to give the allele the accurate name, which should be Penta D 2.2 [AAAGA] 5 del:45056073-85 [36]. Similarly, the deletion at DXS10135 was not reported by the ForenSeq TM UAS, and the true name could only be constructed using another software.
Overall, the performance of the MiSeq FGx TM Forensic Genomics System was high and the assay may be an attractive alternative to PCR-CE assays in some cases. In previous studies, other researchers reached similar conclusions on the overall performance of the MiSeq FGx TM Forensic Genomics System, and also identified the same error-prone loci [6,[26][27][28][29]. However, the higher costs of the workflow compared to that of PCR-CE typing may limit its use to special cases. The reagent costs of typing a sample with the MiSeq FGx TM Forensic Genomics System was approximately US$ 60 with Primer Mix A, and US$ 90 with Primer Mix B. This is 3-4 times the price of PCR-CE typing of CODIS and ESS STR loci, which usually provides sufficient discrimination power for non-degraded single contributor samples [1,45]. In the analysis of mixtures and/or samples with low amounts of DNA, the STR sequence information and sheer number of typed loci may be valuable. However, the high stutter ratios of some of the loci may interfere with mixture interpretation (Table 3, Supplementary  File 4). Ancestry and phenotype information may provide investigative leads to the police in crime cases without a suspect and no hit in the crime DNA database, as well as in cases with missing persons [25,40,46]. However, the ancestry and phenotype prediction provided by the MiSeq FGx TM Forensic Genomics System was not based on the likelihood ratio principle, which is the internationally recommended method for reporting forensic genetic evidence [10,47,48]. This should also be addressed in future versions of the ForenSeq TM UAS.

Compliance with ethical standards
All procedures performed in studies involving human participants were in accordance with the ethical standards of the Danish Ethical Committee (H-3