Use of faecal volatile organic compound analysis for ante-mortem discrimination between CWD-positive, -negative exposed, and -known negative white-tailed deer (Odocoileus virginianus)

ABSTRACT Chronic wasting disease (CWD) is a naturally occurring infectious, fatal, transmissible spongiform encephalopathy of cervids. Currently, disease confirmation relies on post-mortem detection of infectious prions in the medial retropharyngeal lymph nodes or obex in the brain via immunohistochemistry (IHC). Detection of CWD in living animals using this method is impractical, and IHC and other experimental assays are not reliable in detecting low concentrations of prion present in biofluids or faeces. Here, we evaluate the capability of faecal volatile organic compound analysis to discriminate between CWD-positive and -exposed white-tailed deer located at two positive cervid farms, and two groups of CWD-negative deer from two separate disease-free farms.

As CWD continues to spread in both captive and wild ungulate populations, detection of infected animals is of great importance. The current 'gold standard' diagnostic assay is post-mortem immunohistochemistry (IHC) of the medial retropharyngeal lymph nodes (MRPLN) and obex in the brain. There is a significant need for live animal (i.e., ante-mortem) assays to identify infected animals for disease management and control purposes. Antemortem detection of CWD by conventional means has been challenging as IHC cannot be used to test easy to collect samples such as biofluids or faeces [11] and the low concentration of CWD prion present in such samples falls below the detection limit of Western blot. Biopsy and IHC of lymphoid tissue in the recto-anal mucosa (i.e., RAMALT; rectal biopsy) has been shown to have utility under some conditions; however, it is invasive and has limited sensitivity [26] related to the quality of the sample (i.e., too few or no diagnostic lymphoid follicles), repeated sampling of the tissue, extent of histological lesions, and age of the animal [6,27,28].
The development of prion amplification assays such as serial protein misfolding cyclic amplification (PMCA) [24,[29][30][31] and real-time PrPc [32] quakinginduced conversion assay (RT-QuIC) [21,31] allows for amplification of minute, previously undetectable concentrations of infectious prions or their markers to levels detectable in samples such as faeces, urine, saliva, and blood [4,24,25,33,34] although the sensitivity of detection varies between sample type. Additionally, RT-QuIC has demonstrated promise for increased CWD detection in MRPLN and RAMALT samples [35,36]. Like all diagnostic assays, PMCA and RT-QuIC have aspects that make their implementation challenging, such as the level of technical expertise required for reliable results, and the generation of high-quality assay substrate [25,37].
Detection of CWD, particularly in ante-mortem samples, is significantly impacted by genetic variability in the prion protein gene [11,38]. Codon 96 in WTD influences the propagation of the infectious isoform, with wild-type glycine/glycine animals having the shortest incubation period followed by glycine/serine (GS) and serine/serine (SS) individuals. Detection of CWD can be challenging in GS and SS animals; therefore, it is necessary to evaluate the sensitivity of antemortem assays in all three genotypes.
Breath and faecal volatile organic compound (VOC) analyses have been explored as non-invasive methods of disease detection. VOCs are organic chemicals that enter a gaseous phase at low temperature, and are produced anthropogenically and biologically by all plant, animals, and microbes. Animals (including humans) produce VOCs via dietary and metabolic pathways, in response to immunologic or inflammatory stimulation, and via host-pathogen interactions. Such VOCs are present in biofluids, breath, and faeces. In humans, a validated breath VOC assay is used to detect Helicobacter pylori infection (60)(61), and breath, biofluids, and faecal analyses are being explored for diagnosis of metabolic, neoplastic, and infectious disease; dementia; and organ transplant success [39][40][41][42][43]. In domestic and wild ruminants (e.g., cattle (Bos taurus), goats (Capra aegagrus hircus), WTD, bison (Bison bison)), breath and faecal VOC analyses have been used to detect ketosis; bovine tuberculosis; brucellosis; bovine respiratory disease; and Johne's disease [44][45][46][47][48][49][50]. In other species, VOC analysis of serum has been explored for detection of bovine tuberculosis [51].
Development of a method to detect a suite of CWDspecific faecal VOCs would be valuable as a means to detect this disease ante-mortem. Access to portable gas chromatography/mass spectrometry (GCMS) or development of lateral flow assays could feasibly allow onsite testing, and offer lowered cost, increased labour efficiency, and test repeatability. Previously, we have demonstrated that breath and faecal VOCs can be used to discriminate between healthy cattle and cattle experimentally infected with virulent Mycobacterium bovis [44]; healthy non-vaccinated and M. bovis Bacille-Calumet Guerin (BCG)-vaccinated WTD [50] and cattle [49] prior to and after virulent M. bovis challenge; and healthy versus Brucella abortus-infected bison [48]. In this study, we explore VOC analysis of faeces as a means to discriminate between CWDinfected, -exposed, and -negative WTD.

Results
The XCMS Online analysis identified 1994 statistically significant ions from a pool of 5265 total ions. Statistically significant ions were GC column retention time matched to 183 total ion chromatographic (TIC) peaks. After excluding peaks potentially associated with diet (n = 23) or failing to meet the selection criteria (n = 153), a suite of seven candidate peaks remained for statistical analysis. Results of the principal component analysis (PCA) classification performed using all six treatment groups ( Figure 1) identified nine animals from confirmed negative Herd 1 (NN1) and four animals from confirmed negative Herd 2 (NN2) within one cluster closely approximated to a second cluster containing five NN2 individuals. One NN1 individual, one CWD-negative exposed individual from Herd 3 (confirmed positive farm; NE3) and two confirmed CWD-positive individuals from Herd 3 (PP3) are located randomly in the plot and represent outliers. Remaining NE3, PP3, one NN2, and all Herd 4 CWDpositive (PP4), and negative exposed (NE4) animals are closely associated but form distinct clusters, which is more readily observed when this area of the scatterplot is enlarged and drop-lines are added to better demonstrate the location of the animals within three-dimensional space ( Figure 2). In the enlarged figure (Figure 2), one NN2 individual can be visualized in three-dimensional space between other NN2 individuals and the enlarged clusters of NE and PP animals. One PP3 individual is distinct from all other known positive and negative exposed animals. Remaining PP3 and four NE3 animals form a cluster. Remaining NE3 animals (n = 4) form a separate distinct cluster. Three confirmed positive animals from Herd 4 (PP4) form a distinct cluster. The remaining PP4 individual is located in the cluster containing all NE4 animals. Genetics at codon 96 did not appear to influence the sample distribution, as samples fell into their disease status group regardless of their genotype (data not shown). Six class linear discriminant analysis (LDA) classification models were developed using four through seven principal components (PCs) based on the individual and accumulated proportional values of each PC score ( Table 1). The optimal model constructed using seven PC scores (100% of data) returned the lowest misclassification rate for the combined data (8%); group misclassification model (Positives = 7%, Negative Exposed = 18%, Known Negatives = 0%); and individual cohort assessment model (PP3 = 10%; PP4 = 0%, NE3 = 10%, NE4 = 20%, NN1 = 0%, NN2 = 0%). In all models, no PP individuals were misclassified as NN animals; misclassifications consisted entirely of NE animals. Negative exposed animals were misclassified in the optimal model as NN, whereas in the models constructed with four or five PCs they were classified as either PP or NN (no NE misclassifications occurred in the model constructed with six PCs). Negative individuals were correctly classified in the optimal model, but were misclassified as PP in the models constructed with four through six PCs. When NE animals are grouped with NN individuals, calculated SN:SP for the classification models constructed with four to seven PCs are 86%:89%; 93%:89%; 93%:89%; and 93%:95%, respectively. When NE animals are grouped with the PP animals, calculated SN:SP are 93%:90%; 97%90%; 97%:90%; and 97%:100%, Figure 1. Three-dimensional PCA scatterplot of CWD-positive, -negative exposed, and -negative deer. All known negative animals from Herd 1 (NN1; green dots) and one from Herd 2 (NN2, black dot) are located in a cluster (green) closely associated with all other NN2 individuals (black cluster). Three Herd 3 CWD-positive (PP3, red dots) and one -negative exposed individual (NE3, blue dot) are not associated with clusters in the plot and represent outliers. Remaining PP3 and NE3 animals and Herd 4 (PP4, pink dots; NE4, aqua dots) animals form closely associated clusters, with the exception of four NE3 animals found within or in close association to the PP3 cluster. Figure 2. Enlarged view of the area in the PCA scatterplot area containing Herd 3 and 4 CWD-positive and -negative exposed animals. CWD-positive animals from Herd 3 (PP3, red dots) form a distinct cluster containing three negative exposed animals (NE3) from that herd. Remaining NE3 animals form a distinct cluster with the exception of one individual found adjacent to the PP3 cluster. Herd 4 CWD-positive (PP4; pink dots) and -negative exposed (NE4; aqua dots) animals form separate clusters. The close approximation of these clusters indicates that there are some distinct similarities between the cohorts, yet differences between the groups do exist. The three NE3 animals located within the PP3 cluster, and the one NE3 individual located near that cluster may represent animals that were incorrectly classified by our analysis, or may be positive animals infected with a prion burden so low that prion was not detected in the post-mortem IHC analysis performed on the submitted tissues.

Discussion
The 'holy grail' of VOC analysis for disease detection, regardless of the sample used, is identification of a disease-specific biomarker. This has happened occasionally, with most successes occurring relative to metabolic diseases [40,66]; however, there has been little success identifying a unique biomarker paired to a specific infectious disease. Because identifying the sources of infectious disease related VOCs is difficult to determine, it has been hypothesized that detected VOCs may represent some ubiquitous metabolic or immunological response to disease. Based upon the literature and metabolomics database searches we conducted, many of the compounds we tentatively identified are related to metabolic function and synthesis of endogenous substances. Some appear to have immunological function, serve as biomarkers for oxidative damage, and occur in altered concentrations relative to the presence of neurological, metabolic, and infectious disease. In previous studies, we have successfully identified suites of VOCs that allowed discrimination between healthy and M. bovis-infected cattle [44,49] and WTD [50], and healthy vs. B. abortus-infected American bison (B. bison) [48]. Such was the case in this study. Interestingly, five compounds used for discrimination between cohorts in this study (1-butanol, 2-propyl phenol, 6,6-dimethoxy-2,5,5-trimethyl 2-hexene, hexanoic acid, isohexanol) were not present in the suites of faecal VOCs we used to differentiate between healthy and M. bovis-infected cattle and WTD. This finding is suggestive that disease-specific suites of VOCs may exist, and represents an area of study that should be further explored before such a relationship can be claimed. This manuscript summarizes the first study exploring faecal VOC analysis as a method to discriminate between CWD-positive, -exposed and -negative WTD with inference to use of this modality as an ante-mortem test for CWD surveillance of captive, farmed, and wild ungulate populations. Data were analysed using two statistical methods (e.g., PCA and LDA). The strength of PCA as a statistical tool is its independence in pattern recognition, as the analysis occurs independent of treatment group designation. As such, PCA was utilized first to transform the highly variable chromatographic data into orthogonal linearly uncorrelated data (i.e., PC scores), and to generate a visual representation of the data (i.e., the PCA scatterplot). The scatterplot then visually presents the WTD as grouped by the PCA without regard for any treatment group designation. The LDA was performed using our treatment group designations, with the intent to model the difference or similarity between those groups. Results are identified as animals correctly classified to their respected treatment group or misclassified into another treatment group.
exposed individuals as CWD-positive can be interpreted as beneficial (i.e., it is better to classify animals with a subclinical, highly infectious fatal disease as positive) relative to disease control. These misclassifications could be due to failure of our classification models or it is possible that these individuals might be CWD-positive and were incorrectly classified prior to our study if their infectious prion burden was low, and the post-mortem IHC was performed on MRPLN or obex tissue that did not happen to contain detectable infectious prion. Efforts to control for genetic, local environmental effects and diet were used when selecting the suite of VOCs used in our final analysis; however, it is likely that some genetic, geographical, environmental, and dietary effects did influence the changes noted in the VOCs used. Because housing large groups of WTD infected with a chronic infectious disease in under tightly controlled environmental and dietary conditions for long periods of time is difficult and expensive, reliance on 'real-world' scenarios is often the route required to acquire CWD samples. While this potentially confounds our results, it does represent testing of 'real-world' samples, and our results demonstrate robust potential for our analysis method. To further understand the potential of faecal VOC analysis as a means to detect CWD presence or absence, more blinded studies should be undertaken to increase the number of animals, disease status, and geographic localities from which samples are drawn and to increase the database.
While our use of three group diagnostic classification does not compare strictly with standard estimates of SN and SP, some comparison to other assays are possible ( Table 4). The published SN:SP for ante-mortem IHC on tonsil or RAMALT biopsies are 99%:100% and 68%:99% [11,[67][68][69], indicating that 1-32% of CWD-positive animals would be incorrectly identified as CWD-negative, while 0-1% of CWD-negative animals would be identified as CWD positive. The SN:SP for the RT-QuIC assay performed on RAMALT is reported as 70%:94% [11,69], indicating that 30% of CWD-infected animals and 6% of disease-free animals would be identified falsely negative and falsely positive, respectively. When RT-QuIC and/or PMCA were performed on nasal brushings, biofluids, or faeces, resulting SN:SP ranges were 16-93%:96-100% [11,[69][70][71][72][73][74] (e.g., 7-84% of CWD-positive individuals tested false negative; 0-4% of CWD-negative animals tested false positive) depending on the assay and sample used. Calculated SN:SP in this study when CWD-negative exposed animals were grouped with CWD-negatives ranged 86-93%:89-95% (e.g., 7-14% of CWD-positive animals and 5-11% of CWD-negative animals incorrectly identified as false negative and false positive, respectively). When CWD-negative exposed animals were included in the CWD-positive group, the SN:SP ranged 93-97%:90-100%. In this calculation, all CWDpositive animals were correctly identified, and 0-10% of CWD-negative individuals were incorrectly identified false positive. A possible explanation for the difference in the results may be genetic, geographical, or dietary effects that could not be controlled for that changed the SN: SP in the second calculation method. An unfortunate limitation to all of these assays is the difficulty in accurately assessing the disease status of negative exposed individuals.
An interesting finding in this study was that misclassified CWD-negative exposed animals were identified as CWD-positive or -negative individuals. This finding opens questions of whether VOC analysis is capable of discriminating between subclinically infected and true negative individuals in a positive herd, how this testing strategy might compare to RT-QuIC and PMCA results, and whether the disease status of such individuals might change over time. Unfortunately, following the negative exposed animals over time was not an option for this study, but this could be a potential area to evaluate in the future.
The results of this preliminary study exploring use of faecal VOC analysis as a means to discriminate between CWD-positive, -negative exposed, and -known negative animals are encouraging. The sample size used in this pilot study was small; therefore, additional studies should utilize larger sample sizes in order to test the robustness of this method as a potential diagnostic tool. All of the CWD-positive deer in this study were positive in both the MRPLN and the obex, indicating that the disease was significantly progressed in those individuals. Other ante-mortem diagnostic assays such as PMCA performed on blood samples have been successful in detecting CWD in deer that were positive in both locations as well. However, animals that are CWDpositive only in the MRPLN are in earlier stages of the disease course, and are therefore of the most interest for early CWD detection; however, detection of infectious prion by PMCA and other assays is significantly reduced in animals that are positive only in the MRPLN [34]. Further studies utilizing VOC testing must include animals that are positive in the MRPLN only as well as MRPLN and obex for comparison. Additionally, because CWD has been detected in faecal samples by PMCA and RT-QuIC [73,74], it would be very informative to use faecal VOC analysis in tandem with one or both of these assays for comparison. Should faecal VOC analysis prove robust in discriminating between CWD-positive and -negative animals, and sensitive enough to detect subclinical infection in negative exposed individuals, it would provide a powerful tool for disease detection and management. This assay would also vastly improve the ability of wildlife managers to perform wild cervid CWD surveillance from environmental samples and reduce reliance on hunterharvested or lethal sampling.

Sample collection
In cooperation with state and federal agencies, faecal samples were opportunistically collected post-mortem from 51 farmed WTD at four different locations (Table 5). Herds 1 and 2 were located at farms confirmed free from CWD. Herds 3 and 4 were located at confirmed CWD-positive farms with prevalence rates of >50% and <20%, respectively. All herds were separated geographically and environmentally from each other, and fed different diets relative to location and owner preference. Animals in the CWDpositive herds were depopulated for disease control purposes, then tested for CWD via IHC of the MRPLN and obex by the United States Department of Agriculture (USDA) National Veterinary Services Laboratory in Ames, Iowa, USA, as previously described [38], and genotyped at codon 96 by GeneCheck TM [75]. All faeces were placed in individual 50 ml conical tubes and stored on ice until transport to the USDA-Animal Plant Health and Inspection Service-Wildlife Service-National Wildlife Research Center where they were stored at −80°C until analysis. Each sample was warmed to room temperature and an internal standard (0.010 mL, 70 ppm (+) carvone in water) added prior to randomized placement into a GC 120 PAL autosampler (Agilent Technologies, Santa Clara, CA, USA). Samples were pre-incubated for 10 min at 37°C with pulsed agitation (250 RPM for 5 s, off for 2 s) followed by extraction of vial headspace VOCs using a solid-phase microextraction (SPME) fibre (StableFlex™ 2 cm divinylbenzene/carboxen/polydimethylsiloxane (DVB/ CAR/PDMS), Supelco Inc., Bellefonte PA, USA). Extraction time was 40 min at 37°C with pulsed agitation. Following VOCs extraction, the SPME was inserted into the splitless injection port (270°C with 0.75 mm ID, ultra-inert straight liner) equipped with a 23 ga Merlin Microseal™ septa (Merlin Instrument Company, Half Moon Bay, CA, USA) in an Agilent 7890B GC (Agilent Technologies) for desorption for 1 min. A Stabilwax®-DA 30 m × 0.25 mm ID × 0.25 µm film thickness (Restek Corporation, Bellefonte, PA, USA) column was used with helium Table 5. Herd identification, animal classification, CWD status, and the number of animals from each sample group used in the study.
carrier gas in constant flow mode (1.0 mL/min). The GC oven temperature was held at 35°C for 2.5 min, increased at 6.0°C/min to 260°C, and then held for 5 min. The GC was coupled through a 280°C transfer line with an Agilent 5977A mass selective detector (MSD) equipped with an extractor electron impact source operated at 230°C. The MSD quadrupole was operated at 150°C and the scan range was 50-500 m/z.

Data analysis
Chromatographic data were analysed as described in Ellis et al. [49]. Briefly, baseline-corrected chromatograms were first evaluated using the XCMS Online multi-group comparison feature [49,77,78] (www. xcmsonline.scripps.edu) to identify statistically significant peak ion abundances which were then retention time matched to TIC peaks present in the each sample chromatogram using Agilent ChemStation software (Agilent Technologies). Peaks exclusive to location of sampling were excluded to remove potential dietary sources of compounds, and an optimal suite of peaks was identified using peak selection criteria (e.g., between groups fold differences ≥3.0; biological relevance). A PCA was performed on the optimal suite of peaks to transform the chromatographic data into orthogonal (linearly uncorrelated) PC scores and to generate a visual scatterplot of the data. The PC scores were then utilized in a LDA to evaluate the capability of the selected suite of peaks to discriminate between the study subjects and correctly identify their CWD status and Herd designation. All statistical analyses were performed using statistical packages available in 'R' [79,80]. Sensitivity (SN) and Specificity (SP) were calculated using the following formulas [81,82]: Because the 'true' disease status of the negative exposed animals from Herds 3 and 4 (NE3, NE4) could not be confidently assumed, SN:SP calculations were performed first by including them with the Herds 1 and 2 known negative animals (NN1, NN2) as 'true negatives;' and second, by including them with known positive animals in Herds 3 and 4 (PP3, PP4) as 'true positives.' Automated Mass Spectral Deconvolution and Identification System software [83,84] (www.amdis. net); a standard chemical database (National Institute of Standards and Technology W8N08 (www.nist.gov)); and two metabolomics databases (Kyoto Encyclopedia of Genes and Genomes [56,85]  www.hmdb.org) were used to tentatively identify each peak. Peaks meeting minimum spectral match probability ≥65% were further evaluated using KEGG, HMDB, and peer-reviewed literature to determine if the tentatively identified compounds were associated with ruminant physiology or prion-associated disease. Chemical standards were not used to definitively identify selected VOCs due to cost and lack of access to a chemical standards library.

Disclosure of potential conflicts of interest
No potential conflicts of interest were reported.

Funding
This study was funded internally by USDA-Animal and Plant Health Inspection Service (APHIS). No funding conflicts of interest are present.