The longitudinal cerebrospinal fluid metabolomic profile of amyotrophic lateral sclerosis

Neurochemical biomarkers are urgently sought in ALS. Metabolomic analysis of cerebrospinal fluid (CSF) using proton nuclear magnetic resonance (1H-NMR) spectroscopy is a highly sensitive method capable of revealing nervous system cellular pathology. The 1H-NMR CSF metabolomic signature of ALS was sought in a longitudinal cohort. Six-monthly serial collection was performed in ALS patients across a range of clinical sub-types (n = 41) for up to two years, and in healthy controls at a single time-point (n = 14). A multivariate statistical approach, partial least squares discriminant analysis, was used to determine differences between the NMR spectra from patients and controls. Significantly predictive models were found using those patients with at least one year's interval between recruitment and the second sample. Glucose, lactate, citric acid and, unexpectedly, ethanol were the discriminating metabolites elevated in ALS. It is concluded that 1H-NMR captured the CSF metabolomic signature associated with derangements in cellular energy utilization connected with ALS, and was most prominent in comparisons using patients with longer disease duration. The specific metabolites identified support the concept of a hypercatabolic state, possibly involving mitochondrial dysfunction specifically. Endogenous ethanol in the CSF may be an unrecognized novel marker of neuronal tissue injury in ALS.


Introduction
Amyotrophic lateral sclerosis (ALS) is a heterogeneous, progressive and uniformly fatal neurodegenerative disease characterized by clinically variable loss of upper and lower motor neurons and associated frontotemporal cerebral pathways. Over the past decade, a number of candidate neurochemical biomarkers have emerged from the analysis of serum and cerebrospinal fl uid (CSF) (1,2). Nevertheless, there remains a need for biomarkers that are sensitive to diagnosis in those with atypical clinical features, capable of prognostic stratifi cation and measuring therapeutic response in ALS, as well as in understanding the pathophysiology of the condition.
Characterization of the metabolome offers the potential to develop a disease-specifi c signature that facilitates sub-group stratifi cation, as well as providing potentially novel insights into deranged biochemical pathways at close proximity to neuropathological tissue. High-throughput metabolomic The longitudinal cerebrospinal fl uid metabolomic profi le of amyotrophic lateral sclerosis ELIZABETH GRAY 1 * , JAMES R. LARKIN 2 * , TIM D. W. CLARIDGE 3 , KEVIN TALBOT 1 , NICOLA R. SIBSON 2 & MARTIN R. TURNER 1 radic ALS patients. In addition to the potential for biomarker development, determining the identity of the most discriminant molecules compared with healthy controls and between fast and slow progressing patients, offers potential for the identifi cation of novel pathogenic mechanisms.

Participants
Participants were recruited as part of The Oxford Study for Biomarkers in Motor Neuron Disease (BioMOx). CSF samples were obtained serially every six months from prevalent and incident cases of ALS attending The Oxford Motor Neuron Disease Centre, and diagnosed by two experienced neurologists (MRT, KT) according to standard criteria (12). All participants in this analysis were apparently sporadic ALS patients (i.e. not reporting a family history of ALS or frontotemporal dementia). Patients were excluded if they suffered from any other signifi cant medical disorder such as diabetes. Healthy volunteers without signi fi cant past medical history (spouses and friends of patients) were recruited for a single timepoint sampling only. All participants were capable of providing informed consent. The study was approved by the South Central Oxford Ethics Committee B.
Patients were examined at each visit (MRT). Disease duration was calculated in months from symptom onset to date of sampling. Disability was assessed using the revised ALS Functional Rating Score (ALSFRS-R, 0 -48, lower score refl ects greater disability). Progression rate was calculated in ALSFRS-R score decrease per month as: (48 minus ALSFRS-R)/(disease duration).

NMR spectroscopy
Proton NMR spectroscopy was performed as previously described (13). A 700-MHz NMR system (Bruker Avance III equipped with a 1 H TCI cryoprobe) was used to acquire 1 H NMR spectra from each sample. A 1D NOESY pre-saturation sequence, with solvent pre-saturation during the relaxation delay (2 s) and mixing time (10 ms) was used for all samples. Automatic baseline correction was performed on all 1D spectra using a 3rd-order polynomial (Topspin 3.2) and all spectra were manually corrected for phase distortion. To assist with metabolite identifi cation, two-dimensional Correlation Spectroscopy (COSY) 1 H NMR spectra were acquired from one sample within each group. Acquisition of COSY spectra was performed with 1.5-s solvent presaturation, a spectral width of 10 ppm (7002 Hz), and 16 or 32 transients per t1 increment for 256 increments. All NMR spectra were acquired at 293 K.

Data analysis
Prior to data pre-processing, non-linear peak alignment (14) was performed on each 1D 1 H spectrum. Subsequently, spectra were sub-divided into 0.02ppm regions ( δ ϭ midpoint of integral region) and integrated between 0.2 and 9.6 ppm using a custom MATLAB (MathWorks, Inc.) script. This reduced the spectra to ∼ 435 independent variables. The region between 4.30 and 5.00 ppm was highly variable due to imperfect water suppression and was excluded to minimize unwanted deterioration of spectral quality. Initially, a singlet at 3.36 ppm was identifi ed as a signifi cant contributor to a number of models. The identity of this peak was confi rmed using spiking and HSQC (heteronuclear single quantum coherence) spectroscopy as methanol, a common laboratory contaminant from cleaning glassware. Consequently, the spectral region from 3.34 to 3.38 ppm was excluded from all analyses. PLS-DA was applied to the data following scaling using the Pareto variance to suppress noise.

Study design
Five models were built from the study CSF cohort ( Figure 1). In each case only the most advanced sample, with respect to disease course, from each patient was used. The ' Ն 24 months ' model included ALS patients whose most advanced longitudinal sample was Ն 24 months from study enrolment. The ' Ն 18 months ' model also included patients whose most advanced sample was Ն 18 months from study enrolment as did the ' Ն 12 months ' , ' Ն 6 months ' and ' Ն Baseline ' models, which contained progressively greater sample numbers owing to follow-up attrition. By constructing the groups in this fashion it was possible to eliminate bias arising from inclusion of multiple samples from the same patient, while maximizing separation from the control cohort. A PLS-DA model was also built separating slow-and fast-progressing ALS patients according to a progression rate below or above 1 point decrease per month, respectively.

Statistical methods
For each comparison, a partial least squares discriminant analysis (PLS-DA) model was built to best explain differences between the variables for the groups being studied (SIMCA 13.0, Umetrics, Sweden). The q 2 -value was calculated to determine the potential predictive nature of the models. The metric q 2 is derived from a stepwise crossvalidation of the model, whereby a model generated by withholding one-seventh of the samples in seven successive simulations is used to predict group membership of the missing samples. A q 2 Ͼ 0 means the model is predictive, but q 2 Ͼ 0.4 is generally regarded as the threshold for signifi cance in biological modelling (13).
Further validation was carried out using a pseudo-Monte Carlo method where 200 models were built using random group assignments. Models were considered signifi cant where the genuine q 2 was higher than 95% of the randomly generated q 2 -values. Buckets with a VIP (Variable Importance Plot) score greater than 2 were considered to be the most important for model separation. To identify the metabolites contributing to the spectra in each bucket the relevant resonances were identifi ed using a combination of literature values, COSY spectroscopy, spiking, HSQC spectroscopy and reference to the human CSF metabolome database (15,16).
To assess differences in gender between ALS patients and control volunteers, Fischer ' s exact tests were performed. A one-way ANOVA with the appropriate post hoc test or Student ' s t -test was used to assess differences in age, CSF protein and glucose concentration between ALS patients and control volunteers.

Results
An example CSF spectrum with peaks of interest annotated is shown in Figure 2.

Participants and samples
In total, CSF samples were obtained from 41 ALS patients, fi ve patients with the very slowly-progressive variant of primary lateral sclerosis (PLS) and 14 healthy controls (Table I). No overall group difference in the glucose or protein levels in CSF was found between groups. For each model, no signifi cant differences were present with respect to the gender of ALS patients compared to control volunteers. A signifi cant difference in age was present in the model comparing control samples with all baseline ALS patients.

CSF spectra multivariate statistical analysis
Initially, PLS-DA models were constructed separating control volunteers from each of the patient groups, excluding the slowly-progressive PLS patients ( Figure 3A -E). The most signifi cantly pre- Figure 2. Example 1 H NMR spectrum of CSF from an ALS patient with key metabolites identifi ed. dictive model was that comparing control samples with the Ն 12 months subset of ALS patients ( q 2 ϭ 0.51; Figure 3C). The score plot of this onecomponent PLS-DA model showed good discrimination between the two populations. Examination of the loadings revealed that glucose (a large number of buckets in the range δ x-y ϭ 3.25 - 3.91), lactate ( δ x-y ϭ 1. 33 and 4.11 -4.13), citric acid ( δ x-y ϭ 2.64 -2.66), and ethanol ( δ x-y ϭ 3.65 - 3.67 and 1.19) were all increased in ALS patients compared to controls.  Models built separating control volunteers from ALS patient samples taken Ն 6 months and Ն 18 months after baseline were also signifi cantly predictive ( q 2 ϭ 0.41 and 0.45 , respectively; Figure 3B,D). In each case, examination of the loadings revealed that glucose (a number of buckets in the range δ x-y ϭ 3.25 -3.91 ppm), lactate ( δ x-y ϭ 1.33 and 4.11 -4.13), citric acid ( δ x-y ϭ 2.64 -2.66), and ethanol ( δ x-y ϭ 3.65 - 3.67 and 1.19), were again increased in ALS patients compared to controls.
A further model separating control volunteers from ALS patient samples taken Ͼ 5 years after disease onset was predictive ( q 2 ϭ 0.47). Examination of the loadings revealed that the same metabolites were increased in ALS patients with respect to control volunteers, with the exception of citric acid. Although this latter metabolite was still altered relative to control, its contribution to the positive model was considerably lower than previously.
Two other models, comparing control volunteers to ALS patient samples collected either at baseline or Ն 24 months from baseline, although positive, failed to achieve signifi cance ( q 2 ϭ 0.22 and 0.32, respectively; Figure 3A,E). Incorporation of samples from the fi ve slowly-progressive PLS patients to each model, however, resulted in a signifi cant separation between control volunteers and the Ն 24 months ALS patient subset ( q 2 ϭ 0.42). All predictive models were successfully validated using the cross-validation embedded in a Monte-Carlo re-sampling approach

Selected metabolite analysis
Since PLS-DA models are descriptive and not easily compared to other clinical statistics, individual metabolite information was collated. For each of the predictive models metabolites with high variable importance scores had their integral areas summed for the buckets of relevance (Table II, Figure 4). For each metabolite a one-way ANOVA with Dunnet ' s post hoc test was used to determine if the metabolite abundance was different from control values. Lactic acid and citric acid were increased in all ALS groups with respect to control, while glucose was increased in all except the greater than fi ve-year duration model. Ethanol was signifi cantly increased in the Ն 6 months and Ն 12 months ALS patient groups but not increased signifi cantly in the Ն 18 month and duration Ն 5 years model.

Discussion
This study demonstrated that proton NMR spectroscopy, together with PLS-DA, has the potential to distinguish ALS patient and healthy control CSF metabolite profi les across a longitudinal cohort. The statistical models generated showed signifi cant separations when comparing the more advanced of the longitudinal patient samples. Metabolites identifi ed as discriminating ALS patients from controls (including glucose, lactate, citric acid and ethanol) were common to all models, supporting the view that they refl ect consistently and progressively deranged metabolic pathway perturbations. Validation of the descriptive PLS-DA models using ANOVA generally showed agreement between the methods when assigning signifi cance to the metabolite changes. Differences between the methods can be attributed to the nature of the PLS-DA models which calculate contribution to a model differently to the way the ANOVA calculates signifi cance. The PLS-DA models show the contribution of the metabolite in the context of the complete set present in the biofl uid while the ANOVA works in isolation.
Increases in metabolites such as glycolytic and citric acid cycle intermediates, together with creatine, ascorbate, acetone, glutamate and β -hydroxybutyrate have been reported in other cross-sectional studies in ALS (3,4,8), together with reductions in the levels of histidine, threonine, and creatinine (4,8,17). Serum glutamate was reported to be linked to disease duration (4), which fi ts with a long-established concept of excitotoxicity in ALS pathogenesis (18). The CSF increases in glucose, lactate and citric acid identifi ed in the current study suggest signifi cant alterations in energy metabolism. Evidence for dis-  (20,21), and has been related to a broader concept of defective cellular energetic pathways (22). Such dysfunction may lead to increased energy requirements in the muscle and brain of ALS patients (23,24). This is predicted to result in increased production of citric acid, increased demand and consumption of carbohydrates such as glucose and increased production of glycolytic products such as lactate (8). A previously published study of the CSF metabolome in a larger number of ALS patients noted increases in pyruvate, also involved in energy metabolism, but not glucose or lactate as we report (3). The detection of ethanol as a discriminating metabolite in the models produced from CSF of subjects affected with ALS was unexpected and has not previously been described. None of the participants had consumed alcoholic beverages on the day of study. Although it is possible that there was systematically greater use of recreational alcohol among our ALS patient cohort compared to controls, this has not been reported in other epidemiological studies of ALS (and reduced alcohol consumption has been identifi ed as a risk factor (25)). We also note that a large proportion of the control samples came from cohabiting spouses of patients. To be certain that this ethanol was not a contaminant, a number of control experiments were performed (detailed in Supplementary information fi le to be found online at http://informahealthcare.com/doi/abs/ 10.3109/ 21678421.2015.1053490). In summary, no solvents, buffers or tubes were found to have ethanol contamination and any signifi cant possibility of contamination from swabs was also eliminated.
We assume, therefore, that the observed peak refl ects endogenous CSF ethanol, which is a wellestablished concept (26,27). Ethanol has been detected in the blood of people abstaining from alcohol consumption (28), and has been linked to metabolic disturbances, specifi cally involving mitochondria (29). The identifi cation of endogenous ethanol might have particular relevance to ALS, where mitochondrial dysfunction is an area of intense research (19). Interestingly, endogenous CSF ethanol was also detected in the CSF of patients with cervical myelopathy, an occasional mimic disorder of ALS (30). The authors of that study drew a potentially unifying conclusion that the appearance of CSF ethanol is a result of tissue injury. Despite these observations, it is duly noted that two other NMR studies of the CSF metabolome in ALS, while detecting ethanol in one (3), did not note a signifi cant difference between groups in either (3,17). In a ' comprehensive ' CSF metabolome study in seven essentially healthy individuals being screened for meningitis, ethanol was not a reported metabolite, suggesting it is not routinely present (31).
Despite the positive fi ndings reported here, it is important to note that our study does not replicate the previously published study in ALS (3). Further validation will be needed in other cohorts. It is also possible that the relatively low number of healthy control CSF samples limited the power of the modelling approach to detect subtle group-level metabolomic differences, while recruitment of patients at many different stages of disability prevented standardization of the ' stage ' of disease at each timepoint leading to greater noise in the data. At the same time, the progressive nature of ALS means that cohort studies tend to enrich for more slowlyprogressive phenotypes, who are more likely to survive long enough to provide multiple sample time-points, in contrast to aggressive disease where only a baseline sample may have been provided. Consequently, the models with a greater gap between baseline and fi nal sample contained progressively smaller numbers of patients, with a drop-off in separation from controls beyond the Ն 12-months model. The subsequent improvement in separation between patients and controls in the Ն 24-months model with the addition of the slowly-progressive PLS patients supports the view that statistical power was previously lacking. In contrast, the model with separation based on rate of disease progression did not reach signifi cance. If our CSF metabolomic signature refl ects a hypercatabolic state in ALS, then this might be evidence that it is inherent to the disease process, and not simply an effect of rapid accumulation of disability per se. Alteration in dietary intake associated with disease progression in ALS (e.g. gastrostomy supplementation) is another important potential confound not addressed in this study, but one that will need to be considered in future cohorts.

Conclusions
The CSF metabolomic fi ndings in a longitudinal ALS cohort support a model of profound changes in cellular energy metabolism during the symptomatic phase of the disease. Whether the apparent metabolic perturbation is a primary or secondary  Table II. Data are means Ϯ SD. * ϭ p Ͻ 0.05; * * ϭ p Ͻ 0.01; * * * ϭ p Ͻ 0.001 all relative to control values. response to the disease, and whether it is specifi c to ALS (as, for example, might be seen in cancerrelated cachexia), is not yet clear. Extension to the study of CSF in pre-symptomatic carriers of highrisk genetic mutations linked to the development of ALS is an important future initiative that will further defi ne the diagnostic potential of the approach (32). Similarly, prospective studies comparing ALS patient cohorts with established disease mimics including undiagnosed cases of acquired neuromuscular weakness will be key extensions to the current study. More broadly, it is possible that neurochemical surrogates of deranged cellular metabolism in ALS, in particular our novel observation of endogenous CSF ethanol, might provide valuable pharmacodynamic biomarkers in future therapeutic trials.