Advancing mechanistic understanding and biomarker development in amyotrophic lateral sclerosis

ABSTRACT Introduction Proteomic analysis has contributed significantly to the study of the neurodegenerative disease amyotrophic lateral sclerosis (ALS). It has helped to define the pathological change common to nearly all cases, namely intracellular aggregates of phosphorylated TDP-43, shifting the focus of pathogenesis in ALS toward RNA biology. Proteomics has also uniquely underpinned the delineation of disease mechanisms in model systems and has been central to recent advances in human ALS biomarker development. Areas covered The contribution of proteomics to understanding the cellular pathological changes, disease mechanisms, and biomarker development in ALS are covered. Expert opinion Proteomics has delivered unique insights into the pathogenesis of ALS and advanced the goal of objective measurements of disease activity to improve therapeutic trials. Further developments in sensitivity and quantification are expected, with application to the presymptomatic phase of human disease offering the hope of prevention strategies.


Introduction
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease that causes progressive weakness due to death of ventral horn motor neurons within the spinal cord and pyramidal cells of the motor cortical areas [1]. ALS is aggressive, leading to death within 3 years of symptom onset in most cases, though its progression is highly variable [1]. Although most cases of ALS are sporadic, around 10% of cases report a family history; most of these, and a small proportion of sporadic cases, are attributable to variants in one of a handful of genes, though variants in over 40 genes have been implicated in ALS [2]. Beyond monogenic causes, ALS shows significant heritability in twin studies, and recent research indicates an oligogenic contribution to ALS susceptibility [3,4]. ALS has clinical and pathological overlap with frontotemporal dementia (FTD) and shares genetic risk, primarily through pleiotropic effects of hexanucleotide repeat expansion in an intronic region of the C9orf72 gene, which constitutes the most common monogenic cause of ALS [1,5].
The plethora of perturbations in intracellular pathways that has been implicated through different monogenic causes of ALS suggests that motor neuron degeneration occurs as the result of the final common pathway of many upstream molecular alterations such as defects in RNA processing, protein homeostatic processes, oxidative stress, and cytoskeletal perturbations [6,7]; epidemiological evidence suggests that there may be a summation of insults that lead to catastrophic neurodegeneration [8].
In common with other neurodegenerative diseases, neuronal loss accompanied by insoluble protein inclusions are core pathological features of ALS, occurring primarily in the motor cortical regions, brainstem motor nuclei, and ventral horn of the spinal cord [9]. Our knowledge of major aggregate components owes much to proteomics: the identification of TDP-43 as the major component of inclusions in over 95% of ALS cases (excepting those with genetic ALS due to SOD1 or FUS mutation) and 50% of FTD cases was achieved using liquid chromatography-tandem mass spectrometry (LC-MS/MS) of urea-soluble brain fractions [9].
The consequences of this landmark finding shifted etiological hypotheses of ALS toward mechanisms in which TDP-43 plays a central role, particularly relating to its functions in transcription, translation, and splicing, the stress response, mitochondrial function, and the inherent aggregation properties of TDP-43 that might contribute to non-cell autonomous mechanisms of neurodegeneration [6]. It also led to the identification of mutations in TARDBP, encoding TDP-43, as a cause of a small proportion of ALS cases, spawned novel ALS disease models, and led to refocusing of biomarker study toward measuring and characterizing full-length, truncated, and phosphorylated TDP-43 forms.
Whether ALS occurs through alterations in protein-protein interactions leading to aberrant stress response, excitotoxicity, decreases in protein degradation mechanisms leading to aggregation, changes in axonal transport mechanisms or gene splicing, or due to non-cell autonomous mechanismsas the major biological effector molecules, proteins are implicated in every proposed pathological mechanisms in ALS [6]. Proteomics is therefore ideally placed to disentangle these mechanisms by providing a platform to identify individual protein or coordinated protein network changes [10] that can be applied to tissue, disease models, and directly in ALS patients.
This review highlights the contribution of proteomics to the study of ALS, in pathological tissue specimens, in disease models, and in the biomarker field (highlighted in Figure 1). It also discusses the potential future contributions of proteomic techniques at the leading edge.

Proteomics of human tissue in ALS
Proteomic analysis of post mortem tissue from ALS patients has driven progress in our understanding of underlying disease mechanisms. Studies using proteomic techniques to analyze human ALS tissue are summarized in Table 1. Although protein aggregation has long been recognized as a core pathological feature of ALS, it was not until the publication in 2006 of LC-MS/MS analysis of urea soluble, detergent insoluble fractions from neuropathological tissue of patients with ALS and FTD, that TDP-43 was the major aggregate component in over 95% of ALS cases, including ubiquitinated, abnormally truncated, and hyperphosphorylated TDP-43 species [9,11]. Immunoblotting of urea-soluble fractions and immunohistochemistry indicates a pattern of full length and truncated peptides that mostly represent C-terminal peptides at 20-35 kDa in brain tissue from ALS and FTD patients [12]; interestingly, this is a much less consistent finding in spinal cord [13]. Subsequent work has utilized proteomics to further define the nature of TDP-43 in aggregates.

Article highlights
• Proteomics has contributed across the field of ALS in study of post mortem tissue, disease models and identification of biomarkers • Identification of TDP-43 as the major aggregate component in 95% of cases of ALS was brought about by proteomics as well as identifying disease-specific truncation and posttranslational modification of TDP-43 • Proteomics has helped delineate the effects of ALS-causing gene mutations on protein interactions and in cellular and animal models • Major advances in ALS biomarkers have come about through the identification of chitinase protein alterations in cerebrospinal fluid of ALS patients • Future advances are likely to include spatial proteomic methods, improvements in proteomic depth through data independent acquisition and through the study of 'at-risk' populations to understand early events in ALS Three studies have used proteomic analysis to identify endogenously truncated TDP-43 peptides by identifying semitryptic or semi-chymotryptic N-and C-terminal TDP-43 peptides (i.e. TDP-43 specific peptides with one non-enzymatically digested terminus) in urea-soluble fractions from post mortem tissue of patients with ALS or FTD by in-gel digestion of lower molecular weight bands [19][20][21]. Most endogenous truncation sites were found on the N-terminus of peptides, suggesting the enrichment of C-terminal TDP-43 fragments, but also C-terminal truncations were found, broadening the knowledge of pathological processing of TDP-43 ( Table 2).
Measurement of the ratio of C-to N-terminal peptides using targeted proteomics appears relatively specific for ALS compared with other neurodegenerative diseases, though abnormal truncation of TDP-43 is also found in Alzheimer's disease cases accompanied by TDP-43 aggregation [22]. Although alterations in the C:N terminal peptide ratio are not found in ALS spinal cord, concurring with immunoblot findings [20], measurement of C:N terminal ratio appears to be a promising approach for biomarker development. A recent approach using aptamer enrichment prior to quantification to increase the yield of TDP-43 peptides from post mortem tissue might improve the sensitivity of targeted analysis of truncation peptides in biofluids, where such TDP-43 peptides are of much lower abundance [23].
Proteomic analysis of post mortem tissue has also allowed the identification of sites of post-translational modification of proteins involved in ALS, specifically phosphorylation, acetylation, and ubiquitination of TDP-43 [19]. Although a robust finding in tissue samples, it has so far not been possible to reliably recapitulate disease-specific TDP-43 phosphorylation in biofluids, limiting its application as a biomarker. Diseasespecific phosphorylation sites in neurofilament heavy chain have also been sought in ALS, though phosphorylation appears similar in ALS patients and controls [24].
Looking beyond disease-associated protein inclusions, unbiased analysis of post mortem brain and spinal cord tissue has been employed to explore broader alterations in the protein network occurring in ALS. Several studies have incorporated analysis of brain tissue from patients with ALS and TDP-43 FTD subtypes. A recent example used LC-MS/MS analysis of brain tissue of patients with ALS (as well as FTD subtypes) identified over 50 proteins enriched in the sarkosyl-insoluble fractions of ALS brain, including a subset of 23 co-aggregating proteins that differentiated ALS from FTD subtypes, such as the presence of Profilin 1, 26S proteasomal subunit D2, and Tubulin alpha 4 A chain, among others [11].
Network analysis of prefrontal cortex of ALS, ALS-FTD, and FTD patients using weighted gene co-correlation network analysis (WGCNA) has been used to identify coordinated changes within the protein network in ALS and FTD [25]. This indicated relatively minor changes in pure ALS cases (as would be expected given the lack of involvement of wider frontal lobe areas in pure ALS), though a protein co-expression module associated with immune system functioning was upregulated in ALS. Further work to define disruptions at the proteome level would benefit from the inclusion of additional CNS tissue regions salient to ALS (such as the brainstem, thalamus and spinal cord) in order to broaden the understanding of regional and mechanistic differences in these overlapping conditions.
A small number of studies have used proteomics to examine spinal cord tissue. Recent studies have identified dysregulation of mitochondrial proteins and those involved in carbohydrate metabolism mRNA splicing and of the neurofilament compartment as well as altered acetylation of glial fibrillary acidic protein (GFAP) [26][27][28]. Spinal cord lysates from sporadic ALS patients and those carrying a SOD1, FUS, and C9orf72 variants have been used to explore TDP-43 interacting proteins, highlighting the role of TDP-43 in RNA processing and translation, as well as suggesting greater overlap between TDP-43 interactors in FUS and C9orf72 cases and TDP-43 interactors in SOD1 and sporadic ALS cases than between these pairings [29].
A small preliminary study employed matrix-assisted laser desorption-ionization (MALDI) imaging to explore spatial alterations in protein expression; due to technical limitations, only a small number of proteins were identified, though decreased levels of a truncated ubiquitin were observed in Table 2. Endogenous TDP-43 truncation peptides identified in tissue proteomic studies in ALS. All peptides show the expected trypsin cleavage site after amino acids lysine (K) or arginine (R) or chymotrypsin cleavage site after amino acids phenylalanine (F), tryptophan (W), and tyrosine (Y) and one nonspecific cleavage site suggesting truncation. kDa -kilodaltons. the ALS group [30]. Preliminary exploration of the feasibility of using laser capture microdissection of motor neurons from post mortem tissue has been explored as a means to study the human motor neuron-specific proteome, though this has not been applied successfully in comparative study [31]. Another relevant tissue with limited ALS proteomic studies is muscle. While gene expression studies have used muscle samples, only a few studies have focused on proteomics in ALS muscle (Table 1). Given the intensified collection of postmortem muscle tissue within ALS biorepository efforts, future studies to explore the proteome in different muscle tissue types are warranted and may provide new mechanistic insights into muscle degeneration that occurs during ALS and potential new blood-based biomarkers released by muscle.

Proteomic analysis of disease models in ALS
Proteomic techniques have featured in a vast number of studies of cellular and animal models of ALS, contributing to the major hypotheses of ALS pathogenesis.

Alterations in protein-protein interactions in ALS
Defining disease-related alterations in protein-protein interaction networks is an essential aspect of understanding the pathophysiological processes that lead to ALS, which has relied heavily upon proteomics. These studies have focused primarily on TDP-43, extending to other ALS-associated gene mutations, using immunoprecipitation coupled with mass spectrometry.
In addition to the aforementioned post mortem tissue study of the TDP-43 interactome [29], tissue culture models have examined the effect of ALS-causing A315T and M337V TDP-43 mutations using primary neuronal cultures [32] and cell lines [33,34] in physiological conditions as well as following oxidative stress, RNA depletion, and DNA damage [35,36]. In addition to highlighting the multifaceted interactions of TDP-43 with splicing and translation machinery, mitochondrial proteins, and proteins involved in the stress response, this work has indicated that the TDP-43 interactome is conditiondependent and altered in the presence of TDP-43 mutations, with effects on the cellular stress response, translation, and exosome biogenesis pathways [32,33]. Analysis of the FUS interactome has highlighted major overlap in the pathway annotations of FUS and TDP-43 interactors around RNA metabolism and splicing, stress granules, exosomes, and mitochondrial proteins [37,38], as well as its involvement in protein degradation pathways [39].b Proteomic approaches have studied the interactions of other proteins implicated in ALS. Systematic analysis using immunoprecipitation of Ubiquilin 2, FUS, Ataxin 2, C9orf72, TDP-43, and Optineurin in N2a cells demonstrated common interactors and overlapping functional annotations, particularly in relation to DNA and RNA binding, ribosomal proteins, and eukaryotic initiation factors for TDP-43, FUS, and Ataxin 2 interactors and protein homeostatic roles for Ubiquilin 2 and Optineurin interactors [35]. The protein constituents and interactors of SOD1 aggregates have also been probed using proteomic techniques. Cytoskeletal proteins, particularly the intermediate filament protein Vimentin, as well as GFAP and neurofilament proteins, have been consistently identified in native spinal cord detergent-insoluble fractions and whole spinal cord lysate in SOD1 mouse models [40][41][42]. Other proteins co-aggregating in SOD1 inclusions in the SOD1 G93A mouse model include proteins involved in glycolysis and mitochondrial pathways and chaperones [40], pathways overlapping with those of proteins identified as co-aggregating with TDP-43 in ALS brain tissue [11].

Posttranslational modifications
Abnormal ubiquitination and phosphorylation of aggregated proteins is a core pathological feature of ALS. Proteomic analysis has provided a platform to study different posttranslational modifications in disease models, specifically focusing on TDP-43 and FUS in ALS, illustrating interplay between ubiquitylation, phosphorylation, and acetylation, as well as the importance of posttranslational modifications in the physiological behavior of proteins implicated in ALS [43][44][45]. It has also revealed a role of less common modifications such as citrullination in maintaining the physiological function of FUS and TDP-43 and inhibit aggregation, eventually through a decrease of binding to proteins relevant for stress granule formation [46].
Posttranslational modification has also been studied using proteomics of SOD1 aggregates from G93A and G37R mouse models, though significant modifications were not identified [41], at odds with earlier and more recent work demonstrating ubiquitination and conjugation of short ubiquitin modifier proteins (SUMOylation) to aggregated SOD1 protein [47,48], likely reflecting the sensitivity of the mass spectrometry techniques employed. More recently, non-enzymatic deamidation of asparagine to aspartic acid within a proteasomally cleaved SOD1 peptide has been identified in the CSF of the SOD1 G93A rat model, with a corresponding deamidated peptide detected in the CSF of human carriers (symptomatic and asymptomatic) of ALS-causing SOD1 variants [49]. Such deamidation has been shown to accelerate protein fibrillization [50], providing a potential link to ALS pathogenesis of this common posttranslational modification.

Proteomics in models of C9orf72 ALS
Current leading hypotheses as to how hexanucleotide repeat expansion in an intronic region of C9orf72 leads to neuronal loss and TDP-43 accumulation center on three potentially synergistic mechanisms: loss of function of C9orf72 protein due to haploinsufficiency, toxicity due to sense and antisense repeat RNA transcription products of the GGGGCC repeat region, and toxicity due dipeptide repeat proteins formed through repeat-associated non-AUG translation [51]. Recent work has explored mechanisms of toxicity of C9orf72 hexanucleotide repeat expansion using proteomics, including the interactome of dipeptide repeat proteins, with notably consistent enrichment of ribosomal proteins across studies, as well as RNA splicing and mitochondrial proteins and proteins involved in autophagy and proteasomal systems detected in the interactomes of the toxic arginine-containing dipeptides [52][53][54][55][56]. Protein interactors of repeat RNA transcripts have also been identified using proteomics, with enrichment (perhaps unsurprisingly) of proteins involved in RNA metabolism and containing RNA recognition motifs [57].
Mass spectrometry has also delineated the interactome of C9orf72 protein, demonstrating enrichment for autophagy proteins, cytoskeletal components as well as ubiquilin 2 and heterogeneous ribonucleoprotein A1 and A2/B1, proteins implicated in ALS through rare genetic mutations and roles in proteostasis, RNA processing, and stress granules; and separately enrichment of mitochondrial proteins and chaperones, providing convergence on TDP-43-associated disease alterations [35,[58][59][60].

Subcellular compartment alterations
Proteomic analysis of subcellular fractions has been applied to cellular and ALS disease models based on SOD1 mutation and overexpression, indicating significant alterations in the proteome of cell lines, spinal cord and brain tissue of rodents overexpressing wild-type and mutant SOD1, relating to multiple pathways including mitochondria, metabolism, and protein degradation and overlapping with proteomic evidence from human tissue in sporadic ALS [26,[61][62][63][64][65]. Alterations in the nucleocytoplasmic distribution of TDP-43 are a key histopathological feature of ALS, and nuclear pore complex dysfunction has been observed in ALS models, particularly relating to C9orf72 ALS [66]. Comparative proteomic analysis of nuclear and cytoplasmic fractions from a HEK293 C9orf72 hexanucleotide repeat model indicates a shift in the distribution of proteins involved in RNA metabolism and translation toward the cytoplasm [67]. Alterations in the nucleocytoplasmic distribution of RNA processing and translation proteins are also observed following TDP-43 knock-down [68], while RNA transport pathway alterations have been demonstrated with overexpression of mutant SOD1, though these alterations are opposed to those observed in the C9orf72 model [69].

Stress granules
Proteomic analysis of stress granule cores -membraneless organelles comprising RNA and protein that form by liquid-liquid phase separation in response to stress [70]indicates a major role for ALS-associated RNA-binding proteins including TDP-43, FUS, and other heterogeneous ribonucleoproteins (hnRNPs) in stress granule physiology [71]; stress granule cores are proposed to act as a nidus for TDP-43 aggregation [72]; time-series proteomic analysis of stress granule disassembly indicates the importance of altered SUMOylation in delaying stress granule disassembly in a Drosophila C9orf72 ALS model [73]. Accordingly, dipeptide repeat proteins have also separately been found to interact with stress granule proteins [54].
A recent study examining the Caprin-1 proteome in stress granules identified a new hnRNP, SNRNP200, that was also localized to cytoplasmic aggregates in ALS spinal cord [74].

Proteomics of biofluids and the search for ALS biomarkers
Biofluid-based biomarkers present several potential opportunities in ALS. Although for the most part, the diagnosis of ALS is not difficult to achieve in the specialist clinic, sensitive and specific biomarkers have long held promise as a means to resolve diagnostically challenging ALS cases or enable earlier nonspecialist diagnosis [75]. Identifying useful biomarkers that fulfil this promise, however, has proven a major challenge. The axonal cytoskeletal neurofilament proteins neurofilament light chain (NFL) and phosphorylated neurofilament heavy chain (pNFH) have long led on this front [76,77]. Although showing promising specificity and sensitivity in retrospective and prospective analysis [78][79][80][81], the fact that they are nonspecific markers of axonal degeneration (i.e. are not ALS-specific) and show relatively modest rises in slower-progressing, harder-todiagnose cases, has hampered their translation into clinical use [82]. ALS has also so far been resistant to combinatorial diagnostic approaches such as CSF Abeta/Tau ratio in Alzheimer's disease [83] or recently developed protein aggregation-based assays such as RT-QuIC in prion diseases and more recently synucleinopathies [84,85].
A more important role for ALS biomarkers lies in the measurement of underlying disease activity and target engagement in ALS, to support development of novel therapeutics. Drug trials in ALS currently rely on clinical outcome measures, primarily functional decline as measured by the revised ALS functional rating scale (ALSFRS-R), a 48-point score that declines through the course of the disease [86]. Clinical staging systems, decline in respiratory function, muscle strength measures, and survival are also frequently employed [86]. All of these measures accrue slowly over time, and most are confounded by subjectivity or effort and consequently require prolonged follow-up periods with large numbers of participants to measure change. Driven by the desire to detect effects while maintaining relatively small sample size and trial duration, these measures also promote a tendency to limit trial eligibility to those in early disease stages or with aggressive disease, in whom change is detectable over a short timescale; consequently, this may limit the generalizability and eventually access of patients with less aggressive disease to effective treatment [87].
Biomarkers represent an opportunity to provide sensitive, objective, rapidly changing measures that could reduce trial duration and sample sizes, hastening the therapeutic development process while reducing costs and broadening trial inclusion and providing highly valuable information about the underlying frequent failure of preclinically promising drugs in clinical trials [88]. A number of studies have been performed to evaluate specific proteins as pharmacodynamic or prognostic biomarkers in ALS model systems or patient-derived samples [89][90][91][92][93]. Recent ALS clinical trials have explored the use of protein biomarkers as pharmacodynamic biomarkers of treatment effect or as inclusion criteria and then monitoring of treatment effect during the trial [94,95].
The application of proteomic technologies to cerebrospinal fluid (CSF) and blood from ALS patients has been a staple of efforts to identify potential ALS biomarkers over the last two decades. Most proteomic studies have used CSF, due to its proximity to the CNS cells affected by ALS [96]. The relatively low content of highly abundant proteins compared to serum and plasma, reducing the need for depletion methods or separation approaches, and the lower risk of detecting signals of secondary systemic metabolic alterations related to the disease (for example malnutrition due to swallowing difficulties) are additional advantages of studying CSF compared with serum or plasma, though much of the CSF proteome is in fact blood-derived [97]. The obvious disadvantage is, of course, the relatively invasive approach to CSF sampling when compared to blood.
Most frequently, bottom-up shotgun proteomic approaches have been employed, though over this time much of the breadth of proteomic technology has been applied at some point to the study of ALS. Reproducibility has been an issue; despite alterations in over 500 proteins detected over the course of CSF proteomic experiments, only a handful have been demonstrated in two or more studies and even fewer have survived external orthogonal validation techniques [98]. Proteomic studies of human biofluid samples in ALS are summarized in Table 3.
The first mass spectrometric study of CSF in ALS used Fourier transform ion cyclotron resonance (FT-ICR) of tryptically digested CSF samples from a small cohort of ALS patients and healthy controls to produce a classifier based on the resulting spectrograms [107]. Although this early foray into proteomics in ALS identified no individual biomarker candidates, it represents the first mass spectrometric analysis of ALS patient CSF and foretold the later use of multiple proteomic features as the basis for classification algorithms; similar machine learning approaches have been utilized in more recent studies [108].
Subsequent early proteomic biomarker studies in ALS moved toward surface-enhanced laser desorption-ionization TOF (SELDI-TOF) mass spectrometry analysis for top-down proteomics of CSF from ALS patients [109][110][111]. Between these three studies, some overlap was observed with lower levels of Cystatin-C detected in all three (and validated using CSF immunoblot) as well as decreases in Transthyretin in two studies. As the major constituent of lower motor neuron Bunina body inclusions specific to ALS, Cystatin-C was of particular interest as a biomarker; external validation however has subsequently proved contradictory [112,113]. Additionally, in the most recent SELDI-TOF study [111], incorporating samples from 100 ALS patients and 141 controls, levels of the acute phase protein C-reactive protein (CRP) were found to be elevated in ALS with confirmatory enzyme-linked immunosorbent assay (ELISA); although elevated serum CRP has been associated with worse prognosis in ALS patients, a recent ELISA validation of CSF CRP levels did not confirm this finding [114][115][116].
Two CSF proteomic studies in ALS incorporated 2D gel electrophoresis to identify differentially abundant proteins in CSF pools from ALS patients and controls [117,118], with subsequent matrix-assisted laser desorption-ionization mass spectrometry or tandem MS identification of differentially abundant protein spots including upregulation of Alpha-1-antitrypsin precursor and Zn-alpha-2-glycoprotein, both demonstrating sometimes contradictory alterations in other proteomic and immunoassay studies [119][120][121].
More recent studies have employed LC-MS/MS of individual or pooled CSF samples, with preanalytical abundant protein depletion or prefractionation techniques to drive additional proteomic depth and in some cases isobaric labeling to enhance quantitative accuracy [28,108].
A major and consistent feature of recent LC-MS/MS proteomic datasets has been the upregulation of a set of related glial proteins involved in innate immunity, the chitinase proteins, in ALS. The first recognition of coherent alterations in chitinase proteins used LC-MS/MS of pooled CSF samples from ALS patients and controls, identifying a striking upregulation of the active chitinase Chitotriosidase 1 (CHIT1) alongside elevation of the two related inactive chitinase proteins Chitinase 3-like protein 1 (CHI3L1 or YKL-40) and Chitinase 3-like protein 2 (CHI3L2 or YKL-39) [122].
Emerging literature in ALS indicates that CHIT1 is primarily produced by microglia [98]; intrathecal injection of CHIT1 leads to microglial activation, astrogliosis, and loss of motor neurons in rodents [129]. CHIT1 levels correlate with the rate of functional decline in ALS, a proxy for the aggressiveness of disease as well as neurofilament levels [114,[126][127][128]. CHI3L1, on the other hand, is produced by a subset of activated astrocytes and correlates more closely with the burden of upper motor neuron pathology and cognitive impairment in ALS [114,128]; correspondingly, while CHIT1 levels are markedly increased in ALS but more modestly so in FTD, CHI3L1 levels show more modest elevation in ALS and more pronounced elevation in FTD [123]. CHI3L1 is less closely associated with disease progression, but is similarly correlated with neurofilament levels when compared with CHIT1; both CHIT1 and CHI3L1 have shown inconsistent associations with survival as well as inconsistent small longitudinal increases [114,124,127,128].
Overall, CHIT1 and CHI3L1 represent a recent major success for proteomic biomarker discovery in ALS. Although they do not outperform neurofilaments in terms of prognostic or classifier performance, they represent different dimensions of the underlying disease process -specifically microglial and astrocytic activity -that represent pathways potentially amenable to disease-modifying treatments [114]. Chitinase proteins are therefore well-placed to measure treatment response in these areas, though it should be noted that common CHIT1 and CHI3L1 polymorphisms leading to alterations in expression are recognized (though do not appear to slow the progression of ALS) [123,130]. Table 3. Proteomic studies of human biofluid samples in ALS. ALS -amyotrophic lateral sclerosis; CSF -cerebrospinal fluid; EV -extracellular vesicle; FT-ICR -Fourier transform ion cyclotron resistance; iTRAQ -isobaric tags for relative and absolute quantitation; LC-MS/MS -liquid chromatography tandem mass spectrometry; MALDI -matrix-assisted laser desorption-ionization; PBMC -peripheral blood mononuclear cells; SELDI -surface-enhanced laser desorption-ionization; SWATH -sequential window acquisition of all theoretical spectra; TMT -tandem mass tag;   More recent analyses have used isobaric labeling with prefractionation to improve proteomic depth in CSF to quantification of almost 2000 proteins [28,131], identifying upregulation of the proteins Ubiquitin C-terminal hydrolase L1, Mictrotubule-associated protein 2, and Glycoprotein NMB in ALS patients in addition to neurofilament and chitinase proteins, validated within-cohort using targeted proteomics and subsequently using single molecule array (SIMOA), as well as comparing CSF findings with post mortem tissue [28,131]. The comparison of protein level changes in CSF and tissue contributes to the understanding of the origin of alterations of the CSF proteome. The upregulation of neurofilaments in CSF but lower levels in spinal cord tissue is in agreement with the release of neurofilaments into the extracellular space by degenerating axons. In contrast, neuroinflammatory proteins such as chitinases, Glycoprotein NMB, and Macrophage-capping protein are increased in both, indicating elevated tissue expression during disease [28].

Extracellular vesicle proteomics -a window on intracellular processes in ALS
Extracellular vesicles (EVs) are 50-200 nm structures, including exosomes and microvesicles, released by virtually all cells, including neurons and glia of the central nervous system [132]. Alterations in EV biogenesis pathways have been identified in cellular models of genetic ALS and implicated as a potential vector for the intercellular spread of toxic oligomers of TDP-43 [32,133]. EVs are also an attractive target for biomarker discovery efforts due to their intracellular origin, potentially opening a window on mechanisms of disease [134].
However, the low number of EVs in CSF, combined with the contribution of multiple CNS cell types and the use of MS-incompatible isolation techniques (such as those involving polyethylene glycol based precipitation), poses major challenges to the application of proteomic approaches [134]. Research in this field applying mass spectrometry approaches to CSF EVs for biomarker discovery in ALS has so far been very limited, including one targeted proteomic study measuring relative exosomal TDP-43 levels, which did not differ between ALS patients and controls [135], and two small shotgun proteomic studies, which identified decreased proteasomal and proteasome-like proteins in ALS [136], a pathway previously implicated through post mortem analysis of spinal cord tissue; and increased levels of the nucleolar protein Novel INHAT repressor [137], both of which await external verification.
An alluring means to simplify access to CNS biomarkers has emerged through the analysis of CNS-derived EVs extracted from serum by immunoprecipitation of EVs carrying the neuronal lineage marker L1CAM [138]. This method has been used in targeted biomarker development approaches in Alzheimer's disease and Parkinson's disease [139,140]. Whether EVs isolated in this way truly represent a pool of CNS origin is hotly debated, in part due to expression of L1CAM beyond the CNS and some evidence indicating that most serum L1CAM is a cleaved ectodomain [141]. Further work to delineate their origin and identify more robust means of extracting a relevant EV population using proteomics would be highly valuable. Ultimately, a combination of proteomics and transcriptomics of EVs may provide the optimal ALS-specific biomarker.

CSF proteomics in the presymptomatic period
The identification of highly penetrant ALS-causing genetic variants, particularly C9orf72 hexanucleotide repeat expansion and mutations in SOD1, in upwards of 10% of ALS patients has spawned a cohort of first degree relatives of ALS gene carriers with known high risk of carrying a developing ALS [5]. Evidence from neurophysiological studies and measurement of neurofilament and chitinase proteins in asymptomatic gene carriers suggests that significant neurodegeneration and microglial activation is detectable only months before symptom onset [125,[142][143][144]. Studying gene carriers during the period before onset of neurodegeneration therefore offers an opportunity to define early events preceding neurodegeneration, identify biomarkers that might predict the onset of symptoms, or measure presymptomatic therapeutic response, thereby enabling treatment prior to the onset of symptoms [145].
To date, only one proteomic study has addressed this, comparing 14 asymptomatic carriers of SOD1 and C9orf72 mutations with controls and ALS patients using isobaric tag labeled, prefractionated LC-MS/MS approach [28]. Despite quantifying 1929 proteins, no proteins with significantly differing levels between gene carriers and non-carriers were identified, perhaps attributable to the relatively small asymptomatic carrier group and the mixture of underlying gene mutations reflecting multiple pathway alterations upstream of motor neuron degeneration [28].

Blood proteomics
Relatively few studies have examined the serum, plasma, or peripheral blood mononuclear cell (PBMC) proteome in ALS. Recent studies have used isobaric labeling of brain tissue alongside plasma samples in order to improve the relevance of the identified proteome and circumvent the problems of highly abundant proteins suppressing signal from more low abundance, potentially more interesting, proteins [146,147]. Consistent themes indicate alterations in complement proteins and apolipoproteins, though reproducibility of individual findings has been lacking [146][147][148][149][150][151][152].

Proteomics transcending biofluid, pathology and disease model boundaries
Modern bioinformatic and proteomic techniques offer the capability to link alterations in the tissue, model, and biofluid proteomes. Proteomic biofluid studies in ALS have generally detected changes presumed to reflect downstream consequences of neurodegeneration, such as the leakage of neurofilament proteins from damaged neurons, activation of glial cells, the effects of synaptic loss, or altered extracellular matrix regulation [28,108,124]. A handful of studies have attempted to bridge this gap using pathway analysis [108], network analysis [153], or direct comparison of post mortem tissue and biofluid proteomic changes [28,154]. These have identified a degree of overlap between network-level changes in the CSF proteome with RNA processing, cellular stress, and metabolic pathways identified in ALS models. Detecting clear perturbations of these pathways in the biofluid proteome that could find clinical use has not yet occurred.

Expert opinion
Proteomic analysis has cut across the field of ALS research. It has redefined our understanding of the molecular histopathological hallmarks of ALS and diverted the course of scientific study accordingly [9]. As outlined herein, proteomics has also highlighted pathophysiological mechanisms of ALS, including alterations in protein-protein interaction networks brought about by ALS-associated genetic variants, the importance of proteins implicated in ALS in the cellular stress response, and widespread changes in nuclear and cytoplasmic proteomes. Recent proteomic studies have identified major biomarkers capable of quantifying different dimensions of the disease and linked findings from disease models and post mortem tissue with alterations in the protein network in patient CSF.
Many techniques have been utilized, including a range of preanalytical methods, ionization and separation methods, mass spectrometers, and bioinformatic approaches [155]. Within the field of mechanistic study, proteomics has provided highly valuable insights into the consequences of ALS gene mutations and pathways involved in ALS, though interpretation is necessarily tempered by the types of model used, particularly in relation to overexpression models and the use of SOD1 mutation-based models, which, given the pathological differences between SOD1 and other familial and sporadic ALS forms, may not be a faithful reflection of upstream biological differences leading to sporadic ALS [156]. Given alterations in gene transcription, translation, and metabolic pathways in ALS, it would be well-suited to multi-omic analysis, integrating proteomic, transcriptomic and metabolomic datasets together, which has been thus far limited in ALS [157,158].
Proteomics of pathological tissue and disease models stands to benefit from spatial proteomic techniques such as MALDI imaging, which have so far found limited use in ALS research [30,159,160], in order to resolve compartmentalized aspects of ALS pathology. Techniques to separate tissues and enhance the purity of in vitro models, such as laser capture micro-dissection and fluorescence-activated cell sorting (FACS), offer additional means to decipher changes occurring within individual cell types and their relative contribution to the disease process [161], which could in future be further enhanced by nascent single cell proteomics [162]; newer techniques such as MALDI-2 mass spectrometry promise subcellular compartment resolution [159]. Newer technologies that use multiplex immunofluorescence microscopy data from up to 40 different proteins could also enable spatial resolution of many proteins within the same tissue sample [163].
Within the biomarker field [,], reproducibility of proteomic discoveries has been a major problem, driven in part by the issues of inconsistent preanalytical sample handling, the stochastic nature of data dependent acquisition (DDA) proteomic pipelines and the heterogeneity of the disease [164,165]. In the last decade, however, consistent signals in the chitinase proteins have been demonstrated initially in proteomic and subsequently immunoassay studies [114,[122][123][124]126]. Elevated levels of neurofilament proteins, initially identified in candidate-driven immunoassay studies, have also been identified with increasing consistency in recent proteomic studies, particularly Neurofilament medium polypeptide, which has so far been neglected by target-driven studies [28,108,131]. Chitinase proteins, particularly Chitotriosidase 1, represent a major success for proteomic biomarker development and are now front-running ALS biomarkers; though they have not improved upon the classifier or prognostic performance of neurofilament proteins, they encapsulate alternative dimensions of the disease process so might find use in drug trials targeting glial mechanisms or through the eventual advent of personalized treatment of ALS. Identifying ALSspecific biomarkers, such as those based on diseaseassociated truncated or posttranslationally modified forms of TDP-43, remains a major challenge to which targeted proteomic methodologies could offer solutions in future [12].
The multiplex nature and absolute quantitative capabilities of targeted mass spectrometry, protein arrays, and aptamerbased proteomics would also be highly suitable for a panel approach to biochemical diagnosis of ALS, though a suitable set of proteins remains elusive [23,166,167].
Defining the biochemical landscape in the preclinical period in ALS gene carriers is a major challenge that looms large [145]. Antisense oligonucleotide therapies targeting the common gene mutations in ALS have reached the clinical trial arena in symptomatic patients [94]. Asymptomatic gene carriers probably have the most potential to benefit from these treatments, but the unpredictable age of onset, even in genetic cases, high costs, and the invasiveness of intrathecal treatment are major barriers to use in this group [145]. Proteomics is ideally placed to identify biomarkers of treatment response or predict symptom onset that could help to remove this barrier by enabling better timing and monitoring of treatment. Detecting subtle proteomic changes in this group, though, will require major improvements in proteomic depth, quantitative accuracy, and large longitudinal cohorts. Some of this may occur by improving the relevance of the proteome of study, for example using analysis of CSF or neuronal-derived serum EVs, or through advances in proteomic approaches such as data independent acquisition methodologies, which have been seldom used in the ALS field todate.
The goals of ALS biomarker studies are to provide insights into disease mechanisms and biomarkers that are useful in drug development and clinical trials. Continued studies that incorporate biomarkers in ALS drug development programs and clinical trials will generate the data necessary for regulatory agencies to accept biomarkers in their decision-making processes regarding new treatments for ALS.