Telomere length is greater in ALS than in controls: a whole genome sequencing study

Abstract Background: Amyotrophic lateral sclerosis is a neurodegenerative disease of motor neurons resulting in progressive paralysis and death, typically within 3–5 years. Although the heritability of ALS is about 60%, only about 11% is explained by common gene variants, suggesting that other forms of genetic variation are important. Telomeres maintain DNA integrity during cellular replication and shorten naturally with age. Gender and age are risk factors for ALS and also associated with telomere length. We therefore investigated telomere length in ALS. Methods: We estimated telomere length by applying a bioinformatics analysis to whole genome sequence data of leukocyte-derived DNA from people with ALS and age and gender-matched matched controls in a UK population. We tested the association of telomere length with ALS and ALS survival. Results: There were 1241 people with ALS and 335 controls. The median age for ALS was 62.5 years and for controls, 60.1 years, with a male–female ratio of 62:38. Accounting for age and sex, there was a 9% increase of telomere length in ALS compared to matched controls. Those with longer telomeres had a 16% increase in median survival. Of nine SNPs associated with telomere length, two were also associated with ALS: rs8105767 near the ZNF208 gene (p = 1.29 × 10−4) and rs6772228 (p = 0.001), which is in an intron for the PXK gene. Conclusions: Longer telomeres in leukocyte-derived DNA are associated with ALS, and with increased survival in those with ALS.


Introduction
Amyotrophic lateral sclerosis is a neurodegenerative disease of motor neurons leading to progressive muscle weakness and death through neuromuscular respiratory failure (1). Although the heritability of ALS is about 60% (2), the heritability explained by common gene variants is only about 11% (3) suggesting that other forms of genetic variation play an important role.
Telomeres are repeated DNA sequences located at the ends of chromosomes and exist to maintain DNA integrity during cellular replication; chromosome ends tend to shorten with replication, and the repeat region protects against the loss of important gene sequences because loss of repeats can be tolerated (4). As such, telomeres shorten naturally with age as repeats are lost during replication cycles (5). Natural variation in telomere length exists in the population, with women on average having longer telomeres than men (6); shorter telomeres are associated with an increased risk of cancer (7).
A major risk factor for ALS is age (8) and ALS is also more common in men than women (9): both age and sex are related to telomere length. Furthermore, there are some similarities between ALS and cancer (10), such as evidence for a multistep process in pathogenesis (11,12). We therefore investigated telomere length in ALS.

Whole-genome sequencing
Samples were from multiple centers across the UK contributing to the international Project MinE whole genome sequencing initiative (13).
DNA was isolated from venous blood using standard methods. The DNA concentrations were set at 100 ng/uL as measured by a fluorimeter with the PicoGreen V R dsDNA (Thermo Scientific, Waltham, MA) quantitation assay. DNA integrity was assessed using gel electrophoresis. All samples were sequenced using Illumina's FastTrack services (Illumina, San Diego, CA) on the Illumina HiSeq 2000 platform (14). Sequencing was 100 bp paired-end performed using polymerase chain reaction (PCR)-free library preparations and yielded $40x coverage across each sample. Binary sequence alignment/map formats (BAM) were generated for each individual.

Determination of telomere length
TelSeq (15) was used to quantify telomere length using data from whole genome sequences. Telomere lengths were estimated from reads, defined as repeats of more than seven TTAGGG motifs.

Statistical analysis
The effects of telomere length on ALS were tested using a generalized linear regression model, which included total telomere length, age and sex, to predict disease affected status. To assess the model, Pearson's chi-squared test was used.
Because telomere length correlates with age, we performed an additional test to examine the possibility that survival bias could affect the results. To do this, we also performed the analysis restricted to the subgroup of people with ALS onset below the median cohort age (62 years). Although such an analysis would halve our sample and therefore greatly reduce statistical power, the direction of effect should be observable.
To evaluate SNP effects on telomere length we calculated Nagelkerke's R 2 from the results of a generalized linear model using the value of telomere length, age, gender, and nine SNPs selected for having been previously shown to associate with telomere length.
To assess the effect of covariates on telomere length affecting survival, we used Cox regression, controlling for age, gender, and site of disease onset (bulbar or spinal).
To assess the association of genes with ALS we used the SNP-set sequence kernel association test (SKAT) (19), which is a test for association between a set of rare and common variants and continuous/dichotomous phenotypes using kernel machine methods.

Ethical approval
Informed consent was obtained from all volunteers included in this project. Generation of whole genome sequences was approved by the Trent Research Ethics Committee 08/H0405/60.

Results
There were 1241 people with apparently sporadic ALS and 335 controls. The median age for people with ALS was 62.5 years and for controls, 60.1 years, with a male-female ratio of 62:38 (Table 1). The mean telomere length in people with ALS was 3.95 kb, and in controls, 3.80 kb, not taking into account gender or age ( Figure 1). Generalized linear regression accounting for these covariates showed a mean 9% (95% CI 3%, 15%) increase of telomere length in people with ALS compared to age and gender-matched controls (p ¼ 0.008). In the analysis exploring survival bias as an explanation for our results, in which we restricted testing to those younger than the median age, the same direction of effect was observed, although as expected, because of the greatly reduced sample size, this did not reach statistical significance (p ¼ 0.08). Covariate analysis showed that females (p ¼ 0.03) and younger people (p ¼ 2 Â 10 À16 ) had on average longer telomeres (Table 2), confirming the results of earlier studies that telomere length reduces with age and females have on average longer telomeres.
There was no association between telomere length and site of disease onset (p ¼ 0.7), or with C9orf72 expansion status (p ¼ 0.24).
Cox regression analysis showed that in the ALS group, those with longer telomeres had a 16% increase in median survival (hazard ratio 0.81 The generalized linear regression model showed that of the nine SNPs associated with telomere length, two were also associated with ALS: rs8105767 near the ZNF208 gene (p ¼ 1.29 Â 10 À4 , MAF = 0.03) and rs6772228, which is in an intron for the PXK gene (p ¼ 0.001, MAF = 0.03; Table 3), but the SKAT test did not show an association of overall variant burden in these genes with ALS after correction for multiple testing (ZNF208, p ¼ 0.81 and PXK, p ¼ 0.03). Nagelkerke's R 2 test showed that the nine selected SNPs contributed 3% to the variance in total telomere length.

Discussion
We have shown that longer telomeres are associated with ALS and with longer survival in ALS. In keeping with previous studies, we found that mean telomere length was longer in females and shortened with increasing age. Of a panel of nine SNPs known to be associated with telomere length, two showed association with ALS, one in ZNF208, and the other in PXK.
Although both these SNPs, rs6772228 and rs8105767, are known to be associated with telomere length, no association with ALS was seen in a previous large genome-wide association study (22), suggesting that either there is a populationspecific effect, or that the telomere length itself is driving the association, and other factors that influence it have a larger effect than these SNPs. Another possibility is that the difference in results is because the analysis performed was different, as we have tested genotypic association, whereas the genome-wide association study used linear mixed modeling of alleles. 30    Telomeres have largely been investigated for their roles in cancer and aging, shorter telomeres being associated with disease pathology and death. Surprisingly, telomere elongation is also seen in about 15% of cancers, such as adenocarcinoma of the lung and pancreas (23), and in general, cancers with long telomeres are resistant to therapy and carry a poor prognosis (24). Telomere elongation phenomena are well documented but far less well understood than telomere shortening phenomena (24)(25)(26)(27)(28).
A study of telomere length in ALS brains found a trend to longer telomeres in glial cells (29) consistent with our results, but is in contrast to an earlier small study of 50 people with ALS and 50 controls, finding that shorter telomeres are associated with ALS (30).
Our study has some strengths and weaknesses. Although ALS is a disease of the central nervous system, our telomere data are derived from leukocyte DNA, since our DNA source was whole blood. The relationship between leukocyte telomeres, which can be expected to shorten with age as leukocytes undergo mitosis, and telomeres in neurons, which are post-mitotic, is not clear (31), but glial and other cells that do undergo mitosis are probably involved in ALS pathogenesis, and provide a possible mechanism. Furthermore, we did not directly measure telomere length using Southern blotting, but estimated it using whole genome sequence data. However, our findings have the advantage of a large sample size of more than 1200 cases, compared with previous reports of 50 or fewer. Furthermore, our examined cohort is more homogeneous in genetic background, and the sequencing technology used was the same across the entire cohort. However, one limitation of our method is that we cannot draw firm conclusions about the exact length of a telomere. The method we have used, TelSeq, correlates with results from Southern blotting (32), and Q-PCR (33) and is in widespread use (31,34). Nevertheless, different sequencing technologies will generate different telomere length estimates because of differences in library preparation and platform (35,36). To overcome this potential weakness, we have used the same industry-leading sequencing platform for all samples, as well as designing the study to minimize batch effects by having cases and controls sharing the same sequencing plate.
We found that longer telomeres were associated both with ALS and with increased survival in ALS. It is possible that telomere length does not associate with ALS risk but only with survival, and that our cohort was biased in such a way that those with longer survival were more likely to be genotyped. In that case, we would also observe an apparent association with risk, but the driver would be the actual association with increased survival. While this possibility cannot be completely excluded, the cohort tested was an incident cohort, collected from a population rather than a specialist clinic, reducing the likelihood of this explanation. Furthermore, we assessed survival bias by testing the relationship between telomere length and ALS in the younger half of the sample. We found the direction of association of longer telomeres with ALS was still present, although as expected, the statistical power was reduced due to the smaller number of young controls (<175). Replicating these findings in a bigger cohort such as the entire Project MinE sample is an important future step.
There are multiple methods available for telomere length analysis, including terminal restriction fragmentation, quantitative fluorescence in situ Hybridization (Q-FISH) (37), PCR-based techniques and southern blotting. These techniques have the disadvantage of lengthy protocols and limitations, such as the requirement that DNA is extracted from fresh blood, or that chromosomes are individually stained, which is a time-consuming process (35,(38)(39)(40)(41). Differences in applying these techniques between laboratories can create measurement differences (41). Thus, for large scale analyses, whole genome sequence data that can be processed using a standard bioinformatics pipeline can standardize measurements and overcome many of these issues (42). We have shown that measuring telomere length in a UK cohort is feasible using a bioinformatics tool, such as TelSeq, and that this is fast and cost-effective. Estimating the telomere length with TelSeq on a single 40x whole genome sequence takes about 90 min using four threads on a midrange computer, which would translate to about 100 days for our entire dataset. Since high-performance computing access is now straightforward, and multiple computers are able to run the same analysis in parallel, the analysis time can easily be shortened significantly.
In this large study of telomere length and ALS, we have shown that longer telomeres in leukocytes are associated with ALS, and with increased survival in those with ALS.

Acknowledgments
Samples used in this research were in part obtained from the UK National DNA Bank for MND Research, funded by the MND Association and the Wellcome Trust. We thank people with MND and their families for their participation in this project. We acknowledge sample management undertaken by Biobanking Solutions funded by the Medical Research Council at the Center for Integrated Genomic Medical Research, University of Manchester.

Declaration of interest
The authors declare no conflicts of interest.

Funding
This project was funded by the MND Association and the Wellcome Trust. This is an EU Joint Program-Neurodegenerative Disease Research (JPND) project. The project is supported through the following funding organizations under the egis of JPND-www.