Assessment of risk of ALS conferred by the GGGGCC hexanucleotide repeat expansion in C9orf72 among first-degree relatives of patients with ALS carrying the repeat expansion

Abstract Objectives We aimed to estimate the age-related risk of ALS in first-degree relatives of patients with ALS carrying the C9orf72 repeat expansion. Methods We included all patients with ALS carrying a C9orf72 repeat expansion in The Netherlands. Using structured questionnaires, we determined the number of first-degree relatives, their age at death due to ALS or another cause, or age at time of questionnaire. The cumulative incidence of ALS among first-degree relatives was estimated, while accounting for death from other causes. Variability in ALS risk between families was evaluated using a random effects hazards model. We used a second, distinct approach to estimate the risk of ALS and FTD in the general population, using previously published data. Results In total, 214 of the 2,486 (9.2%) patients with ALS carried the C9orf72 repeat expansion. The mean risk of ALS at age 80 for first-degree relatives carrying the repeat expansion was 24.1%, but ranged between individual families from 16.0 to 60.6%. Using the second approach, we found the risk of ALS and FTD combined was 28.7% (95% CI 17.8%–54.3%) for carriers in the general population. Conclusions On average, our estimated risk of ALS in the C9orf72 repeat expansion was lower compared to historical estimates. We showed, however, that the risk of ALS likely varies between families and one overall penetrance estimate may not be sufficient to describe ALS risk. This warrants a tailor-made, patient-specific approach in testing. Further studies are needed to assess the risk of FTD in the C9orf72 repeat expansion.


Introduction
Genetic discoveries play a crucial role in unraveling the complex nature of Amyotrophic Lateral Sclerosis (ALS) and have led to promising genebased therapies now being assessed in clinical trials (1, 2).The pathophysiological mechanism has not yet been elucidated, but is thought to involve an interaction between genetic and environmental factors over time (3,4).The GGGGCC hexanucleotide repeat expansion in C9orf72 (C9orf72 repeat expansion) has been identified as the most common genetic cause of ALS in Europe and North America (5).The C9orf72 repeat expansion is genetically pleiotropic and has most frequently been associated with a phenotypic spectrum ranging from ALS to frontotemporal dementia (FTD) (3).Testing for the C9orf72 repeat expansion as part of routine clinical care is becoming increasingly important (6).However, considering the autosomal dominant inheritance pattern, identification of the C9orf72 repeat expansion in patients with ALS also has implications for their relatives (6).
An important question that emerges when patients carry a C9orf72 repeat expansion is the risk of relatives developing an associated disorder (6).For asymptomatic relatives, the risk of developing an associated disorder, especially in the absence of an effective therapy, is an important factor in the decision to undergo genetic testing (7).Thus far, studies have reported near fully penetrant estimates of the C9orf72 repeat expansion in ALS at older ages (eTable 1) (5,(8)(9)(10)(11).These estimates were, however, based on populations in which the outcome of ALS was already known, leading to an overestimation of penetrance.Also, 11/7,579 (0.15%) subjects in the UK 1958 birth cohort appeared to carry a C9orf72 repeat expansion and in Project MinE, a large international collaboration, we have reported a C9orf72 repeat expansion in 2 out of 1,051 controls (0.19%) (12,13).Both estimates are in agreement, but much higher than expected when assuming high risks of developing ALS and FTD, given the lifetime risks of these conditions.
In this study, we used first-degree family history data of patients with ALS, drawn from a nationwide registry, and known to carry the C9orf72 repeat expansion, to estimate the agerelated risk of ALS, thereby minimizing the risk of selection bias.We aimed to establish the risk of ALS in the C9orf72 repeat expansion in firstdegree relatives and explore the variability in baseline hazards between families.

Design and participants
The Prospective ALS study the Netherlands (PAN) was initiated in 2006 and is an ongoing population-based registry (14).For this study, we included patients carrying a C9orf72 repeat expansion, diagnosed with ALS between January 2006 and January 2020, hereafter referred to as index patients.To prevent including duplicate nuclear families, no more than one sibling from each nuclear family was included as an index patient.This was verified by creating a kinship matrix using data available from genome-wide association studies (data were available for 90.7% of the index patients) (15,16).Within one nuclear family the most recent data was used in the analysis, thus taking into account all known ALS cases in estimating ALS risk.

Data collection
At time of inclusion in the registry, the index patients completed a structured family history questionnaire including information on the number of first-degree relatives (parents and siblings) and their history of ALS.The questionnaire was self-administered and, if necessary, patients were contacted by telephone to complete missing data.From September 2011 onwards, questions about relatives' age at death, cause of death, and age at time of questionnaire were added.The questionnaire data were verified and updated using more recent information collected from several sources.First, we used data from the familial ALS study, in which patients with (suspected) familial ALS and their relatives were invited to participate (17).Second, we used familial history questionnaires completed by first-degree relatives of index patients included in the registry, and last, we used patients' medical records.Disease characteristics were collected from index patients' medical records on the day of diagnosis, and follow-up of survival was done until June 30, 2022.Additionally, data on carriership of the C9orf72 repeat expansion was collected for individuals included in the familial ALS study, of whom all siblings were tested for the repeat expansion, and with at least one known carrier in their nuclear family (17).

Genetic analysis
DNA was extracted from venous blood samples of the index patient according to standard protocols and procedures.Presence of the C9orf72 repeat expansion was determined using one of two previously validated methods: either by a repeat-primed PCR for the C9orf72 hexanucleotide repeat defining alleles with 30 or more repeats as mutated, or by using ExpansionHunter to detect the presence of the C9orf72 repeat expansion in whole-genome sequencing (WGS) data (12,18).Although repeat-primed PCR is considered the gold standard, the ExpansionHunter showed an expansion detection accuracy of more than 99% (12).

Statistical analysis
The primary aim of the analysis was to estimate the age-related risk of ALS in the C9orf72 repeat expansion.Information available for each index patient comprised the number of first-degree relatives, the relative's event, defined as death from ALS, death from other causes, or censor when alive at time of questionnaire, and the relative's age at event.A sample of the data is presented in Figure 1 for a single index patient.We assumed that the C9orf72 repeat expansion was transmitted with Mendelian probability of 0.5 (19).In addition, we assumed that de novo mutations do not occur.As such, 50% of first-degree relatives were assumed to carry the C9orf72 repeat expansion on population level (19).To support this assumption, we calculated the percentage of carriers of the C9orf72 repeat expansion amongst all siblings included in the familial ALS study, excluding index patients (17).To prevent giving an underestimation of ALS risk, we assumed that all ALS cases in our population were attributed to carriership, we thus aimed to estimate the "worst-case" scenario (upper bound) of the age-related risk of ALS in C9orf72.We estimated the risk by multiplying the cumulative incidence of ALS among all first-degree relatives by two.Hereby answering the question: what is my risk of ALS if I carry the C9orf72 repeat expansion and I have a family member with ALS who carries the repeat expansion?
We used a cumulative incidence curve to model the occurrence of ALS among first-degree relatives while accounting for the competing event of death due to other causes (20).Missing data on age at death, age at censoring, or age at ALS death (missing in 59 of the 933 cases (6.3%)), were imputed by creating multiple imputed datasets (n ¼ 100), disregarding the first 100 iterations (burn-in).The imputation model contained the following variables: age at event, event (censor, death, ALS), sex of the relative, sibling or parent, age and sex of the index patient, year of survey and family size (since family size could serve as an important confounder for disease risk) (21).Imputed values were generated using a curtailed non-linear regression model using bootstrapping.Cumulative incidence curves across imputations were pooled using Rubin's rules based on a complementary log-log transformation (22).Due to the clustering of observations within families, the variance around the cumulative incidences was estimated by means of bootstrapping (n ¼ 10000) and averaged across imputations ("MI Boot") (23).
We conducted the following sensitivity analyses: (1) complete case analysis, excluding all firstdegree relatives with missing age or event, (2) fully imputed analysis, including all first-degree relatives, (3) excluding data from parents to minimize under-ascertainment of cases in older generations, (4) excluding questionnaires completed prior to 2011 to exclude data prior to the change in questionnaire, ( 5) excluding all data originating from outside the registry to exclude a potential effect of nonrandom missingness, (6) assuming Mendelian inheritance probabilities ranging from 0.4 to 0.6, excluding that results were driven by random fluctuations of mutant allele transmission, and ( 7) excluding subsets of relatives based on the age of the index patient and family size (e.g.excluding relatives of index patients with small families), to exclude potential under-ascertainment of cases.
Furthermore, to test the robustness of our primary analysis, we used a second distinct approach to estimate the disease risk using estimates regarding the prevalence of the C9orf72 repeat expansion in the general population, based on the 1958 UK birth cohort (n ¼ 7,579), and lifetime risks of dying from C9orf72-related ALS or FTD.We combined the uncertainty around published estimates of C9orf72 prevalence in the general population, number of ALS and FTD patients with a C9orf72 repeat expansion and the lifetime risk of dying from ALS or FTD using the Delta method to estimate a confidence interval.
In addition, we used a mixed effects Cox proportional hazards model with a random intercept term for the baseline hazard to evaluate the variability in ALS risk between families and extracted the relative risk for each family.A relative risk of 2.0, for example, corresponds to a 100% higher hazard of ALS for members in that family compared to the average risk in families.The absolute family-specific risk of ALS could be calculated as: 1 -(1overall risk of ALS) Family-specific HR .We used a likelihood ratio test to evaluate the significance of the random effects and the variability in baseline hazard between families.In addition, we used a permutation test, randomly allocating family identifiers, to understand the variability in results if family was not associated with the baseline hazard.

Standard protocol approvals, registrations, and patient consents
The medical ethics committee and institutional review board of the University Medical Center Utrecht (METC NedMec) approved this study (Study Registration Number: METC .Written consent was obtained from all study participants before this study.

Data availability
All protocol, analyses, and anonymized data will be shared on request.We take full responsibility for data, analyses and interpretation, and conduct of the research.

Results
Between 1 January 2006 and 1 January 2020, 2,734 patients diagnosed with ALS provided informed consent, and 2,486 (90.9%) participated in the study.Of these, 2,316 were screened for the presence of the C9orf72 repeat expansion and 214 (9.2%) patients carried the pathogenic mutation, in line with previous reports (eTable 2) (5,18,24,25).For ALS patients carrying the repeat expansion, there was no relationship between having familial ALS and participation in our nationwide population-based ALS study (eTable 3).Including one sibling from each nuclear family resulted in 207 index patients with a total of 1,160 firstdegree relatives (Table 1).For the primary analysis, relatives with an unknown event (censor, death, ALS) were excluded.Of the remaining 933 relatives from 172 index patients (Table 2), 345 were deceased, 506 were alive at the time of the questionnaire, and there were 82 ALS cases.
The estimated risk of ALS at age 80 for firstdegree relatives carrying the repeat expansion ranged from 18.3% to 30.3%, mean 24.1% (Figure 2).We found that the risk of ALS was low prior to the age of 30 and reached a plateau at age 80 years or older, similar to known ALS incidence rates in the Netherlands (26).The results of the sensitivity analyses are shown in Figure 3.All sensitivity analyses showed a risk of ALS in line with the primary analysis.Using data from the familial ALS study, we found 76 of the 145 (52.4%; 95% CI 44.3% to 60.5%) siblings carried the repeat expansion (eTable 4).As a second distinct approach, we used independent, published data to estimate risk of ALS or FTD in the general population: in 2017, 1 in 322 deaths in the Netherlands was due to ALS, and of the individuals who died from ALS, 8.5% carried the repeat expansion (26).Hence, 1 in 3,803  individuals is expected to die of C9orf72 related ALS.A study conducted in the United Kingdom showed the lifetime risk of FTD is 1 in 742 and a European study showed mutation frequency in FTD is 10.0% (27,28).Based on a study conducted in the United Kingdom, 1 in 689 of the general population is a carrier of the C9orf72 repeat expansion (13).This resulted in an estimated risk of ALS for C9orf72 repeat expansion carriers of 18.4% (95% CI 9.9% to 41.7%) and risk of FTD of 9.4% (95% CI 5.1% to 21.2%), and to a combined estimate of 28.7% (95% CI 17.8% to 54.3%) Finally, we determined interfamilial differences in risk of ALS by allowing the baseline risk to vary between families (Figure 4A).The baseline hazard Figure 4. Family-specific risk of developing ALS.(A) Variability in baseline risk across families.Relative risks were calculated using a mixed effects Cox proportional hazards model with a random intercept term for the baseline hazard.A value of 2.0 indicates that the hazard in that family is twice the hazard of the entire study.(B) Overall, there was no clear correlation between family-specific hazard ratios and family size (Pearson 0.08, p ¼ 0.29).The shaded grey area indicates the expected random variability in baseline hazards for each family size if the baseline hazard would be identical across families, expressed as 95% confidence interval.The majority of the family-specific hazard ratios fall out-side the 95% confidence interval, indicating that the variation in family-specific hazard ratios is larger than expected under random chance.
Assessment of risk of ALS conferred by the GGGGCC hexanucleotide repeat expansion in C9orf72 193 of ALS differed between families (p ¼ 0.023), where relative familial risks ranged from a minimum of 0.63 to a maximum of 3.37 times the average familial risk.This translates to risk of ALS ranging from 16.0% to 60.6% at age 80 in single families.We found familial ALS risk was independent from family size (Pearson 0.08, p > 0.29), and the variation in family-specific ALS risk was larger than what would be expected if the risk was identical for each family (Figure 4B). Figure 5 and eFigure 1 illustrate pedigrees of similar sized families with a family-specific ALS risk ranging from 0.63 to 3.37.

Discussion
We estimated the risk of ALS in the C9orf72 repeat expansion among first-degree relatives using family history data of patients known to carry the expansion.We found the mean risk of ALS at age 80 was 24.1%.We showed however, that the risk of ALS ranged between individual families from 16.0 to 60.6%, and one overall penetrance estimate may not be sufficient to describe ALS risk.In addition, using a second distinct approach based on previously published data, we found an estimate consistent with the results of our primary analysis (mean 18.4%; 95% CI 9.9% to 41.7%).
Combined with published data on FTD, we found an overall risk of ALS or FTD in the general population of 28.7% (95% CI 17.8% to 54.3%).Our findings have important implications when counseling patients who carry the C9orf72 repeat expansion and their relatives, and for decisions related to investigating preventive gene therapy in asymptomatic carriers of the repeat expansion.
Although previous studies reported the penetrance of the C9orf72 repeat expansion to be age-related and incomplete, their estimated ALS risks were considerably higher, showing nearly full penetrance in older ages (5,(8)(9)(10)(11).This discrepancy could be due to previous studies including solely study populations in which ALS as outcome was already known, leading to an overestimation of overall penetrance.The range in risk of ALS found in this study is in line with the expected risk based on the frequency of the C9orf72 repeat expansion in the general population (12,13,(26)(27)(28).These findings were recently confirmed by another study, with penetrance estimates falling within the range found in our study (29).
Another important finding of our study is the interfamilial differences in relative risk of developing ALS.We found the variation in family-specific ALS risk was larger than what would be expected if ALS risk was identical for each family.This indicates that the risk may be family-specific, resulting in families with a higher or lower prevalence of ALS.The variability in ALS risk aligns with the multistep hypothesis of ALS, suggesting the C9orf72 repeat expansion, in isolation, is insufficient to cause ALS, but other genetic, epigenetic, or environmental factors are needed for the disease to manifest (30).The exact estimate of familyspecific risk of ALS is, however, limited by family size (21).In this study, only first-degree relatives were included due to limited data.Larger familial studies, including second-and third-degree relatives, may provide more accurate family-specific estimates and help to refine counseling.
Our study has several limitations that need to be considered.First, we assumed a 50% carrier rate based on an autosomal dominant inheritance pattern, which could be incorrect (31,32).However, in the familial ALS study, we found a similar number of individuals carried the repeat expansion (52.4%; 95% CI 44.3% to 60.5%).In addition, we chose to estimate the "worst-case" scenario, attributing all ALS cases to carrying the repeat expansion, which might be incorrect, since even in highly penetrant conditions, an independent polygenic or environmental component might be at play (phenocopies).Nevertheless, if that is the cases, true penetrance estimates would even be lower than presented here.Similarly, we found a higher risk of ALS when including only siblings of the index patients, suggesting potential underascertainment of ALS amongst parents.The use of questionnaires could lead to under-ascertainment of patients with cognitive or behavioral symptoms, or patients of older age, leading to an underestimation of the risk of ALS.Nevertheless, completing a variety of sensitivity analyses based on these limitations, we obtained estimates all falling within range of the primary analysis.Significantly, besides ALS, the repeat expansion has been associated with a range of other diseases, most importantly FTD (3,33), which was not included in our questionnaire.Further studies are needed to estimate the risk of developing any of the diseases associated with the C9orf72 repeat expansion.
In conclusion, we showed that the risk of ALS in the C9orf72 repeat expansion may not be one fixed number but varies between individual families and on average is lower than previously reported.Our findings have important clinical implications.The range in risk of ALS challenges the practice of routinely offering C9orf72 testing to all patients for genetic counseling purposes (6).Alternatively, taking an extensive family history in patients prior to genetic testing, can help to distinguish familial ALS from truly sporadic ALS, and thus families with a potential high and low disease risk, should they carry the repeat expansion.The increased uncertainty in individual ALS risk should be considered in genetic counseling, as this could impact their decision to undergo genetic testing.Genetic testing could routinely be offered but then with the explicit intent to include patients in new clinical trials targeted at C9orf72 mutation carriers.Moreover, our findings also have consequences for gene-based clinical trials, and potentially future therapies, in asymptomatic carriers.The increased uncertainty in individual ALS risk highlights the need for preclinical markers, including neurofilaments, neurophysiology or imaging markers, to predict who is at increased short-term risk of developing ALS and needs to be treated in the pre-symptomatic phase (34).

Figure 1 .
Figure 1.Sample of available data of first-degree relatives of index patients.Age indicates age at onset for index patient, age at time of questionnaire (censor) for living relatives, or age at death for other relatives.

Figure 2 .
Figure 2. Age-related risk of ALS in the C9orf72 repeat expansion.Cumulative incidence curve of ALS in 933 first-degree relatives, adjusted for death from other causes.Censored indicates relative was alive at time of questionnaire.Shaded area represents bootstrapped 95% percentile interval.

Figure 3 .
Figure 3. Sensitivity analyses of risk of ALS in the C9orf72 repeat expansion.Primary analysis included relatives with a known event, imputing missing age, by creating multiple imputed datasets (n ¼ 100) disregarding the first 100 iterations (burn-in), and assuming 50% of relatives are mutation carriers.Complete cases only included relatives with known age and event.Fully imputed included all relatives, imputing both age and event.Data since 2011 indicates questionnaires completed prior to 2011 were excluded.Registry data indicates data collected through the registry, excluding data originating from other sources.40% carrier and 60% carrier indicate the assumed number of relatives carrying the repeat expansion was changed to 40% and 60%, respectively, reflecting the uncertainty in the Mendelian inheritance probability.Age at onset !60 year included only index patients with an age at disease onset !60 years.Family size !6 only included index patients with a family size !6.

Figure 5 .
Figure 5. Pedigree charts of two index patients.(A) Pedigree of index patient classified as sporadic ALS.(B) Pedigree of index patient classified as familial ALS.Given age indicates age at onset for index patient, age at time of family history questionnaire for living relatives and age at death for other relatives.RR ¼ family-specific relative ALS risk; y ¼ years.

Table 1 .
Clinical characteristics of index patients.ALSFRS-R slope, and vital capacity.Excluded from the primary analysis indicates event (censor, death, ALS) was missing all first-degree relatives of the index patient.ALSFRS-R ¼ revised version of the ALS functional rating scale.Ã Dysarthria or dysphagia is the presenting symptom in patients with bulbar onset.ÃÃ Familial ALS is defined as having one or more relatives diagnosed with ALS and/or FTD.ÃÃÃ According to the revised El Escorial criteria.

Table 2 .
Characteristics of included first-degree relatives.