Does the association between cognition and education differ between older adults with gradual or rapid trajectories of cognitive decline?

ABSTRACT Education is associated with improved baseline cognitive performance in older adults, but the association with maintenance of cognitive function is less clear. Education may be associated with different types of active cognitive reserve in those following different cognitive trajectories. We used data on n = 5642 adults aged >60 from the English Longitudinal Study of Aging (ELSA) over 5 waves (8 years). We used growth mixture models to test if the association between educational attainment and rate of change in verbal fluency or immediate recall varied by latent class trajectory. For recall, 91.5% (n = 5164) of participants were in a gradual decline class and 8.5% (n = 478) in a rapid decline class. For fluency, 90.0% (n = 4907) were in a gradual decline class and 10.0% (n = 561) were in a rapid decline class. Educational attainment was associated with improved baseline performance for both verbal fluency and recall. In the rapidly declining classes, educational attainment was not associated with rate of change for either outcome. In the verbal fluency gradual decline class, education was associated with higher (an additional 0.05–0.38 words per 2 years) or degree level education (an additional 0.04–0.42 words per 2 years) when compared to those with no formal qualifications. We identified no evidence of a protective effect of education against rapid cognitive decline. There was some evidence of active cognitive reserve for verbal fluency but not recall, which may reflect a small degree of domain-specific protection against age-related cognitive decline.


Introduction
Education in childhood and early adulthood is thought to be one of the most important sources of cognitive reserve, defined here as the degree of disease or age-related change that can be tolerated by the brain before impairment becomes apparent. (Barulli & Stern, 2013) This is demonstrated by the finding that educational attainment is associated with a reduced risk of a clinical diagnosis of dementia and higher cognitive performance on a range of measures. (Beydoun et al., 2014;Lipnicki et al., 2019;Meng & D'Arcy, 2012) However, the relationship between education and maintenance of cognition over time has been more contested. (Foverskov et al., 2018;Greenfield & Moorman, 2019;Lenehan et al., 2015) Several theoretical concepts of the reserve have been developed, with contrasting hypotheses about the relationship between education and cognitive maintenance. One important question for understanding this relationship which has not seen a great deal of attention in the epidemiological literature is whether or not the association between education and cognitive decline depends upon the underlying trajectory of cognitive function.

Theories of reserve
In 2018 different groups proposed alternative theoretical frameworks for understanding reserve.(E. M. Arenaza-Urquijo & Vemuri, 2018;Cabeza et al., 2018;Stern et al., 2018) In this paper the framework of Stern et al. (2018) will be drawn upon. We will also refer to cognitive maintenance, defined not as any specific neural or disease process, but the common end result of these processes observed by longitudinal measurements of cognitive function. We use this term to capture all active or dynamic processes as distinct from brain (passive) reserve.
Brain reserve can be defined as "neurobiological capital (numbers of neurons, synapses, etc.)". (Stern et al., 2018) Brain reserve is passive in the sense that, whilst it may be increased over the life-course, it is fixed at a given point in time. It may increase the time to the clinical expression of cognitive or functional impairment but does not directly affect any disease or aging processes. In a theoretical longitudinal study of cognitive function greater levels of brain reserve would increase baseline performance, but would not affect the rate of decline. Examples of brain reserve would include the higher gray matter volumes or improved white matter tract integrity observed in healthy individuals with higher levels of education. (E. Arenaza-Urquijo et al., 2013;Boller et al., 2017;Chen et al., 2019;Teipel et al., 2009) Cognitive reserve is defined as "the adaptability (i.e., efficiency, capacity, flexibility) of cognitive processes that helps to explain differential susceptibility of cognitive abilities". (Stern et al., 2018) This is a theoretical construct for the sum of additive and emergent effects resulting from "networks of brain regions associated with performing a task and the pattern of interactions between these networks". (Stern et al., 2018) This was previously associated with the term active reserve and emphasizes the dynamic functional capacity to respond to pathological or age-related changes. Unlike brain reserve, cognitive reserve is predicted to affect the rate of cognitive decline as the brain responds to advancing pathological changes. However, there are different theories of cognitive reserve that lead to differing hypotheses about the trajectory of cognitive change over time. (Lenehan et al., 2015) The two main theories of the cognitive reserve are neural cognitive reserve and neural compensation reserve. Neural cognitive reserve relates to the efficiency, capacity, and flexibility in the selection of primary networks responsible for performing a cognitive task. (Barulli & Stern, 2013) Neural compensation reserve is the recruitment of secondary networks to perform tasks after failure in the primary networks. If education contributes to neural cognitive reserve then there should be greater redundancy in the primary network to compensate for aging or pathological change. (Weiler et al., 2018) This greater efficiency in the primary networks should slow cognitive decline. If education instead contributes to neural compensation, then it enables the recruitment of secondary networks to compensate for damaged primary networks. (Colangeli et al., 2016;Serra et al., 2017) This theory predicts slow initial decline, which then accelerates rapidly as the secondary networks are overcome by the disease process. (Lenehan et al., 2015;Serra et al., 2017)

Education and cognitive function
Higher levels of education have been identified as a key modifiable protective factor against dementia. (Livingston et al., 2017) There is no debate that higher levels of education lead to later presentation of dementia. However, there is some ongoing debate about the nature of that protective effect. Earlier studies and the systematic reviews based on those studies largely found evidence that education improved cognitive maintenance in support of the neural cognitive reserve hypothesis. (Valenzuela & Sachdev, 2006) More recent reviews questioned the findings of these studies on the basis of methodological limitations. (Lenehan et al., 2015) Later cohort studies, especially those with three or more measurement occasions, have typically found no association between education and rate of decline, and therefore little evidence that education contributes to cognitive maintenance. (Gottesman et al., 2014;Helmes & Van Gerven, 2017;Lenehan et al., 2015;Lipnicki et al., 2019;Piccinin et al., 2013;Zahodne et al., 2011) Nonetheless, some authors continue to find that education appears to have at least some role in protecting against cognitive decline. (Foverskov et al., 2018;Greenfield & Moorman, 2019;Zahodne et al., 2015) Not all of these studies are limited by two or less measurement occasions. Additionally, some analyses, such as Greenfield and Moorman (2019) examine specific cognitive domains, whilst others such as Zahodne et al. (2015) use a measure of global cognitive function. Thus, neither the outcome used nor the number of measurement occasions appears to explain the continuing mixed results. Of those studies which continue to find an association between education and cognitive function in population samples, the effect of education has typically been small in comparison to the effect on baseline performance. It is possible that some measurement instruments lack the sensitivity to detect small differences in change over time, even with the often large samples used. (Lipnicki et al., 2019;Zahodne et al., 2015) Different educational and social systems could result in varying effects on cognition; however, there is no obvious pattern of national characteristics that predict whether a study will find an association or not.
Some studies have observed that cognitive decline may be faster in individuals with dementia who have higher levels of education. (Barulli & Stern, 2013;Meng & D'Arcy, 2012;Yu et al., 2012) This raises the possibility that one contributor to the conflicting findings regarding the association of education with cognitive change, is that different mechanisms of the reserve may be utilized in health and disease. A difference between healthy old age and dementia found in epidemiological or clinical studies is supported by evidence from functional magnetic resonance imaging studies showing that different mechanisms of compensation are utilized depending on disease status. (Colangeli et al., 2016) It is not known whether this effect would also be seen in those with pre-clinical dementia pathology and observed in longitudinal population studies.

Education, reserve, and population heterogeneity
Many of the major analyses of the association between education and cognition implicitly assume that all older adults in the analysis are from the same population, and therefore share the same underlying trajectory (or random effects around this). (Gottesman et al., 2014;Tucker-Drob et al., 2009;Zahodne et al., 2011) It is likely that the samples for these studies were drawn from at least two latent sub-populations. Those with a pre-clinical dementia pathology (a high burden of tau, amyloid, TDP-43, and/or vascular pathology in the absence of functional impairment) and those without. (Braak & Del Tredici, 2015;Nelson et al., 2019;Riley et al., 2011) In analyses that assume a homogenous population when there are two or more sub-populations, the estimated longitudinal change will be biased away from both true trajectories. In particular, it is possible that an association between longitudinal change and education in the minority of participants may be obscured. This is important to consider given the evidence of a more rapid decline in cognitive function amongst more highly education patients with dementia. (Meng & D'Arcy, 2012) An association between education and cognition amongst those with a declining trajectory suggestive of pre-clinical dementia pathology may be lost if that subpopulation is not identified. Ideally, one would have a direct measurement of brain status such as measurements of pathological burden or gray matter volume. (Stern et al., 2018) The cognitive reserve can then be tested as an interaction between the association of brain status and education on cognitive performance. In many longitudinal aging studies, measurements of brain status are not available or available only from a single occasion. However, one can identify a latent sub-population with a more rapid decline in cognitive function from a population sample. One common method used to identify latent subclasses with different rates of change over time is growth mixture modeling (GMM) or related longitudinal mixture models. (Muthen, 2004) These give the opportunity to test whether the more rapid rate of decline seen with greater education in clinical samples is also seen in those with probable pre-clinical dementia, in the absence of more direct measurements. Hayden et al. and Pietrzak et al. combined genotypic and clinicopathological data with GMM and found that membership of a rapidly declining latent class was strongly associated with a higher relative risk of amyloid beta pathology and apolipoprotein ε4 carrier status. (Hayden et al., 2011;Pietrzak et al., 2014) This suggests GMM is able to identify those in a pre-clinical disease state. (Riley et al., 2011) Several other studies have also used GMMs to address the issue of rates of change in cognitive function in population samples with latent sub-populations. (Hayden et al., 2011;Marioni et al., 2014;Muniz-Terrera et al., 2010;Olaya et al., 2017;Pietrzak et al., 2014;Royall et al., 2014;Small & Bäckman, 2007) Of the studies which have used GMM or closely related longitudinal mixture models to analyze cognitive trajectories and education, most have used education as a predictor of class membership. (Ding et al., 2019;Hayden et al., 2011;Lee et al., 2018;Marioni et al., 2014;Min, 2018;Olaya et al., 2017;Pietrzak et al., 2014;Royall et al., 2014;Small & Bäckman, 2007;Tampubolon et al., 2017) The results of these studies have been conflicting, some finding a strong association between the class of cognitive trajectory and education and others finding none. The often-implicit assumption underlying models where education predicts class membership are that education's effect on cognition is mediated via the process underlying the latent classes. Whilst there may be sub-classes of cognitive function within healthy aging, the single greatest determinant of the class is likely to be a disease. This will include a range of pathologies, but in a population study by far the most substantial and frequently co-morbid are tau/amyloid pathology, vascular disease, and the more recently described limbic-predominant age-related TDP-43 encephalopathy. (Nelson et al., 2019;Santos et al., 2017) If we assume that the latent class structure is driven principally by disease state, then by using education to predict class this implies that education affects unobserved (in most population studies) disease status. However, clinicopathological studies have generally found that education is not associated with the quantity of tau and amyloid observed postmortem. Koepsell et al., 2008;Roe et al., 2007;Serrano-pozo et al., 2013) In the theoretical model used in this analysis (see Figure 1), the mechanism underlying the latent classes in cognitive function is driven by the presence or absence of pathology. The effect of education is allowed to vary by latent class. This tests whether there is evidence for differing mechanisms of cognitive reserve dependent on an underlying trajectory, which is assumed to be closely related to underlying pathology. Terrera et al. have previously utilized a similar theoretical model to examine the association between education and decline within the class using the mini-mental state exam (MMSE). (Folstein et al., 1975;Muniz-Terrera et al., 2010) They found 2 sharply declining classes and 1 high-performance group with a very slight decline over time. A lower level of education was associated with a more rapid decline in the high-performance class, but not in either of the two sharp decline classes. However, the MMSE is known to have a strong ceiling effect which can conceal the change in high-performance groups.
We sought to develop previous research by testing whether the association between decline in semantic fluency and immediate recall with education is moderated by a latent class of change over time. Using latent class membership as an estimate of underlying disease status, this will test the hypothesis that different mechanisms of the cognitive reserve are utilized in different states. To do this, we  Figure 1 footnote: C = latent class of change over time; X = all time invariant covariates; Y1-y5 = the outcome at waves 1 through 5; D1-d4 = whether participants died or dropped out at each wave 2-5; I = latent intercept; S = latent linear rate of change; Q = latent quadratic rate of change utilized data from the English Longitudinal Study of Aging (ELSA), a large multidisciplinary study of aging.

Materials and methods
Participants and procedure ELSA has been described in detail previously. (Steptoe et al., 2013) The study sample was drawn from participants in Health Survey for England (HSE) years 1998(HSE) years , 1999(HSE) years , and 2001 who were born before 1 March 1952 and living in a private household or those in their households who were new partners or ≤50 years old. This initial sample was nationally representative of the age-specific English population. Data was collected in biennial sweeps by an interview in the participants' homes. Core sample data from waves 1 (2002) to 5 (2010) were utilized because the core cognitive battery was kept consistent through this time. Individuals born after 1941, who were therefore aged 60 or less at the first wave, were excluded. This eliminates at least one source of cohort effect (prenatal exposure to World War 2 rationing) and restricts the analysis to those more likely to show a greater degree of cognitive decline.
Of the full sample eligible for analysis of n = 5643 at wave 1, 103 were excluded due to missing data on gender, ethnicity, education, or baseline cognitive function. Dropout or death between waves 1 and 2 was 1256, 765 between waves 2 and 3, 634 between waves 3 and 4, and 533 between waves 4 and 5. For verbal fluency, the latent trajectory class structure was initially driven by small numbers of extreme outlying observations. Outliers were identified by regressing each measurement occasion on the previous one. Results with standardized residuals >2.9 or <-2.9 were checked individually. They were coded as missing from the analysis if the results were inconsistent with the other results for those individuals (for example, a 0 despite normal performance on other tests or results far higher or lower than for the same individual both before and after that occasion). This removed 81 observations at wave 2, 111 observations from wave 3, 99 observations from wave 4, and 91 observations from wave 5.

Cognitive measures
The cognitive tests were performed by computer-assisted interview. Of the cognitive measures in ELSA orientation to time, delayed recall and prospective memory task were not utilized due to strong ceiling or floor effects. (Marmot et al., 2014) Immediate recall and verbal (semantic) fluency were utilized because the floor effects were much weaker (supplementary figures 1 and 2). To assess immediate recall 10 common words were played to participants which they were asked to repeat immediately after the presentation. The word lists used were randomly assigned and a standardized recording was used for all participants. Semantic (category) fluency was assessed by asking participants to name as many animals as they could in 1 minute.

Education and covariates
Educational attainment was recorded as no formal qualifications (reference category in all analyses, school leaving at age 14 or later with no examinations completed), high school completion (O-levels or equivalent, school leaving at age 16 with qualifications), 6 th form completion (A-levels of equivalent, leaving at age 18 with qualifications), non-degree level higher education (any education above A-level not leading to a degree) and undergraduate degree or above. Age at baseline was centered for the analysis and wave of the study was used as the metric of time for all analyses. Gender and ethnicity (white and nonwhite) are treated as binary.

Statistical analysis
The models used in this analysis were composed of a growth mixture model and a simultaneously estimated informative missingness model. Separate models were estimated for immediate recall and verbal fluency. For each GMM, linear growth was initially specified, and quadratic or cubic curves tested for improvement in model fit. Class-specific intercepts and slopes were specified. The latent intercept and slope were regressed on all covariates. The effect of each covariate was allowed to vary by class. This tests the primary hypothesis that the association between education and cognitive maintenance will vary by latent class. Pairwise interactions between gender, age, and educational attainment were tested for fluency and recall. Interactions meeting statistical significance were found between education and age for fluency and between education and gender for recall. These interactions were small in magnitude, minimally improved model fit, and added a large number of parameters to the model, whilst not altering the conclusions regarding our substantive question. They were not included in the final model. No interaction significantly predicted dropout. Other available potential pre-education confounders (parental smoking, family structure in childhood, and parental occupation) were tested, but were not significant associated with cognitive function, the results are not presented.
Missing data were handled using a not missing at random (NMAR) Beuncken's model. This was jointly modeled with the GMM. With NMAR data there is a latent process driving loss to follow-up and resulting in nonrandom attrition. This is likely to result in inflated observed cognitive scores and upwardly biased estimates of latent cognitive function. Missingness is therefore incorporated into the estimation of the latent classes and a dependency is introduced between latent cognitive function and missingness. In practice, this is achieved by regressing missingness at each wave on observed covariates, the latent variables for the outcome (intercept and growth factors), and latent class. (Beunckens et al., 2008) We initially wished to model non-response and death separately, as they determined by separate, if correlated, processes. However, attempts to model death and dropout separately led to model under-identification. They were therefore modeled jointly using a single variable. After assessing model fit, the immediate recall model utilized only the intercept to predict missingness. In the verbal fluency model, both intercept and slope independently predicted missingness and were retained. The effect of each variable on missingness was fixed to be equal across all waves. Allowing the regression of missingness on covariates to vary by class did not improve fit for either model. See (Figure 1) for the generalized representation of the structural equation model used.
We used a one step approach to determining the number of latent classes. In other words, the number of latent classes was estimated with covariates included. To determine the number of latent classes we used Rousseau and Mengersen's over-fitting method. (Nasserinejad et al., 2017;Rousseau & Mengersen, 2011) This method requires setting a cutoff for the proportion of participants which makes up a substantively important latent class, for example, 5% of the participants (a posterior mode of 0.05). One then estimates an overfit model with a number of classes much larger than expected (up to 10 classes in the original paper). (Rousseau & Mengersen, 2011) The number of classes for the substantive analysis is then chosen by the number of classes in the overfit model which exceed the cutoff. A new model with the number of classes exceeding the cutoff can then be run and checked for global fit to the data, entropy, and substantive coherency. For example, if 4 out of 10 latent classes contained greater than 5% of the participants in the overfit model, a 4-class model would be chosen for the substantive analysis. That model would then be checked for global fit. If the model is a poor fit to the data, poorly differentiates between classes, or is substantively incoherent, then either the choice of posterior mode can be revisited or the model specification can be reviewed.
In our analysis, six classes were specified for our overfit model. Due to a large number of different parameters between classes, higher numbers of classes failed to converge. Our pre-specified cutoff for the posterior mode was ≥0.05 and the Dirichlet prior for the class proportion of (5,3) for fluency and (4,3) for recall (half the number of free parameters between classes). For both verbal fluency and immediate recall, this method identified two classes meeting the prespecified cutoff. Model fit for the two class models was then assessed using the Bayesian posterior predictive p value (PPPV), entropy, and whether the classes were substantively coherent. Other measures of global fit, such as the Bayesian information criteria, are not calculated for GMMs in MPlus 7.0. Weakly informative priors were used for all regression coefficients, missingness thresholds, and class-specific latent intercepts and means. Direct comparison of the composition of the classes for fluency and recall was not done due to the inability to export class membership from Bayesian mixture models in MPlus 7.0. The data were edited using Stata version 12 and the structural equation

Results
The participant demographics can be seen in (Table 1), which compares participants in the first wave with those remaining at wave 5. As the study progressed the remaining participants were younger, more likely to be female, more likely to be white, and less likely to have no formal educational qualifications.
For a 1-class fluency model the PPPV was 0.078 (−12.8 to 78.5 credible interval for a difference between the observed and replicated chi-squared values), for the 2 class  (Table  2). It can be seen that the proportions of each gender and educational category are similar across classes for recall and fluency. The mean age is somewhat higher in the rapidly declining classes.

Verbal fluency
The coefficients from the GMM for verbal fluency are shown in (Table 3) and estimated mean curves in (Figure 2). In the first verbal fluency latent class (gradual decline or probable healthy cognitive aging), the latent intercept in a number of animals named   The association of education with latent intercept in the gradual decline fluency class showed essentially a dose-response relationship, with greater education associated with higher baseline fluency scores. In this class lower levels of educational attainment were not associated with change over time, but higher levels of educational attainment were associated with a modest decrease in the rate of decline.
In the rapid decline fluency class, level of education was significantly associated with intercept only for high school education. Sixth form, non-degree higher or degree level education was not associated with the intercept in this class. Although mostly nonsignificant, the point estimates showed a similar dose-response pattern to that seen in the gradual decline class. In the rapid decline fluency class, no level of educational attainment was associated with the rate of decline.

Immediate recall
The coefficients from the GMM for immediate recall are shown in (Table 4) and estimated mean curves in (Figure 2). The model for immediate recall in the gradual decline latent class estimated a latent intercept in a number of words correctly recalled of 4.27 (95% CI  19-4.36). The linear rate of change was 0.16 (95% CI 0.03-0.29), the quadratic rate of change was −0.19 (95% CI −0.28 -−0.11) and the cubic rate of change was 0.03 (95% CI 0.02-0.05). In the rapid decline class, the latent intercept was 4.09 (95% CI 3.67-4.50) words recalled correctly. The linear rate of change was 0.35 (95% CI 0.08-0.60), the quadratic rate of change was −0.13 (95% CI −0.05-0.33) and the cubic rate of change was −0.10 (95% CI −0.14 -−0.06).
The association with education for the latent intercept of recall in the gradual decline class showed a dose-response relationship, similar to fluency, with greater education associated with higher baseline recall scores. In the gradual decline class, no level of educational attainment was associated with change over time. In the rapid decline class, level of education was associated with intercept for sixth form and degree level attainment but not high school or non-degree higher education. In the rapid decline class, no level of educational attainment was associated with the rate of decline.

Discussion
The aim of this analysis was to test whether the latent class of change over time moderates the association between educational attainment and decline in semantic fluency or immediate recall. By making the effect of education on cognition direct, and allowing that association to vary by class, this study provides a different perspective to that provided by prior research which has used education to predict class. We used a flexible modeling approach to test predictions produced by different theories of cognitive reserve whilst incorporating a novel method to account for non-random attrition associated with cognitive trajectory.
For both verbal fluency and immediate recall, we identified classes of gradual decline and rapid decline in cognition. The verbal fluency rapid decline class was estimated to decline an initial rate around 10 times as fast as the probable healthy aging class, and this decline also accelerated more rapidly. For immediate recall, neither class shows much initial decline, but performance in the rapid decline class reduces very sharply in later waves. For neither cognitive measure was educational attainment associated with the rate of decline in cognitive function amongst the rapidly declining classes. There was a suggestion that those with the highest levels of education did have a slightly slower rate of decline for verbal fluency, but not immediate recall. The difference in the rate of decline in the verbal fluency gradual decline class was approximately equivalent to being 4 years younger for both higher education and degree education.
Contingent upon the assumption that the latent classes did approximate underlying disease status, our results are consistent with brain reserve being the predominant form of reserve, with the suggestion of a small degree of neural compensation reserve for cognitively health older adults with the highest levels of educational attainment. (Lenehan et al., 2015) This may suggest that higher levels of education provide a small degree of protection against age related or very early pathological decline in cognition. (Kim et al., 2019) However, our findings are largely consistent with the majority of analyses from a large number of aging studies across industrialized nations which find that education has either small or no association with change in cognition over time. (Gottesman et al., 2014;Lenehan et al., 2015;Lipnicki et al., 2019;Piccinin et al., 2013;Zahodne et al., 2011) This analysis extends this previous work by demonstrating that this largely remains true even after underlying population heterogeneity and informative dropout have been taken into account. This does not necessarily mean evidence against education contributing to neural compensation or neural cognitive reserve in individuals with rapid decline. Given that both models have empirical support but generate opposing predictions, it is possible both mechanisms are operating with a net result of minimal differences in cognitive maintenance by education. (Oh et al., 2018) Our findings have both agreement and contrast with those of Muniz-Terrera et al. who used a very similar statistical methodology. (Muniz-Terrera et al., 2010) They found that lower levels of education predicted a faster decline in their high-performance class. Due to the strong ceiling effect of the MMSE, it can appear that those with lower levels of education decline faster. However, in our analysis using verbal fluency, with little to no ceiling effect, we similarly observed that those with higher or degree level education showed slightly better cognitive maintenance than those with no formal qualifications or secondary school level educational attainment. This is what would be expected if education provides a small degree of neural reserve in older adults showing an only gradual decline. Unlike Muniz-Terrera et al. we did not observe the same effect for immediate recall of a 10-word list. This difference may stem from the choice of using specific cognitive tests, rather than a measure of global cognitive function. Our analysis supports the suggestion that the association between education and cognitive maintenance is likely to be domain specific. (Lavrencic et al., 2018;Ritchie et al., 2015;Rodriguez et al., 2019) It is plausible that educational attainment would be more closely associated with verbal skills than short-term memory alone. (McDaniel & Einstein, 2011) Simple span and immediate free recall short-term memory tasks have previously been found to be less strongly associated with education than other cognitive tasks. (Chen et al., 2019;O'Shea et al., 2018;Ritchie et al., 2015). There is little role for strategizing in immediate free recall tasks, with participants typically starting at the first or last letter depending on the length of the word list presented. (Tan et al., 2010) This means that there is comparatively limited scope to employ learned strategies as compared to more complex tasks. Additionally, as they do not require memory consolidation, they tend not to rely on hippocampal structures which may be larger and show greater connectivity in older adults with higher levels of education.(E. Arenaza-Urquijo et al., 2013;Kramer et al., 2007;Noble et al., 2012;O'Shea et al., 2018) As the hippocampus is a crucial site for the development of Alzheimer's disease-or limbic-predominant age-related TDP-43 pathological changes, this may explain why the decline in immediate recall occurred somewhat later than that of semantic fleuncy. (Braskie & Thompson, 2013;Nelson et al., 2019) Semantic fluency itself relies upon a combination of predominantly frontal executive functions and predominantly temporal semantic memories of objects, and is thus more likely to benefit from the greater connectivity associated with higher educational attainment.(E. M. Arenaza-Urquijo et al., 2013;Hirni et al., 2013;Rascovsky et al., 2007;Reverberi et al., 2014;Sheldon & Moscovitch, 2012) Several studies, including previous analysis of ELSA data, have found 3 or 4 latent classes of cognitive function. (Hayden et al., 2011;Olaya et al., 2017) Of those studies with 3 or 4 classes, the pattern is frequently of 2-3 ordinal classes and 1 qualitatively different class (for example, the 3 stable classes with differing baseline performance and 1 declining class as seen in Olaya et al.). (Olaya et al., 2017;Royall et al., 2014) Allowing the effects of education and age to vary within the class, rather than predict class, would be likely to result in the loss of the ordinal classes (whose differences in baseline performance are instead modeled as a function of education within the class) and the preservation of the qualitatively different trajectories.

Strengths and limitations
Though a limited range of measures was available, the relative lack of ceiling and floor effects in the measures used is an important strength of this analysis. Another important strength of this study is the fact that education is used to predict change within the class and not class itself. For the reasons described in the introduction, we feel the GMM used in this analysis more accurately represents the results of clinic-pathological studies in a population setting. Additionally, there are many strengths of the ELSA dataset in general including, but far from limited to, the large sample size, a representative general population sample, and good duration of follow-up. (Steptoe et al., 2013) The large number of individuals with lower levels of educational attainment is of special relevance to this study as it provided a range of exposure to education. This is important not only for statistical power, but also reduces the chances of our results being due to sampling bias. Although we adjusted for the effect of nonwhite ethnicity, the ELSA sample reflects the older adult English population at the time of recruitment and thus is not highly diverse in terms of language or ethnicity. Nonetheless, the results of this analysis are likely to be generalizable to similar populations of older adults in industrialized western nations. The inclusion of an informative missingness model is an important strength of the analysis, as it relaxes the missing at random assumption for at least one missingness process. Over the follow-up period the regression estimates including the dropout model for the gradual decline classes showed an approximately 0.3-word decline in immediate recall and a 2-word decline for verbal fluency. This compares to the 0.3-word increase in immediate recall and 0.6-word increase in verbal fluency in the raw data. This suggests that the model was working as intended to counter the effect of nonrandom higher rates of attrition in the cognitively disadvantaged. One weakness of this analytic approach is that the classes identified are both classes of cognitive decline and missingness pattern. (Muthen et al., 2011) Whilst these processes are closely linked it would be preferable to model them separately. Unfortunately, Bayesian estimation using multiple-membership latent classes is not yet implementable within available software. The use of Bayesian estimation could be seen as a weakness, as Bayesian mixture modeling can be sensitive to prior specification. However, they tend to converge with frequentist estimation with weakly informative priors like those used here and provide an efficient means of estimation for complex mixture models. (Depaoli et al., 2017;Helm et al., 2017) A substantial limitation of the current analysis is the unavailability of a measure of brain status. (Stern et al., 2018) As explained in the 2018 whitepaper from the Reserve, Resilience and Protective Factors PIA Empirical Definitions and Conceptual Frameworks Workgroup, studies of the cognitive reserve should ideally have a sociobehavioural proxy for reserve (education in our case), cognitive performance outcomes and a measure brain status. ELSA does not contain measures of brain status. We were able to follow this insofar as we utilized mixture modeling to infer from the available data the subsample most likely to have substantial pathology, and then tested for an interaction between latent class and education. Whilst this makes use of the available data to approximate the approach advocated by Stern et al., it does not replace direct functional or volumetric measurements. Further research may wish to extend the current work by studying the relationship between education and observed cognitive function incorporating an additional latent class of change in a measurement of brain status.
In addition to measures of brain status, other important variables that are not present in the ELSA data are childhood intelligence or confirmation of self-reported educational attainment. Intelligence at age 11 may attenuate the associations between education and observed cognitive status and gray matter volumes. (Cox et al., 2016;Gow et al., 2011) Measurement error in self-reported educational attainment has previously been found to underestimate the effect of education on performance in recall and fluency tasks. (Foverskov et al., 2018) With these important caveats, it seems relatively unlikely that the results have been unduly influenced by unmeasured confounding. What early life measures preceding education we had available did not alter the principle finding of no or minimal association between educational attainment and rate of decline for most participants. Other unmeasured confounders would be anticipated to bias results away from, rather than toward, the null hypothesis. Our analysis does not account for the various post-education pathways to later life cognition. This being the case, our results cannot say how much of the observed lack of association is caused by mediating pathways rather than being the direct effect of education itself. We considered the inclusion of a range of post-education mediators such as adult social status, cardiometabolic risk factors, or cognitively stimulating activities. Many researchers of course do condition upon these covariates with the aim of estimating a direct effect of education and the life-course approach has much to commend it. However, their inclusion introduces a large number of additional modeling assumptions, such as no interactions between mediators and no time-varying confounding affected by prior exposure, which is not necessarily sustainable. Ultimately, our research question was about estimating whether there is a total effect, not the many possible pathways this might take. It is also worth noting that it was not possible to elucidate cohort effects because of using both time and age in the model. (Bell & Jones, 2013) As well as study designs addressing the limitations above, our study, and the majority of studies using GMMs, have focused on episodic memory, fluency, or composite cognitive scores. Future research may wish to extend the use of these methods to explore the association of education, or other sources of cognitive reserve, with different cognitive or functional domains. In particular, further work may wish to test if education protects against a decline in social and occupational function independent of or partially mediated by cognitive scores. (Jokinen et al., 2016)

Conclusions
We identified two latent classes of verbal fluency and immediate recall in a representative sample of the English older adult population. One class showed a minimal decline and the other class rapid decline, which is likely to represent a population with or at risk of preclinical dementia. We developed previous analyses by relaxing the assumption of population heterogeneity, allowing the association between education and cognition to be moderated by latent class, and explicitly modeling data NMAR. There was no evidence that educational attainment was associated with the rate of cognitive decline in the rapidly declining groups for either outcome. In older adults with the gradual decline, there was evidence of a small association with reduced rates of decline in verbal fluency for the highest levels of education, but no association was seen for recall.