Psychometric properties and confirmatory factor analysis of the CASP-19, a measure of quality of life in early old age: the HAPIEE study

Objectives: The aim was to assess the reliability and validity of the quality of life (QoL) instrument CASP-19, and three shorter versions of CASP-12 in large population sample of older adults from the HAPIEE (Health, Alcohol, and Psychosocial factors In Eastern Europe) study. Methods: From the Czech Republic, Russia, and Poland, 13,210 HAPIEE participants aged 50 or older completed the retirement questionnaire including CASP-19 at baseline. Three shorter 12-item versions were also derived from original 19-item instrument. Psychometric validation used confirmatory factor analysis, Cronbach's alpha, Pearson's correlation, and construct validity. Results: The second-order four-factor model of CASP-19 did not provide a good fit to the data. Two-factor CASP-12v.3 including residual covariances for negative items to account for the method effect of negative items had the best fit to the data in all countries (CFI = 0.98, TLI = 0.97, RMSEA = 0.05, and WRMR = 1.65 in the Czech Republic; 0.96, 0.94, 0.07, and 2.70 in Poland; and 0.93, 0.90, 0.08, and 3.04 in Russia). Goodness-of-fit indices for the two-factor structure were substantially better than second-order models. Conclusions: This large population-based study is the first validation study of CASP scale in Central and Eastern Europe (CEE), which includes a general population sample in Russia, Poland, and the Czech Republic. The results of this study have demonstrated that the CASP-12v.3 is a valid and reliable tool for assessing QoL among adults aged 50 years or older. This version of CASP is recommended for use in future studies investigating QoL in the CEE populations.


Background
With declining mortality rates and increased longevity, there has been a substantial increase in the proportion of adults reaching older age. In Europe, the number of older people aged 65 and over has risen significantly over the past two decades. The percentage of Europeans aged 65 or older is projected to rise from 17.4% at present to 27% in 2058 (Eurostat, 2012). With such an increase, governments worldwide are concerned with how to promote healthy ageing and assist older people to maintain their independence and active participation in society, in effect, to enhance quality of life (QoL) at older ages.
There is no general consensus regarding how QoL should be defined or measured. However, most researchers consider QoL to be a multi-dimensional concept encompassing various concepts including life satisfaction, and covers physical, emotional, mental health, as well as social and behavioural components of well-being (Bergner, 1989;Janse et al., 2004). The WHO Quality of Life Group defines QoL as: 'An individuals' perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards, and concerns' (Kuyken et al., 1995). QoL can be assessed by both objective and subjective measures: an individual's financial situation and general expectations from life as well as other factors such as education, housing, social support, and health. Within the gerontological literature, aging has traditionally been conceived in terms of physiological, mental, and social decline and deterioration in health. As such, QoL at older ages has been predominantly measured by health-related quality of life measures, for example, the Short Form-36 (SF-36), to assess the effects of poor health on mental and physical functioning (Lima et al., 2011;Walters, Munro, & Brazier, 2001;Ware & Sherbourne, 1992). As older people live longer and healthier , it is no longer appropriate to reduce QoL in older people to their experience of physical and mental health alone, and there is a need for a measure of QoL that explores positive experiences of ageing.
In the present study, we focus on the 19-item Control, Autonomy, Self-realisation, Pleasure scale . CASP-19 is a theoretically based measure of broader QoL, which was specially designed to measure QoL in the 'Third Age' . Third Age has been characterised as a period of life after retirement, in which one is free from many social roles, and able to explore areas of personal fulfilment. CASP-19 is underpinned by social theories: specifically, Maslow, Doyal, and Gough's theory on human need (Doyal & Gough, 1991;Maslow, 1968); Laslett's theory of the Third Age (Laslett, 1989); Giddens on reflexivity (Giddens, 1991); and Gilleard and Higgs on cultures of ageing (Gilleard & Higgs, 2000). QoL is defined in terms of satisfaction of human needs in four life domains, namely control, autonomy, self-realisation, and pleasure. CASP-19 aims to capture positive aspects of life at older ages, whilst being independent of factors, including financial circumstance and health that may influence it. The control domain represents people's ability to actively control their environments, whilst autonomy is defined as the freedom from unwanted interference of others. Self-realisation and pleasure domains capture the active and self-reflexive aspects of living that bring reward and happiness to people in later life .
There have been a number studies which have validated the factor structure of CASP-19 (Bowling & Stenner, 2011;Sexton, King-Kallimanis, Conroy, & Hickey, 2013;Sim, Bartlam, & Bernard, 2011;Vanhoutte, 2012;Vanhoutte, 2014;Wiggins et al., 2008;Wu et al., 2013). Most of the studies on the psychometric properties of CASP-19 have been conducted in West European countries, primarily in the United Kingdom (Sim et al., 2011;Wiggins et al., 2008) and Ireland , but also in Taiwan (Wu et al., 2013). In earlier studies of CASP, a four-factor solution was suggested for CASP-19 and twoor three-factor structure for 12-item CASP scales. In a UK study, Wiggins et al. (2008) has shown that a four-factor measurement model of CASP-19 has a good fit to the data in the BHPS and ELSA samples, using the confirmatory factor analysis (CFA) approach. Also, the shortened threefactor CASP-12v.2 proved slightly superior to the original 19-item scale. These findings have been confirmed by Sim et al. (2011); the authors assessed the psychometric properties of CASP-19 and CASP-12v.2 on a sample of 120 British adults living in the retirement community. More recently, Sexton et al. (2013) undertook a detailed psychometric assessment of CASP-19 using The Irish Longitudinal Study of Ageing (TILDA) ( Table 1). Their findings did not support the validity of the established measurement models. The control and autonomy, self-realisation, and pleasure factors were not sufficiently distinctive either empirically or conceptually. Instead, they recommended the use of a revised 12-item scale with either a single-factor or two-factor model (CASP-12v.3) when assessing overall QoL . Also, studies have found a method effect in the CASP scale and allowing error correlations between negatively worded items led to significant improvement in model fit Vanhoutte, 2014).
To date, the psychometric properties of the proposed single-or two-factor CASP-12v. 3 have not yet been further investigated in other studies. The aim of this study is to establish the reliability and validity of the original 19item and 12-item CASP scales in the sample of older adults living in Central and Eastern Europe (CEE). In addition, factor structure of the newly suggested CASP-12v.3 instrument will be further investigated using CFA.
In the past 20 years, the countries of CEE have experienced a remarkable social and economic transition from closed, totalitarian, and centrally planned economies towards open, democratic, and market-based economies. Such transition has had devastating impact on health. In the early 1990s, many countries in CEE experienced a dramatic increase in mortality rates. The largest increase was concentrated in the Former Soviet Union (FSU). In Russia, for example, male life expectancy at birth decreased from 63.8 to 57.7 years and 74.4 to 71.2 years for women between 1990 and 1994 (Notzon et al., 1998). This natural experiment offers a unique opportunity to study the health outcomes and well-being in these countries as they undergo significant socio-economic restructuring. Validation of CASP instrument in CEE will be useful for determining its potential for the use in future research and for comparing the QoL of older adults with other international studies which have incorporated the CASP.

Study population
The study subjects come from wave 1 of the HAPIEE (Health, Alcohol, and Psychosocial factors In Eastern Europe) project. The sampling procedure and design of HAPIEE is described in detail elsewhere (Peasey et al., 2006), and is briefly summarised below. HAPIEE comprises random population samples of men and women aged 45À70 years in Novosibirsk (Russia), Krakow (Poland), and seven Czech towns À Jihlava, Havirov, Hradec Kralove, Karvina, Kromeriz, Liberec, and Usti nad Labem. The data for wave 1 of the study were collected in 2002À2005. At baseline, a total of 28,945 individuals have been recruited (response rate 59%). In this study, we analysed data from participants who were administered the retirement questionnaire including the QoL (CASP-19) questions. Participants also completed an extensive questionnaire on their medical history, health status, lifestyle, diet, and socio-economic and psychosocial factors, and underwent a short clinical examination for measurement of anthropometric parameters. All questions were translated from English into each language and back translated into English to check for accuracy.

Quality of life
QoL was assessed in all participants who were retired using CASP-19. CASP-19 is a self-completed 19-item questionnaire originally developed and validated in a representative sample of 263 adults aged 65À75 years from the UK Boyd Orr Study (Gunnell, Frankel, Nanchahal, Braddon, & Smith, 1996). Each scale item was rated on 4point Likert scale, with responses ranging from 0 (never), 1 (not very often), 2 (sometimes), and 3 (often). The scale includes both positively and negatively worded items (see Appendix 1). All negatively worded questions were reverse coded so that all item responses are in the same direction. The total scores ranged from 0 to 57 for CASP-19. Higher scores indicate better QoL. The original CASP scale is composed of 19 items, but revised forms of 12 items (Borsch-Supan et al., 2005;Sexton et al., 2013;Wiggins et al., 2008) have been proposed for use. In the original study that tested the qualities of the CASP scale, a four-dimensional structure was proposed for the 19-item scale , and both single-or two-factor  and three-factor structures have been proposed for the different 12-item versions of CASP (Borsch-Supan et al., 2013;Wiggins et al., 2008). CASP-12v.1 (Borsch-Supan et al., 2013;Borsch-Supan et al., 2005) and CASP-12v.2 (Wiggins et al., 2008) were derived from CASP-19 by removing items which correlated most weakly with other items in their dimensions. CASP-12v.1 consists of 12 items: C1, C2, C4, A5, A6, A9, P10, P11, P14, S15, S18, and S19.

Socio-demographic variables
In addition to CASP items, the following variables were used for the purpose of obtaining the descriptive statistics: the marital status was categorised into four groups (married or cohabiting, single (never married), divorced, and widowed), the educational level was categorised into four groups (primary or less, vocational, secondary, and university), health status was divided into three groups (very good and good, average, and poor and very poor), level of material deprivation was assessed by three questions about the frequency of difficulties in (1) paying bills, (2) buying food, and (3) clothes necessary for themselves and/or his family. The answers were recorded on a 5-point scale (coded from 0 to 4); the total deprivation score was calculated as the sum of the three questions, and categorised into three groups: low (0), medium (1À6), and high (7À12). Self-reported economic activity was classified into the following categories: working pensioners and non-working pensioners. Participants in Poland and the Czech Republic completed the questionnaire during a nurse visit to their home (85% of them subsequently attended an examination in a clinic); all Russian participants completed the questionnaire during a visit to a clinic.

Statistical analyses
Sample description and initial steps of the analysis Data were analysed in STATA version 12 for descriptive analyses and Mplus version 6.11 for CFAs (Muth en & Muth en, 1998À2010). The analytical strategy was as follows. First, descriptive statistics, chi-squared tests, and analysis of variance (ANOVA) test were employed to describe the data. Second, frequency distributions were examined to evaluate the normality of scale items. Missing data and floor and ceiling effects (percentages of participants indicating minimum and maximum scores) of the CASP-19 were investigated in order to verify the validity and reliability of scale content (Ware & Gandek, 1998). Such effects were considered to be present if more than 15% of the sample reported the lowest or highest score (McHorney & Tarlov, 1995;Terwee et al., 2007). If floor or ceiling effects are present, it is likely that extreme items are missing in the lower or upper end of the scale. In such cases, as a result, participants with the lowest or highest possible scores cannot be distinguished from each other, and the reliability of the questionnaire is reduced. Missing item responses up to 10% have been considered as acceptable (af Sandeberg, Johansson, Hagell, & Wettergren, 2010).
Second, internal consistency reliability of CASP-19 was determined using Cronbach's alpha (a). It evaluates the extent to which items within a scale are inter-correlated with one another and measures the same concept. Cronbach's alpha typically ranges from 0 to 1. Internal consistency reliability is suggested to be acceptable when Cronbach's alpha 0.70 (DeVellis, 1991). Item-total correlations were calculated to examine the dimensionality of the scale items. Items within each dimension should represent the same latent variable and correlate more strongly with own domain than others. This is considered satisfactory if item-total correlations are 0.40 (Ware & Gandek, 1998). Construct validity was further examined by analysing the correlation between CASP-19 dimensions with other previously validated measures (Cohen, 2005). Spearman's correlation coefficients were used and were interpreted as follows: >0.90: excellent relationship, 0.71À0.90: good, 0.51À0.70: fair, 0.31À0.50: weak, and 0.30: none.
Previous QoL studies have sought to find evidence of construct validity by correlating with other established measures such as the SF-36, self-rated health status, and satisfaction with life scales (Bowling, 2009;Sim et al., 2011). Two measures that have been incorporated in the HAPIEE questionnaire were used for this purpose: physical functioning and self-rated health status. Physical functioning was measured by the 10 questions on activities of daily living from the SF-36 questionnaire (Mchorney, Ware, & Raczek, 1993;Ware & Sherbourne, 1992).
Respondents were asked to rate their health over the last 12 months (1 D good/very good; 2 D average; 3 D poor/ very poor). Higher self-rated health scores indicate poorer heath, and a negative correlation with the CASP-19 would be hypothesised. Conversely, physical functioning was rated on a 0À100 scale, with a higher score indicating better physical functioning. CASP-19 should correlate positively with physical functioning. Correlation coefficients between 0.1 and 0.3 are considered low, between 0.3 and 0.5 moderate, and over 0.5 high.
A single-factor model where all 19 or 12 items load directly onto unobserved variable called QoL was tested (Figure 1), followed by a first-order model in which the four domains were included (Figure 2). In the second-order measurement model, the CASP domains are allowed to be dependent upon a single underlying factor, QoL (Figures 3 and 4). The second-order model is applicable when (1) the lower order factors are highly correlated with each other, and (2) there is a higher order factor which is hypothesised to account for the relations among the lower order factors. A second-order factor solution with four domains was proposed for the 19-item scale, and a similar factor structure based on three domains was proposed for CASP-12v.1 and CASP-12v2. In addition, we examined the single-and two-factor structures of CASP-12v.3 as proposed by Sexton et al. (2013) (see Figure 5). The two-factor model is composed of control/ autonomy and self-realisation/pleasure factors, and this includes residual covariances for negative items, to take account of method effect that arises from the direction of wording in the scale items (Marsh, 1996).
CFA was computed using the weighted least square estimator with a mean-and variance-adjusted chi-squared method to handle ordered categorical items as dependent variables in Mplus. Missing data across CASP-19 were handled using full information maximum likelihood estimation. This method computes parameter estimates on the basis of all available data, including the incomplete cases. The procedure works under the assumption that the data are missing at random.

Assessing the degree of model fit
To evaluate overall model fit, three goodness-of-fit indices were calculated. These indices include comparative fit index (CFI), TuckerÀLewis index (TLI), root mean square error of approximation (RMSEA), and weighted root mean square residual (WRMR). According to Hu and Bentler (1999), a CFI value of greater than 0.90 can be expected for a psychometrically acceptable fit to the data. RMSEA is another quantitative index which describes how well the model fits the observed data.

Ethical approval and consent
The study received ethical approval from the local ethical committees in each participating country and at University College London, and all participants gave written consent.

Sample characteristics
Of all the 14,059 retirees (approximately 50%; 5906 males and 8153 females), we restricted inclusion in the study to those aged 50À70 years who answered at least one of the CASP-19 items. There were 449 retirees who did not give any responses to the CASP-19 questionnaire and were, therefore, excluded from the analyses. Also, there were participants younger than 50, who answered the module for retired people. These respondents are most likely to be retired for health reasons. There were also few respondents aged 70 or older but their number was low and they would not well represent this age group. Consequently, 400 subjects who were outside the 50À70 years range were excluded. Thus, the analytical sample consisted of 13,210 individuals (Czech Republic: n D 3782; Russia: n D 3802; Poland: n D 5626).
The baseline descriptive characteristics of the 13,210 individuals with valid data are shown in Table 2. The mean age of participants in all countries ranged between 62 and 64 years for men and 61 and 63 years for women. A large majority of participants were married and had completed vocational or secondary level of education.
There were differences in socio-demographic characteristics between the samples. Notably, Russians reported higher levels of poor/very poor health than Czechs (women 35.0% vs. 14.2%), while also presenting lower rates of very good/good health (women 3.0% vs. 30.3%; P < 0.001). Also, the Russians reported higher levels of self-reported material deprivation. The proportion of participants with university education was higher in Russia, and in all countries it was higher in men than in women. Compared to Czech and Polish women, there was a large proportion of widows and lower proportion of married or cohabiting women in Russia. Four times as many Russian women (29.3%) were widowed, compared to men (7.3%). In regards to economic status, the Russian sample consisted only of non-working pensioners, while proportion of Czech and Poles still working ranged between 7.7% and 13.5%, respectively.
There were significant differences in mean values of CASP between countries (p < 0.001). Men scored significantly higher on the CASP-19 than women in all countries. Polish men and women reported the highest CASP-19 scores (mean score: 38.0 (95% CI D 37.7, 38.4) for men and 36.8 (95% CI D 36.5, 37.1) for women). The lowest CASP-19 scores were reported by Russian men (mean score: 34.5 (95% CI D 32.8, 33.4)). Distribution of CASP-19 was skewed in all countries (skewness: ¡0.96, ¡0.09, and ¡0.28 for the Czech Republic, Russia, and Poland, respectively). In the Czech sample, the median CASP-19 score was 36. Median scores for each CASP-19 sub-scale were as follows: control D 6 (inter-quartile range [IQR] D 4À8), autonomy D 9 (IQR D 7À11), self-realisation D 8 The distributions of responses to each of the items in the CASP are shown in Appendix 2. Most of the participants completed all 19 items (n D 12,692; 93.3%). Missing data were relatively small, with between 0.5% and 6.7% respondents not providing a response to an item. A marked ceiling effect was found in the pleasure domain, with the highest ceiling effect of 67.3% (Czech Republic), 70.4% (Russia), and 76.2% (Poland). Table 3 shows the Cronbach's alpha coefficients for the four CASP scales. CASP-19 scale presented acceptable to good internal consistency coefficients. Cronbach's alpha of CASP-19 total score was 0.84 (Czech Republic), 0.83 (Russia), and 0.86 (Poland). Nearly all CASP subscales had high internal consistency. Self-realisation domain had respectable reliability, with coefficient alpha ranging from 0.73 to 0.75. The pleasure subscale was found to be highly reliable (a D 0.78, a D 0.74, and a D 0.75 for Czech Republic, Russia, and Poland, respectively). However, autonomy domains had particularly low reliability coefficients, which suggests unacceptable reliability (DeVellis, 1991). When the control and autonomy domains were combined together to form the 12-item scale, alpha Correlations between CASP-19 and physical functioning, self-rated health, CESD-20: evidence for construct validity

CASP reliability
The associations of CASP-19 dimensions with physical functioning scales, self-rated health, and CESD-20 are shown in Table 4. Physical functioning (SF-10) and self-rated health scores were moderately correlated with total CASP-19 score in each country. These findings indicated that as the level of physical functioning increases, the QoL increases. Conversely, the level the QoL decreases with increasing levels of depressive symptoms and poor self-rated health. All correlations were significant at p < 0.001.

Confirmatory factor analysis (CFA)
Table 5 presents the goodness-of-fit indices for the three measurement models in each country.
The four-factor solutions for CASP-19 had relatively poor model fit, as illustrated by the goodness-of-fit indices. RMSEA values were all above or equal to 0.10; CFI and TLI values were below 0.90. Although the three-factor second-order model suggested the best fit of all the three models, . Goodness-of-fit indices for the two-factor structure were substantially better than the second-order models (Table 6). Similarly, the single-factor measurement model provided a good fit to the data, suggesting that either single-factor or two-factor models fit the data equally well.
For two-factor measurement models of CASP-12v.3, all item-factor loadings were significant (p < 0.001). Items on the self-realisation/pleasure exhibited strong factor loadings (>0.40) for all the three samples. Four items were below the 0.4 level in the Czech sample (items C1,  Table 4. Correlation coefficients of the dimensions of CASP-19 with physical functioning (SF-10), self-rated health, and CESD-20 depression scale.

Czech Republic Russia Poland
SF-10 Self-rated health CESD-20 SF-10 Self-rated health CESD-20 SF-10 Self-rated health CESD-20 C2, A8, and A9), whereas two items did not reach the recommended 0.4 threshold in the Russian and Polish samples (items C4, A9, and items C1 and A9 in Russian and Poland, respectively) (see Figures 5À7). Item C1 À 'my age prevents me from doing the things I would like to do'À and item A9 À 'shortage of money stops me from doing the things I want to do'À exhibited lower factor loadings than other items among all samples (Czech Republic: 0.28, 0.31; Russia: 0.41, 0.34; Poland: 0.37 0.38 for C1). Moreover, the correlation between the control/autonomy and self-realisation/pleasure factors was significant and very high (Czech Republic: r D 0.89; Russia: r D 0.74; Poland: r D 0.85). This indicates that there may be only one factor underlying the 12 items of CASP.
Consistent with the existing literature, there was little evidence of good fit for the second-order model using CASP-19; RMSEA values were all above or equal to 0.10; CFI and TLI values were below, which indicated unsatisfactory model fit. Our results of CFA suggest that the 'second-order model' has adequate fit to the data for the Czech and Polish samples. CFI and TLI values are greater than 0.9 which is above Hu & Bentler's (1999) cut-off criteria for fit indices. For Russia, the 'secondorder' model had a marginal model fit to the data. These results suggested that the CASP scales could be revised further to achieve better model fit.
It is difficult to compare our results to other CEE/FSU data, due to lack of similar local studies. However, our results of CFA are in agreement with the evidence from UK studies. For CASP-12v.2, the goodness-of-fit indices of the latter two models are of a similar magnitude as that found by Wiggins et al. (2008) (BHPS wave 11: CFI D 0.91; TLI D 0.96; RMSEA D 0.07). Also, our CFI and TLI values for CASP-12v.1 are comparable to Vanhoutte's work on CASP using ELSA wave 1 participants (CFI D 0.94, TLI D 0.93, RMSEA D 0.09) ( Vanhoutte, 2012). In regards to CASP-12v.3, our findings are in accordance with the study by Sexton et al. (2013) (twofactor model: CFI D 0.99, TLI D 0.99, RMSEA D 0.03, WRMR D 1.76).
The Russian data were somewhat less well fit by the proposed measurement models than Czech and Polish data. This discrepancy in results across HAPIEE populations may be attributed to issues surrounding translation artefact, cultural relevance of certain CASP items, and variation in the interpretation of items across respondents of different cultures (Ramirez, Ford, Stewart, & Teresi, 2005). Certain CASP questions may have slightly different connotations in one language than another.
Although the countries of CEE/FSU share some socio-economic and political characteristics, the analysed group of countries is still little heterogeneous in terms of their geography, natural resources, democratic structure, and developmental trajectories. Historically, governments in these countries followed different overall socio-economic transformation policies after the collapse of communism in 1989: shock therapy in Russia and more socialÀliberal approach in the Czech Republic and Poland. There is also divergence in the range of health indicators, such as life expectancy or cardiovascular disease (CVD) trends, socio-economic trajectories, and alcohol consumption patterns in the region. For example, in 2011, the life expectancies at age 45 years in Russia, Poland, the Czech Republic, and the European Union were 28. 6, 33.1, 34.0, and 36.4, respectively (WHO, 2011). In general, CEE countries have better health outcomes than FSU countries. Due to this heterogeneity, the operationalisation of CASP and some items are likely to have different cultural meaning or value for those from CEE and FSU.
The study has a number of limitations. First, the CASP-19 is a self-completed questionnaire. A methodological problem commonly associated with the use of self-report measures, which may have been present in our study, is the inability to determine the extent to which responses accurately reflect the respondents' experiences due to inaccurate recall; respondents for various reasons may under or overestimate their QoL. Second, since the Russian data only comprised non-working subjects, the working pensioners are excluded from the analysis. Consequently, respondents included in the study may not be representative of the whole population and the generalisation of our results may be limited. Thus, future studies with a more heterogeneous group of participants are needed to examine the psychometric properties in more detail. Finally, the data used in this study had been collected in 2002À2005, and the results reflect conditions in these countries at the time of data collection which might be little different from the conditions in these countries now. We, however, believe that 15 years between the start of political and social changes and data collection have been long enough to make these societies more stable and the results are still applicable to current societies in the region.

Conclusion
Despite the above-mentioned limitations, this is one of the first, and the largest study so far on the levels and psychometric properties of CASP in CEE. In conclusion, CASP-12v.3 is a valid and reliable tool for assessing QoL among older adults aged 50 years or older. This version of CASP is recommended for the use in future studies investigating QoL in the CEE populations. Note: Each item was rated on a 4-point scale from 0 (never), 1 (not often), 2 (sometimes), and 3 (often).

Czech Republic Russia Poland
Item Control Autonomy Pleasure Selfrealisation Control Autonomy Pleasure Selfrealisation Control Autonomy Pleasure