The Assessment of Burden of COPD (ABC) Scale: A Reliable and Valid Questionnaire.

Abstract The newly developed Assessment of Burden of COPD (ABC) scale is a 14-item self-administered questionnaire which measures the physical, psychological, emotional and/or social burden as experienced by patients with chronic obstructive pulmonary disease (COPD). The ABC scale is part of the ABC tool that visualises the outcomes of the questionnaire. The aim of this study was to assess the reliability and construct validity of the ABC scale. This multi-centre survey study was conducted in the practices of 19 general practitioners and 9 pulmonologists throughout the Netherlands. Next to the ABC scale, patients with COPD completed the Saint George Respiratory Questionnaire (SGRQ). Reliability analyses were performed with data from 162 cases. Cronbach's alpha was 0.91 for the total scale. Test-retest reliability, measured at a two week interval (n = 137), had an intra-class correlation coefficient of 0.92. Analyses for convergent validity were performed with data from 133 cases. Discriminant and known-groups validity was analysed with data from 162 cases. The ABC scale total score had a strong correlation with the total score of the SGRQ (r = 0.72, p < 0.001) but a weak correlation with the forced expired volume in 1 second predicted (r = -0.28, p < 0.001). Subgroups with more severe disease, defined by GOLD-stage, frequency of exacerbations, activity level and depression scored statistically significantly (p < 0.05) worse on almost all domains of the ABC scale than the less severe subgroups. The ABC scale seems a valid and reliable tool with good discriminative properties.


Introduction
Chronic obstructive pulmonary disease (COPD) is a common preventable and treatable lung disease, characterised by airflow limitation that is not completely reversible (1,2). COPD is a major health concern worldwide and is projected to become the number four most important cause of death, and the 7 th -leading cause of disability-adjusted life years (DALY's) worldwide by 2030, worse than ranks 5 and 11 in 2002, respectively (3).
The Dutch care guidelines for the management of COPD have embraced the concept 'burden of disease' to stimulate a holistic assessment of the burden of COPD as experienced by the patient, followed by more personalized care management (4). A Dutch national expert research team, commissioned by the Dutch Lung Alliance, has defined burden of disease as the physical, psychological, emotional and/or social burden as experienced by the patient (5), and this definition encompasses considerably more than airway obstruction alone. To measure the burden of COPD this definition needed to be operationalised into a measurement scale. A questionnaire was deemed to be the most suitable method for this purpose.
The Dutch national expert research team also formulated a number of conditions this questionnaire had to meet to operationalise the burden of disease concept (e.g., it should provide insight in the (pathophysiologic) impairments, disabilities and complaints due to COPD, it should be based on input by patients and it should be easily manageable). A systematic review of the literature was performed for existing COPD quality of life/health status questionnaires, but none of these encompassed all of the aspects of the definition of the burden of COPD and fulfilled all the conditions formulated by the Dutch national expert research team (5). However, the Clinical COPD Questionnaire (CCQ), a 10-item self-administered questionnaire encompassed most of these aspects and conditions, and was adapted into the new Assessment of Burden of COPD (ABC) scale. This new questionnaire was created by combining the 10 items of the CCQ with the 3 items of the Distress-screener (a short screening tool for early identification of distress) (6), and by adding an item to measure fatigue (5). This resulted in a 14-item scale comprised of 5 domains (Appendix A). Each item can be answered using a Likert scale, ranging from 0 (asymptomatic/no limitation) to 6 (extremely symptomatic/ total limitation). The sub-score of each of the domains and the total score can be calculated by adding the value of the answers and dividing them by the number of items. A low score on the ABC scale indicates a low experienced burden of COPD.
The ABC scale is the core part of the Assessment of Burden of COPD (ABC) tool. This tool visualises the questionnaire scores and adds more objective indicators of the burden of COPD (e. g., smoking status, lung function, exacerbation history) to provide an overview of the individual patient's integrated health status. The questionnaire scores and objective indicators are visualised using balloons. When a patient is doing well on a certain domain, the balloon is shown at the top end in green. When a patient is not doing well, the balloon is shown at the bottom end in red. The scores in between are visualised with orange balloons in different shades. For example, when a patients scores high on the emotion domain, the balloon is shown at the bottom end in red (see Figure 1).
The tool has been developed to guide patients with COPD and healthcare providers into making a personal treatment plan. It facilitates in shared decision making between patients and healthcare providers and formulating a personal goal. The tool is also supposed to be used to monitor a patient's experienced burden of disease, since the scores of previous visit can also be displayed (in grey balloons; see Figure 1, showing a grey balloon at the symptoms domain). This provides the healthcare provider and patient the possibility to discuss progression or deterioration.
Because the scores on the questionnaire determine the position of the balloons, and because four questions were added to the original CCQ, the reliability and validity of the new ABC scale needed to be assessed. The researchers aimed to evaluate the ABC scale by assessing the internal consistency of the total ABC scale and its individual domains, the test-retest reliability of the ABC scale and the construct validity of the ABC scale.

Study design
This research was part of a large two-armed randomised controlled trial assessing the effectiveness of the Assessment of Burden of Chronic Obstructive Pulmonary Disease (ABC) tool; the study protocol of this cluster randomised trial has been published elsewhere (7). The study was approved by the Medical Ethics committee of Atrium-Orbis-Zuyd hospital, The Netherlands. This research is performed with data collected in the practices of 19 general practitioners and 9 pulmonologists in the Netherlands, who were included in the intervention group of the study.

Setting and subjects
The study population consisted of patients recruited by primary and secondary care providers spread across the Netherlands between 01-02-2013 and 01-10-2013. Inclusion criteria were spirometrically confirmed diagnosis of COPD (post bronchodilator forced expiratory volume in 1 second (FEV 1 ) / forced vital capacity (FVC) < 0.7), age > 40 years, and the ability to understand and read Dutch. Exclusion criteria were: exacerbation of COPD less than 6 weeks ago, hard drug addiction, life-threatening co-morbid condition, and pregnancy.
All patients provided written informed consent prior to participation in the study.

Questionnaires
Next to the ABC scale, all patients completed the Saint George's Respiratory Questionnaire (SGRQ) (8), and the Hospital Anxiety and Depression Scale (HADS) (9).

ABC scale
The ABC scale is a 14-item self-administered questionnaire which has recently been developed to measure the burden of COPD (5). It has a total score and 5 domain scores: symptoms (4 items), functional status (4 items), mental status (2 items), emotions (3 items) and fatigue (1 item). All scores range from 0 to 6 (0 = no burden of disease, 6 = highest possible burden of disease).
The ABC scale was completed on paper in the waiting room of the practices of the health care providers, at two occasions with an interval of 2 weeks, without supervision. Treatment did not change between these two measurements. With regard to handling missing data, the original rules of the CCQ (10), of which the ABC scale is largely composed, were followed and similarly extended to the additional two domains. No missing data was accepted for the mental status and fatigue domains; for the other domains one missing value was tolerated. The score for a single missing item from a subscale was inferred by using the mean of the remaining items in the specific domain of that patient. If the tolerated amount of missing data was exceeded, the domain was judged as invalid, and total score could not be calculated.
The total score of the ABC scale was calculated by adding the sum scores of the five domains and dividing them by 5. The domain scores were calculated by adding the items of the domain and dividing them by the number of items of that domain.

SGRQ
The SGRQ (8), a 50-item disease-specific health status questionnaire, consists of three different subscales: symptoms (8 items), activity (16 items) and impact (26 items), and includes specific item-weights which can be summated and divided by the maximum score to obtain a total score. Each of the sub-scores and the total score range from 0 to 100 (0 = no impairment). Missing data were handled as described in the SGRQ manual (11).

HADS
The HADS, a 14-item screening scale for anxiety and depression comprises two subscales: anxiety (7 items) and depression (7 items). This questionnaire was originally developed for hospital outpatient settings (9), but proved valid also for use in general practice patients in the Netherlands (12,13). All item-scores range from 0 to 3 (0 = no signs of depression/anxiety). The depression subscale-score is calculated as the sum of the depression item scores, and a cut-off score of 8 or higher was used to discriminate between patients with depression or borderline depression and patients with no depression (13). Missing data was handled as described in the HADS manual (14). The SGRQ and HADS were completed at home, on paper, or online (as chosen by the patient).

Other measurements
Lung function parameters (FEV 1 (in mL and as percentage (%) of predicted) and FVC) were measured by the health care providers using a spirometer according to the appropriate clinical guidelines (1). Data concerning the number of COPDexacerbations in the past year were recorded by the health care providers using the electronic registration program of this study (7).

Reliability
Reliability can be divided into internal consistency and testretest reliability (15). To determine internal consistency, Cronbach's alpha (16) was calculated for the total ABC scale and for each of the five subscales. Since Cronbach's alpha increases with the number of items of a scale, α ࣙ 0.90 for the 14-item ABC scale was deemed acceptable, as compared to α ࣙ 0.70 for each of the subscales (17).
The 2-week test-retest reliability was expressed in terms of the intraclass correlation coefficient (ICC). The first measurement was 2 weeks before baseline (T0). No changes in therapy were made between these two moments. A good reliability (ICC ࣙ 0.9) (18,19) between both assessments (t = 0 and t = −2 weeks) was hypothesized for the total score of the ABC scale.

Construct validity
Construct validity is defined as 'the degree to which the scores of an instrument are consistent with hypotheses based on the assumption that the Health Related-questionnaire validly measures the construct to be measured (15). The assessment of construct validity was subdivided into convergent validity, discriminant validity and known-groups validity.
For the assessment of convergent validity, it was hypothesized that the ABC scale would show a strong correlation with an established COPD quality of life questionnaire. It would be ideal to compare the ABC scale with a gold standard (20). However, since no gold standard for measuring burden of COPD currently exists, the correlation between the total score of the ABC scale and the total score of the SGRQ (8) was calculated, as is common practice when assessing validity of a COPD quality of life questionnaire (21)(22)(23). As both the ABC scale and the SGRQ can be regarded health status instruments, it was hypothesized these would show a strong correlation (r ࣙ 0.70) (24) when evaluated by Pearson's correlation coefficient. Since both questionnaires (SGRQ and ABC scale) were administered at different locations at different time points, patients were only included in the analyses if the questionnaires were completed within an interval of one month.
For assessment of the discriminant validity, a weak correlation was presumed with lung function (i.e., FEV 1 % predicted), similar to findings of earlier studies (25)(26)(27)(28)(29). It was hypothesized that Pearson's correlation coefficient would be ࣘ0.35 (24). In the case of non-normally distributed data, Spearman's rank correlation coefficient was calculated and an equally weak correlation was assumed.
To test known groups validity, four pairs of groups were created based on (1) the number of exacerbations in the past year (ࣙ 2 vs < 2 per year), (2) the disease stage, as defined by the  It was hypothesized that: patients with a history of frequent exacerbations would show a statistically significant higher (i.e., worse) score on the symptoms domain of the ABC scale, patients with a higher GOLD classification would show a higher score on the domains functional status and fatigue of the ABC scale, less active patients would score higher on the domain functional status and depressed or borderline depressed patients would show a higher score on the mental state and emotional domains than their respective counterparts. Differences between each group in the known groups analysis were evaluated by performing independent-sample t-tests. In case of non-normally distributed data, the Mann-Whitney U-test was used. A p-value < 0.05 was considered statistically significant. The software used for statistical analysis was IBM SPSS Statistics version 21.0.

Results
A total of 173 patients from 19 primary care practices and 9 pulmonology outpatient hospital practices contributed data for the study. Since there were eleven cases missing in one or more variables, analyses for the internal consistency were performed on 162 complete cases. Characteristics of the study population are presented in Table 1. The number of patients that completed both measurements of the ABC scale was 137. Thus, the testretest reliability analysis was performed with these 137 cases. Analyses for the convergent validity were performed with 133 complete cases who completed the ABC scale and the SGRQ within an interval of one month. Discriminant validity and known-groups validity were analysed with the complete 162 cases.
At baseline, patients' total score on the ABC scale ranged from 0.0 to 4.6. The scores of the subdomains ranged from 0.0 to 5.0 in the functional status domain, and 0.0 to 6.0 in the other domains.

Test-retest reliability
The ICC was 0.92 (95% CI: 0.88,0.94) for the total score of the two consecutive ABC scales taken two weeks apart.

Convergent and discriminant validity
The ABC scale total score and the total score of the SGRQ showed a significant correlation of r = 0.72 (95% CI: 0.61,0.86; p value < 0.001). The ABC scale score and the FEV 1 (% predicted) showed a significant, but weak, correlation (r = −0.28 (95% CI: -0.43,-0.13; p value < 0.001). A table including correlations between sub-scores of the domains is available for reference in Appendix B.

Known group validity
Results of the known groups analyses are presented in Table 2. Patients with a history of frequent exacerbations showed a significantly higher score on the symptoms domain of the ABC scale. Patients with a higher GOLD classification had a significantly higher score on the domain functional state, but not on the domain fatigue. In-active patients had a significantly higher  score on the domain functional state. Patients with depression or borderline depression had a significantly higher score on both the mental state and emotions sub-domains.

Discussion
This study shows a good internal consistency for the total ABC scale and its' individual domains. Furthermore, the test-retest reliability showed an excellent correlation (ICC = 0.92) between the measurements with a 2-week interval. As hypothesized, the ABC scale correlated strongly with the SGRQ (r = 0.72) and weakly with the FEV 1 % predicted (r = -0.28) (27,30). Even though the different domains of the scale had good discriminative properties (compared by median and mean scores), in accordance with our hypothesis, this was not the case for the fatigue domain when patients were grouped according to their GOLD stage. Patients in GOLD stages 3 and 4 did not have a significantly higher score on the fatigue domain than the patients in GOLD stages 1 and 2. Additional analyses however showed a significantly higher score of the fatigue domain in the inactive patients, in the frequent exacerbators and also in depressed and borderline depressed patients, consistent with other studies (31-34) (see Table 2). The ABC scale is the first questionnaire that attempts to fully comply with the definition of burden of disease as the physical, psychological, emotional and/or social burden as experienced by the patient. It is an adaptation of the CCQ, with the addition of the domains emotions and fatigue (5). The addition of the two domains did not change the outcomes of the validity and reliability analyses, as the reported values and correlation coefficients for reliability and validity of the ABC scale were comparable with the values reported in articles assessing the psychometrics of the CCQ (21,23).
The ABC scale is not developed to replace the CCQ in the measurement of health status in daily clinical practice. The ABC scale is developed to be an important part of the ABC tool. This tool encompasses a visualization of the scores in terms of a balloon-diagram and an algorithm that gives personal treatment advice based on the integrated health status of the patient, defined as the ABC scale score and the additional items; smoking status, exacerbations, dyspnoea [as evaluated by the MRC scale (35)], body mass index (BMI), lung function parameters, and physical activity. Use of the ABC tool facilitates more personalized treatment of COPD. The ABC tool and its relation to the ABC scale are described elsewhere (5).

Strengths and limitations
One could argue about the generalizability of the study population. There were relatively few patients with GOLD stage 1 and relatively many GOLD 3 patients as compared to the general COPD population of the Netherlands (8.6% vs. 28% and 30.9% vs. 15%, respectively) (36). The most obvious reason for this imbalance is that almost half of the patients were recruited from pulmonologists' practices and therefore had worse lung function parameters, while the majority of the general COPD population of the Netherlands are treated by their primary care physician. This is unlikely to have influenced the results, since there was an adequate number of patients from each of the GOLD stages 1-3, and, as shown, FEV 1 on which the GOLD stages are based does not correlate well with burden of COPD.
A possible limitation of this study is the different locations at which the questionnaires were filled out. Due to the design of the study, the SGRQ and the HADS were filled out at home, while the ABC scale was filled out in the waiting room of the healthcare providers practice (but without supervision), also resulting in a different time of administration. Since health status changes over time, it would have been ideal to complete both the SGRQ and the ABC scale at the same time point for assessing their correlation. Due to this different time points of administration we only included patients in the analyses with a difference in administration-time of no longer than one month. Analyses with the full sample (n = 162) yielded similar results as the sub-sample analyses. The different locations could also have resulted in a different completion behaviour. However, patients came to the practices for their regular monitoring-appointment and not specifically for the study. Furthermore, this applied for all patients participating in the study.
Another possible limitation of this study is the lack of monitoring the patients' clinical status, in the 2 weeks between the two measurements to perform the test-retest analysis. We therefore did not have information about whether patients were clinically stable between those 2 weeks.
A major strength is the small number of missing data in the SGRQ and HADS. There were 173 patients who filled out questionnaires, and because only 11 cases missed one or more baseline variables, analyses could be performed with 162 cases. This can partly be explained by the fact that approximately one third of the patients completed the SGRQ and the HADS in an online program, which ensured that patients were unable to continue the questionnaire if questions were left unanswered.
There was a larger amount of missing data for the first ABCscale that was specifically added to assess the 2-week test-retest analysis (36 cases). This can partly be explained by the two visits to the practice required to fill out both ABC-scales. Not every patient filled out this first ABC-scale. Another 10 of the first ABC scales were lost while being mailed to the researcher. However, there is no reason to believe there is a systematic error due to missing data in all of the analyses.
Another important strength of this study is its sample size (n = 162); this is large as compared to the validity testing of the CCQ (n = 119) (23), and especially compared to the test-retest reliability (ABC scale n = 137, CCQ; n = 20). Furthermore, the sample is highly representative for the Dutch COPD population, because the data originated from both primary and secondary care patients, distributed all over the Netherlands, which decreases any possible regional influence.

Implications for clinical practice/research
This study shows that the ABC scale has good reliability and validity. It can be deployed within the ABC tool to measure the health status of the patient, to provide insight in the personal burden of COPD and to support the development of a personalized treatment plan to decrease this burden. Together with the ABC tool it can be used to facilitate the dialogue about burden of COPD and to evaluate the effectiveness of the interventions.
The responsiveness of the ABC scale needs to be determined to see how the ABC scale score reflects change over time and the effect of treatment interventions. This information is essential for usability in daily practice. A study evaluating these characterizations is currently ongoing. Additional research will be performed to determine the importance of the items and domains of the ABC scale from a patient's perspective, by means of a Discrete Choice Experiment.

Conclusion
The ABC scale is a 14-item self-administered questionnaire that measures the burden of disease in patients with COPD. Data presented in this study supports the reliability and convergent and discriminant validity of the ABC scale for use in patients with COPD both in primary care and in secondary care. Known groups analyses showed the ability of the ABC scale to differentiate among different groups of patients. Additional studies regarding the responsiveness and the minimally clinical important difference, are needed before it can be implemented in daily care as a method towards more personalized COPD care.