Disability weights from a household survey in a low socio-economic setting: how does it compare to the global burden of disease 2010 study?

Background The global burden of disease (GBD) 2010 study used a universal set of disability weights to estimate disability adjusted life years (DALYs) by country. However, it is not clear whether these weights can be applied universally in calculating DALYs to inform local decision-making. This study derived disability weights for a resource-constrained community in Cape Town, South Africa, and interrogated whether the GBD 2010 disability weights necessarily represent the preferences of economically disadvantaged communities. Design A household survey was conducted in Lavender Hill, Cape Town, to assess the health state preferences of the general public. The responses from a paired comparison valuation method were assessed using a probit regression. The probit coefficients were anchored onto the 0 to 1 disability weight scale by running a lowess regression on the GBD 2010 disability weights and interpolating the coefficients between the upper and lower limit of the smoothed disability weights. Results Heroin and opioid dependence had the highest disability weight of 0.630, whereas intellectual disability had the lowest (0.040). Untreated injuries ranked higher than severe mental disorders. There were some counterintuitive results, such as moderate (15th) and severe vision impairment (16th) ranking higher than blindness (20th). A moderate correlation between the disability weights of the local study and those of the GBD 2010 study was observed (R2=0.440, p<0.05). This indicates that there was a relationship, although some conditions, such as untreated fracture of the radius or ulna, showed large variability in disability weights (0.488 in local study and 0.043 in GBD 2010). Conclusions Respondents seemed to value physical mobility higher than cognitive functioning, which is in contrast to the GBD 2010 study. This study shows that not all health state preferences are universal. Studies estimating DALYs need to derive local disability weights using methods that are less cognitively demanding for respondents.


Introduction
Disability weights are used to quantify time of healthy life lost due to disability and represent the severity of health loss on a scale of 0 (full health) to 1 (death) (1). They are used to calculate the morbidity component of disability adjusted life years (DALYs), which allows mortality and morbidity outcomes to be combined into a single metric, allowing for a comprehensive assessment of health conditions. The DALY was first introduced in the 1990s and has been used by the global burden of disease (GBD) group to estimate global population health ever since (1). Disability weights are based on valuations of health states, which are short, lay descriptions of health with no accompanying disease label; they can encompass the effects of several diseases, for instance, blindness can be the effect of diseases such as diabetes and stroke.
In order to derive disability weights, it is necessary to define the group whose responses should be elicited, the method of health state valuation measurement, and the health state descriptions (2). Study participants consist of  either health experts, patients experiencing the health effects, caregivers, or the general public (3). There are various methods used to elicit health state responses. Some of these require study participants to make trade-offs either in time (TTO) or person-years (PTO). Another method is the standard gamble approach, which requires specifying a risk of death against improvement in health, and other options are the pairwise trade-off and willingness to pay valuation methods, which involve ranking health states against each other (2,4). A health state description can describe a disease condition in generic or disease-specific terms (2). The various methodological options used to derive disability weights could result in differing estimates for the same health states and thus influence the overall burden of disease estimates for specific disease conditions (2).
In 1996, the GBD study constructed disability weights based on the valuation of different health states by health experts (1). These were used as a universal set of disability weights in the 1996 GBD DALY and also in subsequent GBD estimates (5,6).
The 1996 study as well as subsequent GBD studies have been criticised for excluding the valuation of health states by the broader community and therefore not reflecting a global understanding of health (7). Studies have indicated that the weights are not universal. For instance, Jelsma et al. (8) compared the ranking of the 22 indicator conditions used in the 1996 GBD between health professionals and non-professionals. They reported that the rankings of the indicator conditions of Zimbabwean health professionals were very similar to those of the GBD (Spearman's r 00.912) but that the correlation between health professionals and the lay public was much lower (r 00.341). They concluded that the weights represented the values of a small group of educated elite, rather than those of society as a whole. Ustü n et al. (9) compared the ranking of 17 health conditions from 241 informants (health professionals, policy makers, people with disabilities, and their carers) across 14 countries. They found significant differences (pB0.05) in rankings between countries for 13 of the 17 health conditions, whereas 5 health conditions were ranked significantly different (p B0.05) between the different informant groups. The findings of these studies indicate that the disability weights derived by the 1996 GBD study may not be universally applicable.
The GBD 2010 study sought to address this critique by undertaking a large-scale re-estimation of disability weights. The researchers assessed health state preferences of 220 unique health states of the general public through population-based surveys (10). A household survey was undertaken in five countries (Bangladesh, Indonesia, Peru, Tanzania, and the United States) as well as a web-based survey to elicit responses from populations diverse in language, culture, and socio-economic status (10).
Respondents in the household surveys were given paired comparison questions of health states whereas the respondents in the web survey were assigned to one of four different survey versions, which included paired comparison and population health equivalence questions. To assess country differences in health state valuations, paired comparison health state valuations were combined from all the pooled analyses and compared to each household survey. A high degree of consistency in ranking was found between sites (r 00.90 or higher) for all except Bangladesh (r 00.75). Based on this finding, the authors disputed the hypothesis that health valuations vary widely across cultural, educational, and environmental circumstances.
Although the GBD 2010 study was intended to assess a diversity of respondents, 57% had tertiary education and 17% had only primary education or less. In addition, more than 50% of the respondents participated in the web survey of which 93% had tertiary education and might have had prior experience with the GBD approach because of the type of recruitment used to attract the web participants. This suggests that the study participants may not have been as diverse as intended and that the five selected sites were perhaps not sufficient to represent the global population.
The South African team approached the GBD researchers because they wished to replicate the GBD study in the small suburb of Lavender Hill in Cape Town. Local researchers were trained by the GBD team and were allowed to utilise the GBD methodology and data collection instruments. However, as the South African sample was not part of the original global sample and the sample size was much smaller, data from the Cape Town sample were not included in the overall GBD 2010 analysis.
The GBD improved its methods in deriving disability weights for its 2010 study, but has not been able to refute all criticisms related to the universality of the weights (11,12). There is still a need to interrogate to what extent the weights represent the preferences of disadvantaged and less-educated communities in particular. The existence of data collected using almost identical methodology to the main study allowed us to compare the global health state preferences with those representing the resource-deprived community of Lavender Hill in Cape Town.
The GBD study assumes that health preferences are generalisable across different populations despite differences in socio-economic status, and cultural and political beliefs (10). This hypothesis was explored by comparing the disability weights of this study to those in the GBD 2010 study.

Methods
As explained here, the methodological design and tools used in this study were conceptualised by researchers of the GBD 2010 study, led by Joshua Solomon (10). The South African data collection team was trained by the GBD researchers, and the software developed by the GBD team was used to gather the local data. The method was similar to that used in the GBD 2010 study with the following important exceptions. Firstly the local study used the TTO valuation method to assess the indicator conditions whereas the GBD 2010 study used a population health equivalence method. The local study assessed 51 health states compared with 220 health states in the GBD 2010 study. In addition, the local study used household personal interviews to elicit responses, whereas the GBD 2010 study used household personal interviews as well as telephonic interviews and a web survey method.

Study design and research setting
A household survey was conducted in the resourcedeprived community of Lavender Hill, Cape Town, whose residents are bilingual in English and Afrikaans. Census data indicate that this suburb has an approximate population of 32,000, with 19% of people aged 20 years and older having completed Grade 12 or higher, 58% of the labour force being employed, and 59% of households having a monthly income of R3,200 ( ÂUS $228) or less (13). The dwellings consist of small apartments, houses, and informal settlements each representing about onethird of all dwellings in the area.

Sampling
An aerial map was used to divide each dwelling type into approximately equal numbers of clusters from which two clusters were selected for each type. Four streets were randomly selected from each cluster and every third dwelling in the street was visited until 20 eligible adults were found. An adult aged 18 years and older was randomly chosen from each household using a statistical package for the social sciences (SPSS) computer algorithm, designed by the GBD team, after obtaining information on the sex and age of each household member.
Sample size calculation A sample of 700 respondents and 51 health states were assessed to have a margin of error not higher than 0.7 at the 95% confidence interval by simulating the mean relative error against a benchmark of 2,500 respondents and 100 health states.

Data collection procedures and instruments
Data collection occurred between September 2009 and March 2010. Face-to-face interviews were conducted in English by trained interviewers and facilitated by a computer-assisted personal interview programme (CAPI) created by the GBD researchers, with survey questions and response options displayed one at a time in the appropriate order on a laptop screen. Participants did not receive any remuneration for their participation. Interviews lasted an average of 19 min and 24 sec, whereas it took 8 min and 24 sec on average for all the pairwise comparison questions, and 1 min and 12 sec for the TTO valuation.
The CAPI programme was used to randomly choose the different health state valuation questions. It also contained questions on demographics, individual and household assets, marital status, and education.
The health states were presented to respondents as brief lay descriptions, using non-clinical vocabulary that highlighted the symptoms and functional consequences of each health state. These descriptions were the same as those used in the GBD 2010 study (10). Two health state valuation techniques were used to assess health state preferences. Firstly, in a pairwise health state trade-off, respondents were presented with two descriptions of hypothetical people each with a different health state and asked which they thought was the healthiest. Each respondent completed 15 pairwise comparisons randomly selected from a possible 51 health states, which were extracted from a list of 107. Secondly, a TTO health valuation was used to assess the 10 health states used as indicator conditions with each respondent required to valuate one indicator condition. In a TTO, respondents are asked to choose between living 10 years with a health state with some mental or physical limitation, and living a shorter period without any limitation. Because hypothetical scenarios were used in both the pairwise trade-off and TTO valuation techniques, no sensitive information was collected. The fieldworkers recorded all answers on the SPSS computer program, and all data were consolidated on a central server.

Pilot study
The questionnaire was piloted in the study population prior to the commencement of data collection to assess whether respondents were able to understand the questions. In addition, a test was done to assess the cognitive ability of the study participants. It was found that the methodology was feasible.

Ethical considerations
The study was approved by the Health Research Ethics Committee of the University of Cape Town. Informed consent was obtained from all participants.

Analysis
Analysis was performed using Stata/IC 12.0 and Microsoft Excel 2010. Descriptive analysis was undertaken for the variables sex, age, marital status, and education. Marital status distinguished between those currently married, those who had been married and were divorced, widowed or separated, and those who had never been married. Divorced implied a legal separation, whereas separated implied living apart without a legal process having been followed. Living together implied living as man and wife without having gone through the process of legal marriage. Education distinguished between the different levels of schooling, that is, primary, secondary, higher education, and no schooling. Some primary or some secondary refers to respondents who had attended primary or secondary school without completing the highest grade.
Disability weights were derived for the 10 health states used as indicator conditions. This was done by deriving the disutility (1-year/10) of the upper and lower limit of each health state. The disutilities were then logit transformed, as was done in the GBD 2010 study, because it allows for normally distributed error (10). The logit-transformed disabilities were then used to conduct an interval regression. Respondents were given the option of choosing between living 3, 5, or 7 years in perfect health instead of 10 years with a particular health state. However, as there was a possibility of respondents choosing values in between those presented to them, an interval regression was used. The resulting coefficients were then back transformed onto the 0 to 1 disability weight scale.
To analyse the paired comparison valuations, a probit regression was used to assess the relative difference in severity of health states, following the GBD 2010 approach (10).
Rescaling of the probit coefficients was assessed using two methods. Firstly, a linear regression against the logittransformed disability weights of the indicator conditions was run, with the resulting slope and intercept used to transform the coefficients onto the disability weight scale. Secondly, a lowess regression of the probit regression coefficients against the logit-transformed disability weights of the GBD 2010 (10) study was run, following the method used in the study by Haagsma et al. (14). The predicted smooth coefficients were then back transformed to yield disability weights onto the 0 to 1 scale. The probit coefficients were then linearly interpolated between the upper and lower limit of the disability weights.
These disability weights were compared with the disability weights of the GBD 2010 study, using a Pearson's correlation coefficient with significance set at pB0.05.
A 95% confidence interval for each disability weight was derived by lowess regression of the upper and lower limit of

Results
Of the 741 people selected for interviews, 62 refused to answer any questions and 2 partially completed the health valuation questions. The results are therefore based on the answers of the 677 respondents who completed all the questions. The age range of the respondents was between 18 and 81 years, with an average age of 46 years; 59% were female; 94% were between the economically active ages of 18 and 65 years; 35% were married; 21% had completed Grade 12; and, of these, 3% had studied further (Table 1). On average, respondents chose spinal injury at the neck level as the most severe condition with a disability weight of 0.700 using the TTO valuation method, whereas moderate angina pectoris had the lowest disability weight of 0.029 ( Table 2). The indicator conditions were not used for rescaling because they showed a negative correlation with the probit coefficients, which results in disability weights that are inversely related to the severity of health states as indicated by the probit coefficients. The lowess approach was therefore used as an alternative rescaling procedure. Table 3 displays the disability weights of the health states used in the pairwise comparison valuation method by health state domains. Heroin and opioid dependence had the highest disability weight of 0.630, followed by severe brain injury (disability weight 0.536), whereas severe intellectual disability ranked lowest (disability weight 0.04). Untreated injuries such as amputation of one leg (disability weight 0.504) and fracture of the radius or ulna (disability weight 0.488) ranked high at third and fourth, respectively. The health states with the highest disability weights seem plausible, as they have higher severity. Similarly, health states with milder severity such as primary (disability weight 0.047) and secondary infertility (disability weight 0.108) and periodontitis (disability weight 0.175) ranked low at 50th, 48th, and 46th, respectively. However, some of the results seem counterintuitive such as severe and moderate vision impairment ranking higher than blindness. Respondents chose the first health state mentioned in a paired comparison as the healthier option 53% of the time, while the first health state was also selected 53% of the time when respondents were asked to decide between health states with similar severity.
The Pearson's correlation coefficient between the disability weights of the local study and those of the GBD study was 0.44 (p00.0015; Fig. 1). The correlations between the disability weights within each health state domain were poor except for the mental, behavioural, and substance-use domain which had a Pearson's correlation coefficient of 0.66 (p!0.05) and the musculoskeletal disorders domain (r 00.79, p!0.05). The point estimates of the local study were within the 95% confidence interval of the GBD 2010 study for 25.5% of all health states, whereas 58.8% were higher than the upper bound and 15.7% were below the lower bound of the GBD 2010 uncertainty interval.

Discussion
This is the first study to our knowledge that has attempted to derive disability weights for a range of health states from the general public living in a resource-deprived community in South Africa. The health states with the highest and lowest disability weights seem plausible; however, there were also some counterintuitive results. The overall correlation of the disability weights between the local and GBD 2010 study was moderate but statistically significant, indicating a relationship between the disability weights of the two studies. However, there was considerable variability in the disability weights for selected conditions, such as ear pain (0.486 in local study and 0.013 in GBD 2010), untreated fracture of the radius or ulna (0.488 in local study and 0.043 in GBD 2010), and amputation of one leg, untreated (0.504 in local study and 0.173 in GBD 2010). The correlation coefficient would have been stronger if the ranking of certain health states was not as counterintuitive. For instance, moderate dementia ranked lower than mild dementia, severe alcohol-use disorder ranked lower than moderate-use disorder, and blindness ranked lower than both severe and moderate vision impairment. The GBD assertion of universality of health state preferences is not entirely supported by the results of this study. This finding therefore raises questions about circumstantial factors that may influence perceptions of health states.
Untreated injuries ranked particularly high, whereas severe intellectual disability ranked lowest, which might indicate that local respondents value physical mobility higher than cognitive functioning. In the GBD 2010 study, mental disorders such as acute schizophrenia (disability weight 0.776) and severe major depression (disability weight 0.658) had among the highest disability weights, whereas injuries such as amputation of one leg without treatment ranked much lower with a disability weight of 0.173. These differences might be contextual because the South African research site is an impoverished community whereas most respondents in the GBD 2010 were from the United States and Australia, and had tertiary education and high living standards. There have been other studies which suggest that contextual factors influence differing health state preferences. In a review of disability weight studies, Haagsma et al.   (17). A health state valuation may depend on the social stigma or disruption to social life for one person, whereas another might valuate the same health state on the basis of loss of working ability or time loss (17). The wording of the health state descriptions might be important in the understanding of the severity of health states (12). The descriptions for health states within the mental, behavioural, and substance-use disorders domain included the cause of the health state, for example, 'drinking of alcohol', whereas this was excluded from most other health state descriptions. This could have made it easier for participants to relate to the health state, which might explain the good correlation of health states within this domain.
The overall approach used to valuate health states may also not be effective in producing reliable results in all settings. The techniques used in valuating health states can be cognitively demanding (18) with respondents usually not familiar with the method, and the design often does not allow enough time to reflect on the health state choices made (19). In addition, the task, which involves making multiple complex choices, is quite strenuous (17). The counterintuitive results observed in the local study, such as severe and moderate vision impairment ranking higher than blindness, might suggest that the health state descriptions were not well understood. There was no repetition of the first and last paired comparison questions, which could have given an indication of whether the methodology was well understood by the respondents.
The GBD study group recently published new disability weights for their 2013 analysis of the global disease burden (20). Changes have been made to some health state descriptions to add consistency in wording and additional health states have been added. The study pooled the GBD 2010 disability weights with disability weights from a study conducted in five European countries (14). The health states that were the same in the GBD 2010 and GBD 2013 study showed a high degree of correlation with a Pearson's correlation coefficient of 0.992. The GBD 2013 study involved double the number of respondents to the 2010 study but they were mostly from high-income countries. Hence, more studies from low-resource settings are needed to test the GBD assertion regarding the universality of their disability weights.  Limitations A possible limitation of this study is that the survey questionnaire was presented in English, whereas the home language of some participants was Afrikaans. However, during pilot testing respondents showed good understanding of the questions and the ability to reason rationally. The health state rankings may also have been influenced by the sample size which may not have allowed for all possible pairs of health states to be sufficiently compared in the pairwise comparison health state valuation. Another limitation is that only one TTO exercise was assigned to each respondent. The cognitive difficulty of a TTO might require more than one exercise before an understanding of the concept is developed.

Conclusion
In conclusion, this study is unable to refute the claim that health state preferences are universal although it does show differences in the preference of health states between the local and GBD study. A universal set of disability weights might be preferable for comparing DALYs between countries; however, the counter argument is that empirical disability weights are needed to better represent DALY estimates for each country. Although countryspecific disability weights would be ideal, the current methods used to assess health state preferences by the GBD group might not be feasible in all settings. To derive empirical disability weights in low socio-economic settings might require methods that are less cognitively demanding for respondents. A visual analogue scale might be the easiest health valuation method to explain to respondents. However, validation studies that test for the best methods in such settings are advisable. The DALY is a valuable tool as it gives a comprehensive picture of morbidity and mortality for different diseases using a single metric allowing easier decisions regarding resource allocation towards diseases with high burden (21,22). However, accurate estimates of mortality and morbidity are also needed in addition to disability weights, especially in resource-constrained countries such as South Africa.