Comparison between Swedish EORTC QLQ-C30 general population norm data published in 2000 and 2019

Abstract Background Normative health-related quality of life (HRQoL) data from the general population are regularly used to facilitate the interpretation of HRQoL as reported by cancer patients participating in cancer clinical trials, especially when conducting long-term follow-up studies after treatment. The aim of the present study is to compare two Swedish normative data sets, published in 2000 and 2019 respectively, and explore whether HRQoL as reported by the Swedish general population has changed over time. Material and Methods ‘Sample 2000’ was comprised of normative data from the Swedish general population who responded to the EORTC QLQ-C30 in a Swedish mail survey in 1999 (n = 3069). ‘Sample 2019’ consisted of data from the Swedish general population collected as part of a European norm data study using online panels, published in 2019 (n = 1027). Data were analyzed stratified by sex and age (40–49; 50–59; 60–69; 70–79 years). Results For most of the subscales and single items, no age group differences between the two samples were found, with the exception of the oldest age group (70–79 years), where Sample 2019 generally showed better HRQoL as compared to Sample 2000. Lower (worse) levels of Global quality of life and higher (worse) levels of Dyspnoea were found in Sample 2019 for most age groups. Conclusion There were no differences found between the samples for most EORTC QLQ-C30 subscales and single items, with the exception of the oldest age group of both sexes in Sample 2019 who reported better HRQoL on many variables. When deciding which normative dataset to use, the mode of data collection and age group have to be considered.


Background
Normative health-related quality of life (HRQoL) data from the general population are regularly used to facilitate the interpretation of HRQoL as reported by cancer patients participating in cancer clinical trials.The normative data are particularly interesting when conducting long-term follow-up after treatment in order to evaluate recovery back to a level regarded as 'normal'.Normative data for the European Organization for Research and Treatment of Cancer core Quality of Life Questionnaire (EORTC QLQ-C30) from the Swedish general population have been collected and published at three time points between 2000 and 2019 [1][2][3].In the first publication, Michelson et al. (2000) [1] presented data according to sex and six age groups.In the second publication by Derogar et al. (2012), four age groups were presented, corresponding to four of the groups in the first publication [2].In 2019, Swedish normative data were published in connection with a large-scale study that was aimed at defining the European Norm for the EORTC QLQ-C30 based on 11 European countries.The latter data set presented the normative data according to the same age groups as used in the publication from 2000 [3].
It is uncertain whether the HRQoL of the general population changes over time and, therefore, previously published norm values [1,2] may or may not mirror the norm values of the Swedish general population today.Hence, there is a need to compare the previously published normative data with the latest publication in order to establish values that can be used for comparisons with cancer patients in current and future studies.
The aim of the present study is to compare two Swedish EORTC QLQ-C30 general population normative data sets, published in 2000 and 2019 respectively, and explore whether the HRQoL as reported by the Swedish general population has changed over time.

"Sample 2000", the Swedish general population normative data from 2000 [1]
Questionnaires were sent via mail from Karolinska Institutet in 1999 to a random sample of 4008 adults of the Swedish general population, stratified by age (six age groups: 18-29, 30-39, 40-49, 50-59, 60-69, 70-79 years).The sample was drawn from a population-based registry (SEMA), including all Swedish inhabitants born between 1918 and 1979.The postal package also contained an information letter and a return envelope.One reminder was sent after two weeks, and a second reminder was sent, together with a new set of questionnaires, after one month.No reimbursement was offered.A total of 3069 (78%) persons responded.The distribution of sex, marital status, education, income, and employment in Sample 2000 have been described in detail in the original publication [1].The distribution of co-morbidities in this sample according to age has also been presented earlier [4].

"Sample 2019", the Swedish general population normative data embedded in the EORTC QLQ-C30 European norm data study from 2019 (study number: 1519) [3]
In 2017, the EORTC Quality of Life Group (QLG) collected norm data from 11 countries of the European Union (EU), including Sweden, to define the European Norm for the EORTC QLQ-C30 [3].The samples were stratified by sex and age (five age groups: 18-39, 40-49, 50-59, 60-69 and �70 years).The target sample size was 1000 individuals in each country.Data collection was subcontracted to the panel research company GfK (www.gfk.com) which used internet panels representative of the general population with access to the internet.For the present paper, the Swedish norm data sample (n ¼ 1027) were requested from the EORTC Quality of Life Department.According to the original publication, GfK estimated the response rate to be between 79% to 90%, as the panel members were registered voluntarily and generally willing to participate [3].
For the comparison of the two samples, four age categories were chosen (i.e., 40-49, 50-59, 60-69, and 70-79 years) for analysis in the present paper.

The instrument
The EORTC QLQ-C30 questionnaire was developed by the EORTC QLG 30 years ago [5] and has since been used in a large number of cancer clinical trials [6].It consists of 30 items comprising five functioning scales, three symptom scales, six single-item symptom scales, and one Global health status/quality of life (QoL) scale.Most of the items are rated on a four-point response scale (1¼'Not at all' to 4 ¼'Very much'), with a recall period of one week.The Global health status/QoL scale consists of two items rated on a seven-point scale (1¼'Very poor' to 7¼'Excellent').

Statistical Methods
Data were analyzed according to the EORTC QLQ-C30 scoring manual [7].Normative mean values and standard deviations (SD) for each scale, stratified by sex and age group for Sample 2019, were calculated.Mean age-specific scale scores are graphically presented as mean profiles.Data from Derogar et al. (2012) [2] are included in the profiles for visual comparison of the two samples.
Mean age-specific differences (with 95% confidence intervals [CI]) between Sample 2019 and Sample 2000 are presented.If the 95% CI for an age-specific difference excluded the value 0, the difference was regarded as statistically significant.Once statistical significance was established, differences were further judged for their clinical relevance, i.e., mean age-specific differences of 5-10 points were considered as small, 10-20 points as moderate, and differences of >20 points were interpreted as large [8].

Results
Normative values (mean scores, SD) for the various EORTC QLQ-C30 scales for Sample 2019 are presented in Table 1.Age-specific mean differences by sex (with 95% CI) between the two samples are shown in Table 2.In Figure 1, mean scores for the two samples are shown graphically, also including data from the article published in 2012 for comparative purposes [2].
Global health status/QoL: Statistically significant and clinically relevant differences were found for females across the three youngest age groups between Sample 2000 and Sample 2019, with higher (i.e., better) levels in Sample 2000 than in Sample 2019.Each of the three age groups showed a small clinically relevant difference.For males, statistically significant and clinically relevant differences were found for the two youngest age groups, i.e., small for age group 40-49 years and moderate for the age group 50-59 years, favouring Sample 2000.No differences were found for the two oldest age groups.
Physical Functioning and Role Functioning: No differences were found for either females or males across age groups, with the exception of the oldest age group where a statistically significant difference was found.For both females and males, a small clinically relevant difference was seen in favour of Sample 2019 compared to Sample 2000.
Cognitive functioning: Females in age groups 40-49 and 50-59 years of Sample 2019 showed lower mean values than the same age groups in Sample 2000.Observed differences were statistically significant, corresponding to a small clinically relevant difference.No difference was found for the age group 60-69 years.In the age group 70-79 years, a small clinically relevant difference was found, with higher mean values observed in Sample 2019.For males, no differences between the two samples were found, except for the age group 70-79 years where a statistically significant difference was found.This difference was, however, not clinically relevant.
Emotional functioning: No subgroup differences were observed between Sample 2000 and Sample 2019 with respect to emotional functioning.Dyspnoea: Females in age groups 40-49, 50-59 and 60-69 years in Sample 2019 reported higher levels of dyspnoea as compared to females in the corresponding age groups in Sample 2000.Observed differences were statistically significant, corresponding to moderate clinically relevant differences.Statistically significant differences in the same direction were found for males in age groups 40-49 and 50-59 years in Sample 2019, also corresponding to moderate clinically relevant differences.There were no differences in the older age groups.
Insomnia: A statistically significant difference was found between females in the age group 50-59 years in Sample 2000 and Sample 2019, with higher levels of insomnia in Sample 2019, corresponding to a small clinically relevant difference.For males, no differences were observed with respect to insomnia.
Appetite loss: For females, no differences were observed with respect to Appetite loss.For males, no differences between the two samples were found, except for the age group 60-69 years where a statistically significant difference was found.The difference was, however, not clinically relevant.16) 5 ( 16) 1 (6) a Higher values indicate higher (i.e., better) levels of functioning and global quality of life.b Higher values indicate higher (i.e., worse) levels of symptoms and problems.
Constipation: No subgroup differences were observed between Sample 2000 and Sample 2019 with respect to Constipation.
Diarrhoea: No subgroup differences were observed between Sample 2000 and Sample 2019 with respect to Diarrhoea.
Financial difficulties: Females in the oldest age group in Sample 2019 reported lower levels of financial problems as compared to the same age group in Sample 2000.This difference was statistically significant, corresponding to a small clinically relevant difference.For males, a statistically significant difference in the same direction was found in the oldest age group; however, the difference was not clinically relevant.

Discussion
Updated Swedish general population normative data, published in 2019 [3], for all EORTC QLQ-C30 subscales stratified by sex and age groups, were compared to Swedish normative data published in 2000 [1] in order to explore whether there were subgroup differences in HRQoL over time.
Four age groups were and analyzed in the present paper.age group scored higher on Physical functioning, Role functioning, and Social functioning, and they showed lower levels of Pain.One explanation for these differences might be the positive development of older age groups in Sweden over time.For example, one Swedish study showed improvements in activities of daily living in 85-year-old birth cohorts, with later born cohorts facing less disability compared to earlier born cohorts [9].In another study by the same group, it was found that cognitive performance at ages 70-79 years in the general population improved in later born cohorts compared to earlier born cohorts [10].Another possible explanation for the differences is the different modes of administration (MOA) of collecting the EORTC QLQ-C30 general population norm data in the samples.In Sample 2000, and also in Sample 2012, paper questionnaires were used, whereas Sample 2019 was collected via a web-based questionnaire.
There was also a difference in the recruitment of participants between the samples.Participants in Sample 2000 were drawn from a population-based registry in Sweden, whereas individuals in Sample 2019 had consented to participate in an internet panel, i.e., the latter sample is only representative of the general population with internet access [3].A study comparing responses from cancer patients who filled out the EORTC QLQ-C30 obtained from paper questionnaires with responses obtained by computer touch screen questionnaires found that the responses between the two modes were similar, but there was a tendency for more positive responses for the touch screen on Emotional functioning, Fatigue, Nausea/ vomiting and Appetite loss [11].The possibility that the MOA (paper versus computer) explains our findings of differences between the samples is unlikely, as the differences were found for subscales not affected by MOA.Instead, there might be a difference between the participants in the two samples in terms of access to computers, which in turn might be related to socioeconomic status and health.Therefore, when normative data are to be used for the interpretation of cancer patients' HRQoL data, it is important to consider whether the data from patients were collected by paper questionnaires or by web-based questionnaire, especially in the oldest group.Global health status/QoL was scored lower in all but one of the female age groups in Sample 2019 compared to Sample 2000, and also by the two youngest male age groups.In contrast to our results, a Swedish study by Waller et al. (2021) that monitored trends in well-being and perceived mental stress in the populations of 38-and 50-year-old women found that ratings of well-being improved in generations of 50-yearold women between 1980 and 2016 [12].In that study, well-being was assessed by an item asking: "How do you experience your health situation (well-being)?".The respondents were instructed that "health situation/well-being" reflected mental and physical health, but no specific examples were given.In the EORTC QLQ-C30, two items are used to assess Global health status/QoL, including one item asking about 'overall health' and one item assessing 'overall quality of life'.Although seemingly similar, 'well-being' and 'quality of life' might be two distinct conceptsthat might explain the differences between our study and the study by Waller et al. (2021) [12].Another possible explanation for the difference might be the way the data were analysed.In Waller et al. (2021) [12], responses to the seven-category scale were categorized into two groups, 'good' versus 'poor' well-being.Thus, a small change in the middle of the scale might result in a cohort difference over time.Data for the EORTC QLQ-C30 subscales are transformed to 100-point scales and treated as continuous variables.The Swedish study also reported an increase in mental stress over the years in women in both age groups.Therefore, increased mental stress might be one explanation for the decrease in overall quality of life found in the present study, with more demands on the younger age groups.
Surprisingly, there were a number of age-group differences for both females and males for the one-item subscale Dyspnoea, except for the older age groups.The item is stated: 'During the last week, were you short of breath?'.One suggested explanation for the differences is that physical exercise has become more popular in the population, and that the authorities encourage people to engage in activities that make you 'short of breath'.Another possible explanation might be that the prevalence of overweight and obesity has increased in the Swedish general population over time, leading to higher levels of dyspnoea.The Public Health Agency of Sweden reports that the proportion of adults (16-84 years) with overweight or obesity increased from 46% to 51% during the period 2006 to 2022 [13].The most possible explanation for the high levels of dyspnoea, however, relates to translation issues.This particular item was answered systematically differently in the Swedish general population compared to other countries included in the European norm data study [3] and also in another recent publication [14].The chosen word for 'breathlessness' might be easier in Swedish, i.e., easier to endorse, than in other languages explaining the relatively high levels of dyspnoea.This does not, however, explain the observed differences between Sample 2000 and Sample 2019 since the translation of the item has not changed between 2000 and 2019.We have not found any recent publication concerning the validation of the Swedish version of the EORTC QLQ-C30.An old validation study of the Swedish version of the EORTC QLQ-C36 did not include the item on dyspnoea [15].To our knowledge, no validation study of the Swedish version of the EORTC QLQ-C30, including the dyspnoea item, has been published.Hence, future research should explore whether there may be potential translation issues which can be investigated using differential item functioning analyses.
The levels of clinical significance were chosen from the work by Osoba et al. (1998) [8].A recent publication, aiming at identifying minimal important difference (MID thresholds for the interpretation of group differences), concluded that no single MID can be applied to all EORTC QLQ-C30 scales and disease settings [16].Differences in MID were mostly within a 2-point range in that study, indicating that the levels used in the present paper underestimate rather than overestimate the clinical differences.
The strength of the present study is the unique opportunity to investigate differences in Swedish EORTC QLQ-C30 general population norm data assessed almost 20 years apart.Sample 2000 included a larger number of subjects than Sample 2019.One weakness is that the data in the two samples were collected using two different MOAs, i.e., via a webbased questionnaire (Sample 2019) versus MOA by paper and pencil (Sample 2000).Another weakness is the difference in participant recruitment.An internet panel was used in Sample 2019, whilst Sample 2000 was drawn from a population-based registry.A further drawback is that the oldest group is defined as 70þ years, with only a few participants being 80 years and older.Given that life expectancy is getting higher, with more cancer patients reaching old age, it is important for future research to sample a separate age group of 80-89 years (or �80 years) in addition to the age group of 70-79 years.

Conclusions
For most EORTC QLQ-C30 subscales and single items, there were no differences between the two general population norm data samples, with the exception of the oldest age group where generally better HRQoL was observed in Sample 2019 as compared to Sample 2000.In addition, differences were also found for Global health status/QoL and Dyspnoea across most age groups.When deciding which normative data set to use, the mode of administration used for data collection as well as the age group have to be considered.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
The collection of the data published in 2019 was funded by the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group (QLG), awarded to SN, grant number 001 2015.The EORTC QLG business model involves license fees for the commercial use of their instruments.Academic use of EORTC instruments is free of charge.If you are interested in using any of the EORTC quality of life instruments, please see: https://qol.eortc.org/questionnaires/.

Figure 1 .
Figure 1.Mean scores for Sample 2000, the publication from 2012 and Sample 2019 according to sex and age group.
Social functioning: No differences were found between Sample 2000 and Sample 2019 among females.Males in Sample 2019 in the age group 70-79 years reported higher mean values than the corresponding age group in Sample 2000.This difference was statistically significant, corresponding to a small clinically relevant difference.No other differences were found among males.Fatigue: No differences were found between females in Sample 2000 and Sample 2019 in three of the four age groups.Females in the oldest age group in Sample 2019 reported, however, a lower (i.e., better) mean value for fatigue compared to the same age group in Sample 2000.Females in the age group 60-69 years in Sample 2019 reported higher levels of pain compared to the same age group in Sample 2000.This difference was statistically significant and corresponded to a small clinically relevant difference.No differences were found for females in the other age groups.Males in the oldest age group in Sample 2019 reported lower levels of pain as compared to men in the same age group in Sample 2000.The difference was statistically significant, corresponding to a small clinically relevant difference.No statistically significant differences were found in the other age groups.

Table 1 .
EORTC QLQ-C30 mean values and standard deviations (SD) for Sample 2019 according to sex and age group The main reason is that the data from Sample 2019 included relatively few participants in the younger age groups as HRQoL of the normative sample with data from cancer populations.Since cancer is relatively uncommon in the age groups under 40 years of age, we decided to focus on ages 40 years and above.For most of the EORTC QLQ-C30 subscales and single items, no age-group differences between Sample 2000 and Sample 2019 were found.When adding the general population sample from 2012 by Derogar et al. (2012) [2] to the comparison, the values of Sample 2012 appear to be similar to Sample 2000, which was also concluded by the authors.As was the case for Sample 2000 [1], Derogar et al. (2012) also collected data by paper and pencil.The most striking result of the comparison is the finding of 'better' HRQoL for a number of subscales in the oldest age group (70-79 years) in Sample 2019 as compared to Sample 2000.Females in this age group reported higher levels of Physical functioning, Role functioning, and Cognitive functioning, as well as lower levels of Fatigue and Financial difficulties compared to the Sample 2000.Males in the oldest

Table 2 .
EORTC QLQ-C30 mean differences, 95% confidence intervals and levels of clinical significance between Sample 2000 and Sample 2019 according to sex and age group