Test-retest reliability and concurrent validity of the Adapted Short QUestionnaire to ASsess Health-enhancing physical activity (Adapted-SQUASH) in adults with disabilities

ABSTRACT The current study determined the test–retest reliability and concurrent validity of the Adapted Short QUestionnaire to ASsess Health-enhancing physical activity (Adapted-SQUASH) in adults with disabilities. Before filling in the Adapted-SQUASH twice with a recall period of 2 weeks, participants wore the Actiheart activity monitor up to 1 week. For the test–retest reliability (N = 68), Intraclass correlation coefficients (ICCs) were 0.67 (p < 0.001) for the total activity score (min x intensity/week) and 0.76 (p < 0.001) for the total minutes of activity (min/week). For the concurrent validity (N = 58), the Spearman correlation coefficient was 0.40 (p = 0.002) between the total activity score of the first administration of the Adapted-SQUASH and activity energy expenditure from the Actiheart (kcals kg−1 min−1). The ICC was 0.22 (p = 0.027) between the total minutes of activity assessed with the first administration of the Adapted-SQUASH and Actiheart. The Adapted-SQUASH is an acceptable measure to assess self-reported physical activity in large populations of adults with disabilities but is not applicable at the individual level due to wide limits of agreement. Self-reported physical activity assessed with the Adapted-SQUASH does not accurately represent physical activity assessed with the Actiheart in adults with disabilities, as indicated with a systematic bias between both instruments in the Bland–Altman analysis.


Introduction
Measuring patients' physical activity behaviour is important for evaluating effectivity of physical activity promotion interventions and, ideally, individually tailoring rehabilitation programmes among adults suffering from a physical disability and/or chronic disease that impairs mobility (further: adults with disabilities) (Ploeg et al., 2004). Therefore, an accurate and efficient measurement instrument for assessing (self-reported) physical activity in people with physical disabilities is essential. Although multiple measures of physical activity (e.g., accelerometer-derived in combination with self-report) might be preferred (Cervantes & Porretta, 2010), mostly it is not practically feasible, and it is too expensive among large-scale populations in interventions and/ or observational cohort studies (Nigg et al., 2020). Self-reports are frequently used measurement tools to assess physical activity in disabled populations, both in rehabilitation practice and in research (Booth, 2000;Cervantes & Porretta, 2010). Also, questionnaires are easy to fill in (Nigg et al., 2020;Rennie & Wareham, 1998;Wendel-Vos et al., 2003). However, self-reported physical activity depends on the persons' recall and mostly is not sensitive for light physical activities at home (e.g., walking from the bedroom to the toilet and from the kitchen to the dining table) or outside (e.g., walking to the mailbox to post a letter), as was found in adults with spinal cord injury (Ma et al., 2020). (Nigg et al., 2020), and to clarify the dose-response relationship between physical activity and the received counselling during the intervention (Prince et al., 2008), frequency, intensity, duration and type of the activity should be measured. The PASIPD assesses duration and type of physical activities but does not specifically assess the frequency and intensity of physical activities, whereby it was considered not applicable for the ReSpAct cohort. The Short Questionnaire to Assess Health-enhancing physical activity (SQUASH) developed for healthy adults does measure frequency, intensity, duration and type of physical activities (Wendel-Vos et al., 2003). The SQUASH is widely used, for example, by governmental agencies to monitor largescale physical activity behaviour among the Dutch population and to monitor whether physical activity guidelines are achieved. Studies on the psychometric properties of the SQUASH have supported the appropriateness of the SQUASH to measure the level of weekly physical activity in a healthy adult population (Wendel-Vos et al., 2003), in patients after a total hip arthroplasty (Wagenmakers et al., 2008) and in outpatients with ankylosing spondylitis (Arends et al., 2013).
When assessing physical activity in adults with disabilities, it needs to be taken into account that this target population may have a different perceived intensity of activities compared to a healthy population (Dawes et al., 2005). It is expected that adults with a disability experience activities as more intense, because activities often cost (absolutely and relatively) more energy compared to healthy adults (Waters & Mulroy, 1999;Wezenberg et al., 2013). Therefore, the ReSpAct research team converted the original SQUASH (Wendel-Vos et al., 2003) into a measurement tool (mentioned from here: the Adapted-SQUASH) that was expected to better meet the perceived intensity of activities among adults with disabilities compared with the original SQUASH, by using appropriate metabolic equivalent of task (MET) values for this target population (Alingh et al., 2015). Also, the SQUASH was adapted to better match the activity pattern of wheelchair users by including common physical activity behaviours: wheelchair sports (e.g., wheelchair basketball) and questions concerning wheelchair propulsion and handcycling (Alingh et al., 2015). The SQUASH has two main outcome measures: the activity score, measuring a combination of intensity and duration of physical activity per week, and total minutes of activity per week (duration).
It is relevant for (rehabilitation) practice and research in (adapted) physical activity to determine the psychometric properties of the Adapted-SQUASH among a sample of adults with disabilities. Apart from test-retest reliability, concurrent validity is deemed an important asset. The Actiheart (Cambridge Neurotechnology™ UK), an uniaxial activity monitor, was identified by the research team as a suitable criterion measure to compare with the outcomes of the Adapted-SQUASH. Since the Actiheart is a medical device, it is suitable for ambulant people with disabilities. The Actiheart is accurate in measuring physical activity energy expenditure (AEE) in freeliving conditions, and ideally, it combines its measured heart rate and movement sensor information improving the prediction of AEE in daily physical activities (Rennie & Wareham, 1998;Strath et al., 2001).
The current study aims to determine the test-retest reliability and concurrent validity of the Adapted-SQUASH among adults with disabilities. We had focussed on the two main outcome measures of the Adapted-SQUASH, the total activity score and the total minutes of activity per week (Wendel-Vos et al., 2003), which were derived from the test and retest of the Adapted-SQUASH as well as from the Actiheart activity monitor among a convenience sample of adults with disabilities.

Study population
Participants were recruited through patient activity groups in hospitals, rehabilitation centres, sport clubs and patient associations in the northern and eastern provinces in the Netherlands. We aimed at a sample size between 50 and 100 participants, as recommended in literature for validation studies and reliability studies, to provide an appropriate number of dots with estimated limits of agreement in the Bland-Altman plot and to obtain an acceptable confidence interval around the reliability parameter (de Vet et al., 2011). Inclusion criteria were being at least 18 years of age, having a physical disability and/or chronic disease (e.g., stroke, heart failure, Parkinson's disease) and being able to read and write the Dutch language. Participants were excluded when they were still receiving inpatient or outpatient rehabilitation care, were participating in the ReSpAct study (Alingh et al., 2015;Hoekstra et al., 2014), were completely wheelchair dependent (because of the use of the Actiheart), or were not able to complete the questionnaires even with help. The data collection took place from November 2014 till June 2016.

Study procedures
This study consisted of a test-retest reliability study and a validity study. For the test-retest reliability study, the participants filled out the first Adapted-SQUASH twice, with approximately 2 weeks between the measurement occasions.
For the validity study, the participant was asked to wear an Actiheart activity monitor (Cambridge Neurotechnology™ UK) to objectively measure physical activity levels during the week prior to administration of the first Adapted-SQUASH. Two researchers visited the participants in their free-living home situation twice, to instal and attach the Actiheart to the participants' chest, and to collect the Actiheart after 1 week. The Actiheart measurement started at 0.00am and continued for the next seven consecutive days, both day and night. The participant was instructed to remove the Actiheart during showering, bathing, or swimming. In addition, the participant filled out a diary in which noncompliance to the Actiheart was noted. Measurements were included in the validity study when a minimum registration of the Actiheart of at least 4-days valid acceleration data (at least 75% activity data registration of 24 hours) for each participant was present (Klaren et al., 2016).
Participants' general characteristics were obtained by using a questionnaire. Participant's body weight (kg) and height (m) were measured by researchers by using a personal scale and measuring tape, respectively. The study was approved by the Ethics Committee of the Center of Human Movement Sciences (ECB/2014.06.30_1) at the University of Groningen, University Medical Center Groningen. All participants voluntarily signed an informed consent.

The Adapted Short QUestionnaire to ASssess Health-enhancing physical activity (Adapted-SQUASH)
The 19-item Adapted-SQUASH (see supplemental file) is a selfreported recall questionnaire to assess physical activity among adults with disabilities based on an average week in the past month as reference period. Equal to the original SQUASH (Wendel-Vos et al., 2003), the Adapted-SQUASH is prestructured in four main domains outlining types and settings of activity: "commuting traffic", "activities at work and school", "household activities" and "leisure time activities" including "sports activities". The frequency in days per week, the duration in average hours and minutes per day and the perceived intensity were asked.
Several adjustments have been made to make the original SQUASH applicable for people with disabilities, as described in the study protocol of the ReSpAct study (Alingh et al., 2015). First, the items "wheelchair riding" and "handcycling" were added in the domains "commuting activities and leisure-time" and "sports activities". Second, the self-reported intensity of the activity was categorised into "light", "moderate" and "vigorous", instead of "slow", "moderate" and "fast". Third, the syntax to determine the outcome measures of the Adapted-SQUASH includes a large range of Adapted sports (e.g., wheelchair basketball/rugby/tennis) for the item "sports activities". The MET-values in the syntax were updated based on the most recent version of the Ainsworth' compendium of physical activities (Ainsworth et al., 2011) and MET-values for wheelchair riding, handcycling and adapted sports were added based on a compendium of energy costs of physical activities for wheelchair-dependent individuals (Conger & Bassett, 2011). Lastly, in the examples of different sports "tennis" was replaced by "(wheelchair) tennis".

The total activity score per week (Adapted-SQUASH).
For practical use of the questionnaire all outcome measures of the Adapted-SQUASH were calculated by using a syntax. The total activity score and the total minutes of activity per week are the main outcomes of the Adapted-SQUASH. The total activity score (min x intensity/week) was calculated following the procedure described by Wendel-Vos et al. (2003). First, all the questions in the Adapted-SQUASH were assigned to a METvalue representing the intensity of this task, based on the Ainsworth' compendium of physical activities (Ainsworth et al., 2011) and based on a compendium of energy costs of physical activities for wheelchair-dependent individuals (Conger & Bassett, 2011). Second, an activity score was calculated for each domain by multiplying the total minutes of activity with a self-reported intensity score, which is based on age and MET-values (Wendel-Vos et al., 2003). Lastly, the total activity score was calculated by summing up the activity scores of the four domains. In accordance with the original SQUASH, data were excluded if the total minutes of activity a day exceeded 960 minutes or if values were missing (Wendel-Vos et al., 2003).

The total minutes of activity per week (Adapted-SQUASH).
The total minutes of activity per week (min/week) assessed with the Adapted-SQUASH were calculated by summing up the total minutes of physical activity per week reported in the Adapted-SQUASH. Also, the total minutes of light, moderate and vigorous-intensity activities per week (min/week) were calculated, using MET-value cut-off points based on the Dutch physical activity guidelines (Hildebrandt et al., 1999).

The actiheart activity monitor
The Actiheart (Cambridge Neurotechnology™ UK) activity monitor is a combined uniaxial accelerometer and heart rate monitor, which was used to measure accelerometer-derived physical activity. Acceptable reliability and validity (when compared with Electrocardiography [ECG] (Brage et al., 2005) and indirect calorimetry (Crouter et al., 2008)) were found for the Actiheart among adults, and was deemed appropriate for our target population, because the combination of accelerometer data with heart rate data would be better able to determine the intensity of physical activities (Brage et al., 2005). The Actiheart was attached to the participation's chest by using two ECG electrodes. The Actiheart is a lightweight (8gram) and compact (7x33mm) device, connected to the two ECG electrodes and capable of storing time-sequenced data. Acceleration (1D, vertical axis) was measured with a 15-second epoch by a piezoelectric element within the unit with a frequency range of 1-7 Hz. The Actiheart output provides activity counts and heart rate data per minute, simultaneously.

Activity energy expenditure (Actiheart). Based on the
Actiheart data, AEE estimates in kcals kg −1 min −1 were calculated for each minute by combining activity counts and heart rate in a branched equations model as described in literature (Brage et al., 2005;Crouter et al., 2008) and as proposed by the Actiheart software for AEE (see supplemental file). A branched equation model allows the Actiheart to accurately assess AEE even when there is low body movement, but high heart rate during an activity. The combined activity and heart rate algorithm to calculate AEE needs the individual's sleeping heart rate. The sleeping heart rate was calculated by averaging the minute-to-minute heart rate between 2.00 and 5.00am on the first day the Actiheart was worn.
When heart rate was missing, AEE was calculated based on the activity algorithm only for the specific missing minute. The total AEE per week was calculated by summing up the AEE minute-to-minute data, divided by the number of valid days the Actiheart was worn and multiplied by seven (assuming that the average amount of physical activity a day is representative for all weekdays and weekend days).
In addition, MET values were calculated for each minute based on the AEE minute-to-minute data, following the Ainsworth' compendium of physical activities (Ainsworth et al., 2011). In the next step, MET values per minute were categorised in the following MET categories: sedentary behaviour (1.0-1.5 METs), light intensity (1.6-2.9 METs), moderate intensity (3.0-5.9 METs) and vigorous intensity (≥6 METs). Sum scores of all minutes in each MET category were calculated. Also, a sum score for al minutes of physical activity was calculated (≥1.6 METs). Sum scores were divided by the number of valid days the Actiheart was worn and multiplied by seven for week scores.

Statistical analysis
Descriptive statistics were used to describe the demographic characteristics of the study population. Test-retest reliability of the Adapted-SQUASH was determined by calculating Intraclass correlation coefficients (ICCs) (two-way random, absolute agreement, single measures) for the total activity score (total, four main domains separately and all individual item separately), as well for the total minutes of activity (total and separately per intensity category) between the first and second measurement occasion. The ICC quantifies the degree to which the two measurements are absolutely related (Cicchetti, 1994). Since there is no widely accepted criterion for defining the strength of a correlation, we used a general guideline for clinical research: a correlation below 0.25 indicates little or no agreement, a correlation between 0.25 and 0.50 indicates fair agreement, a correlation between 0.50 and 0.75 indicates moderate to good agreement and a correlation higher than 0.75 indicates good to excellent agreement (Portney & Watkins, 2009). Also, confidence intervals were calculated for the ICCs. Additionally, Bland-Altman analyses were performed to illustrate the agreement between the first and second measurement of the Adapted-SQUASH (Bland & Altman, 1999;Giavarina, 2015). Subsequently, a one-sample t-test was performed to determine any systematic bias.
Concurrent validity of the Adapted-SQUASH was determined by calculating a Spearman correlation coefficient between the total activity score (min x intensity/week) based on the baseline administration of the Adapted-SQUASH and the total AEE (kcals kg −1 /week) based on the Actiheart data. Non-parametric Spearman correlation coefficients were chosen because assumptions of normality were not met for the outcomes of the Adapted-SQUASH and the two continuous outcome variables do not have the same measurement unit. In addition, concurrent validity of the Adapted-SQUASH was determined by calculating an ICC between the total minutes of activity (min/week) based on the baseline administration of the Adapted-SQUASH and the total minutes of activity (min/ week) based on the Actiheart data, and by performing a Bland-Altman analysis. Although ICCs are preferred if the two measurement instruments are expressed in the same units (min/ week) (de Vet et al., 2011), a Spearman correlation coefficient was also calculated between the total minutes of activity assessed with the Adapted-SQUASH and Actiheart to compare our correlation with previous literature (Arends et al., 2013;Ploeg et al., 2007;Wagenmakers et al., 2008;Wendel-Vos et al., 2003). There is no consensus on how high correlations should be to demonstrate acceptable validity of a physical activity questionnaire (Terwee et al., 2010). The same interpretation of correlations as mentioned above is used for the validity of the Adapted-SQUASH. The level of significance was set at 0.05. Data were analysed using the Statistical Package for the Social Science (IBM SPSS Statistics, version 24).

Results
A convenience sample of adults with disabilities (N = 80) was approached. Finally, 68 participants were included in the testretest reliability study and 58 in the validity study (see supplemental file for a flow diagram of the included and excluded participants). Twelve out of 80 participants were excluded from the test-retest reliability study because they did not fill out the second questionnaire due to illness (N = 1), surgery (N = 1) or unknown reasons (N = 10). Based on the characteristics, the included and excluded sample for the test-rest reliability study only statistically significantly differed in average body weight (see supplemental file). Body weight was on average 89.8 ± 3.2 kg in the excluded sample (N = 12) and 79.1 ± 14.7 kg in the included sample (N = 68). However, body height and Body Mass Index were not significantly different between the included and excluded sample. Based on the criterion of a minimum of 4-days valid Actiheart accelerometer data, 22 out of 80 participants were excluded from the validity study. We included participants with 4 days (N = 5), 5 days (N = 1), 6 days (N = 6), and 7 days (N = 46) of valid Actiheart accelerometer data, whereof all participants had at least three weekdays and one weekend day available. Based on the characteristics, the included and excluded sample only significantly differed in the use of mobility aid (see supplemental file). In the excluded sample, more people used a mobility aid (32%) compared to the included sample (17%).
The characteristics of the participants for the test-retest reliability (N = 68) and the validity (N = 58) studies are presented in table 1. The Adapted-SQUASH was completed for a second time after a mean period of 17 ± 4 days.

Test-retest reliability
The ICC for the repeated Adapted-SQUASH measurements was 0.67 (p < 0.001) for the total activity score, and 0.76 (p < 0.001) for the total minutes of activity per week, which, respectively, indicated a moderate to good and good to excellent agreement (Cicchetti, 1994) (table 2). Test-retest reliability within the light,  (    retest reliability of the new-added items for handcycling activities during commuting and leisure time and wheelchair riding during commuting could not be determined because too few participants reported this activity. Test-retest reliability for wheelchair riding in leisure time was 0.27 (p = 0.011). Bland-Altman analyses showed that the mean difference between the first and second measurement was not significantly different from zero for both the total activity score (t 67 = −0.03, p = 0.98) and for the total minutes of activity (t 67 = 0.11, p = 0.92), indicating no systematic bias between the two measurements. We found wide Limits of Agreement (LOA) with 95% of the measurements of the total activity score within the boundaries of 4072 activity score above and below the mean difference (figure 1), and with 95% of the measurements of the total minutes of activity within the boundaries of 945 min activity above and below the mean difference ( figure 2). Besides, based on the Bland-Altman plots the absolute amount of time spent on physical activity and the total activity score were higher at the second measurement occasion than at the first measurement occasion, while the total activity score was lower at the second measurement than at the first measurement occasion. Also, a Spearman correlation coefficient of −0.08 (p =.526) between the x and y axis of the Bland-Altman analysis for the total activity score derived from the Adapted-SQUASH was found (figure 1), and a Spearman correlation coefficient of −0.10 (p = .431) between the x and y axis of the Bland-Altman analysis for the total minutes of activity per week derived from the Adapted-SQUASH (figure 2) was found, which indicated homoscedasticity of the data.

Concurrent validity
Correlation coefficients for the concurrent validity are presented in table 3. A significant Spearman correlation coefficient was found between the total activity score from the Adapted-SQUASH and the AEE from the Actiheart (ρ = 0.40, p = 0.002). A significant ICC of 0.22 was found between the total minutes of activity per week from the Adapted-SQUASH and the total  Values are in total minutes per week, only the total activity score of the Adapted-SQUASH is compared with activity energy expenditure in kcals kg −1 min −1 .
minutes of activity per week from the Actiheart (p = 0.027). The correlation coefficients indicated fair and little agreement, respectively. No significant ICCs were found between the total minutes of light and moderate activity per week calculated with the Adapted-SQUASH and Actiheart. Only a significant ICC of 0.21 (p = 0.046) was found between the total minutes of vigorous activity per week from the Adapted-SQUASH and Actiheart, indicating little agreement between the two measurement tools.
Bland-Altman analysis showed that the mean difference between the total minutes of activity calculated with the Adapted-SQUASH and Actiheart was significantly different from zero (t 57 = 3.48, p = 0.001), indicating systematic bias between the two. We found wide LOA with 95% of the measurements of the total minutes of activity within the boundaries of 1485 minutes above and below the mean difference ( figure 3). Besides, based on the Bland-Altman plot the absolute amount of time spent on physical activity was higher reported in the Adapted-SQUASH questionnaire compared to physical activity assessed with the Actiheart. Also, a Spearman correlation coefficient of 0.42 (p = .001) between the x and y axis of the Bland-Altman analysis for the total minutes of activity per week derived from the Actiheart and Adapted-SQUASH was found (figure 3), which indicated heteroscedasticity of the data.

Discussion
The current study showed good reproducibility of the Adapted SQUASH to assess self-reported physical activity in populations of people with disabilities, but not at the individual level since the Bland-Altman analyses found wide LOA. In addition, the current study showed fair validity of the Adapted SQUASH and the Bland-Altman analysis showed wide LOA, which indicates that self-reported physical activity individually assessed with the Adapted-SQUASH does not accurately represent individually accelerometer-derived physical activity assessed with the Actiheart in this sample of people with disabilities.

Test-retest reliability
The test-retest reliability of the total activity score per week (ICC = 0.67, p < .001) of the Adapted-SQUASH is slightly higher compared to the Spearman correlation coefficients found in studies of the original SQUASH among 50 healthy adults (ρ = 0.58) (Wendel-Vos et al., 2003), among 44 patients after a total hip arthroplasty (ρ = 0.57) (Wagenmakers et al., 2008), but slightly lower compared to a study among 52 patients with ankylosing spondylitis (ρ = 0.89) (Arends et al., 2013). Also, our result of the test-retest reliability of the total minutes per week (ICC = 0.76, p < .001) of the Adapted-SQUASH is comparable to the Spearman correlation coefficient for the test-retest reliability of the PASIPD in similar populations with a disability (ρ = 0.77) (Ploeg et al., 2007). A special note when comparing the test-retest reliability of our study to others is that we examined the test-retest reliability by using ICCs, while others used Spearman correlation coefficients (Arends et al., 2013;Wagenmakers et al., 2008;Wendel-Vos et al., 2003). ICCs give lower correlation coefficients compared to Spearman correlation coefficients, because an ICC is the absolute agreement between the first and second measurement, which does not correct for systematic differences. In accordance with previous studies (Arends et al., 2013;Wagenmakers et al., 2008), the Bland-Altman analysis showed no systematic bias on total activity scores between test and retest. Although the Adapted-SQUASH has good test-retest reliability and the mean Figure 3. The differences between the total minutes of activity calculated with the Adapted-SQUASH and Actiheart, plotted against their mean for each participant, together with the 95% confidence interval (CI) and the 95% Limits of Agreement (LOA). (N = 58), with the diagonal line representing the correlation between the x and y axis (ρ = 0.42, p =.001), indicating heteroscedasticity. differences between the first and second measurement are close to zero, relatively wide LOA are found for the total activity score and the total minutes of activity, which indicated that the degree of repeatability is insufficient at the individual level and/ or that levels of physical activity fluctuate over time. Therefore, the Adapted-SQUASH can be used to assess self-reported physical activity behaviour in large (patient) populations but is not acceptable to monitor individual physical activity levels. Also, it indicates that large changes in the outcomes of the Adapted-SQUASH should be found when interested in the course of selfreported physical activity over time (e.g., before and after an intervention or treatment).
The Adapted-SQUASH also calculated the total minutes of light, moderate and vigorous activity per week. In previous literature, Wendel-Vos et al. (2003) and Wagenmakers et al. (2008) found the highest Spearman correlation coefficient for the total minutes of vigorous activity per week, respectively 0.92 (Wendel-Vos et al., 2003) and 0.85 (Wagenmakers et al., 2008), while we found the lowest correlation for the total minutes of vigorous activity per week (ICC = 0.32, p = .004). Explanation of this outcome is that vigorous-intensity activities, such as weekly scheduled sports activities, are the easiest to recall for healthy adults (Jacobs et al., 1993), while intermittent light-intensity activities (e.g., walking) are more difficult to recall (Herbolsheimer et al., 2018;Skender et al., 2016). However, adults with disabilities might experience activities as more intense, since activities often cost more energy compared to healthy adults (Waters & Mulroy, 1999;Wezenberg et al., 2013) and may be more variable over the day due to fatigue and lack of appropriate pacing behaviour (Abonie et al., 2020;Martin Ginis et al., 2016;Murphy & Kratz, 2014;Rimmer et al., 2011). Therefore, temporal fluctuation in light intensity activities in healthy adults, may be similar to temporal fluctuation in moderate or vigorous intensity activities in our target population. Furthermore, our sample reported less minutes of vigorous-intensity activities (so a lower between subjects' variance) compared to light intensity activities, which might give a lower ICC.
The Adapted-SQUASH provides information of different settings of physical activity (commuting activities, activities at work/school, household activities and leisure time activities including different sports). We found low test-retest reliability for leisure-time activities, which might be explained by the non-regular frequency of this type of activities per week, due to barriers to physical activity such as the amount of leisuretime, tiredness, or bad weather conditions (King, 1997). The quite low correlation for intense activities at work could be due to a small percentage of the population who can perform intense activities at work and the high variability in vigorous activities. The two newly added items "wheelchair riding" and "handcycling" in the Adapted-SQUASH had low response, because our study excluded people who were completely wheelchair dependent. However, our study population did mention adapted sports in the category "sports activities" (e.g., wheelchair basketball).
Another interesting variable is the sport outcome measure indicating good test-retest reliability (ICC = 0.76, p < 0.001), probably because sports activities are often easy to recall, and sports participation is a stable behaviour with scheduled regular practice. This variable is often used in clinical settings, as well as in policy making and governmental guidelines worldwide. Insight in sports activities can be used for a tailored advice regarding an active lifestyle during or after rehabilitation, which has health-influencing effects, is crucial for quality of life, mobility and participation in everyday life and is strongly recommended for adults with disabilities (Haskell et al., 2007).

Concurrent validity
The concurrent validity of the total activity score per week of the Adapted-SQUASH (ρ = 0.40, p = .002), when compared with the total AEE per week assessed with the Actiheart, is lower compared to the Spearman correlation coefficients found in studies of the original SQUASH among 50 healthy adults (ρ = 0.45, physical activity was assessed with the computer science and applications activity monitor) (Wendel-Vos et al., 2003), among 44 patients after a total hip arthroplasty (ρ = 0.67, physical activity was assessed with an Actigraph accelerometer) (Wagenmakers et al., 2008), but higher compared to a study among 52 patients with ankylosing spondylitis (ρ = 0.35, physical activity was assessed with an Actigraph accelerometer) (Arends et al., 2013). Also, the concurrent validity of the total minutes of activity per week of the Adapted-SQUASH (ICC = 0.22, p = .027 and ρ = 0.36, p = .006), when compared with the total minutes of activity assessed with the Actiheart, is lower compared to the Spearman correlation coefficient found in the study of the original SQUASH among 50 healthy adults (ρ = 0.56) (Booth, 2000), but higher compared to the Spearman correlation coefficient for the validity of the PASIPD among people with disabilities (ρ = 0.30, physical activity was assessed with an Actigraph accelerometer) (Ploeg et al., 2007). The lower concurrent validity of physical activity questionnaires in people with disabilities compared to healthy adults might be due to variation of the questionnaire and variation of the standard. Also, cognitive function, which is sometimes affected in people with disabilities, might influence the recall of activities and thereby might explain the differences between self-reported and accelerometer-derived physical activity (Herbolsheimer et al., 2018).
In addition, the Bland-Altman analysis showed systematic bias between the total minutes of activity per week assessed with the Adapted-SQUASH and Actiheart and the LOA were wide. This indicated that the Adapted-SQUASH does not accurately represent accelerometer-derived physical activity assessed with the Actiheart in individuals with disabilities. Previous literature also found that individual self-reported physical activity compared to physical activity assessed with an accelerometer was not accurate in people after joint arthroplasty (Vaughn et al., 2019) and in people with spinal cord injury (Ma et al., 2020). Besides, the mean difference between the Adapted-SQUASH and Actiheart was 346 minutes per week, which indicates that people with disabilities seem to overestimate their self-reported physical activity assessed with the Adapted-SQUASH compared to accelerometer-derived physical activity assessed with the Actiheart. Also, based on the Bland-Altman analysis for the total minutes of activity per week assessed with the Adapted-SQUASH and Actiheart, heteroscedasticity in the data was found, which indicates a tendency to overestimate self-reported physical activity when higher mean levels of physical activity were measured of the Adapted-SQUASH and Actiheart. This is in agreement with previous literature (Chinapaw et al., 2009;Feuering et al., 2014;Sebastiao et al., 2012). This overestimation of actual time spent being physically active is probably attributable to recall bias, such as the difficulty in recalling short breaks during physical activity (e.g., socializing or refreshment during the reported time doing sports, or taking rest during the reported time doing gardening or household activities) (Ma et al., 2020), while the Actiheart does measure all sorts of short breaks during physical activity and over the day. Another potential bias between self-reported and accelerometer-derived physical activity outcomes may reside in the appreciation and perception of physical activities and their intensities, which notions may be quite different in our population in the context of their often low physical work capacity (Carroll et al., 2014) and phenomena of fatigue during the day (Martin Ginis et al., 2016;Rimmer et al., 2011). This introduces a difference in what one does and what one perceives.
Consequently, for the total minutes of vigorous activities per week low or little agreement was found between the Adapted-SQUASH and Actiheart (ICC = 0.21, p = .046), while no agreement was found for the total minutes of light (ICC = 0.05, p = .346) and moderate (ICC = 0.03, p = .401) activities per week. This suggests that the perceived intensity of activities in people with disabilities is not in agreement with the accelerometer-derived intensities of activities assessed with the Actiheart. Therefore, we suggest to use the total minutes of physical activity per week assessed with the Adapted-SQUASH when interested in dose-response relationships among for instance, physical activity and health outcomes, or between physical activity and the received intervention/treatment in people with disabilities.

Limitations
A few limitations need to be considered. First, the Adapted-SQUASH used MET values from the Ainsworth compendium of physical activities, which were derived from and intended for use in able-bodied adults (Ainsworth et al., 2011). This limitation could have overestimated the total activity score for each intensity category (Ainsworth et al., 2011;Rikli, 2000), because our target population probably experiences activities as more intense compared to healthy adults (Waters & Mulroy, 1999;Wezenberg et al., 2013), as well as less consistent during the day. Also, the Adapted-SQUASH is sensitive to overestimation of frequency and/or duration of the activities, due to recall bias. A more or less similar limitation is however true for the Actiheart device, where the used sensor algorithms are not specific to people with disabilities, but have been derived from the general healthy population (Brage et al., 2005;Crouter et al., 2008). This stresses the need for more population-specific validation studies also of objective physical activity measurement tools in the future (Lankhorst et al., 2020).
Thirdly, the test-retest period was on average 17 days. This duration could be too short to prevent participants from copying the Adapted-SQUASH from memory. However, following the recommendations of Matthews et al. (2012), we have consciously chosen for this short recall period to decrease the reporting error of activities, since physical activity levels tend to fluctuate between days and weeks due to weather conditions (Matthews et al., 2012) and/or due to fluctuating experienced health or fatigue conditions among this population of persons with a disability (Abonie et.al., 2020). Furthermore, we did not check at the participant if the week the Actiheart was worn was a representative week of their physical activity behaviour.
Lastly, the Actiheart is a device capable of measuring heart rate and acceleration, and combines these variables in a branched equation model to calculate AEE (Brage et al., 2005;Crouter et al., 2008). However, we found a large amount of missing heart rate data in our sample, while calculating AEE based on the heart rate and combined algorithm is preferred (Crouter et al., 2008). The median percentage of missing heart rate was 22% (inter quartile range: 10-42%). The unsuccessful measurement of heart rate may have happened due to malfunction of the battery or the electrodes. However, if during the week participants felt that the electrodes loosened or if the electrodes had not been replaced by the fourth day, the instruction was given to replace the electrodes. As stated above another limitation is that the algorithm from the Actiheart to calculate AEE has not been validated among adults with deviating movement patterns and adults using drugs against high blood pressure, who are included in our target population. This is however the case for most of the activity monitor devices currently available (Van Remoortel et al., 2012). In addition, the Actiheart was validated among healthy adults within the age range of 26-50 years (Brage et al., 2005) and the algorithm of the Actiheart was validated among adults within the age range of 21-55 years (Crouter et al., 2008), while the current study population had an age range of 19-85 years.

Practical implications and further research
The Adapted-SQUASH provides information on various dimensions (frequency, duration and intensity) and settings (e.g., household, leisure time), is inexpensive, and has low burden for participants to fill in. Together this turns the Adapted-SQUASH into a useful tool to assess self-reported physical activity among adults with disabilities in large population studies. Firstly, the Adapted-SQUASH can be used in community and health-care settings, like rehabilitation centres, to monitor physical activity levels in large heterogeneous populations with disabilities. For this practical use, the Adapted-SQUASH is distinctive compared with other physical activity questionnaires (e.g., PASIPD), because even though the questionnaire specifically assesses type, frequency and intensity of activities, it is short and quick to fill in and it includes physical activities for wheelchair users and adapted sports. Secondly, the Adapted-SQUASH can be used for large longitudinal cohort studies or intervention studies to evaluate self-reported physical activity. For example, the Adapted-SQUASH has already been used in the longitudinal cohort study ReSpAct, which aimed to evaluate physical activity in people with disabilities during and after a physical activity stimulation programme (Alingh et al., 2015;Hoekstra et al., 2014). When accurate and complete measures of physical activity are preferred in further research among large populations with disabilities, we suggest using both the Adapted-SQUASH (in the total sample) and an activity monitor (in a sub-sample). The Adapted-SQUASH provides information on the setting of the activity, while an activity monitor provides information on intermittent activities (e.g., walking at home and taking rest during activities) (Ma et al., 2020;Skender et al., 2016). So, selection of the best measurement to assess physical activity depends on the purpose, construct, measurement unit, population, setting etc. (Nigg et al., 2020).
For practical implications, we recommend using the total minutes of activity per week or the total activity score, which were the two main outcome measures of the Adapted-SQUASH, to assess self-reported physical activity in people with disabilities. The test-retest reliability of the total minutes of activity per week was good but systematic bias with the Actiheart was found. The test-retest reliability of the total activity score per week was lower and the perceived intensity of activities (light, moderate and vigorous) was not in agreement with the Actiheart. However, outcomes should be interpreted with caution since our sample of people with disabilities overestimated their physical activity. Also, for future research it is recommended to assess the validity and test-retest reliability of the Adapted-SQUASH among people who are completely wheelchair dependent.

Conclusion
The Adapted-SQUASH is an acceptable measure to assess self-reported physical activity in large populations of people with disabilities but is not applicable at the individual level due to the wide LOA. Self-reported physical activity assessed with the Adapted-SQUASH does not accurately represent accelerometer-derived physical activity assessed with the Actiheart in individuals with disabilities. They seem to overestimate their physical activity and find it difficult to recall the perceived intensity of the activity. The test-retest reliability and concurrent validity of the Adapted-SQUASH are comparable to other physical activity questionnaires among people with disabilities. We recommend using the total minutes of activity per week and/or total activity score, derived from the Adapted-SQUASH,preferably in combination with measurements with an activity monitor in a sub-sample -to evaluate physical activity in large populations of people with disabilities in rehabilitation practice and beyond as well as in (cohort) research.