Calibration and Cross-validation of Accelerometry in Children and Adolescents with Cystic Fibrosis

ABSTRACT Commonly used cut-points may misclassify physical activity (PA) in people with cystic fibrosis (CF). The aim of this study was to develop and cross-validate condition-specific cut-points in children and adolescents with CF. Thirty-five children and adolescents with CF (15 girls; 11.6 ± 2.8 years) and 28 controls (16 girls; 12.2 ± 2.7 years), had their energy expenditure and triaxial acceleration measured during six daily activities of varying intensities. Euclidean Norm Minus One (ENMO) and Mean Amplitude Deviation (MAD) were extracted using both GENEActiv (both wrists) and ActiGraph GT9X (both wrists and right waist) accelerometers. ROC curves were used to determine healthy and CF-specific raw acceleration cut-points for sedentary time (SED), moderate physical activity (MPA) and vigorous physical activity (VPA). The PA cut-points were generally lower in CF compared to controls for both ENMO (60.2–73.1 vs. 63.5–86.8 mg) and MAD (58.9–85.2 vs. 75.9–93.7 mg). These substantial inter-cut-point differences support the need for disease-specific cut-points.


Introduction
Physical activity (PA) is associated with many benefits in those with cystic fibrosis (CF), such as the reduction of exacerbations and the improvement of both life expectancy and quality of life (Hebestreit et al., 2014). Accelerometers measure velocity over time, which can subsequently be translated into PA intensities by using prediction equations, cut-points, and, more recently, machine learning models (Arvidsson et al., 2019). However, cut-points or prediction equations that were developed for the general population may lead to inaccurate PA estimations when applied to those with a chronic condition such as CF (Mackintosh et al., 2018). For example, non-disease specific moderate-to-vigorous PA (MVPA) cut-points were found to underestimate PA in youth with CF (Stephens et al., 2016). Such discrepancies may be related to the pathophysiology of the disease, as children and adolescents with CF have a higher resting metabolic rate (RMR) and energy expenditure (EE) for a given task compared to healthy peers (Moudiou et al., 2007). Consequently, the cut-points used to assess PA in children with CF may need to be tailored to estimate relative PA levels.
To develop CF-specific cut-points, accelerometers need to be calibrated using a protocol comprising a range of activities that span the intensity spectrum and are representative of daily life (Welk, 2005). In healthy populations, calibration studies have typically utilized either laboratory-based or free-living protocols (Welk, 2005). Whilst highly structured activities, such as walking or running on a treadmill, included in laboratory-based protocols are generally associated with superior predictive accuracy, they lack ecological validity (Welk, 2005). Consequently, free-living protocols are widely recommended to generate cutpoints reflecting the unique, sporadic nature of children's PA patterns (Mackintosh et al., 2012). However, despite the associated advantages, a freeliving protocol precludes the measurement of a biological reference criterion, such as EE, which is pivotal when calibrating accelerometry, especially in clinical populations (Mackintosh et al., 2012). Indeed, a recent systematic review found that while the type of protocol greatly impacts PA classification, studies calibrating accelerometry in pediatric clinical cohorts should account for the pathophysiology of the disease and integrate EE measurements in the protocol (Bianchim et al., 2020).
Earlier studies developing cut-points in healthy children have utilized waist-worn accelerometers due to the proximity of this location to the body's center of gravity (Welk, 2005). Although the waist is known to provide accurate estimations of wholebody movement, this placement is associated with poor compliance, which can lead to misclassification of PA intensities and bias (Fairclough et al., 2016). Similarly, there is no consensus regarding the optimal accelerometer brand to derive PA levels in children, though recent studies have relied on brands that provide raw acceleration data, such as the GENEActiv and ActiGraph (GT3X+ and GT9X; Hildebrand et al., 2014;Hurter et al., 2018).
The aim of this study was to develop diseasespecific cut-points for children and adolescents with CF using different accelerometer placements (wrist and waist) and brands (GENEActiv and ActiGraph), in comparison to a healthy comparative group. Different accelerometer brands and placements were compared.

Participants
Thirty-five children and adolescents with CF (15 girls) and 28 healthy controls (16 girls), aged 7-17 years, participated in the study. Participants with CF were recruited from pediatric CF clinics in South Wales and had been diagnosed as having CF according to a newborn screening test, and/or presenting with CF-typical symptoms and either two pathological sweat tests or the identification of two CF-relevant mutations. Those with multi-resistant bacteria, an acute exacerbation at the time of the assessments, co-morbidities such as cardiovascular or musculoskeletal issues that compromise exercise performance, or who were less than two weeks post antibiotic treatment for an exacerbation or awaiting a transplant, were excluded from the study. Healthy participants were recruited through a university in Wales and from the friends and families of the CF participants. The health status of the healthy control group was confirmed by a short clinical anamnesis. Written informed consent was obtained from parents/guardians and assent from the participants prior to the study commencement. Ethics approval was obtained from the National Health Service (NHS) Research Ethics Committee (18/WS/0032).

Protocol
Participants were asked to attend the laboratory on three occasions, with the first two visits separated by seven days. The first visit involved baseline measures of anthropometry, RMR, and lung function. The second and third visits consisted of the daily-life activity protocol and a treadmill-based exercise test, respectively. Participants were asked to arrive at least two hours postprandial and to have avoided caffeine and vigorous exercise for 24-hours. For participants with CF, information regarding medication, any associated comorbidities, and the frequency of exacerbations was extracted from their medical records.

Anthropometry
Body mass, stature and sitting height were measured to the nearest. 1 kg, .1 cm and .1 cm, respectively. Body mass index (BMI) and age-and sex-specific z-scores were determined according to the World Health Organization reference data (De Onis et al., 2004). Pubertal stage was estimated according to time pre-or post-peak height velocity (PHV; Mirwald et al., 2002), with pre-considered >-1 years from PHV, circa-as −1 to +1 years and post-PHV as >+1 years post PHV.

Resting metabolic rate
The assessment of RMR was performed for 20 minutes via indirect calorimetry through a facemask and calibrated metabolic cart following at least 10 minutes at rest on supine (MetaMax Cortex 3B, CORTEX Biophysik GmbH, Germany). This measure was performed in a quiet room and all participants were instructed to remain supine and to avoid talking and/ or sleeping. To calculate RMR, the first five minutes and the last two and a half minutes were removed. Subsequently, RMR was calculated according to the Weir equation (Weir, 1949).

Aerobic capacity
A standard Bruce protocol was used to assess peak oxygen uptake ( _ VO 2peak ). Gas exchange variables were measured on a breath-by-breath basis (Metamax 3B, Cortex Biophysik GmbH, Germany). Oxygen saturation and heart rate were measured throughout the exercise protocol using a pulse oximeter (Nonin® WristOx® Model 3150, Nonin® Medical Inc., USA) and a three-lead electrocardiogram (ECG; Custo Guard ECG, custo med GmbH, Germany), respectively. Peak oxygen uptake was defined as the highest 10-s moving average during the exercise test.

Lung function
All participants were asked to complete a spirometry assessment at the start of the first session using a forced vital capacity maneuver to determine forced expiratory volume in one second (FEV 1 ) in accordance with the European Respiratory Society standards (Moore, 2012). The FEV 1 % predicted was estimated using a reference equation (Quanjer et al., 2012) for age, sex and height, and used to categorize disease severity in those with CF as mild (>70%), moderate (40-69%) or severe (<40%; Davies et al., 2009).

Accelerometry
The ActiGraph GT9X Link (ActiGraph, Pensacola, FL) and GENEActiv (ActivInsights Ltd., Cambridge, UK) were used to measure raw acceleration at 100 Hz. Both accelerometers have a dynamic range of approximately 8 g. ActiGraph GT9X Link monitors were placed on each wrist and the right waist with GENEActiv monitors placed on each wrist. The wrist monitors were placed beside one another in a randomized order across participants. The ActiGraph low frequency extension was activated.

Daily-life calibration protocol
The daily-life calibration protocol consisted of activities mimicking the participants' daily lives. During their first laboratory visit, participants were given a spreadsheet of common activities from the compendium of physical activities (Ainsworth et al., 2011) and asked to select any that they would typically do at least once a day. Suggestions of additional activities were also integrated, with the six most commonly selected activities, stratified by behavior type (i.e., sedentary), chosen to be integrated into the daily-life protocol (Table 1). Over 50 minutes, participants performed the same six activities for three to ten minutes each, in a randomized order, interspersed by three minutes rest, whilst wearing the accelerometers, metabolic system and pulse oximeter.

Data reduction
The raw acceleration data were extracted as .gt3x files and .bin files using ActiLife V 6.10.2 and GENEActiv PC software V2.2, respectively. All .gt3x files were converted to time-stamp free .csv files and subsequently imported along with the .bin files into R statistical software (V3.1.2; R Foundation for Statistical Computing, Vienna, Austria) for the extraction of raw acceleration data. Specifically, the GGIR package (V1.2-0;  was used to auto-calibrate and extract the Euclidean Norm Minus One (ENMO) and Mean Amplitude Deviation (MAD) metrics in 5-s epochs .
Metabolic equivalent of task (MET) values for each activity were calculated by dividing the _ VO 2 (ml·min-−1 ·kg −1 ) for each activity by the measured resting _ VO 2 . Data from the first and last minute of each activity were excluded to avoid transitional movements. Subsequently, MET values were aligned with the acceleration metrics for each activity. MET values were then used to code the corresponding 5-s raw accelerometer data as sedentary (≤1.5 METs), moderate (4.0-6.9 METs) or vigorous (≥7.0 METs; Troiano et al., 2008).

Statistical analyses
Descriptive analyses were performed using SPSS Statistics, version 23.0 (IBM Corp., USA) with data presented as mean ± standard deviation (SD) or frequencies for continuous and categorical variables, respectively. Participants with missing data were not included in the analysis. A two-way ANOVA and Kruskal-Wallis test were utilized for parametric and non-parametric data, respectively, to investigate inter-group comparisons of participant demographics, accelerometer outputs and EE data. A three-factorial, repeated-measures ANOVA test with Bonferroni correction was used to investigate the effect of activity type, accelerometer placement and accelerometer brand, and their interaction according to disease status.
Disease-specific cut-points for SED, MPA and VPA were generated using Receiver operating characteristic (ROC) with its respective area under the curve (AUC) in MedCalc (version 19.2.1; Ostend, Belgium). The ROC analyses were interpreted according to the sensitivity (the number of true positives) and specificity (the number of false positives). For the ROC-AUC, values of ≥.90 were considered excellent, .80-.90 good, .70-.80 fair, and <.70 poor. The code generated from the MET values, for each intensity, was used as the dependent variable for the ROC curve, with the cut-points Walking continuously at a self-selected comfortable pace for five minutes Stairs Climbing and descending stairs continuously at a self-selected comfortable pace for three minutes generated selected to optimize both sensitivity and specificity. An iterative leave-one-out approach was conducted in R to cross-validate the cut-points. Bland-Altman plots were used to assess the mean bias and limits of agreement between monitors for each placement in both groups (Bland & Altman, 1986).
Significance was accepted at p ≤ .05.

Results
From the initial 64 participants that were screened, one participant was unable to attend the second visit and was therefore excluded from further analyses. In total, the study included 63 children, 35 with CF and 28 healthy controls (Table 2). Most CF participants were homozygous (55%) for the ΔF508 mutation. Fifty-five children were right-handed (30 CF) and 8 were left-handed (5 CF). The two-way ANOVA revealed that those with CF had significantly lower body mass (p = .02) and zBMI (p = .006) than the healthy participants, but there were no significant differences in age, stature, RMR, lung function or _ VO 2peak . During walking at a comfortable pace, the CF group had significantly higher accelerometer outputs in comparison to the healthy group for both ENMO (p = .04) and MAD (p = .05), but there were no differences in METs for any activity (Table 3). Tables 4 and 5 present the cut-points derived from ENMO and MAD, respectively. The cross-validation data for each cut-point are presented as Supplementary Table 1.
The Bland-Altman plots (Supplementary Figure 1) indicated high agreement between the dominant wrist and waist for MAD in the CF group. In the healthy children, ENMO from the non-dominant wrist and waist displayed the best agreement. A visual inspection of the Bland-Altman plots showed heteroscedasticity for ENMO and MAD metrics, irrespective of placement and condition, and for most physical activity intensities. A negative correlation was found between the average  ENMO values and the difference in values for waistworn devices (−.24, p = .003) in the CF group. Moreover, a negative correlation was found between the average raw acceleration values and the difference in values for both ENMO and MAD from the dominant wrist in the healthy group (−.42, −.45, respectively, both p = .0001).

Discussion
This original study developed raw acceleration SED, MPA and VPA cut-points from ActiGraph GT9X and GENEActiv accelerometers placed at the waist and wrist in children and adolescents with and without CF. All cut-points demonstrated fair to excellent accuracy, sensitivity and specificity, and low error. Overall, the GENEActiv provided significantly higher outputs, in comparison with ActiGraph, irrespective of placement and population.
Overall, the disparity in the cut-points developed for youth with CF, relative to an apparently healthy population, both in the present study and in previous literature (Stephens et al., 2016), further supports the contention that cut-points developed in healthy children and adolescents are not suitable to estimate PA levels in youth with CF (Bianchim et al., 2020). Most importantly, earlier studies estimating PA levels in pediatric CF cohorts from nonspecific cut-points have potentially underestimated total MVPA in these populations. Indeed, previous research reported that those with CF spent less time in MVPA and more in LPA when compared with healthy participants (Aznar et al., 2014;Selvadurai et al., 2004); the present research shows that such findings may have been due to the cut-points applied, and subsequently a misclassification of PA. Therefore, the potential implications of utilizing these CF-specific cut-points, not least in terms of health outcomes or prognostic indicators, cannot be understated.
Among other factors, the use of count-based cutpoints is reported to be responsible for misclassifying PA intensities 33-68% (Trost et al., 2010). Consequently, more recent calibration studies have utilized raw acceleration metrics, such as ENMO and MAD (De Almeida Mendes et al., 2018). However, the majority of studies developing cut-points from raw acceleration metrics have focused on adults, with few generating thresholds for children (Hildebrand et al., 2014;Migueles, Cadenas-Sanchez et al., 2019). Importantly, whilst our study found that the MPA and VPA cutpoints developed for healthy children were largely comparable to these earlier studies, the MPA and VPA CFspecific cut-points were substantially lower. Such discrepancies may be due to the ventilatory and muscular impairments associated with the pathophysiology of CF (Stephens et al., 2016). Specifically, given the enhanced cost of breathing and the pathologic exercise intolerance associated with the condition, daily-life activities are likely to be more energetically costly for those with CF in comparison with their healthy counterparts (Matel & Milla, 2009). As such, it is expected that those with CF will have higher EE at lower raw accelerations than healthy participants during given activities. In accord with this, those in the CF group expended more energy during all the sedentary tasks and had significantly higher accelerometer outputs and EE during walking. Differences during walking may reflect altered muscle and metabolic function even in youth with mild CF (Erickson et al., 2015), reinforcing the crucial need for condition-specific cut-points. It is important to highlight, however, that there were no differences in METS or RMR between CF and healthy controls.
A unique aspect of this study was the comparison of outputs from two different accelerometer brands, across wrist and waist placements. Overall, GENEActiv generated higher values for both ENMO and MAD across all placements and participants, in accord with previous studies (Hildebrand et al., 2014;Hurter et al., 2018). Such inter-brand differences could be attributed to multiple factors including, but not limited to, a difference in the magnitude of acceleration signals, proprietary filters, signal to noise ratios, and data resolution (Hildebrand et al., 2014). It is well known that ActiGraph has an inbuilt low-pass filter of an unknown cutoff frequency, which is considered proprietary information . Previous studies have shown that outputs from different accelerometer brands are not interchangeable, despite that this difference does not seem to compromise the accuracy of the PA measurement (John et al., 2013;Montoye et al., 2016). However, it is crucial that future studies should use brand-specific cut-points to avoid misclassification.
Intra-brand comparisons showed that the dominant wrist-worn GENEActiv yielded higher ENMO and MAD values, particularly during free-games. This is particularly important considering that this activity was designed to replicate children's normal daily-life routines. Indeed, congruent with previous research in healthy children (Hildebrand et al., 2014), the agreement between placements across brands varied greatly by activity, with more pronounced differences shown for coloring, playing on a handheld device, free-games and stairs. Interestingly, Hildebrand et al. (2014) did not find such differences in adults. Of importance, MPA cut-points from the dominant wrist performed slightly better amongst all placements in CF, whilst the non-dominant wrist and waist performed better in healthy participants. Indeed, research in healthy children recommended that the non-dominant wrist should be the placement of choice to prioritize compliance (Hildebrand et al., 2014). However, as both wrist placements performed well, and to allow for greater standardization across studies, the non-dominant wrist placement is recommended in youth with CF.
The cross-validation found that the cut-points developed in children and adolescents accurately classified SED, MVPA and VPA most of the time, irrespective of condition, brand and placement. In particular, cutpoints derived from ENMO achieved superior accuracy for all placements and brands, in comparison with MAD, independent of health status. Nonetheless, further validation of the CF-specific cut-points is warranted to ensure validity in free-living conditions (Mackintosh et al., 2012).
Despite numerous strengths, this study is not without its limitations. Our sample consisted of children and adolescents with mild CF and the developed cut-points may therefore not be applicable to those with a more severe condition. However, it is important to highlight that with the advent of the triple-combination therapy Kaftrio (tezacaftor*elexacaftor*ivacaftor; Jaques et al., 2020) it could be argued that mild CF is going to be more common in the near future, and therefore these cut-points may be applicable to a wider proportion of the population with CF. Additionally, due to the COVID-19 lockdown, it was not possible to recruit as many healthy age-and sex-matched healthy participants. Finally, it is paramount to recognize that the statistical approach used to develop the cut-points can greatly impact the prediction accuracy (Welk, 2005). Specifically, whilst ROC curve analyses presents advantages in relation to linear-models (Bianchim et al., 2020), it may not be optimal for clinical populations, where adjusting for health-related confounding factors may enhance accuracy and account for interpatient variability. Given the complexity and nonlinear nature of PA, future research should consider using machine learning to further enhance the accuracy for PA classification in those with CF (Bianchim et al., 2020). This is the first study to calibrate and cross-validate cut-points from raw accelerometer data for children and adolescents with CF. The newly developed CF-specific cut-points demonstrated high sensitivity and specificity, fair to excellent accuracy, and a low error. Most importantly, the majority of the CF-specific cut-points, particularly MPA and VPA, were lower than those developed for healthy controls and previously reported cut-points. Therefore, the newly developed CF-specific cut-points highlight the need to reevaluate PA levels and associated health outcomes in children and adolescents with CF.