Comparability of Postural and Physical Activity Metrics from Different Accelerometer Brands Worn on the Thigh: Data Harmonization Possibilities

ABSTRACT The aim was to establish which postural and physical activity outcomes are comparable across different accelerometer brands worn on the thigh when processed using open-source methods. Twenty participants wore four accelerometers (Axivity, ActiGraph, activPAL, GENEActiv) for three free-living days. Postural and physical activity outputs (average acceleration, intensity gradient, intensity of the most active 30 min, 60 min, and 8 h) were generated. Postural outputs: Mean absolute percent errors (MAPEs) were low, reliability excellent, and equivalency within the 5% zone across all monitor pairings for sitting/lying and upright times, but not specific lying postures. Physical activity outputs: MAPEs were higher and reliability lower than for sitting/lying and upright time. However, the majority of the outcomes were within the 10% equivalency zone for Axivity/GENEActiv and Axivity/ActiGraph pairings. Total sitting/lying and upright times show strong potential for harmonization across studies utilizing different thigh-worn accelerometers. The majority of acceleration outcomes compare well for Axivity, GENEActiv, and ActiGraph.


Introduction
Accelerometer-based devices worn on the thigh are increasingly being used in large research studies to assess physical behaviors (Stamatakis et al., 2020). The thigh has become a wear location of interest due to its accuracy for estimating the postural component of sedentary behavior (i.e., lying, sitting, and reclining postures) as well as physical activity Montoye et al., 2016b;Lyden et al., 2017;Sellers et al., 2016). The activPAL device (PAL Technologies Ltd, Glasgow, UK) is specifically designed to be worn on the thigh but more recently other brands of accelerometer, such as ActiGraph, Axivity, and GENEActiv, are being used in this wear location (Daugaard et al., 2018;Hartley et al., 2018;Jørgensen et al., 2019;Ryan et al., 2019). These devices all measure raw acceleration across three axes so there is potential for data harmonization across devices. This is particularly pertinent given the recently established Prospective Physical Activity, Sitting, and Sleep Consortium (ProPASS), which aims to combine existing and future observational studies utilizing thigh worn accelerometry to "produce evidence on the associations of physical activity, sitting and sleep and health outcomes" (Stamatakis et al., 2020).
In a controlled laboratory setting, it has previously been demonstrated that accelerometers worn on the thigh produced similar accuracy for identifying lying, sitting, and upright postures irrespective of whether proprietary postural allocation algorithms or opensource algorithms were used to process the data . However, not all accelerometer brands were processed using a single method and the structured activities completed within the laboratory may not reflect free-living behaviors. More recently, by using the Acti4 software package to process all data, Crowley et al. (2019) compared time spent on physical behavior types (e.g., sitting, standing, walking, running) from three different accelerometer brands (activPAL, ActiGraph, and Axivity) worn on the thigh. They concluded that during a 7-day free living protocol, the difference in the average time spent on the physical behavior types was negligible between the accelerometer brands. When averaged over 24 h, the absolute differences were as little as 0.1 min for running to 3.5 min for moving between accelerometers. These results show promise for the harmonization of data collected from different brands of thigh worn accelerometers. This study focuses on physical behavior types classified using Acti4 software; this is not open source, although it is freely available upon agreement with the National Research Center for the Working Environment, Copenhagen (Crowley et al., 2019). In addition to classification of behavior type, it is important to assess whether metrics that directly reflect measured acceleration from the accelerometers are comparable. These include the magnitude of the acceleration, the orientation of the monitor relative to gravity, and the pattern of the acceleration. Algorithms to classify behavior type will continue to evolve but will be based on direct measurements from the accelerometer (e.g., Montoye et al., 2016a;Tjurin et al., 2019). Further, direct measures may be used alongside behavior-type algorithms; for example, the intensity (acceleration magnitude) when someone is walking. Therefore, knowing which of the directly measured acceleration metrics compare well is crucial if considering harmonizing data across brands.
Direct measures of acceleration have been used to describe the physical activity profile and its association with health (e.g., Buchan et al., 2019;Fairclough et al., 2019;Rowlands et al., 2018a). For example, the average acceleration reflects the overall level of activity, the intensity gradient (Rowlands et al., 2018a) describes the distribution of acceleration intensity across the 24 h profile, and the MX metrics describe the intensity of the most active periods of the day (e.g., M30 for the most active 30 min) (Rowlands et al., 2019a(Rowlands et al., , 2019c. We have previously shown the average acceleration and intensity gradient were equivalent between accelerometer brands worn on the non-dominant wrist, with only the intensity gradient being equivalent between brands on the dominant wrist (Rowlands et al., 2019b). Given comparability can differ between accelerometer brands for the dominant and non-dominant wrist (Rowlands et al., 2019b) and that different movement patterns occur from the wrist and thigh, we cannot assume results from studies comparing the metrics at the wrist will generalize to the thigh. Furthermore, the equivalency of these metrics has yet to be investigated between the activPAL (the most widely used thigh-worn accelerometer) and different accelerometer brands, and between all brands when worn on the thigh.
The aim of this study is to compare postural and physical activity outcomes across the GENEActiv, activPAL, ActiGraph, and Axivity accelerometers worn on the thigh when processed using the same opensource methods. We hypothesize that outcomes created using information on the orientation of the device (i.e., postural outcomes) will not differ between accelerometer brands, but there will be differences in acceleration metrics between accelerometer brands when processed using the open-source accelerometer processing and analyzing software GGIR , particularly for the activPAL due to its lower dynamic range (± 2 g) and sampling frequency (20 Hz) relative to the other accelerometer brands. The results will inform the potential for data harmonization across studies for different postural and physical activity outcomes across different accelerometer brands when worn on the thigh.

Participants and procedure
A convenience sample of 23 adults (≥18 years of age) was recruited via word of mouth from the University of Leicester (staff and students). Ethical approval was requested in May 2019 and received from the University of Leicester's College of Life Sciences ethical representatives in July 2019.
Following the provision of informed consent, participants' height, weight, and body fat percentage (Tanita BC-418 MA, Tanita Europe BV) were measured and basic demographic variables (age, sex, and ethnicity) were collected. Participants were fitted with four accelerometers (activPAL micro, GENEActiv, Axivity, ActiGraph Link) on the upper mid-line of the right thigh. The devices were attached to the skin with medical dressing (Hypafix; BSN Medical, UK). The Axivity was taped on top of the ActiGraph and the activPAL taped on top of the GENEActiv with the order in which these were worn on the thigh randomized for each participant. The two randomization options were as follows: 1) GENEActiv/activPAL closest to the knee and the ActiGraph/Axivity closest to the hip; 2) GENEActiv/activPAL closest to the hip and the ActiGraph/Axivity closest to the knee. There was approximately a 1.5 cm gap between the pairs of devices when placed on the thigh. Participants were instructed to wear the monitors continuously for three nights and two complete waking days and complete a log of the time they went to bed, fell to sleep, woke up, and got out of bed as well as any monitor removal times.

Accelerometers
The activPAL3 TM micro (PAL Technologies Ltd, Glasgow, UK) is a triaxial accelerometer with a dynamic range of ± 2 g, where g is equal to the Earth's gravity. Default settings (20 Hz, 10 second minimum sitting, and upright period) were used during initialization. activPAL monitors were initialized and data downloaded using PAL Connect version 8.10.6.63 and data saved in raw format as .csv through PAL Batch.
The ActiGraph GT9X Link (ActiGraph LLC, Pensacola, FL, USA), GENEActiv Original (Activinsights Ltd., Cambridgeshire, UK), and Axivity AX3 (Axivity Ltd, Newcastle, UK) are triaxial accelerometers with a dynamic range of ± 8 g (ActiGraph GT9X Link and GENEActiv Original) and ± 16 g (Axivity AX3). These monitors can be worn on various body locations including wrist, waist, ankle, upper arm, and thigh. All accelerometers were configured on the same PC in the dynamic range ± 8 g and to record at a frequency of 100 Hz to reflect common practice (e.g., Doherty et al., 2017;Fairclough et al., 2019;Rowlands et al., 2018b). The Actigraph Link GT9X monitors were initialized and data downloaded using ActiLife version 6.13.4. Data were saved in raw format as .gt3x, and subsequently converted to .csv format for data processing. GENEActiv monitors were initialized and data downloaded using GENEActiv PC software version 3.2 and data were saved in raw format as .bin files. Axivity devices were set up and data downloaded with OmGui open-source software (OmGui Version 1.0.0.37, Open Movement, Newcastle, UK) and saved in .cwa format.

Postural outcomes
GENEActiv: The raw .bin files were converted into 15 second epoch csv files using GENActiv PC software v3.2. The 15-s epoch files were imported into a custombuilt spreadsheet template in Excel originally designed for GENEActiv 15 s epoch data (ActivInsights postural allocation algorithm (Rowlands et al., 2014) available at: https://www.researchgate.net/project/AMBer-Assessment-of-Movement-Behaviors) that computes the most likely posture based on the relative values of the x, y, z vectors measured at the thigh.
ActiGraph, Axivity and activPAL: The raw .csv files were converted with a custom-built program (GT9X-to-SedSphere, Liverpool John Moores University, UK) written in MATLAB (R2017b, The MathWorks Inc., Natick, MA, USA). It calculates the vector magnitude with gravity subtracted, summed over 15 s epochs (as in GENEActiv PC software), and matches the orientation of each axis to those of the GENEActiv, as required for the ActivInsights postural allocation algorithm (Hurter et al., 2019). Data were then imported into a custombuilt template in Excel processing as described for the GENEActiv. The following outcomes were generated and averaged across the 2 days: sitting/lying on the back, lying on the side, lying on the front, total time sitting/lying (summary of the previous three variables) and total upright time.

Physical activity outcomes
All accelerometer files were processed and analyzed with the R-package GGIR version 1.9-4 (http://cran.r-pro ject.org)  using default settings to maximize generalizability. Signal processing in GGIR included auto calibration using local gravity as a reference (Van Hees et al., 2014); calculation of two acceleration signal aggregation metrics (ENMO (Euclidean Norm minus 1 g) and MAD (Mean Amplitude Deviation)) averaged over 5 s epochs; detection of non-wear; and detection of abnormally sustained high values. Both metrics aim to remove the gravitational component of acceleration, thus reflecting body movement (Van Hees et al., 2013). ENMO is the average magnitude of dynamic acceleration with one g subtracted and resulting values rounded to zero (Van Hees et al., 2014. MAD is the variability in acceleration around the mean (Euclidean norm of each raw acceleration datapoint minus the mean of its correspondent 5 s epoch (Aittasalo et al., 2015;Migueles, Cadenas-Sanchez et al., 2019;Vähä-Ypyä et al., 2015a, 2015b). Both were expressed in milli-gravitational units (mg). Non-wear was imputed using the default setting in GGIR i.e., invalid data were imputed by the average at similar time-points on different days of the week.
Participants were excluded if their accelerometer files showed: post-calibration errors greater than 0.01 g -(10 mg), or wear data were not present for each 15 min period of the 24 h cycle. Detection of non-wear has been described in detail previously (See "Procedure for nonwear detection" in supplementary document to Van Hees et al., 2013). Briefly, non-wear is estimated based on the standard deviation and value range of each axis, calculated for 60 min windows with a 15-min sliding window. The window is classified as non-wear if, for at least two out of the three axes, the SD (standard deviation) is less than 3 mg or the value range is less than 50 mg.
The following outcomes were generated and averaged across the 2 days: average acceleration (mg); intensity gradient; acceleration above which a person's most active 8 hours, 60 and 30 minutes (M8h, M60, and M30, mg) are accumulated (within the GGIR package, these metrics are obtained in part 2 using: qlevelsqlevels (0,24 hours): 960/1440, 1380/1440,1410/1440). The M8h, M60, and M30 rank the accelerations for each epoch during the day in descending order to obtain the acceleration above which the person's most active 8 hours, 60, and 30 minutes, respectively, are accumulated (Rowlands et al., 2019a). The M60 and M30 illustrate the more active periods of the day, while the M8h refers to the most active 8 h of the day. If sleep approximates 8 h, the M8h is the acceleration that discriminates between the least and most active half of the waking day (Rowlands et al., 2019a).
The average acceleration reflects the overall level of physical activity. The intensity gradient reflects the distribution of acceleration intensity across the 24 h day and has been described elsewhere (Rowlands et al., 2018a); in brief, it describes the negative curvilinear relationship between physical activity intensity and the time accumulated at that intensity during the 24 h day. The intensity gradient is always negative, reflecting the drop in time accumulated as intensity increases; a more negative (lower) gradient reflects a steeper drop with little time accumulated at mid-range and higher intensities, while a less negative (higher) gradient reflects a shallower drop with more time spread across the intensity range. It was calculated as previously described (Rowlands et al., 2018a) and generated in GGIR (argument ig levels = TRUE).

Statistical analysis
Only those who provided a full set of data (at least 20 hours of non-imputed data on each day across all four accelerometers: amount and timing of non-wear time the same between accelerometers, confirmed numerically and by visual inspection of the accelerometer traces) were included in analyses. This was determined for the physical activity outcomes and matched for the postural outcomes. Sleep data were not excluded as the purpose was to compare the posture and acceleration metrics as generated from the open-source algorithms. Descriptive statistics (mean ± SD) or median (interquartile range) where data were not normally distributed were calculated for all outputs. Pairwise 95% equivalence tests were used to determine whether the 95% CI for the mean of one accelerometer fell within a 10% equivalence zone of the second accelerometer (Wellek, 2003). The 10% equivalence zone might not be strict enough when the values are very high or the range is low, so a 5% equivalence zone was also used. Log transformation was applied to data that were not normally distributed. No accelerometer is considered a gold standard so equivalency tests were conducted with each accelerometer as the reference monitor for each accelerometer pairing. Postural and physical activity outcomes from accelerometer pairings were considered equivalent if equivalency was achieved irrespective of which accelerometer was the reference monitor. Equivalence results are shown with the ActiGraph as the reference when paired with the GENEActiv, Axivity, and ActiPAL, and Axivity as a reference when paired with the activPAL and GENEActiv, and the GENEActiv as a reference when paired with activPAL. Our previous work (Rowlands et al., 2018b), suggests that the ratio between the mean average acceleration outputs from the two accelerometers would be within 1 ± 0.05 and the standard deviation of the differences in the log transformed outputs would be less than 0.08. A sample size of 20 gives 80% power (alpha 0.05) to detect statistical equivalence of the anticipated effect size (Minitab v20.1; Minitab LLC, State College, PA).
The level of agreement between outcome variables was examined using mean absolute percent error (MAPE), pairwise Intra-class Correlation Coefficients (ICC) with 95% confidence intervals (CI) and mean bias and limits of agreement (LoA) (Bland & Altman, 1986). The level of reliability from ICCs was based on the lower bound of the 95% CI of the ICC values. If lower bound of the 95% CI was <0.5, reliability was classified as "poor," 0.5-0.75 as "moderate," >0.75-0.9 as "good" and >0.9 as "excellent" (Koo & Li, 2016). Percent agreement and kappa were used to determine epoch by epoch classification accuracy for upright vs sitting/lying (sitting/lying on the back, lying on the side or on the front).
Descriptive statistics such as MAPE, ICCs, LoA, and classification accuracy were calculated using IBM SPSS 25 Statistics, and equivalency tests were conducted in Minitab (v19). Alpha was set at 0.05.

Results
Valid data across all accelerometers was provided by 20 of the 23 participants, consisting of seven males and 13 females (mean age: 28.7 ± 6.3 years [range 20-45 years]; mean BMI: 24.0 ± 3.8 kg/m 2 [range 19.8-31.3 kg/m 2 ]). The data were invalid for two participants due to malfunction of the activPAL accelerometer and for one participant due to <20 hours wear-time on each day. Descriptive data (mean (SD), or median (25 th -75 th percentile) were not normally distributed) of postural and physical activities outcomes are presented in Table 1. Monitors were worn for 23.9 ± 0.4 h per 24 h day, with 4.9 ± 1.9 minutes of data imputed per 24 h day. Participants took 9825 ± 3948 steps per day (generated from the activPAL (Winkler et al., 2016)) and spent 10.7 ± 2.1 hours per day sedentary (excluding time reported sleeping).

Postural Outputs
Descriptive statistics were similar for sitting/lying and upright time across all monitors (Table 1). However, mean values suggested the activPAL tended to give higher estimates of sitting/lying on the back and lower estimates of lying on the side. MAPEs were lowest (0.3-0.8% for sitting/lying time and 1.5-2.5% for upright time) and reliability excellent (lower bound ICCs ≥0.99) for all monitor pairings for sitting/lying time and upright time. However, MAPEs were higher for the specific lying positions (back: 1.1-8.5%; side: 3.1-21.0%), with the monitor pairings which included the activPAL having consistently higher MAPEs. Reliability was good for sitting/lying on the back (lower bound ICCs 0.87-0.99), lying on the side (lower bound ICCs 0.85-0.98) and lying on the front (lower bound ICCs 0.81-0.95). All monitor pairings were within the 5% equivalency zone for the average time spent sitting/ lying and upright. Results for the specific lying positions were less consistent, but for sitting/lying on the back and lying on the side all monitor pairings that did not include the activPAL were within the 10% equivalency zone. Epoch-by-epoch classification accuracy for upright vs sitting/lying was generally high irrespective of monitor pairing (mean percent accuracy 96.7% (6.3)), Table 2.

Physical activity outputs
Mean values for the activPAL were lower across all physical activity outputs in comparison to other monitors as were values from the ActiGraph compared to the GENEActiv and Axivity across the majority of outputs. The MAPEs were less than 8% (3.6-8.0%) for all monitor pairings for intensity gradients generated from ENMO and MAD. For all other outputs, there was greater variation in MAPE across monitor pairings, with the ActiGraph/Axivity pairing generally having lower MAPEs (ENMO: 3.6-11.3%; MAD: 3.0-15.5%) than other monitor pairings across most physical activity outputs. The M8h followed a different pattern with monitor pairings including the Axivity having the highest MAPE for ENMO outputs (38.2-44.2%), but monitor pairings including the activPAL having the highest MAPE for MAD outputs (63.2-177.0%). The monitor pairings which included the activPAL consistently had higher MAPEs (ENMO: 6.9-29.3%; MAD: 6.0-35.5% for GENEActiv/activPAL pairing; ENMO: 6.5-25.5%; MAD: 6.8-25.2 for ActiGraph/activPAL pairing; ENMO: 7.9-32.8%; MAD: 5.5-32.5 for Axivity/ activPAL pairing) across all other physical activity outputs compared to the other monitors.
Reliability was good to excellent for average accelera-  MAD (ENMO: ICC 0.95-0.99). For M8h ENMO, reliability was moderate for the GENEActiv/activPAL, ActiGraph/GENEActiv, ActiGaph/Axivity (ICC: 0.52-0.65), but otherwise poor. All the monitored pairings were within the 10% equivalency zone for the intensity gradient for ENMO and MAD. The M30 and M60 were within 10% equivalence zone for ENMO and MAD for the Axivity and the GENEActiv and the ActiGraph and Axivity. The average acceleration was within 10% for the Axivity and GENEActiv (ENMO) and for the ActiGraph and Axivity (MAD). An inconsistent pattern of results were observed for the other outcomes and monitor pairings.

Discussion
Accelerometer-based devices worn on the thigh are now being used in large observational studies (Stamatakis et al., 2020). To aid comparisons across studies and to combine data from studies that have employed different accelerometer brands, it is important to understand which postural and physical activity metrics are comparable at the individual level. When information on the angle of the thigh is used for classification of posture, total sitting/lying time and upright time compare well across all monitor pairings. For directly measured acceleration, the majority of outcomes compared well for the Axivity, GENEActiv, and ActiGraph. Overall, the upright and sitting/lying postural outputs show the highest potential for data pooling across all four brands.
Recently, Crowley et al. (2019) compared time spent in types of physical behaviors (i.e., sitting/lying, standing, moving, walking, stair climbing, running, and cycling) across three different accelerometer brands (activPAL, Axivity and ActiGraph) worn simultaneously on the thigh during a semi-structured protocol of activities and a 7-day free-living measurement period. All data were processed using a custom developed software package called Acti4 and results showed only small absolute standard deviation values for lying/sitting (1.2 minutes/day), standing (3.4 minutes/day), moving (3.5 minutes/day), walking (1.9 minutes/day), running (0.1 minutes/day), stair climbing (1.2 minutes/day), and cycling (1.9 minutes/day). Considering these results alongside the present study, data harmonization using data collected from different accelerometer brands worn on the thigh appears to be most appropriate when outcomes of interest are broad postures (sit/lie or upright) or physical behavior types rather than acceleration magnitude-based metrics, even when the same processing methods are used. This is consistent with previous research that has demonstrated greater consistency between monitor brands for the orientation of the monitor Rowlands et al., 2016) and acceleration features from the frequency domain (John et al., 2013;Rowlands et al., 2015) than for the magnitude of acceleration. For acceleration magnitude, the Axivity/GENEActiv (ENMO) and the Axivity/ ActiGraph (MAD) pairings show the greatest potential for data harmonization as they could be considered 29.4 (20.4) 12.5 (9.8) 11.8 (11.8) 11.3 (9.0) Intensity Gradient 6.5 (4.2) 6.6 (4.4) 8.0 (5.1) 5.2 (5.2) 6.3 (5.9) 3.6 (2.6) M30 29. equivalent for all ENMO outputs except for the intensity of the least active 8 h. The intensity gradient (calculated from either ENMO or MAD) showed comparability across all monitor pairings, likely because this metric describes the distribution (or pattern) of acceleration intensity across the 24 h profile metric, rather than the magnitude of acceleration. This is in line with previous research where the intensity gradient was equivalent between all accelerometer brands worn on either wrist (Rowlands et al., 2019b). However, the intensity gradient has a narrow range of values, and thus the 10% equivalence zone encompasses values that differ by over half a standard deviation, which limits its physiological and clinical relevance. The range of average acceleration encompassed by the 10% zone is 0.2 standard deviations, which equates to a 3% equivalency zone for the intensity gradient. Our results suggest that the intensity gradient falls into this 3% equivalency zone for the Axivity/ GENEActiv and ActiGraph/Axivity pairings, which is consistent with equivalency results for the other acceleration outcomes. The results for the intensity of the most active 8 h suggest that the equivalence between monitors is lowest and MAPE highest for very low acceleration magnitudes, particularly for the MAD metric. This is important as it may have implications for low-active populations where accelerations throughout the day are lower; however, reliability was moderate   (ENMO) to excellent (MAD), and mean bias was relatively low.
The activPAL tended to record lower acceleration values for the physical activity outputs calculated from direct acceleration relative to the other devices. The lower acceleration values observed with the activPAL device were anticipated due to the lower dynamic range (± 2 g vs ± 8 g) and frequency (20 Hz vs 100 Hz) of the device settings for data collection in comparison to other brands (John et al., 2019). The ActiGraph device also recorded lower accelerations for the physical activity outputs relative to the GENEActiv and Axivity accelerometers. This has been reported previously when accelerometers brands were worn concurrently on the wrist (Rowlands et al., 2018b(Rowlands et al., , 2019b and during shaker testing (John et al., 2013). This has been attributed to the data being passed through a wide bandwidth low-pass filter that is used when down sampling to the frequency requested by the researcher (Rowlands et al., 2019b). As there was no criterion measure in this study, it is not possible to know which monitor performed best, nor is it possible to know the most appropriate dynamic range and sampling frequency for the capture of physical behaviors.
The main strengths of this study were the use of open software methods to process the data from all monitors, the free-living assessment, and simultaneous wearing of four different accelerometer brands. However, a number of limitations should be noted. Firstly, in order for the monitors to be worn on a similar part of the thigh, we made the decision to tape monitors together. The activPAL was taped on top of the GENEActiv and the Axivity on top of the ActiGraph which may have affected the results, specifically this could have impacted on the different lying positions. Second, the epoch duration differed for postures and physical activity analyses; to ensure the results were relevant to people accessing the opensource methods the default epoch settings of 15 s for the posture spreadsheet and of 5 s for GGIR outcomes were used. This may have impacted on the findings. Finally, our sample size was small and did not include any participants over the age of 45 years. Our sample did however include a range of activity levels; the average acceleration ranged from 10 to 70 mg and the steps per day ranged from 3475 to 20,640. In view of the poorer results for the lowest acceleration outcomes (intensity of the most active 8 h), comparability of acceleration outcomes in a lower active population should be investigated. Ratio between the means

MAD Average Acceleration
Ratio between the means

MAD Intensity Gradient
Ratio between the means

MAD M30
Ratio between the means

MAD M60
Ratio between the means

MAD M8h
Ratio between the means

ENMO Average Acceleration
Ratio between the means

ENMO Intensity Gradient
Ratio between the means

ENMO M30
Ratio between the means
In conclusion, the results suggest that average time spent in total sitting/lying, upright time derived from any of the monitors tested have a strong potential for harmonization. Overall, most acceleration outcomes compared well for the Axivity, GENEActiv and ActiGraph for both the ENMO and MAD acceleration metrics, but care is recommended when comparing acceleration magnitude outcomes between brands. For the widely used ENMO acceleration metric Migueles, Cadenas-Sanchez et al., 2019;Rowlands et al., 2019a), the Axivity and the GENEActiv appear most suited to harmonization, however, the magnitude of acceleration from the activPAL is lower than from the other monitor brands. These findings will inform studies looking to harmonize individual-level data from studies deploying accelerometer data, e.g, ProPASS