Intra-rater repeatability of gait parameters in healthy adults during self-paced treadmill-based virtual reality walking

Abstract Self-paced treadmill walking is becoming increasingly popular for the gait assessment and re-education, in both research and clinical settings. Its day-to-day repeatability is yet to be established. This study scrutinised the test-retest repeatability of key gait parameters, obtained from the Gait Real-time Analysis Interactive Lab (GRAIL) system. Twenty-three male able-bodied adults (age: 34.56 ± 5.12 years) completed two separate gait assessments on the GRAIL system, separated by 5 ± 3 days. Key gait kinematic, kinetic, and spatial-temporal parameters were analysed. The Intraclass-Correlation Coefficients (ICC), Standard Error Measurement (SEM), Minimum Detectable Change (MDC), and the 95% limits of agreements were calculated to evaluate the repeatability of these gait parameters. Day-to-day agreements were excellent (ICCs > 0.87) for spatial-temporal parameters with low MDC and SEM values, <0.153 and <0.055, respectively. The repeatability was higher for joint kinetic than kinematic parameters, as reflected in small values of SEM (<0.13 Nm/kg and <3.4°) and MDC (<0.335 Nm/kg and <9.44°). The obtained values of all parameters fell within the 95% limits of agreement. Our findings demonstrate the repeatability of the GRAIL system available in our laboratory. The SEM and MDC values can be used to assist researchers and clinicians to distinguish ‘real’ changes in gait performance over time.


Introduction
Gait analysis using instrumented treadmills has become increasingly popular for gait assessment and training. These treadmills offer potential advantages for advancing gait analysis in both clinical and research settings, by recording multiple consecutive strides in a small space (Belli et al. 2001;Goldberg et al. 2008;Reed et al. 2013). However, walking on a treadmill at fixed speed with the absence of visual flow raises the concern over whether gait is being compensated for by maintaining fixed speed (Sheik-Nainar and Kaber 2007;Terrier and Dériaz 2011;Sloot et al. 2014b). This could be overcome by introducing a self-paced technique: a novel feedback-controlled paradigm that allows subjects to continually control and intrinsically select the treadmill's speed, in combination with a speed-matched virtual reality that generates visual flow to restore, to some extent, comparable over-ground walking (Souman et al. 2008;Geijtenbeek et al. 2011;Sloot et al. 2014aSloot et al. , 2014b. The Gait Real-time Analysis Interactive Lab (GRAIL, Motekforce Link, Amsterdam, the Netherlands) system uses the novel approach of a virtual reality-based self-paced treadmill, which integrates a self-paced algorithm to regulate the treadmill's speed, as described by Sloot et al. (2014b). Recent literature (Sloot et al. 2014a(Sloot et al. , 2014bLiu et al. 2016) shows potential applications for the GRAIL system in gait assessment. However, to the best of our knowledge there is no published research that has established the day-to-day repeatability of the GRAIL system. This is despite its growing use in both clinical and research settings.
Repeatability of gait analysis is needed for both researchers and clinicians to better understand the results (Baker 2006). Establishing gait repeatability is vital to interpret whether the difference between repeated assessments of gait parameters represents a real change or merely a change within the boundaries of Standard Error Measurement (Atkinson and Nevill 1998;Hopkins 2000;Schwartz et al. 2004;Baker 2006;McGinley et al. 2009).
The repeatability of the GRAIL system for healthy individuals needs to be established before it can be used to examine the effects of pathology on gait in certain patient populations. Therefore, the aim of this study was to KEYWORDS reliability; instrumented treadmill; gait analysis; grail; self-paced walking ARTICLE HISTORY received 26 may 2017 accepted 10 november 2017 projected onto a 180° semi-cylindrical projection screen and with a 12-camera Vicon MX optical infrared tracking system (Oxford Metrics, UK).

Measurement procedure
Each participant underwent two gait analysis sessions, approximately one week apart. They were asked to wear shorts and Aqua Shoes that were used to consistently place their feet's markers and to avoid walking barefoot on the treadmill, for safety reasons. Participants were required to walk continuously back and forth for a minute in the laboratory to ensure that they were comfortable in the Aqua Shoes. One assessor, a post-graduate musculoskeletal physiotherapist with nine years of practical experience, placed 25 reflective markers using the Human Body Model (HBM) lower-body marker set (van den Bogert et al. 2013) on each participant as detailed in Supplementary Appendix 1. The assessor was blinded to the results of the first session when undertaking the second session. Knee and ankle widths, required for the HBM model, were measured by another post-graduate neuro physiotherapist during the first session. Kinematic marker data were collected and synchronised at 200 Hz using the passive marker VICON MX motion analysis system, with ground reaction forces data sampled at 2000 Hz. Spatial-temporal gait parameters and joint kinematics and kinetics were calculated using the HBM gait model (van den Bogert et al. 2013) that is implemented in D-flow software package (version 3.20.1, Motekforce Link, Amsterdam, the Netherlands).
In each trial, participants were positioned on the middle of the treadmill and wore a non-body-weight support safety harness over their shoulders, loosely hanging from the ceiling. At the beginning of each session, participants determine the intra-rater repeatability of the GRAIL system measurements during self-paced treadmill walking, in repeated gait testing of healthy male participants. We hypothesised that the gait parameters of spatial-temporal, joints range of motion angle, and peak joints moment recorded by the GRAIL system would be within the acceptable repeatability range (McGinley et al. 2009).

Research participants and setting
The sample size for this study was chosen based on the recommendations reported by Walter et al. (1998). The likelihood of committing type I and type II errors set at α = 0.05 and β = 0.2 and based on Table 2 in Walter et al. study, a sample size (k) of 15 was deemed suitable for our study. To consider attrition, we increased the sample size by 50%, resulting in 23 subjects. Therefore, 23 male participants (age: 34.56 ± 5.12; height: 171 ± 6 cm; weight: 77.22 ± 9.76; and BMI: 27.82 ± 6.93) underwent 2 gait analysis sessions, separated by an average of 5 ± 3 days. During this time, all participants remained active with regular gym attendance. Inclusion criteria were: age between 30 and 50 years; no evidence of photosensitive epilepsy and with normal or corrected-to-normal vision; and healthy without any known neurological, cardio-vascular, or musculoskeletal conditions. Written informed consent was obtained prior to participation. The study was carried out in the Research Centre for Clinical Kinesiology at Cardiff University, with ethical approval from the Research Ethics Committee at the Cardiff University School of Healthcare Science.
The study used the Cardiff GRAIL ( Figure 1: Gait Real-time Analysis Interactive Lab Motekforce Link, the Netherlands) system, consisting of an instrumented splitbelt treadmill with synchronised virtual environments, were asked to perform at least 6-minute of trial walking at their own comfortable walking speed to accustom themselves to self-paced treadmill walking (Liu et al. 2016) by means of self-paced speed algorithm (Sloot et al. 2014b), whilst the pace of the visual-flow was maintained by the treadmill's pace. The self-paced mode was chosen in order to allow participants to walk with freedom in stride variability. Following the acclimatisation trial, and as part of a larger study protocol, participants performed four 6-minute walking trials (with and without a cognitive task) in a random order. The trial order for the first session was replicated for the second. Participants were given 2-minute rest between trials. Before the measurements of the second session began, participants were given 3-m to warm-up on the treadmill.

Measurement of outcomes
Marker data was low-passed filtered with a second order Butterworth filter with a cut-off frequency of 10 Hz. Gait events detection was calculated based on foot markers (Zeni et al. 2008). Walking speed derived from the GRAIL treadmill output (Sloot et al. 2014b), key clinically important gait kinematic and kinetic parameters, and spatial-temporal gait parameters were processed and analysed in Matlab R2015b (The Mathworks Inc., USA). Vertical ground reaction forces (VGRF) were normalised by body weight and their first and second peaks values were calculated. To negate the effects of gait initiation and termination, average values of each parameter were computed across 100 strides within 4-minute (from 50-310 s) of the full walking period (about 360 s). The number of strides included in this study was adequate for gait repeatability studies according to Monaghan et al. (2007) and Diss (2001), who have demonstrated that five gait strides are sufficient for better intra-rater repeatability. Included strides were visually inspected for accuracy.
Following the literature (Gorton III et al. 2009;McGinley et al. 2009;Bridenbaugh and Kressig 2011), repeatability of the most clinically key spatial-temporal, kinematic, and kinetic gait parameters for both limbs are reported in this paper. This includes: walking speed; stride time; stance and swing times; step length and width; Range of Motion (ROM) of hip flexion/extension, adduction/ abduction, and rotation; ROM of knee flexion/extension; ROM of ankle dorsiflexion; foot of progression; peaks of hip extension and adduction moment; peak knee extension moment; the first peak of knee adduction moment; peak of ankle dorsiflexion moment; and the first and second peaks of the VGRF. The ROM was calculated throughout the gait cycle by finding the difference between the maximum and minimum joint angle. In terms of the foot progression angle, it is defined as the angle of the foot in the horizontal plane relative to the direction of walking.

Data analysis
All statistical analyses were carried out in the Statistical Package for the Social Sciences (SPSS) software Version 23 for Windows (IBM SPSS, Chicago, USA). The assumption of normally data was assessed by Shapiro-Wilk test. Between-session differences in gait parameters were identified using a paired t-test with the significance level set at 0.05. 'Bland-Altman' analysis was used to assess systematic variations between the measurement values of both sessions (Bunce 2009). This includes the calculation of the 95% limits of agreement (LOA), the mean difference (Diff) between the first and second sessions, and standard deviation of Diff. The following formulas were used: where Diff refers to the mean difference between two sessions and the SD Diff refers to the standard deviation of the Diff.
Agreement between measurements was analysed by the Intraclass Correlation Coefficients (ICC). For this calculation, the one-way random effects model was chosen, with a confidence interval of 95%. To interpret the ICC results, a cut-off point of 0.8 was considered to be excellent repeatability, ICC values between 0.6 and 0.79 to be high repeatability, fair if the ICC values were between 0.4 and 0.59, and 0.39 or lower to indicate poor repeatability (Bruton et al. 2000). The ICC usually overlooks absolute repeatability. Therefore, the Standard Error of Measurement (SEM) was used to estimate absolute repeatability and provide information to delineate intra-individual variability over repeated measurements (Atkinson and Nevill 1998). SEM provides measurement errors in the same units as the original measurement and it was calculated using (3) (Bruton et al. 2000): where SD1 refers to the standard deviation of the first session.
To facilitate clinical interpretation, the Minimum Detectable Change (MDC) that represents whether a change observed between tests is a 'real' alteration, rather than a 'random' variation in measurements (Haley and Fragala-Pinkham 2006;Wilken et al. 2012) was calculated with 95%, which was given by (4) (Haley and Fragala-Pinkham 2006): than 1°, for the kinematic parameters apart from the left ROM of ankle dorsi-plantar flexion and foot progression (1.075° and 2.218°, respectively). Table 2 reports the repeatability assessments values of all gait parameters by the ICC, SEM, and MDC. ICC values for the gait spatial-temporal parameters showed excellent repeatability, ranging between 0.87 and 0.942. SEM values were between 0.005 and 0.055 measurement units, whilst the MDC values were less than 0.154 measurement units.
The ICC values for kinematic parameters were generally higher than those for the kinetic parameters. Despite a tendency for some small differences in Range of Motion (ROM) between right and left limbs, the ROM of hip and knee flexion/extension and foot progression for both limbs had ICC values greater than 0.885. The ICCs for ROM of hip rotation and ankle dorsi-plantar flexion ranged between 0.654 and 0.794, whilst the ICC values

Results
The mean, standard deviations, and mean difference for test-retest results of key spatial-temporal, kinematic, and kinetic gait parameters are presented in Table 1. The paired t-tests showed non-significant differences between the means of the two sessions across the gait parameters except left swing time (p t-test = 0.006), right peak hip abduction moment (p t-test = 0.001), and right peak planter-flexion moment (p t-test = 0.048). The 'Bland-Altman' graphs in Figure 2 and Supplementary Appendix 2 show that the plotted differences of all gait parameters lie within the 95% recommended LOA, apart from two outliers that are slightly out of the LOAs. This indicates the presence of the between-session agreement of the key gait parameters. For the spatial-temporal parameters, the difference between the means of these parameters in both sessions was less than 0.02 measurement units, whilst it was less For key kinetic parameters, peak knee adduction moment, the second peak of VGRF, and the first peak of the right VGRF had ICC values greater than 0.83. The ICC of the ROM for hip abduction/adduction was less than 0.587. The SEM and MDC values were less than 3.41° and 9.45° for all kinematic parameters, respectively. values ranged between 0.625 and 0.765 for the other key kinetic parameters. The SEM was below 2.546 measurement units for all kinetic parameters, apart from the first peak of the VGRF that had the greatest MDC and wider widths of 95% LOA.

Discussion
The aim of this study was to provide an assessment of day-to-day repeatability of the most common clinical gait parameters obtained from the GRAIL system during selfpaced treadmill walking. The test-retest repeatability of gait analysis is fundamentally vital to both clinical and research considerations because patients and research participants are often assessed on multiple sessions. The current study indicates that the key spatial-temporal gait parameters had excellent ICC values (>0.88) and relatively small SEM and MDC values (<0.055 & <0.153 measurement units, respectively), which were inspected visually by using the 'Bland-Altman' graphs. These test-retest repeatability findings are comparable with other previously reported reliability findings for spatial-temporal gait during both over-ground (Stolze et al. 1997;Paterson et al. 2008;Meldrum et al. 2014) and self-selected speed treadmills (Owings and Grabiner 2004;Faude et al. 2012;Reed et al. 2013;Liu et al. 2016). The MDC values for spatial-temporal gait revealed relatively small amounts of variability made by the GRAIL system, which are sufficient to detect real changes over time.
The test-retest repeatability performed on the kinematic data (Table 2) in this study was excellent (ICC > 0.885) for both hip and knee in the sagittal plane and foot progression, high for hip rotation and ankle dorsi-plantar flexion, and poor for hip abduction/adduction.  that the GRAIL system provides reliable kinematic gait parameters. Except for the left peak knee extensor moment, all kinetic gait parameters showed high to excellent repeatability (ICC > 0.622) during these day-to-day tests. The SEM and MDC values for all kinetic joint parameters were relatively lower than <0.129 Nm/Kg and <0.36 Nm/ Kg, respectively. While noticeable differences in the SEM and MDC values (Table 2) for the first and second VGRF peaks were noticed, these differences fall within the 95% LOA and were comparable to those reported by Reed et al. (2013). We consider the SEM and MDC values for the first VGRF peak high, which may lead us to conclude that the self-paced treadmill walking provides less reliable VGRF peaks. Ground reaction forces are expected to contribute to modulate walking speed (Marasovic et al. 2009;Peterson et al. 2011) on a treadmill, during acceleration and deceleration. Therefore, it was expected that increase would be observed in the VGRF during the loading response for the first peak compared to the second peak. This is in line with the characteristics of VGRF during over-ground gait (Marasovic et al. 2009). However, more fundamental research focusing on the underlying mechanisms of self-paced treadmill walking is essential to clarify its relationship with ground reaction forces. Despite the concerns on the reliability of the first VGRF peak, the overall findings suggest that the GRAIL system provided reliable kinetic joint measurements.
Our study had a few limitations. Firstly, this was an intra-rater repeatability study that involved a group of able-bodied adults, who attended a single laboratory. Therefore, the results of this study should be interpreted with caution. Thus, further research is needed to assess the intra-and inter-rater repeatability of the GRAIL system in pathological populations. Future research should also overcome the bias of the current results by including females, although Menz et al. (2004) suggested that there are no gender effects on spatial-temporal gait. The inclusion of both males and females, however, would enable the future work to be in line with other research that has shown differences in gait between genders (Kerrigan et al. 1998). The second limitation of this study is that we could not explore whether the learning factor of selfpaced treadmill walking over time and between days had an impact on the results of this study. Participants in this study were required to walk for at least 6 min on day 1 and 3 min on day 2 to accustom themselves to the self-paced treadmill walking. Further research would be necessary to determine what specific effects factors like learning and adaptation may have on the repeatability of gait data obtained from the self-paced treadmill system.
In conclusion, this study established the betweensession repeatability of the most common clinical gait The ICC values for left and right limbs for these kinematic parameters were relatively within similar ICC bounds (Table 2), except for the values for hip abduction/adduction (ICC values for right and left limbs were 0.253 and 0.587, respectively). The right hip abduction/ adduction ICC value in this study is, however, comparable to that noted during over-ground gait (0.57) in healthy adults (Monaghan et al. 2007). The ICC cannot be used alone to establish whether the measurement is reliable (Atkinson and Nevill 1998), thus further insight into the 'Bland-Altman' bias plots for the ROM of the hip abduction/adduction was investigated in particular. The graph (Figure 2) showed that there were small differences between sessions (0.614° for right limb; and -0.153° for the left limb). However, an outlier was noticed in the right hip abduction/adduction, with 4° above the 95% LOA. This difference between both sessions for a participant has an impact on the calculation of the overall ICC value of the right hip abduction/adduction. The absolute repeatability (SEM and MDC) was calculated to provide further information on the consistency of measurements from test-retest. The SEM values for the right and left hip abduction/adduction were within an acceptable repeatability range (2.049° and 1.917°, respectively) (McGinley et al. 2009), which indicate that measurements of the hip abduction/adduction made by the GRAIL system is stable over time (see Supplementary Appendix 3 that shows mean of kinematic and kinetic curves throughout the gait cycle for the 23 subjects). We cannot compare the repeatability of left and right limbs reported in this study with other published work because, to the best of our knowledge, there is no published work that has reported the repeatability of both left and right limbs during treadmill walking. The differences in the contribution of each lower limb during gait in healthy subjects are perhaps not surprising. This is, however, beyond the scope of this work. Functional differences between right and left lower limbs during walking have been discussed by Sadeghi et al. (1997).
The SEM and MDC values for the other reported kinematic parameters (Table 2) in this paper were lower than <3.4° and <5.681°, respectively, apart from the MDC values of the ankle dorsi-plantar flexion (8.534° and 8.378° for left and right, respectively) and foot progression (9.44° and 8.603° for left and right, respectively). In agreement with these results, the 'Bland-Altman' graphs (Supplementary Appendix 2) revealed a good, acceptable lower variability within-subjects for kinematic gait parameters obtained from the GRAIL system. Overall, these findings are in line with previous research in healthy people, who performed self-paced over-ground walking in a conventional gait laboratory on two separate days (Meldrum et al. 2014). To recap, these findings illustrate parameters obtained from the GRAIL system in self-paced mode. The temporal-spatial gait parameters showed the best reliability measures with the lowest SEM and MDC values. Key joint kinematic and kinetic gait parameters, except the first VGRF peak, are reliable and to some extent sensitive to the detection of relevant changes in these parameters. The SEM and MDC values for healthy individuals reported in this study can be used as baseline references for interpreting self-paced GRAIL assessments in clinical and research populations. Further studies are needed to determine the inter-centre repeatability of the GRAIL system.