The influence of season phase on multivariate load relationships in professional youth soccer

ABSTRACT The purpose of this research was to assess relationships between subjective and external measures of training load in professional youth footballers, whilst accounting for the effect of the stage of the season. Data for ratings of perceived exertion (RPE) and seven global positioning systems (GPS) derived measures were collected from 20 players (age = 17.4 ± 1.3 yrs, height = 178.0 ± 8.1 cm, mass = 71.8 ± 7.2 kg) across a 47-week season. The season was categorised by a pre-season phase, and two competitive phases (Comp1, Comp2). The structure of the data were investigated using principal component analysis. An extraction criterion of component with eigenvalues ≥1.0 was used. Two components were retained for the pre-season period explaining a cumulative variance of 77.1%. Single components were retained for both Comp1 and Comp2 explaining 73.3% and 74.3% of variance, respectively. Identification of single components may suggest that measures are related and can be used interchangeably, however these interpretations should be considered with caution. The identification of multiple components in the pre-season phase suggests that univariate measures may not be sufficient when considering load experienced. These results suggest that factoring load based on measures of volume and intensity should be considered.


Introduction
Soccer match play is characterised by frequent high-intensity accelerations, decelerations, and running (Whitehead et al., 2018). As such, soccer training aims to prepare players for the physical demands of match play, alongside developing technical, tactical and psychological understanding. Due to the high physical demands involved, match play and training to prepare soccer players can also present substantive risk of injury (Peterson et al., 2000). With the aim of improving performance, and reducing the risk of injury, practitioners supporting professional soccer players routinely monitor the physical load experienced by players (Drew & Finch, 2016). Whilst this route of investigation is common, it has been suggested that current practices relating load monitoring with injury are lacking in substantial evidence, possibly due to the shortcomings of available univariate load metrics (Kalkhoven et al., 2021) Load and the subsequent adaptations generated can be characterised as being either physiological or biomechanical (Vanrenterghem et al., 2017). Features of training load describing the magnitude and amount of the physical work are considered the external load (Impellizzeri et al., 2019;Vanrenterghem et al., 2017), whereas features describing the resultant physiological and biomechanical response are characterised as the internal load (Impellizzeri et al., 2019;Vanrenterghem et al., 2017). Generally, practitioners monitor prescribed physical work, which is represented by external load, alongside the players response which is characterised as the internal load (Impellizzeri et al., 2019;Vanrenterghem et al., 2017). A central aim of research is to accurately model relationships between external and internal load to create more effective and responsive training stimuli to enhance physical performance and its expression during match play (Halson, 2014).
A range of technologies, variables, data processing and analysis techniques are used when monitoring internal and external load. Common approaches to monitor internal load include subjective measurements such as the rating of perceived exertion (RPE) and objective measurements including heart-rate (HR) based assessments in the form of training impulse (TRIMP) and time spent in specific HR zones (Impellizzeri et al., 2004). Development of technologies such as global position system (GPS) devices and accelerometers has increased the availability of external load variables that are now common in professional soccer (Akenhead & Nassis, 2016). Whilst advances in technology and greater dissemination of research-based practices has made continuous load monitoring an essential component of elite athlete support, the lack of criterion measures of load has led practitioners to collect a range of variables posing a challenge to clear interpretation of the data (Weaving et al., 2014). Initial attempts to assess validity of outcomes or identify underlying structures to reduce the dimensionality of data have been achieved by comparing all measures against each other using correlation or principal component approaches, respectively (Weaving et al., 2014). Research investigating underlying structure has generally found that measures representing either the internal or external load are strongly related to each other (Weaving et al., 2014). However, research has also established that relationships between load monitoring variables may be influenced by different training modes 10−13 . Comparing research findings across different sports suggests that potential changes in underlying structure across different training modes may also be sport specific Weaving et al., 2017). Previous research in rugby league showed significant effects of training mode on relationships between internal and external load measures (Weaving et al., 2014). Similar findings were found in a follow-up study in rugby league comparing relationships between load measures during skills and conditioning focused training sessions (Weaving et al., 2017). In contrast, a recent analysis in professional youth soccer found no changes in underlying structure when categorising training sessions based on their proximity to match day (e.g., MD-1, MD-2) (P . In accordance with previous research, the structure of load measures aligned themselves along measures of volume and intensity (PC Maughan et al., 2021). It is plausible that the contrasting results may be influenced by the specificity of the training sessions, where mode of training is more clearly defined in rugby league and sessions can be categorised for example, as "skills" or "conditioning" (Weaving et al., 2017). Conversely in soccer training, there is often less specificity and sessions are generally categorised based on their proximity to match day creating greater within-session variability and potentially masking more subtle changes in relationships .
Whilst preliminary evidence suggests that load relationships remain consistent across different training contexts in professional soccer, less is known about the effect of stage of season. Previous research investigating training load in professional soccer has compared internal and external load in the English Premier League (Malone et al., 2015). Malone et al. (Malone et al., 2015) reported no significant differences across the preseason and in-season phases of training; however, it is worth noting that match play data was not included which may have the potential to influence overall load experienced, particularly during the in-season phase (Malone et al., 2015). The aims of the different phases of the season are generally different, with development of fitness a primary goal of pre-season (Malone et al., 2015) and often maintenance of previously developed physical qualities the aim during in-season to enable focus on technical and tactical development (Malone et al., 2015). Given the contrasting aims of different stages of the season, there is potential that the underlying structure described by the multivariate relationships between load measures may also change. As it is routine for practitioners to collect many load variables without criterion, greater understanding of underlying structure and the factors that can alter this will provide practitioners with better context to monitor players throughout the season. Therefore, the aim of the current study was to quantify and describe the relationship between internal and external load variables across phases of the season. Specifically, we aimed to assess the relationship between sRPE and various external load measures collected via GPS technology. To do this the study used analyses methods previously used to assess the underlying structure of relationships 10−13 .

Subjects
Data were collected from 20 male professional youth soccer players (age 17.4 ± 1.3 yrs, height 178.0 ± 8.1 cm, mass 71.8 ± 7.2 kg). All data were collected during the 2018/19 season. Data comprised players from multiple positions, but data provided from goalkeepers were removed. In accordance with previous research (Malone et al., 2015), data recorded from a small selection of non-representative training sessions were removed to limit the influence of outliers. Post-Match top-ups, rehabilitation sessions, and non-pitch-based sessions such as resistance training were also excluded from the analysis. As the aim of this study was to compare different phases of the season, the winter break period was not included in the analyses.

Design
The present study employed a prospective design with data collection across a 47-week season with Scottish professional youth soccer players. The data collection periods comprised a 6-week pre-season and two competitive phases lasting 20 weeks (Comp1) and 19 weeks (Comp2), respectively. The competitive phases were split by a 2-week winter break. Subjective measures of training load were collected via RPE. Objective measures of training load were collected via commercially available GPS units. Data were collected for all training sessions and matches. Data collected and the retrospective nature of the data analysis conformed to the University of Glasgow research policies and were in accordance with the Declaration of Helsinki.

Methodology
RPE was collected, in isolation, approximately 30 minutes after each training session using a commonly utilised modified BORG-CR10 scale (Foster et al., 1995;Impellizzeri et al., 2004) that had been used extensively with players previous to the study. Each RPE score was multiplied by session duration to obtain subjective training load (Foster et al., 2001). Alongside this measurement of subjective training load, objective external training load was also collected. Players wore commercially available GPS units (Optimeye X4, Firmware version 7.27; Catapult Sports, Melbourne, Australia) previously used in research conducted in team sports (Jones et al., 2019;Weaving et al., 2017). The units include a GPS receiver and a triaxial accelerometer collecting data at 10 Hz and 100 Hz, respectively. Velocity and acceleration dwell times were set at 0.6 and 0.4 s, respectively. As per previous recommendations, each player wore the same device for each session (Scott et al., 2016). Following training or matches, data were downloaded and analysed via the Openfield software package (Software version 1.19, Catapult Sports). Average satellite count was 10.6 ± 1.7. The average horizontal dilution of precision (HDOP) was 0.8 ± 0.2. Variables selected to quantify external load were total distance (m), PlayerLoad (au) (Whitehead et al., 2018), m), accelerations (>2 m.s − (Peterson et al., 2000), count) and decelerations (> −2 m.s − (Peterson et al., 2000), count).

Statistical analysis
Following previously described procedures  we carried out a correlation analysis before performing principal component analysis (PCA) on each stage of season. Where data were missing, they were treated as missing at random and imputed using the MICE package in the R statistical environment (version 4.0.3; R Foundation for Statistical Computing, Vienna, Austria.) (Buuren & Groothuis-Oudshoorn, 2010). Relationships between all load variables were quantified during each stage of season using Pearson's product moment correlation. Following this, data were prepared for PCA by firstly visually inspecting the correlation matrix to assess the factorability of the dataset (Tabachnick et al., 2007). The suitability of data was then assessed using the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, and the Bartlett test of sphericity (Bartlett, 1954). KMO (~chisquare) values were 0.76 (5187.241), 0.84 (16,931.8), and 0.83 (16,078.5) for Pre-Season, Comp1 and Comp2, respectively. All tests of sphericity were significant (p < 0.001). A KMO value of 0.5 or above has previously been identified as a suitable result to perform PCA (Hair et al., 2006;Kaiser, 1960) and has been used in similar research Weaving et al., 2017). PCA was carried out using the "prcomp" function of the R stats package (v3.6.2) (Team RC. R, 2013) and the "principal" function of the psych package (v2.0.12) (Revelle & Revelle, 2015). Principal components with an eigenvalue ≥1.0 were retained for extraction (Kaiser, 1960). When two or more principal components were retained based on their eigenvalue, varimax rotation was performed. For each retained principal component, only the original load variables with a principal component loading of >0.7 were retained (Hair et al., 2006).

Results
There were 3207 individual recordings included in the analysis comprising 695 individual MD recording and 2512 individual training session recording. Distribution of the mean loads during each phase of the season are presented in Table 1. Correlations including 95% confidence intervals for each phase of season are presented in Figure 1. Total distance, PlayerLoad and low-intensity running showed very-large correlations (r ≥ 0.77) across all phases of the season. High-speed running distance showed moderate to very-large correlations (0.39 ≤ r ≤ 0.70), whilst sprinting distance showed moderate correlations across the season (0.32 ≤ r ≤ 0.45). Finally, accelerations showed large correlations across all phases (r ≥ 0.52), whilst decelerations showed large to very-large correlations (0.54 ≤ r ≥ 0.75).
Results of the PCA are presented in Tables 2 and 3. Two principal components were identified for pre-season whilst one component was identified for each competitive phase. Variance explained and loadings are presented for the pre-season phase following varimax rotation. The components explained 77.1%

Discussion
The primary finding of this study was the identification of multiple components during the pre-season period, and conversely the identification of a single component within both competitive phases. This finding suggests in the pre-season phase univariate assessments of load may be insufficient when characterising the load experienced by players. Weaving et al., 2017). Conversely, the identification of a single component with relatively similar loadings across all variables obtained during both competitive phases suggests that load measures may be used interchangeably.
Previous research in professional rugby league (Weaving et al., 2017(Weaving et al., , 2014 and in professional soccer  has reported that multiple measures are required to capture the variance across different training themes when expressed as training mode, or relative to match day. In each of these studies, two or more components were identified following PCA. To our knowledge, this is the first assessment of this relationship when considering the phase of the season. In the present study, the pre-season stage produced two components and following varimax rotation, the component loadings could be described as representative of either training volume or intensity . In the present study, PCA carried out on pre-season data produced two principal components that represented 77.11% of the cumulative variance. The highest rotated component loadings for component one were sRPE (0.85), total distance (0.9), PlayerLoad (0.91) and lowintensity running (0.94). For rotated component two, the highest loadings were high-speed running (0.79), sprinting (0.87) and acceleration (0.57). Studies in rugby league have shown that variables generally align based on categories of internal or external training load (Weaving et al., 2017(Weaving et al., , 2014. In the present study, we only included sRPE as a measure of subjective internal load. This may have influenced our findings; however, there does still seem to be some relationship between measures which may provide similar information regarding either volume or intensity of training or match play. Whilst our analysis produced multiple principal components when investigating the pre-season phase, we only identified one component when analysing both competitive phases. This would suggest that all load variables fit into one theoretical factor, and could, theoretically, be used interchangeably (Weaving et al., 2014). It is worth noting that this may be due to the method we selected for defining how many components would be retained for rotation. A recent review concerning the use of PCA in sport found that 62.2% of the studies analysed retained factors for rotation if they had an eigenvalue >1 (Rojas-Valverde et al., 2020). Other methods, such as visual analysis of an eigenvalue scree plot whereby the "elbow" of the data would be identified (Tabachnick et al., 2007), may have led to retention of two principal components for competitive phase data. Had we included a second factor in both analyses then the results would have been comparable to our presented preseason data ( Table 2). Retention of two factors for Comp1 would have resulted in two principal components that would have explained 84.6% of the variance. Rotated component loadings would also have corresponded with our pre-season findings. Factor loadings for the first rotated component would have been 0.88, 0.9, 0.88 and 0.94 for sRPE, total distance, PlayerLoad and low-intensity running, respectively. The second rotated component would again have been best represented by high-speed running (0.77), sprinting (0.93), accelerations (0.63) and additionally decelerations (0.61). Similarly, for Comp2, retention of two factors would have results in a cumulative variance explained of 84.4%. Rotated component loadings would also have been similar to preseason findings. Component 1 would have been best represented by sRPE (0.88), total distance (0.91), PlayerLoad (0.92), and low-intensity running (0.94). Component 2 would again have been best represented by high-speed running (0.68) and sprinting (0.94). Interestingly loadings for accelerations and decelerations were slightly lower than may have been presented for Comp1 with values of 0.47 and 0.58, respectively. Clearly the method selected by practitioners for retaining factors will effect results, with the most popular method used currently in practice being the Kaiser criteria (eigenvalue >1) (Tabachnick et al., 2007). The findings from the present study alongside previous work (P  demonstrates that sRPE is representative of a measure of volume. Previous research has shown that both RPE and sRPE are significantly related to several external load and intensity measures (Gaudino et al., 2015;Marynowicz et al., 2020). When analysing youth soccer players, the strongest within-individual correlations between sRPE and various external load measures were found for duration (r = 0.767), distance (r = 0.699) and distance in acceleration (r = 0.696) (Marynowicz et al., 2020). Using generalized estimating equation (GEE) models, it was found that PlayerLoad, highspeed distance and distance in acceleration were the strongest contributory variables when estimating sRPE (Marynowicz et al., 2020). However, in our present study it is worth noting the strong component loadings of acceleration and deceleration within the first rotated component of each analyses, which may suggest that subjective perception of effort, may also be strongly related to measures of acceleration and deceleration, but not high-speed running or sprinting.
The findings of the present study further evidence that measures of sRPE appears to provide information regarding load volume, rather than intensity. Practitioners should consider this when analysing this measure to represent the load experienced by athletes. Whilst our analysis shows that this relationship is not consistent across stages of the season, this is likely due to retention criteria applied. Therefore, practitioners should consider the stage of the season, and the physical goals of that phase, when assessing load measurements.
The findings of the present study should be interpreted given the following limitations of the research. The categorisation method used in the present study comprised three levels for analysis and a logical comparison between a pre-season phase, and two competitive phases. However, future analysis may wish to investigate shorter mesocycle periods within the competitive period, for example, 6-week blocks, to provide a more in-depth comparison across the season. Additionally, the present study did not attempt to differentiate structure of load variables across different categories of players of players. Further differentiation in terms of partitioning within and between variance in structure, or potential differences across for example, starters, non-starters, or fringe players, may also provide additional insight to the proposed relationships. Additionally, the present study only included one subjective measure of internal load due to player adherence with objective methods, such as heart-rate-based measures. Further insight to objective measures of internal load may provide useful insight regarding previously observed relationships between internal and external measures of load (Weaving et al., 2014).
This study provides further evidence that univariate measures may not be sufficient when measuring the load experienced by players and that this limitation may be influenced by factors such as the stage of the season. These results, alongside previous results, would suggest that factoring load based on measures of volume and intensity would be appropriate. Whilst analyses of both competitive phases of the season identified only one principal component, which would suggest that variables may be used interchangeably during this period, it is worth noting that the criteria selected for retaining factors play a key role in this process. As previously suggested, the dose-response relationship with changes in fitness, or injury occurrence, for these combined load measures should be a future aim of analyses.