Differentiating patients with radiculopathy from chronic low back pain patients by single surface EMG parameter

ABSTRACT The classification potential of surface electromyographic (EMG) parameters needs to be explored beyond classification of subjects onto low back pain subjects and control subjects. In this paper, a classification model based on surface EMG parameter is introduced to differentiate low back pain patients with radiculopathy from chronic low back pain (CLBP) patients and control subjects. A variant of the Roman chair was used to perform static contractions, where subject's own upper body weight was used to induce muscle fatigue in low back muscles. Surface EMG signals were recorded over the paraspinal muscles at L1–L2 and L4–L5 interspace level. As a descriptor of spectral changes, the median frequency of the power spectrum (MDF) was estimated by use of Hilbert–Huang transform. Student's t-test detected that regression line slope of the median frequency is significantly different (p < 0.05) only between low back pain patients with radiculopathy and other two groups. There was no significant difference between CLBP patients and control subjects. The achieved overall accuracy of the implemented decision tree classification model was at best 86.8%. The results suggest possibility of differentiating low back pain patients to subgroups depending on clinical symptoms.


Introduction
Differences in spectral variables of surface EMG (sEMG) signals recorded over low back muscles of subjects with low back pain (LBP) and those without (NLBP) have been thoroughly explored in the past . Only smaller part of the research was directed to development of sEMG-based classification models enabling differentiation to LBP and NLBP groups [1][2][3][4][6][7][8]13,19,26], and none of these investigated problems whether LBP patients could be further classified to homogenous groups such as a low back pain with radiculopathy (LBPR). The trend of classification only between LBP and NLBP groups continues in recent investigations exploring classification potential of large array surface electromyography [27][28][29].
The most dominant classification methods were different types of discriminant analysis [1][2][3][4]6,7,19,26]. There is no apparent explanation for this dominance or occasional use of linear regression [8,13]. As noticed by Peach and McGill [8], the drawback of discriminant analysis is inconsistent selection of input parameters which they attributed to overfitting of the data or not using the holdout group. The overfitting of the data leads to classification model that performs well on the training data but negatively impacts its ability to generalize, and omitting of holdout group for evaluation of the classification model does not provide objective insight into classification accuracy. There are other possible choices for classification methods from a vast range of machine learning techniques [30][31][32]. Among them, decision trees are used in medicine and health care applications over several decades [33], and they seem to represent prevalent algorithm for classification in healthcare analytics [34]. Since decision trees have not been used for sEMG-based classification between LBP and NLBP they are selected for implementation in this research.
The number of input variables for the classification purposes varied from just a few [1,6] to rather large number of variables [4,8]. The power spectrum median frequency was used either in simple form such as the slope of the regression line of the power spectrum median frequency time change, MDF Slope, [1,6] or as a part of more complex variables such as a proportion of recovery from the end of fatigue to the start of the repeat contraction [8]. In researches that used spectral parameters derived from the EMG signal, the power spectrum median frequency was inherently included in classification models. To explore possibility that single sEMG variable could be employed for classification MDF Slope is chosen to be a classification variable in this work.
In analysed studies, data sets were small and typically imbalanced [2][3][4][6][7][8]13], with the imbalance degree of a class distribution not exceeding value 2.83. It is possible that rebalancing of the data sets might improve classification accuracy of the applied discriminant analysis, but only when large training data sets could be provided [35]. The imbalance degree also influences performance of the decision tree classifications. The exact value of the imbalance degree at which the classification performance of decision tree begins to deteriorate is not known since other factors influence classification performance [36,37]. A positive property of decision tree is the ability to perform well even in situations when data sets for learning are imbalanced [36].
The LBP-NLBP classification studies lost pace with development of data signal processing. The fast Fourier transform (FFT) or even the analogue signal processing methods were used for estimation of the median frequency [1][2][3][4][6][7][8]13,19,26]. These are all well-known classic signal processing methods. In 2009, Cifrek et al. [38] reviewed the classic and modern signal processing methods from the position of applicability to sEMG signals. Among the modern methods, the Hilbert-Huang transform (HHT) seem to be entering slowly into the biomedical engineering field. One of the reasons is surely its computationally demanding algorithm. Following the initial research of our group, indicating that HHT provides statistically more significant results then STFT-based analysis [39] of sEMG signal recorded over low back muscles during static contractions, we have decided to apply HHT.
Based on previous studies, we hypothesized a possible differentiation of low back pain patients with radiculopathy from chronic low back pain (CLBP) patients. It can be assumed that patients having radiculopathy with radicular pain develop asymmetrical functioning and different fatigability pattern of the lower back muscles. In our opinion, such phenomenon would reflect in myoelectrical signals and enable differentiation from CLBP patients. To achieve completeness, the set of subjects without low back pain (healthy subjects) is also included in the study. The specific goal was to explore whether differentiation could be achieved with decision tree classification and based on only one surface EMG parameter. If such result could be achieved, it would simplify classification methodology and thus improve the likelihood of being used in clinical application.

Subjects
The 76 male volunteers were included in the study, half of them being a control group of healthy men without any history of low back pain in the past 5 years. The other half had a history of low back pain and was further divided into two groups: the first group consisted of 25 CLBP patients and second group consisted of 13 patients having LBPR. The presence of CLBP was defined as daily or almost daily pain that lasted at least 6 months prior to measurements. Low back pain patients with radiculopathy had clinical symptoms of radiculopathy with radicular pain lasting at least 14 days. The exclusion criteria for subjects were spinal deformation, spinal injuries, spinal surgery, spondylolisthesis, spinal stenosis, osteoporosis, and there were not any accompanying systemic diseases.
The whole experiment was approved by Ethics Committee of the University of Zagreb, Faculty of Electrical Engineering and Computing, and informed consent was received from each subject.

Data collection and signal conditioning
The system used to acquire surface EMG data over the paraspinal muscles is FREEEMG system (BTS, Milano, Italy). It is a system with wireless EMG probes enabling free movement of subjects during measurement. Each probe has a pair of surface EMG pre-gelled Ag-AgCl 10 mm diameter electrodes (Ambu-Blue, Sensor, and Ballerup, Denmark).
The placement of the electrodes over the paraspinal muscles is illustrated in Figure 1. First electrode pair is placed parallel to the direction of the muscle fibres of the m. erector spinae 30 mm lateral from the spinous process at L1-L2 interspace. Second electrode pair is placed at m. erector spinae at L4-L5 interspace and aligned parallel to the line between the posterior superior iliac spine and the L1-L2 interspace.
Prior to measurement, electrode-skin impedance was measured to ensure that value is less than 5 k . The raw surface EMG signal was differentially amplified and bandpass filtered at 20-400 Hz. The differential amplifier input impedance was > 100 M , and common mode rejection ratio was > 100 dB at 65 Hz. The signal was sampled at 1200 Hz using a 12-bit A/D converter.  The high-pass cutoff frequency of 20 Hz was chosen as recommended to be optimal choice for general use [40].

Testing procedure
Before commencing the testing procedure each subject was familiarized with the procedure, a tilting device presented in Figure 2, and instrumentation. The tilting device was a variant of a Roman chair.
The skin, over lower back muscles, was shaved and cleaned with abrasive paste and alcohol. Then, the subject was asked to stand upright until electrodes were positioned. While standing in the upright position without footwear, the distance between floor and anterior superior iliac spine was measured. Depending on the measured distance, the standing pad of tilting device was adjusted so that toes, back of the lower leg (above Achilles tendon) and pelvis (together with upper thigh) became the only body parts in contact with the tilting device and thus creating the supporting points.
Subject was instructed to stand on the tilting device and to hold hands crossed having palms placed on chest. Upon subject's verbal confirmation, medical staff gradually tilted the device until horizontal position was reached. To ensure static contractions of lower back muscles each subject was asked to maintain in horizontal position as stable as possible. The weight of the subject's upper body was used to induce muscle fatigue. Upon subject's verbal request medical staff returned the subject to upright position. Only subjects able to maintain at least 45 s in horizontal position were included in the study.
As part of the testing procedure, all raw surface EMG signals were visually inspected for motion artifacts and improper amplifier gain. If any of four signals was corrupt the measurement was fully discarded and subject repeated the test after 20 min rest interval.

Data analysis
Starting point of contraction was defined as the moment when smoothed and rectified surface EMG signal reached 90% of the maximum value during the contraction. If duration of the signal exceeded 60 s the signal was trimmed to 60 s to mitigate potential influence of the length of analysing interval [41].
MATLAB was used for all signal processing tasks following recording.

Spectral parameters
The time-frequency signal processing was done by Hilbert-Huang transform [42][43][44][45] which does not require that analysed signal has to be stationary or quasi-stationary as for example in analysis based on fast Fourier transformation [8,38].
The median frequency was selected as a descriptor to track spectral changes of surface EMG. Linear regression was applied to MDF time series in order to calculate MDF Slope, a known muscle fatigue index [38,46].

Classification feature
To check whether each of the four muscle sites carries additional classification information, correlation coefficients for MDF Slope between muscle sites within each group (control group, CLBP and LBPR) were calculated. Based on correlation results, classification feature was constructed containing MDF Slope of all four muscle sites. Following, the Student's t-test was used to demonstrate a significant difference between the groups.

Classification model
Decision tree was used for binary classification between groups, where MDF Slope from all four muscle sites was used for segmenting predictor space. To protect from overfitting a 10-fold cross validation was used to partition the data set into folds. For each fold a model was trained by using the out-of-fold observations followed by assessment of model performance, where assessment was done with in-fold data. At the end, an average test error over all folds was calculated.
Two split criterions have been used: the first one being a cross entropy (also known as maximum deviance reduction) and the second one being Gini diversity index [32,47]. It can be expected that both criterions will generate close results. Raileanu and Stoffel [48] in a theoretical comparison between the Gini diversity index and the information gain reported that these two split criterions will have disagreement only in 2% of all cases. Information gain is a criterion that uses the cross-entropy as the impurity measure [49].

Spectral parameters
The mean value and standard deviation of the MDF Slope for all three groups are presented in Table 1. It is noticeable that mean value of MDF Slope for LBPR patients is lower in comparison to CLBP patients and control group for all measurement sites. Such result indicates possibility of differentiation between LBPR patients and other two subject groups. Since mean values of MDF Slope for CLBP patients and control group are rather close or even equal it can be expected that differentiation between them would not be possible based on MDF Slope parameter.

Statistical significance
The absolute values of correlation coefficients for all combinations of MDF Slope parameter were below 0.9. Student's t-test detected that MDF Slope parameter is significantly different (p < 0.05) between LBPR and other two subject groups for all measurement sites as shown in Table 2. This confirms that MDF Slope can be used to disjunct LBPR patients from other two subject groups. For all muscle sites, there was no significant difference between CLBP and control group confirming inability of MDF Slope to act as classification feature between them in this experiment. Therefore, binary classification was performed only between LBPR and CLBP patients, and between LBPR patients and control group.

Classification
The overall accuracy of decision tree classification between CLBP and LBPR is slightly better if Gini diversity index is used as split criterion: 86.8%. The situation is opposite for classification of LBPR and control group, where Maximum deviance reduction gives slightly better result of 82.4%. The overall accuracy (%) of classification between LBPR and CLBP patients, and between LBPR and control group are presented in Table 3 for both split criterions. In case of classification between   LPBR and CLBP patients, the difference in overall accuracy arising from choice of the split criterion is 2.6%, and in case of classification between LBPR patients and control group it is 2%. Classification results also indicate somewhat better overall accuracy in classification between CLBP and LBPR when compared to results of classification between LBPR and control subjects. Evaluation of decision tree classification is given in form of confusion matrices in Tables 4 and 5 presenting absolute numbers and accuracy (%) for classification combinations between true class and predicted one. Results show that differences in accuracy arising from split criterion are not significant if taking into account small number of samples, e.g. small change in absolute numbers of classification results in changes of accuracy for several percentages.

Classifying LBPR from CLBP patients
In the present study, we found that low back patients with radiculopathy could be differentiated from control subjects and from CLBP patients. When compared to results of others who were dealing with classification on LBP and NLBP the achieved accuracy of decision tree classification between LBPR and CLBP, and between LBPR and control subjects, is within the range reported by others or even smaller [8,26,50,51].

Split criterions
Difference in overall accuracy of classification between two split criterions from Table 3 is close to theoretical prediction of Raileanu and Stoffel of 2% [48]. In the case of classification between LBPR and CLBP patients the difference is higher than in case of classification between LBPR patients and control group.

CLBP and control subjects do not differentiate
Unlike previously reported results on classification between low back patients and control subjects this study, which is single parameter based, showed that it was not possible to distinguish between CLBP and control subjects. This may be due to many different reasons.
Previous works have used classification methods such as logistic regression [13] or different types of discriminant analysis [6][7][8]26,[50][51][52] where a group of preselected surface EMG parameters were prescreened for multicollinearity by computing correlation matrix to eliminate those highly correlated. Such selection of parameters had to be done on study by study basis, even though some authors have reused classification model without any changes [10]. Nevertheless, it is possible that approach with multiparameter feature space enables classification between CLBP and control subjects while single parameter does not. It remains open whether decision tree classification with multiparameter feature space might provide better classification results and possibility to distinct CLBP and control subjects.
Second reason for inability to differentiate between CLBP and control groups may be due to fact that the subjects own body weight of the upper body was used to produce local muscle fatigue instead of most commonly used controlled percentage of MVC. Yoshitake and Moritani [53] conducted experiment by strapping subject in prone position to rigid table creating somewhat similar conditions to one achieved by the tilting device, Figure 2. They reported that keeping the upper body in prone position requires from paraspinal muscles to contract on average at 45% of MVC. The exact value of the % of MVC at which paraspinal muscles contract depends on the ratio of the upper body weight and total back muscle strength (citation [7] according to Yoshitake and Moritani [53]). Several years later, a group of authors [21] have found that subjects upper body weight, with body being positioned for Sorensen test, does not have exact relationship to percentage of MVC. For majority of the subjects, it was between 40% and 60% MVC. In their study subjects that fitted within the boundaries were included. None of the subjects were above the interval and those below the interval were excluded. This leads us to conclusion that the load of the upper body part during Sorensen test might be lower than 60% MVC. Since the relationship between MDF Slope and % MVC is nonlinear with differentiation in MDF Slope between controls and low back pain patients only for specific electrode sites and force levels at and above 60% MVC [1,13] it is possible that failure to differentiate in this study between CLBP and control group might have origin partially in this fact.
The third possible reason is related to selection of subjects. Zarakowska, as cited in Roy et al. [26], categorized subjects with low back pain into "avoider" and "confronter" groups based on their behavioural response to pain. The analysis showed that only the "avoider" group with low back pain could be accurately discriminated from the non-low back pain group. It was postulated that "avoider" group tends to refrain from physical activity and as a result develops distinct muscle fatigability. In contrast, the confronter group does not have evidence of impairment and members were classified as indistinguishable from the controls. In our study, only the subjects able to maintain static contraction over 45 s have been kept in the study. Such duration of contraction could have resulted in selecting predominantly confronter type of subjects. It remains open to check if shorter duration could improve possibility to differentiate between CLBP and control subjects.
The fourth possible reason is of physiological nature. As presented by De Luca [54] the rate of the blood flow in the muscle can affect the surface EMG spectral variables. During isometric contractions at high force levels the internal pressure of the muscle remains reasonably constant and does not alter the rate of the blood flow in the muscle as in the case of dynamic contractions. Sustained contractions are isometric, leaving the possibility that dynamics of blood flow in the muscle act similarly in CLBP and control group and thus masking the effects in surface EMG that would otherwise allow differentiation between these groups. Similarly, Peach and McGill [13] presented results where the difference in MDF Slope between CLBP patients and control subjects exist also for 60% MVC for data recorded from m. erector spinae at level L3 and m. erector spinae at level L5. Nevertheless, they reported that control subjects have higher MDF Slope.

LBPR tend to fatigue slower
These opposite findings on how fast control subjects fatigue in comparison to chronic low back patients do have one commonality -a significant difference in MDF Slope between the groups. Our results as shown in Table 1 do not support such findings. Instead, it is notable that LBPR patients tend to fatigue slower than CLBP patients and control subjects.

Time-frequency analysis for nonstationary surface EMG
In the study, we have chosen Hilbert-Huang transform (HHT) to estimate the power spectrum of surface EMG signal. It is a novel approach, still rarely used for surface EMG analysis, which has not been used previously for analysis of surface EMG of the lower back muscles. Since HHT enables analysis of nonstationary signals there is no requirement on surface EMG to be either stationary or quasi stationary. This removes previous limitations to analyse dynamic contractions [26] and opens the possibility to improve classification accuracy. There are also other signal processing methods suitable for muscle fatigue evaluation in biomechanical applications that allow analysis of nonstationary signals [38].

Objectivity
The ten-fold cross validation was used in this work to protect from overfitting and to eliminate any subjective selection of subjects which is potential risk present in classification methods relying on holdout groups for accuracy validation [6,55].

Conclusion
We measured surface EMG signal above lower back muscles during static contractions and analysed potential of belonging MDF Slope parameter to discriminate between three groups of subjects: CLBP patients, low back pain patients with radiculopathy and control subjects without low back pain. Typical classification in scientific literature is only between low back pain patients and healthy subjects.
A significant difference of MDF Slope parameter was present for all muscle sites, but only between low back pain patients with radiculopathy and other two groups of subjects. We exploited this finding to design decision tree-based classification model using MDF Slope as classification feature. To protect from overfitting and ensure objectivity of the model ten-fold cross validation was used to partition the data set into folds, train and test the model.
The overall accuracy of the classification between low back pain patients with radiculopathy and CLBP patients was at best 86.8% (Gini diversity index as split criterion) and higher in comparison to classification from control subjects in which case the overall accuracy was at best 82.4% (Maximum deviance reduction as split criterion). The results of classification show that MDF Slope-based decision tree classification can further be explored and utilized to contribute to differential electromyographic diagnostics of CLBP and LBPR.

Disclosure statement
No potential conflict of interest was reported by the authors.