An SVM fall recognition algorithm based on a gravity acceleration sensor

ABSTRACT To address the increasing health care needs for an ageing population, in this paper, a method of detecting human movements using smartphones is proposed to decrease the risk of accidents in the elderly. The method proposed in this paper uses a mobile phone that has an embedded acceleration sensor to record human motion information that are divided into daily activities (walking, running, going up stairs, going down stairs, and standing still) and falling down. In the process of data acquisition, motion noise contains some interference, and thus the median filter is employed to de-noise and smooth the motion data. Moreover, we extract representative multi-group features and analyse the features by principal component analysis and singular value decomposition to reduce dimensions. Through experimental comparisons with various classifiers, the support vector machine classifier is selected to classify the extracted features. The accuracy of fall detection reached 96.072%, which proved the accuracy of our proposed method.


Introduction
With the development of our society and the improvement of our living standards, fall detection, as a fundamental research topic in activity sensing, has attracted a great deal of attention from researchers in the past few years. One out of every three people over the age of 65 has fallen (Salva, Bolibar, Pera, & Arias, 2004), which seriously affects the physical and mental health of the elderly and their ability to care for themselves. If the elderly cannot get prompt help after falling, they will have to lie on the ground for a long time (Nouty, Fleury, & Rumeau, 2007). Falling is a serious threat to the health and safety of the elderly, and timely medical assistance will help reduce morbidity and mortality (Chen, Zhang, Feng, & Li, 2012).
In recent years, researchers have obtained some important achievements in fall detection. However, due to the complexity of human movements and the influence of other uncertain factors, the human body falls in different ways, which eventually leads to a false positive rate of detection. A fall generally results from interactions between many factors, and study by Skelton et al. identified more than 400 factors that cause falls (Chaccour, Darazi, Hassani, & Andres, 2017). Wei et al. reviewed the gait analysis of wearable systems and briefly studied the types and working principles of the sensors used in the system (Tao, Liu, Zheng, & Feng, 2012). The principles and methods of fall detection were investigated in the article in reference (Mubashir, Shao, & Seed, 2013), CONTACT Guilin Zhang zhangguilin@sdust.edu.cn which points out that the existing fall detection techniques can be divided into three categories. The first type of method is based on machine vision (Panahi & Ghods, 2018) (Khawandi, Ballit, & Daya, 2013), in which images are captured by using the Microsoft Kinect R camera and processed to extract features using a detection algorithm. In addition, the SVM classifier is used to distinguish fall from normal motion. Rougier et al. proposed a new method for detecting falls by analysing the deformation of the human body in a video sequence (Rougier, Meunier, St-Arnaud, & Rousseau, 2011). Agrawal et al. used real-time video surveillance to detect human fall events at home. Then, they used the human contours generated in the video to match the human template to determine whether the fallen object in the video was human (Agrawal, Tripathi, & Jalal, 2017). However, this system has many limitations, such as the high environmental requirements, the elevated systems costs from complicated algorithm processing, and the potential to expose a user's personal privacy. The second type of method is based on acoustic fall detection systems, which uses principles that are similar to those of a stethoscope. The motion state of the human body is classified by capturing sound waves generated from the floor reflection (Principi, Droghini, Squartini, Olivetti, & Piazza, 2016). In addition, the sound signal of this system has many interference signals that are leading to a decreased recognition rate. The third type of methods is based on wearable devices that generally collect the acceleration sensor signal of human body. These methods identify falls by a certain proposed algorithm. The advantage of using a wearable sensor is that there is no need to install additional equipment. Therefore, the area of operation is not limited by space. Ailisto et al. (Ailisto & Makela, 2005) first proposed a method using acceleration sensors to measure the acceleration data of human body for gait recognition. Lee proposed a vertical velocity based precollision fall detection method using wearable inertial sensors (Lee, Robinovitch, & Park, 2015). A Harris et al. used wearable technology and machine learning algorithms to study fall recognition, including fall detection and fall direction recognition (Harris, True, Zhen, & Jin, 2017). Wu developed a new fall detection system based on wearable devices. The fall is identified by an effective quaternion-based algorithm and a help request is automatically sent to the patient's location (Wu, Zhao, Zhao, & Zhong, 2015). However, there are some shortcomings of this system. The lying down or suddenly sit down of a user may cause the false alarm problem. The wearable device detection system is convenient to carry, and it is not restricted by the environment. Because of the complexity of human behaviours, the recognition results are sometimes diverse and different. With the development of information technology, various mobile devices have rapidly emerged, and their performance and embedded sensors have been enhanced as well. All kinds of sensors can measure real-time motion information of users, and this information can be used not only for predicting users' locations, but also for identifying users' behaviours (Li, Xie, Zhou, Gou, & Bie, 2016). Sensors embedded in smartphone are used to acquire data, which are analysed and used to design an algorithm for fall detection (Hakim, Huq, Shanta, & Ibrahim, 2017). Pinky Paul et al. focused on activity recognition using embedded accelerometers in smartphone (Paul & George, 2015). Rakhman et al. developed a fall detection system, which detects fall state by setting a threshold. However, this method is only adapted to the case of a type of forward fall (Rakhman, Nugroho, Widyawan, & Kurnianingsih, 2014). Tolkiehn uses a 3D accelerometer and air pressure sensor to detect fall state and fall direction (Tolkiehn, Atallah, Lo, & Yang, 2011).
By using embedded sensors with computational ability, personal devices are able to detect physical activities. The advantage of this solution is that we do not need to deploy additional device. Thus, the designed system is simple and easy to use (Sun, Zhang, Li, Guo, & Li, 2010). In terms of data analysis, the main approaches are thresholding and machine learning. A Harris et al. compared the four algorithms, including the support vector machine, random forest, logistic regression and k-nearest neighbours (K-NN), and they ultimately demonstrated the accuracy of their proposed fall recognition system. However, fewer features were selected in their experiments and the recognition accuracy was not high (2017). Paul used a clustering K-NN classification algorithm, which is superior to K-NN in accuracy. However, this algorithm is susceptible to abnormal values and ultimately leads to misjudgments (2015). Khawandi et al. used a decision tree approach to classify each feature, but it could be over-fitting due to its multiple scans and data set types (2013). In this paper, we process the data collected from the sensor and adopt SVM to detect fall state.

Data collection and feature extraction
The acceleration information changes relatively smoothly during normal movement, and a sharp impact force occurs during a fall. The experiment considers that the acceleration signal will have a certain deviation due to the difference in the position of the mobile phone. Thus, to better reflect the state of the human body the mobile phone is placed at the volunteer's waist, which is the centre of gravity of the human body. Figure 1 presents photos of a fall taken every 0.2 s. It can be seen in the figure that the three X, Y, and Z axes have relatively large changes when the human body falls. Therefore, we used gravity acceleration sensor signal to detect falls.

Data collection and processing
The collected sensor data include not only the human motion acceleration signals but also gravity acceleration signals. Both signals could be disturbed by noise during the motion. Therefore, it was necessary to de-noise and smooth the collected data. The median filter is a nonlinear smoothing technique. And its basic principle is to replace the value of a point in a sequence of digits with the median value of all the points in a neighbourhood. The median filter has a good filtering effect on impulse noise. Median filter can protect the edge of signal in the process of filtering. Therefore, this paper uses median filter to process the signal.

Feature extraction and feature selection
For each sample collected, a number of factors related to human motion recognition must be identified, each of which becomes a feature of the research. The performance of feature selection can greatly have impact on classification results.
Feature extraction is a method to extract representative features of a pattern by transforming the measured values. Therefore, feature extraction plays an important role in pattern recognition. In a human motion pattern recognition system, the feature vector is extracted and selected from the time domain acceleration signal. The extraction process is relatively simple. A fall is a short and strenuous movement in the unconscious state. The resulting acceleration is only analysed as one of the features, and the remaining features are extracted based on the uniaxial acceleration, defined in equation (1): where a x , a y and a z are the acceleration in the X, Y and Z directions, respectively. The resulting acceleration is used to reflect the severity of the body movement, and the error caused by the uniaxial acceleration analysis of human motion is avoided.
The changes in time-domain signals during daily movements are obvious, while the frequency-domain signal changes are small. Therefore, only the time-domain features of the acceleration are extracted for the experiment. The most commonly used time domain features are: the mean (Wang, Yang, Chen, & Chen, 2005) (Ling & Intille, 2004) (Ravi, Dandekar, Mysore, & Littman, 2005), the variance (2005) (2004), the correlation between axes (2005) (2004) (2005), the skewness and the kurtosis. The acceleration signal will be almost constant when a person stays still. Skewness is a measure of the skew direction and degree of acceleration distribution, and it can effectively distinguish a downward movement from another state of motion. Kurtosis is the peak value of an acceleration curve at the mean value, and it can distinguish between running and other states. During a fall, the angle of inclination of the human body towards the ground changes greatly. Figure 1 shows that the Y-axis of the mobile phone position changes greatly in vertical direction. In order to distinguish falls from daily activities, the rotation angle ∂ between the Y axis and the gravity acceleration is taken into consideration in this paper, and is defined as follows: where G is the gravity acceleration.
As the number of features increases, the dimensions of the feature space also expand. And the irrelevance of features may result in a decrease in the recognition rate. Principal component analysis (PCA) is a widely used statistical method for identifying high-dimensional dataset patterns (Mastylo, 2016). To eliminate feature relevance and information redundancy, we used the PCA method to reduce the dimensions of the extracted features.
In the experiment, 21-dimensional time domain features of the inclination angle, resultant acceleration, mean, and variance, correlation between axes, kurtosis and skewness of the X, Y and Z axes were extracted. With a sample definition of Sample : X (1) , X (2) , · · · , X (m) , the characteristics of each sample are n ] T . The process is as follows: Step 1: Normalize the selected training sample features and calculate the processed sample data as: Step 2: Calculate the covariance matrix of the sample features.

Cov
Step 3: Use the singular value decomposition algorithm to calculate the eigenvalues and eigenvectors of the covariance matrix refer to the function in MATLAB.
[eigenvectors, eigenvalues] = eig(cov) Step 4: Arrange the eigenvalues in descending order and calculate the cumulative contribution rate for dimensionality reduction (set threshold is 0.90).
The reduced dimensions of the matrix are represented as X k . The original 21-dimensional feature matrix is reduced to 7-dimensional matrix after PCA dimensionality reduction. An effective classification method can substantially improve the system's ultimate recognition performance.

Classification and recognition
Artificial neural networks and the SVM classifier have been widely used in the field of pattern recognition (Talele, Shirsat, Uplenchwar, & Tuckley, 2016) (Li, Pang, Liu, & Wang, 2017). The SVM classifier is generally adopted to solve classification and regression problems. Class labels 1-6 represent going up stairs, walking, going down stairs, running, standing and falling. Figure 2 is the flowchart of the fall detection system. The training set is modelled by neural networks (generalized regression neural networks and probabilistic neural networks) and support vector machines in machine learning.

Neural network
A generalized regression neural network (GRNN) has a summation layer, which can remove the weight connection between hidden layers and output layers. There are two types of neurons in the summation layers. The first type is to arithmetically sum the output of all neurons in the pattern layers. The connection weight between the pattern layers and each neuron is set to 1. The second type is the weighted summation of neurons in all patterns. The output Y of the network is obtained with the following calculation: where X is the network input variable and X i is the corresponding learning sample of the ith neuron, Y i is the sample observation of the random variable y, n is the sample capacity, and σ is the smoothing factor.
The GRNN training process does not need to be iterated, and it is much faster than the back propagation (BP) neural network. In addition, the GRNN learning algorithm does not need to adjust the connection weights among neurons in the training process. Instead, the smoothing factors are changed to adjust the transfer function among units.
Probabilistic neural networks (PNNs) have many advantages such as a simple learning process, fast training, more accurate classification and better fault tolerance. Essentially, it is a supervised network classifier based on Bayesian minimum risk criteria. The structures of probabilistic neural networks and generalized regressive neural networks are similar. The equation for estimating the probability density function in the PNN model is formulated as follows: where w i is the class of the sample, − → x ik is the kth training sample belonging to the w i class, l is the dimension of the sample vector, σ is the smoothing parameter and N i is the total number of training samples of the w i class. The parameters that need to be adjusted in the PNN model are σ , which can be half of the average distance between feature vectors in the same group. By performing several experiments, we found that it is not difficult to find the optimal value of σ in practice. There is no significant change in the misclassification ratio with a slight change in the value of σ .

Support vector machine
The SVM classifier was originally applied to solve dichotomous problems. He et al. generally divides multi-class problems into several types of problems (He & Jin, 2008). Ma et al. provide good results in various pattern recognition areas and seem to be a good choice for human motion recognition (Ma, Zhang, Yang, Liu, & Chen, 2016). The given data contains m indicators (x ∈ R m ) and l training points. The SVM classifier can be separated by the optimal hyperplane shown as follows: where ω is a hyperplane normal vector, and b is a constant term of a hyperplane. If the training set is inseparable in the linear space, the relaxation variable ξ i (ξ i ≥ 0) and the penalty parameter C(C > 0) are introduced to the training point (x i , y i ) of i. Then the optimal objective function and the constraint condition of the classification problem under the linear non-separable case are formulated as follows: For non-linear cases, the approach of SVM is to choose a kernel function that solves the problem of linear indivisibility in the original space by mapping the data to a high-dimensional space. In this paper, the radial basis kernel function is used, which is formulated as follows: We adopted crossover verification (CV) and a grid search to find the optimum of the parameters. Crossvalidation is an assessment statistical analysis. The basic idea of CV is to divide the raw data into two groups, one is the test set, and the other is the validation set. The training set is used to train the classifier, where we can obtain the optimal parameters of the model. Then, the model is validated with the verification set, which takes the classification accuracy as the performance index of evaluating the classifier.
CV can effectively avoid over-learning and underlearning. Experiments show that the model obtained by training the SVM classifier with the parameters selected by the CV is more effective than the model obtained by randomly selecting the parameters for training the SVM classifier. The grid search method is used to find the global optimal parameters to improve classification accuracy. If there are multiple sets of (c g), we should look for the (c g) pair corresponding to the smallest parameter c. If there are multiple sets of g, then the first set of (c g) pairs should be selected.

Experiment results
Based on many experiments, the most ideal model prediction results for GRNN and PNN are shown in Figure 3. The most ideal model prediction results for SVM are shown in Figure 4. From the comparisons, it can be seen that SVM is the best choice for classifying and identifying human motion states.
To predict the state of the daily movements of a human body, the category labels 1-5 represent: going up stairs, walking, descending stairs, running and standing, respectively. Figure 5 shows the final prediction results with an accuracy rate of 92%.   The CV method yielded a value of 64 for the penalty parameter C and 128 for the parameter g. Figure 5 shows that there is a tendency to misjudge the upper and lower stairs. The variation trends of the resultant accelerations are similar in both cases. The state of the daily movements of a human body is classified in one category, and the state of falling is classified in the other category. The prediction results are shown in Figure 6. In the experiment, we can get the best parameters c = 4 g = 8. The accuracy of classification under daily condition is 96.072%.
One of the challenges in fall detection is to identify falls in our daily life versus similar activities such as lying down, sitting down, squatting, which often lead to misjudgment.

Conclusion
In this paper, a fall detection algorithm based on the SVM classifier is proposed. The median filter is used to reduce the noise of the sensor signal. The representative features are extracted based on the acceleration of the sensor signal. The experimental results show that the accuracy of the SVM classifier is higher than that of neural networks and its prediction accuracy can reach 96.072%. Our future work will use multiple vector machines to reduce misjudgments and missed cases. Talele, K., Shirsat, A., Uplenchwar, T., & Tuckley, K. (2016).