Research on fault diagnosis of time-domain vibration signal based on convolutional neural networks

ABSTRACT In order to maintain the safe operation of various types of equipment, the health status of main components should be monitored in real-time, and the demand for intelligent fault diagnosis algorithm has increased sharply. However, the traditional intelligent diagnosis algorithm is based on the manual method for signal feature extraction, which has high requirements for expert experience and poor generality. A convolutional neural network, with big data as its engine, is the most effective pattern classification algorithm at present. In this paper, the convolutional neural network is applied to time-domain vibration signal fault diagnosis, taking the bearing as an example, and an intelligent diagnosis method of bearing based on convolution neural network is proposed. The proposed method does not need manual feature extraction, and can automatically complete feature extraction and automatic fault recognition. The convolutional neural network has three convolutional layers. We use data enhancement techniques for the input raw signal and convert the one-dimensional original time-domain vibration signal into a two-dimensional signal. The model shows good results on CWRU dataset and the recognition accuracy of the algorithm in the CWRU bearing database is more than 96%.


Introduction
With more and more modern electromechanical equipment, monitoring sensors on electromechanical equipment will obtain massive data, with the increase of the number and sampling frequency of equipment sensors, mechanical health monitoring has entered the era of big data. In the era of electromechanical big data, we need to use new technologies and new methods to automatically mine features from massive data, gradually replace experts to extract features, carry out real-time detection on important parts such as bearings, and ensure the accuracy and efficiency of fault diagnosis and prediction.
Bearing fault diagnosis is a hot research direction of mechanical fault diagnosis. The core of the algorithm lies in signal feature extraction and pattern classification. In the field of bearing fault diagnosis, common feature extraction algorithms include Fast Fourier Transform (Rai & Mohanty, 2007), Wavelet Transform (Lou & Loparo, 2004), Empirical Mode Decomposition (Yu, YuDejie, & Junsheng, 2006), statistical characteristics of signals (Samanta & Al-Balushi, 2003) and so on. Common pattern classification algorithms include Support Vector Machine (Konar & Chattopadhyay, 2011), BP neural CONTACT Hongya Wang hywang@dhu.edu.cn network (Li, Chow, Tipsuwan, & Hung, 2000), Bayesian Classifier (Muralidharan & Sugumaran, 2012), Nearest Neighbour Classifier (Pandya, Upadhyay, & Harsha, 2013) and so on. The current research hotspot of bearing fault diagnosis can be summarized into three categories: seeking for better feature representations, searching for the most suitable feature representations and combination of classifiers, and inventing new sensors. In 2015, Jinane from the 11th French University proposed a new feature extraction method, Global Spectrum Analysis (Harmouche, Delpha, & Diallo, 2015), to analyse bearing vibration signals. In 2016, Christelle from Aachen Uuniversity of Technology in Germany proposed to use the current signal to diagnose the bearing fault (Mbo'o & Hameyer, 2016). In the same year, Georgoulas proposed a vibration signal feature fusion method for bearing fault diagnosis (Georgoulas & Nikolakopoulos, 2016).
In recent years, convolutional neural networks have achieved a great success in the field of pattern recognition (LeCun, Bengio, & Hinton, 2015). It can automatically mine features from signals and images, replacing the cumbersome feature engineering of traditional algorithms. This makes the convolutional neural network gradually stand out in the era of big data. In 2015, Xueqian Wang from Tsinghua University constructed a deep neural network using a stacked sparse self-encoder to diagnose the faults of rolling bearings (Junbo et al., 2015). In 2016, Janssen from the University of Ghent in Belgium used convolutional neural network for the first time to diagnose the faults of bearings and gears in the gear box (Janssens et al., 2016). In the same year, Ruqiang Yan from southeast university used a deep neural network based on Denoising Sparse auto-encoders to diagnose faults in bearings and motors (Sun et al., 2016). Chang-an Zhu of China university of science and technology, based on deep belief networks and diagnose the faults of bearing using HDN (Hierarchical Diagnosis Network) (Gan, Wang, & Zhu, 2016). Yaguo Lei from Xi 'an Jiaotong University proposed a four-layer deep neural network for bearing fault diagnosis (Jia, Lei, Lin, Zhou, & Lu, 2016).
The convolutional neural network has made major breakthroughs in the fields of two-dimensional image recognition and diagnosis. The convolutional neural network is an 'end-to-end' network structure, which can complete the whole process of feature extraction, feature dimension reduction and classifier classification through a neural network. This feature of convolutional neural network undoubtedly makes up the shortcomings of the current fault diagnosis method, and can provide new ideas for the current fault diagnosis algorithm and improve the diagnostic performance of the current fault diagnosis algorithm.
As a powerful feature extractor and classifier, CNN (convolutional neural network) can obtain the most suitable features for classification tasks through training. In this paper, bearing, the most important part in machinery, is taken as the research object, and the convolutional neural network is applied in bearing fault diagnosis. The model can automatically complete feature extraction and fault identification. The main work of this paper is as follows: (1) The proposed method in this paper can directly use the raw time-domain vibration signal as the input of CNN, which means that this method can be used to solve intelligent diagnosis problems of other machine systems.
(2) In the fault diagnosis algorithm, feature extraction of signal is necessary. The fault diagnosis method proposed in this paper is based on convolutional neural network, which is an end-to-end diagnosis method without the need for hand-crafted feature extraction, and can automatically feature extraction and automatic fault recognition.
(3) The experimental results show that the proposed method can extract the appropriate fault features adaptively from the raw bearing vibration data and classify the faults with high accuracy and stability. The recognition accuracy of the proposed method in the CWRU bearing database is more than 96%.

Convolutional neural network structure
Convolution neural network is a multilayer neural network, which includes convolutional layers, pooling layers, activation layers and full connected layers. The filter stage is designed to extract features from the inputs, which contains two kinds of layers, the convolutional layer and the pooling layer. The full connection layer classifies the learning features. The function of each type of layer will be described below.

Convolution layer
Convolution layer uses Convolutional Kernels to perform convolution operation on the local region of the input signal, then generates the output features by the activation unit. Each filter uses the same kernel which is also known as weight sharing, to extract the local feature of the input local region. A filter outputs a frame on the next layer, and the number of frames is called the depth of this layer. We use K l i to denote the weights of the i-th filter kernel in layer l, and use X l(r j ) to denote the j-th local region in convolutional layer l, Therefore, the convolution process is described as follows: where K l(j ) i is the j'th weight of the i-th convolution kernel in layer l, X l(r j ) is the j-th local region convolved in layer l, and W is the width of the convolution kernel.

Pooling layer
It is common to add a pooling layer after a convolutional layer in the CNN network. It functions as a downsampling operation which reduces the spatial size of the features and the parameters of the network, at the same time, it has the effect of anti-displacement and antideformation. The commonly used Pooling functions are average-pooling and max-pooling, the most commonly used pooling layer is a max-pooling layer, which takes the maximum value in the perception domain as the output, to reduce the parameters and obtain location-invariant features. The max-pooling transformation is described as follows: where a l(i,t) is the activation value of the t-th neuron in the i-th frame of the layer l, W is the width of the pooling region, and p l(i,j) denotes the corresponding value of the neuron in layer l of the pooling operation.

Activation layer
After the convolution operation, then the activation function will be used. The purpose of the activation function is to acquire a nonlinear expression of the input signal to enhance the representation ability and make the learned features more dividable. Activation functions commonly used in neural networks include Sigmoid functions, Tanh and ReLU (Rectified Linear Unit). In recent years, ReLU was widely used as an activation unit to accelerate the convergence of the CNNs. ReLU makes the weights in the shallow layer more trainable when using back-propagation learning method to adjust the parameters. The formula of ReLU is described as follows: where a l(i,j) is the activation value of the output y l(i,j) of the convolution layer.

Fully connected layer
The fully connected (FC) layer classifies the features extracted from the convolution layer. Specifically, the output of the last pooling layer is firstly developed into a onedimensional feature vector as the input of the FC layer. Then the input and output are fully connected. The activation function used by the hidden layer is ReLU, and the activation function used by the output layer is Softmax. The purpose of Softmax function is to convert the input neurons into the probability distribution with a sum of 1, which is conducive to the establishment of the subsequent multi-classification objective function. The forward propagation formula of the FC layer is shown as follows: where W l ij is the weight between the i-th neuron in layer l and the j-th neuron in layer l + 1. Z l+l(j) is the logits value of the j-th output neuron in layer l + 1. b l j is the offset value of all neurons in the layer l to the j-th neuron in layer l + 1.

Objective function
In the convolutional neural network, the output results obtained by forwarding propagation needs to be compared with the actual label results to calculate the gap between them. The gap size is defined by the objective function, also known as Loss Function. The common objective functions are Squared Loss Function and Cross-entropy Loss Function. In machine learning, Crossentropy Loss Function is often regarded as the negative log-likelihood of softmax distribution. Therefore, Crossentropy Loss Function is adopted as the objective function in this paper. The Cross-entropy Loss Function is the objective function, as shown in Equation (5): where m is the size of the input mini-batch.

CNN back propagation
Loss back propagation is the key step of weight optimization for the neural network. The main approach is to first solve the derivative of the objective function with respect to the last layer of neurons, and then calculate the derivative of the objective function with respect to the ownership value layer by layer from back to front through the chain rule. It mainly includes fully connected (FC) layer back propagation and convolution layer back propagation. The FC layer reverse derivative. First, calculate the derivative of the objective function L with respect to the last logits value z l+l(j) as shown in Equation (6): Then calculate the derivative of the FC layer objective function L with respect to the weight W l ij and partial ∂b l j of the FC layer, as shown in Equations (7) and (8): ∂L ∂b l Finally, calculate the derivatives of the objective function L with respect to the activation value a l(i) and logits value z l(j) of the FC hidden layer with the activation function ReLU, as shown in Equations (9) and (10): After the value of ∂L/∂z l(i) is calculated by the Equation (10), the derivatives of the objective function L with respect to the weight W l−1 ij and the bias b l−1 j of the FC hidden layer can be solved according to Equations (7) and (8). The inverse derivative of the convolution layer is similar to the FC layer, so they will not be described here.

The bearing fault diagnosis model based CNN
There are three steps in our CNN diagnosis model: Data preparation, Model Training, and Model testing, as shown in Figure 1.
The CWRU Data preprocessing process is illustrated with the following data as an example: 48k Drive End Bearing Fault Data (48k_Drive_End_B007_2). The specific process is shown in Figure 2. The converted twodimensional Data is suitable as the input sample of the convolutional neural network.
In the field of computer vision, the dataset can be enhanced by image mirroring, rotation, panning, cropping, and contrast adjustment. However, in the field of fault diagnosis, there is no specific dataset enhancement technology, and the training sample size of some diagnosis algorithms is very small, which is easy to cause overfitting. For the one-dimensional fault diagnosis signal, due to its unique timing and periodicity, the image dataset enhancement technology is not completely applicable. The data enhancement method proposed in this paper is overlapping sampling.
For a given raw accelerator signal, we use a sliding window of size 512 with shift step size 200 to scan the signal and generate the raw data samples. Thus, for any two consecutive samples, there will be an overlap of 300 data points.

The CNN structure for bearing fault diagnosis
At present, the commonly used convolutional neural networks, such as VGGnet (Simonyan & Zisserman, 2014), ResNet, and Google's Inception V4 (Szegedy et al., 2016), all contain stacked 3 × 3 convolution kernel. This can not only deepen the depth of the network, but also achieve a larger field with fewer parameters, thus suppressing overfitting. The network structure of the visual field is not suitable for bearing fault diagnosis.
The structure of the convolutional neural network proposed in this paper is shown in Figure 3. The convolutional network consists of three convolutional layers (conv1, conv2, conv3), two pooling layers (pool1, pool3), two fully connected hidden layers (hidden4, hidden5), and one Softmax layer. The diagnostic signal is transformed into a set of Feature Maps through the first convolution layer and the ReLU activation layer, and then the max-pooling is used for down-sampling. Repeat the above operation to connect the feature map of the last    pooling layer with the FC hidden layer, and then pass it to the last softmax layer after being activated by ReLU.
The CWRU dataset is one-dimensional time-domain data, which is converted into 30 × 40 two-dimensional data for easy processing in CNN. The structural parameters of the convolutional neural network are shown in Table 1. The model has three layers of convolution and pooling, the size of the first layer of convolution kernel is 3 × 3, the size of the second and third layer of convolution kernel are 2 × 2, and the size of the two pooling layers are 2 × 2. The number of neurons in the two full connective layers are 500. Softmax layer has 10 outputs for 10 bearing states.
Adam optimization algorithm: For the shallow neural network, the SGD, which used in BP neural networks can converge to the global optimum. However, for the deep convolutional neural network proposed in this paper, due to the large number of parameters and hyperparameters, if the selection of hyperparameters is not good, the SGD training often falls into the local optimum. Therefore, Adam (adaptive moments) algorithm is adopted in this paper (Kingma & Ba, 2014). Adam is a learning rateadaptive optimization algorithm that dynamically adjusts the learning rate of each parameter by using the firstmoment estimation and the second-moment estimation of the gradient.
Dropout algorithm: Dropout was proposed by Srivastava of the University of Toronto (Srivastava et al., 2014), which could suppress overfitting of the neural network and improve its generalization performance. The Dropout algorithm is to adjust the value of neurons in the network to zero in probability p, so as to prevent co-adaptation between neurons in the same layer. Therefore, Dropout makes the feature expression of neurons more independent and can improve the expression ability of network. In general, Dropout is used for full connection layer, because it takes the largest proportion of parameters in the whole convolutional neural network.

Experiment settings
The experimental data in this paper are from the Rolling Bearing Data Center of Case Western Reserve University (CWRU) (Csegroups). The CWRU dataset is the bearing fault diagnostic standard dataset, which is recognized worldwide. The bearing data acquisition system is shown in Figure 4. The experimental platform consists of a 2 HP motor (left), a torque sensor (middle), a power meter (right) and an electronic control unit. A single point of failure was arranged on the bearing by using EDM technology. There were three defect positions of the bearing diagnosed, namely rolling element damage, outer ring damage and inner ring damage, and the fault diameters are 0.007, 0.014, and 0.021 in., respectively. The bearings use SKF bearings. Vibration signals of active end bearings are collected by acceleration in experiments. The sampling frequency of the digital signal is 12,000 Hz, and the fault data of the drive end bearing are also collected at the sampling rate of 48,000 Hz.
As shown in Figure 5, the drive end signal of frequency 48 KHZ is used to display the rolling element, inner ring, outer ring fault signal and the normal signal from top to bottom.

Experimental implementation
The test objects of this test are the drive end bearing. The model of the bearing to be diagnosed is the deep groove ball bearing SKF6205, and the sampling frequency is 48 kHz. There are three defect positions in the bearing to be diagnosed, namely ball, inner race and outer race. The fault diameters are 0.007, 0.014, and 0.021 in., respectively. There are nine damage states in total. On experiment, 2000 data points were used for diagnosis for each sample.
A total of four datasets were prepared for the experiment, as shown in Table 2. Dataset 1, Dataset 2, and Dataset 3 are loaded with 1, 2, and 3 hp, respectively. Each dataset contains 2400 training samples and 480 test samples, and there is no overlap between the training samples and the test samples. Dataset 4 is the union of the Dataset 1, Dataset 2, and Dataset 3, which includes 3 load states, a total of 7200 training samples and 1440 test samples.  Figure 6. Test accuracy of convolution network on four datasets.

Experimental results
The deep-learning framework used in the experiment was Lasagne. Dropout suppression overfitting is used on the third and fourth layers of the network, respectively. The specific parameters are set as follows:  Figure 6. It can be seen that the recognition rate of the convolutional neural network on each dataset reaches 98%, with a minimum of 96%. The experimental results show that the convolutional neural network is feasible and can achieve high recognition rate when applied to the vibration signal in time-domain.

Conclusion
Under the background of intelligent manufacturing and machine learning, this paper applies the convolutional neural network to the bearing fault diagnosis, proposes the convolutional neural network model applied to the time-domain vibration signal, and carries out the bearing fault diagnosis. The proposed method in this paper can directly use the raw time-domain vibration signal as the input of CNN, which means that this method can be used to solve intelligent diagnosis problems of other machine systems. The proposed method is an end-toend diagnosis method without the need for hand-crafted feature extraction, and can automatically feature extraction and automatic fault recognition. The experimental results show that the proposed method can extract the appropriate fault features adaptively from the raw bearing vibration data and classify the faults with high accuracy and stability. The recognition accuracy of the proposed method in the CWRU bearing database is more than 96%. In this paper, bearing is taken as an example, in the future work, the proposed method can be applied to fault diagnosis of rotating parts such as gears, or other neural networks other than CNN, such as Long Short-Term Memory network (LSTM).