Surface microseismic data denoising based on sparse autoencoder and Kalman filter

Microseismic technology is widely used in unconventional oil and gas production. Microseismic noise reduction is of great significance for the identification of microseismic events, the location of seismic sources and the improvement of unconventional oil and gas production. In this paper, a denoising filter is proposed based on sparse autoencoder and Kalman filtering. Firstly, a sparse autoencoder is pre-trained to learn the feature of the microseismic data. Sparse autoencoding is a back-propagation neural network algorithm based on unsupervised learning, in which there are three layers: the input layer, the hidden layer and the output layer. The hidden layer is the spare, which makes the algorithm learn features better, represents samples in harsh environments and reduces dimensionality effectively. Besides, Kalman filter is used to deal with the uncertainty factors. Using a dataset of 600 surface microseismic synthesis traces and simulation noise. Sparse autoencoders and Kalman filtering are trained to suppress noise. The denoising filter based on sparse autoencoder and Kalman filter model obtains a higher signal noise ratio than the conventional model. The experiment results for the filtering of surface microseismic signals show the feasibility and effectiveness of the proposed method.


Introduction
With the rapid development of the world economy, oil and gas consumption has also been increasing year by year (AlKhars et al., 2020;Tang et al., 2016;Tsani, 2010;Umurzakov et al., 2020;Wei et al., 2020;Zamani, 2007). Oil and gas consumption is not only related with the economy, but also imperative for industry development, national energy security and transport efficiency (Fatima et al., 2019;X. Pan et al., 2020;. However, the production of many oilfields has been declining recently (S. Pan et al., 2020, november 20-22;H. Wang et al., 2019). The exploration and development of most oil and gas fields has been relatively high, resulting in the continuous reduction of the exploitation potential of conventional oil and gas reservoirs, and the subsequent exploration and development has become increasingly difficult and arduous (AlKhars et al., 2020;Höök et al., 2009). The exploration and development of unconventional oil and gas reservoirs, such as tight gas, shale oil, and coalbed methane (Orangi et al., 2011, january 24-26), is a new hot spot in the global oil and gas resources field CONTACT Xuegui Li lixg82@163.com (Milad et al., 2021;Santos et al., 2021). Shale oil, a kind of unconventional oil and gas, is the new key exploration energy. Shale oil represents an unconventional oil and gas resource with extremely large reserves; the global resources amount to 411 × 109 t (W. Q. Wang & Li, 2019). In shale gas exploration and development, hydraulic fracturing is an imperative approach to increase the production (Zhou et al., 2016). Microseismic monitoring mainly studies the microseismic problems induced by hydraulic fracturing, which establishes a spatial image of fracturing cracks to determine the orientation and the shape of cracks, extracts rock mechanics parameters, and provides technical support for further reservoir reconstruction. Then the oil and gas fields would ensure efficient yield increase during the development process (Staněk & Eisner, 2017;Wu et al., 2017). Microseismic monitoring is a geophysical technique for crack imaging. Compared with well monitoring, ground monitoring has a large detection range. However, due to the influence of the ground and the surrounding environment, the obtained microseismic data has poor quality, bad reliability and low signal-to-noise ratio (SNR), which has a great influence on the accuracy of the first-time pickup. The microseismic data collected on the ground contains a series of noise (Li et al., 2021;Mousavi & Langston, 2016) which possesses different shapes and strong energy. Severe noise interference makes it almost impossible to identify the effective signal, which seriously affects the quality of ground microseismic data. Therefore, the noise suppression of microseismic data is regarded as the key step of microseismic data processing. How to suppress noise and improve the SNR of microseismic signals is a question worthy of investigation. With the wide application of microseismic monitoring technology, various filtering methods have been proposed for microseismic signals including Kalman filtering (Baziw & Weir-Jones, 2002;Mohammadali et al., 2015), multichannel Wiener filtering (J. Wang et al., 2009), adaptive FK filtering (C. Liang et al., 2009, january), and others (Eisner et al., 2008, january). These methods are based on the fact that different characteristics of signal-to-noise separation are presented of microseismic signals and noise under different conditions. Nowadays, distinct approaches have been proposed for denoising microseismic data. In V. E. Oropeza (2010) and V. Oropeza and Sacchi (2011), a singular reduction analysis strategy has been proposed, in which a Hankel matrix is constructed in the f -x domain and then the singular value decomposition is used to reduce the level. A Fourier transform-based denoising method has been discussed in Sacchi (2010, 2012), respectively, to increase the ratio of signals and protect the main harmonic components. The transform-based microseismic data denoising technique has been developed in Forghani-Arani et al. (2012), which effectively suppresses noise and protects the waveform of the original signal. The denoising of microseismic data has been realized in Jiang et al. (2012) based on Curvelet transform. In Sabbione et al. (2013, september), the apexshifted hyperbolic Radon transform has been employed to denoise microseismic data and improve the SNR of microseismic data. A way of combining empirical modal decomposition and adaptive threshold has been analyzed in Han and van der Baan (2015) to suppress the noise of microseismic data. However, there may exist some limitations in the aforementioned methods in use. To enhance the automatic identification of an effective event, analyzing the environmental noise becomes necessary, and many researchers have begun solving the noise suppression problem of microseismic signals.
Deep learning theory has been presented by leading figure Professor Hinton of the University of Toronto in the area of machine learning in 2006. The past decades have witnessed successful applications of deep learning in diverse fields such as image classification (Cireşan et al., 2011, july 16-22;Krizhevsky et al., 2012;J.-E. Liu & An, 2020;Öztürk & Erçelebi, 2021), video classification (Mansour et al., 2021), traffic sign recognition (S. Jin et al., 2014) and human action recognition (Ji et al., 2013;Luvizon et al., 2021). Especially, a 9-dan professional Lee Sedol has been beaten for the first time by a computer Go program AlphaGo upon a fivegame match in 2016, which forms a major milestone in research of artificial intelligence (Lu et al., 2017;Silver et al., 2016;F. Wang, 2016). Deep autoencoder is a typical model of deep learning structure, which is applicable for high-dimensional complex data processing, and plays an important role in unsupervised learning and nonlinear characteristic extraction. The concept of autoencoder has been put forward for processing high-dimensional complex data (Rumelhart et al., 1986). The prototype structure of autoencoder has been improved and a deep autoencoder model has been built in Hinton et al. (2006). Unsupervised layer-by-layer greedy algorithm is firstly used to realize the pre-training of the hidden layer of deep autoencoder, and then BP algorithm is utilized to finetune the parameters of the whole neural network, significantly improving the abilities of learning and generalization of neural network. Deepening the research of deep autoencoder further, sparse autoencoder (SAE) has been devised in Schölkopf et al. (2007), which has a good property of signal dimension reduction. In Amaral et al. (2013), the performance of the deep autoencoder trained by different cost functions has been analyzed, and the direction has been pointed out for the development of the optimization strategy of cost function. A new method has been introduced in Qu et al. (2017) for gear pitting fault detection based on a deep sparse autoencoder in combination with dictionary learning. Deep autoencoder can hierarchically present the learned features, which lays the foundation for constructing deep structure.
Recently, several studies on signal denoising have been conducted using deep learning techniques (Isogawa et al., 2017;Keshavarzi et al., 2018;Seo et al., 2018;Shi et al., 2017;Vickers et al., 2021). A deep recurrent neural network has been adopted in Keshavarzi et al. (2018) to reduce wind noise, which shows that processing using the recurrent neural network is significantly preferred over no processing for both subjective intelligibility and sound quality of two subject groups. In Seo et al. (2018), a regression-based integrated acoustic echo and background noise suppression algorithm has been through the use of a deep neural network with multi-layer deep architecture. A deep convolutional neural network has been involved in Isogawa et al. (2017) with using soft shrinkage for activation functions of deep convolutional neural network, which is adjustable to the noise level of the input image immediately. A speech enhancement method has been indicated in Shi et al. (2017) adopting noise classification and deep neural network, where the deep neural network model is to enhance the noisy speech. So far, the corresponding research has been few which explores deep autoencoder neural networks for surface microseismic noise suppression.
In this paper, the noise generated by the ground micro-seismic monitoring method is mainly suppressed. The goal is to construct a filtering model of deep autoencoder network to improve the SNR of microseismic data. Particularly, we present a novel filtering method for surface microseismic signal based on a sparse autoencoder neural network. The developed SAE-based model can get a higher SNR. Via analysis of the structure and the parameters in SAE, the greedy layer-wise pre-training algorithm is applied to train SAE. With a total of 600 surface microseismic synthesis traces and simulation noise first, then the processed microseismic data are used to train SAE. This SAE model can improve the convergence speed while training the parameters in SAE, and obtain a higher SNR than the conventional model. The main contributions of this paper are mainly threefold: (1) sparse autoencoder neural network is first applied to the filtering issue of surface microseismic signal; (2) using the Kalman filter for signal filtering after deep autoencoder filtering, a microseismic noise suppression framework is proposed combining deep sparse autoencoder and Kalman filter; and (3) the experimental results exhibit that the SAE method has better noise reduction effect and retains the same phase axis of the microseismic signal, which is obviously superior to wavelet threshold denoising method.
The remainder of this paper is organized as follows. In Section 2, we describe the microseismic monitoring technology and ground micro-seismic signal filtering. Section 3 introduces the ground micro-seismic signal filtering. The application of the developed SAE model to the problem of surface microseismic filtering is presented in Section 4. Conclusions are given in Section 5.

Surface microseismic data denoising filter
In this section, a brief discussion is presented of microseismic monitoring technology. The illustration of the basics is highlighted of the surface microseismic signal filtering, followed by the proposed algorithms and analysis.

Surface microseismic monitoring
Microseismic monitoring has tended to be a common technique to image fracture-network deformation that accompanies oil and gas operations. The most extensive application of microseismic monitoring is to image hydraulic-fracture operations, although the technique is also used to monitor microseismic events induced by inelastic deformation associated with injection of steam, water or gas for secondary recovery and production (Maxwell et al., 2010). Microseismic monitoring is the basic approach, that is, by decorating in the borehole and surface or near-surface production activities by geophone array receiving or inducing tiny seismic events, and through the microseismic source, location is calculated through inversion of these events, such as parameters. Finally, the production activities are monitored or guided through these parameters. With the progress of microseismic observation system and technology method, and in-depth study of inversion and visualization of the mechanism of microseismic source, the development prospect of microseismic monitoring becomes broader. In Maxwell et al. (2010), two main microseismic monitoring techniques used currently have been indicated, i.e. surface and downhole monitoring. In downhole monitoring, high-sensitivity sensors are deployed in boreholes close to the seismic source to minimize signal attenuation and background noise. This detects smallmagnitude microseismicity with a SNR sufficient to determine source location from a sparse receiver array. In surface monitoring, detection bias is considered such that more small microseismic events are recorded close to the array and the ultimate detection range is limited to a region around the monitoring location, as reflected in Figure 1.

Surface microseismic data denoising filter
The microseismic event has weak energy, high frequency, and short duration, so it is easily affected or covered by ambient noise. Because of these characteristics of microseismic data, it is necessary to carry out a series of treatments of microseismic data to accurately conduct  the initial pick and the source localization. First, through pretreatment and reasonable filtering, the microseismic signal of the filtered background noise is consistent. Then select favourable polarization analysis and seismic events to do early pickup, and get the angle relative to the source and the time difference of vertical and horizontal wave. At the same time, according to the time difference, P-wave velocity model is set up to achieve the aim of accurate source positioning. The main frequency of surface microseismic data is significantly lower than that of well monitoring seismic data. The background interference in terrestrial microseismic data is also more complicated. There is no regularity in such interference in the time and space domain. Lowfrequency and high-frequency interference can be suppressed by simple band-pass filtering. The random noise overlapping the real signal in the frequency domain is not easy to suppress. To improve the automatic identification of an effective event, it is necessary to suppress environmental noise. The environment of the microseismic monitoring detector is very complex. According to the microseismic signals, the effect is not ideal of using conventional filter to process. The denoising effect of surface microseismic data is directly related to fracturing effect interpretation, fracturing program adjustment, focal location and other tasks. Denoising is the key to the whole microseismic processing flow as expressed in Figure 2.

The surface microseismic deep learning filter
Inspired by the classical and successful deep autoencoder neural network, sparse autoencoder and its performance improved models (Hinton et al., 2006;Lyons et al., 2014;Schölkopf et al., 2007), we describe a kind of multi SAEs model for denoising filter. With the goal of estimating velocity models using seismic data as inputs directly, the network needs to project seismic data from the data domain (x; t) to the model domain (x; z), as given in Figure 3. Also, the basic concept of the proposed method is expressed, which is to establish the map between inputs and outputs. As the model to be developed is concerned with surface microseismic filter, the illustration is mentioned in this section.

Auto-encoder
Deep autoencoder is a neural network model which is not supervised to study high-dimensional data and is widely used in the study of unsupervised feature in deep learning. Automatic encoder (or auto encoder) has been employed in Rumelhart et al. (1986) and Hinton et al. (2006) to improve the structure and depth of the automatic encoder. The automatic encoder (or deep autoencoder) first uses the greedy algorithm to complete an unsupervised training step by step, and then uses BP algorithm for the entire network parameter tuning. This improves the BP algorithm effectively by avoiding evils of falling into the smallest easily. In Schölkopf et al. (2007), a sparse automatic encoder has been brought forward which further deepens the study of a deep automatic encoder. The automatic encoder for noise reduction named denoising autoencoder has been investigated, which adds noise vectors in the input data, avoids the proposed merger of the network to improve robustness, and achieves better results. The autoencoder is an asymmetrical three-layer neural network, which is composed of the input layer, the hidden layer, and the output layer, as clarified in Figure 4. First, the input data is encoded by the hidden layer, then the input data is reconstructed by the hidden layer, and the reconstruction error is minimized to obtain the best expression and feature extraction of the data hidden layer.
The goal of autoencoder learning is to make the output of the network as approximate as possible to the input, and the training process consists of the encoding process and the decoding process. In the encoding process, the input samples are linearly mapped and nonlinearly transformed to be expressed by the hidden layer. Supposing that X = {x i } N i=1 is a sample set, expression of the input sample x i in the hidden layer is written as follows where W 1 and b 1 represent the weight and the bias between the input layer and the hidden layer, respectively. Sigmoid(·) indicates the activation function of the hidden layer. The decoding process is to re-project the encoded data to the original signal space, and obtain the decoded signalx i , which is described as: where W 2 and b 2 represent the weight and the bias between the input layer and the hidden layer, respectively. Sigmoid(·) denotes the activation function of the output layer. The goal of autoencoder training is to make the decoding output as approximate as possible to the input before encoding. The network parameters are optimized by minimizing the reconstructing error, and the cost function is defined as follows: (3) After the training of an autoencoder, the activation value of the hidden layer is used as the input of the next autoencoder. In this way, the multiple layers of the autoencoder are stacked up to form a stacked autoencoder.

Sparse autoencoder
The sparse autoencoder uses the idea of sparse coding, introducing a sparse penalty term based on the autoencoder. It can obtain relatively sparser and more concise data features through learning under sparsely constrained conditions. One of the goals of the sparse encoder is to make the hidden layer neurons be 'inactive' in most of the time. Assuming that a j (x) represents the jth activated unit of the hidden layer. For N training samples, in the forward propagation process, the average activation of the jth unit of the hidden layer is Since most of the neurons should be 'inactive', the average activation quantity ρ j should approximate to a constant ρ nearly zero. ρ is a sparse parameter. To achieve the sparse object, the penalty term is added to the cost function of the encoder to penalize ρ j so that it can not deviate from ρ. The Kullback-Leibler (KL) divergence (Hinton et al., 2006) is used to define the penalty term PN, the expression is: where S 2 is the number of neurons in the hidden layer. KL(ρ ρ j ) is the divergence of KL, which is mathematically expressed as The penalty term is determined according to the nature of KL divergence. That is, if ρ j = ρ, KL(ρ ρ j ) = 0; else, the KL divergence value will gradually increase with the deviation of ρ j from ρ. Therefore, when the sparse penalty term is added, the network cost function of the sparse autoencoder can be defined as: where β is the weight of the sparse penalty term.

Sparse autoencoder filter
Based on a sparse autoencoder neural network, a generalized h-layer deep neural network filtering model is obtained composed of input layer, output layer and SAE deep structure unit, as plotted in Figure 5. In Figure 5, there are n input nodes, n output nodes, and h SAE units in the sparse autoencoder filter model. The part from the input layer to hidden layer is the information processing unit of SAEs. It is used to complete the input of n-dimensional seismic signals x 1 , x 2 , . . . , x n to the network, and spatial and temporal weighted aggregation of the input signals, its outputs are Xh n , and the outputs are n-dimensional denoising signals y 1 , y 2 , . . . , y n .
According to Figure 5, the input/output relationship of SAEs is written as follows: (1) The input/output relationship of the first layer Input: where f 1 and θ (1) k are the activation function of SAE and the threshold respectively.
In summary, the input/output relationship of SAEs is written as: (10) SAE k (·) is the output of the kth node of the SAE output layer. In Equation (10), the learning ability of deep learning framework is adopted, abstract high-level category representation is formed through combining the lower level characteristics of signals, and the SNR of signal samples is improved.

Kalman filter
Kalman filter is the Bayes-optimal solution which has been used for a variety of disciplines. In essence, Kalman estimation is a data fusion method. Kalman filter estimates the value of this moment by the known estimated value of the previous moment and the measured value of this moment. Kalman filter, which finds the posterior probability through the prior probability and likelihood probability, is also a probability distribution problem. It takes the estimate of the last moment as the current prior, revises the prior with the current observation value, then calculates the posterior result. Kalman filtering was proposed very early, and it has been applied in many fields (Lin et al., 2021;Mao et al., 2021;Subchan et al., 2021;Tan et al., 2020;Y. Zhang et al., 2019). Therefore, we would like to test whether the experimental effect will be better after using Kalman filtering to process microseismic data.
Kalman filter describes the dynamic characteristics of the system by using the method of state space. The state space model and the measurement equation are described as follows.
State space model: where X is the true system state, A is the state transition matrix, B is the control matrix, u k is the control variable, and ω k is the process noise or disturbance.
Measurement equation: where Z k is the measurement vector, H is the observation matrix, υ k is a random noise vector. After describing the dynamic characteristics of the system through the state space, it is seen that Kalman filter operates by predict equation and correct equation. Subsidiary equation 1 (Estimate error) where e k+1 is the estimation error. Subsidiary equation 2 (Estimate Uncertainty) State extrapolation whereX k+1|k is the predicted system state vector at time step k. Subsidiary equation 3 (Process noise uncertainty) where Q k+1 reflects the covariance matrix of the process noise, and ω k+1 stands for the process noise. Covariance extrapolation where P k|k is the estimate uncertainty (covariance) matrix of the current state, P k+1|k is the predicted estimate uncertainty (covariance) matrix for the next state, Q is the process noise matrix. Covariance update equation Subsidiary equation 4 (Measurement Uncertainty) where R k+1 is the covariance matrix of the measurement, ν k+1 is the measurement error. Kalman gain equation State update equation whereX k+1|k+1 is an estimated state vector of the system at time step k + 1,X k+1|k is a predicted state vector of the system at time step k, and Z k+1 is a measurement. Equations (15) and (17) are for predicting. Equations (18), (20) and (21) are for correcting. Kalman filter operates in 'predict-correct' loop, which is specified in Figure 6.
The state-space model of microseismic signals (Baziw & Weir-Jones, 2002) is modified by: with being the sampling interval and being the dominant angular frequency ( = 2π f ).
We hope to make further corrections to the microseismic signal with the autoenocoder. The noise reduction signal from the autoencoder is utilized as our observed variable, updating the state variables and covariance matrix through the prediction equation and correction equation.

Signal filtering simulation examples
In this paper, the effective signal in the seismic record is simulated by using the Ricker wavelet (Chang et al., 2018;. The formula is as follows: where f m is the dominant frequency of the Ricker wavelet, and t is the time. The proposed SAE model is applied to the filtering problem of ground microseismic signal. We use a database of 600 traces of microseismic signal, and trace of analog signal data with a size of 400 by 80 are generated. One of the microseismic signal traces is depicted in Figure 7. Gaussian noise is used to simulate noise data. The simulation results are presented in Figures 8-10. This experiment uses the wavelet threshold method to compare with the model. Figure 9 is the result of noise reduction only through Kalman filter. Figure 10 describes the effect of noise reduction of the model in this paper. As seen from the figures, the denoising effect calculated in this paper is very obvious, the noise is suppressed more completely, and the effective signal is completely restored.

Conclusions
In this paper, sparse autoencoder and Kalman filter have been adopted for microseismic denosing. The Ricker wavelet has been used to simulate the microseismic signal for the pre-training of the sparse autoencoder. Then the well trained SAE has been used to process the noisy microseismic signal. A database has been used for testing, which contains 600 microseismic signal traces with analog signal data of size 400 × 80. The noisy data has been obtained by adding Gaussian noise to the microseismic signal. The noisy microseismic signals have been the input of SAE, and the output of SAE has been used in Kalman filter. SAE has been adopted to extract the feature of the microseismic data, Kalman filter has been implemented to handle the uncertain problems and improve estimation. Through experiments, we have found that SAE-KF denosing method has better denoising effect than Kalman filter. However, time delay may exist in the actual network as discussed in Bauer et

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported in part by the National Natural Sci-