Short-term load forecasting based on EEMD-Adaboost-BP

ABSTRACT In order to realize short-time load forecasting, an Adaboost-BP method with a weight update mechanism is proposed based on ensemble learning theory. Firstly, the original historical load power is decomposed into a set of sub-series with diverse characteristics via using ensemble empirical mode decomposition. Then, BP neural network is performed as a weak learner to predict the load power of test samples. At the same time, the prediction results are used to update the weight of the weak learner and test sample and then construct a strong learner to obtain the final prediction results. According to the analysis results of the characteristics of each sub-series, the load forecasting model is established. The result of analysing the calculation example shows that the proposed prediction model outperforms all other algorithms in accuracy, which has high engineering application value.


Introduction
Accurate load forecasting is crucial for the reliable operation of power systems and economic growth. It is an important basis for power supply enterprises to arrange power production scheduling and improve the level of power system automation, which is conducive to energy conservation and emission reduction (Amjady, 2001;Hinojosa & Hoese, 2010). Therefore, the research of power system load forecasting has attracted extensive attention from scholars and produced a series of research results (Alfuhaibdc et al., 1997;Dedinec et al., 2016;Hernández et al., 2014;Ma et al., 2017;Xiaoyu et al., 2015).
In the past few years, a great quantity of methodologies have been used to forecast power load. Generally speaking, there are two major methods of short-term load forecasting, including statistical methods (Charytoniuk et al., 1998;Papalexopoulos & Hesterberg, 1990;Song et al., 2005) and artificial intelligence methods (Aly, 2020;Ge et al., 2020;Li et al., 2017). Statistical method contains autoregression, multiple linear regression, logistic regression and so on. The structure of this kind of forecasting model is simple, and it can obtain fast and accurate forecasting results for load data with small fluctuation and weak timing, but it is not suitable for complex data. For instance, the authors of Charytoniuk et al. (1998) proposed the algorithm of nonparametric regression, which calculated the load from historical data by applying the cross-validation technique. In addition, most of these models are limited to linear ones, while most of the load CONTACT Bin Zhang zhangbin_csu309@163.com series are nonlinear. In 2006, Hinton and Salakhutdinov successfully trained multilayer neural networks through layer-by-layer unsupervised greedy training for the first time, and achieved very good results on multiple public data sets (Hinton & Salakhutdinov, 2006). Neural network has gradually become a research hotspot again and produced a series of excellent results (Basin et al., 2018;Cheng et al., 2021;Li et al., 2021;Liang et al., 2021;Sheng et al., 2021). Artificial intelligence methods include support vector machine (SVM), back propagation (BP) neural network and artificial neural network (ANN). SVM is superior to traditional algorithms in addressing the problems of nonlinearity, high dimension and local minimum. However, its convergence speed is slow, and its prediction accuracy is poor when facing a large amount of data (Feng et al., 2015;Lin & Chou, 2013). BP neural network possesses strong self-learning capability and nonlinear function fitting ability, but its initial parameters are obtained randomly, so its generalization ability is poor, and it is prone to fall into partial optimum. Adaboost algorithm proposed by Freund and Schapire in 1995 trained several weaker learners to form a strong learner according to the prediction results of the weak learner, which can not only significantly improve the prediction accuracy but also has great advantages in the selection of model parameters, so it acquired successful application (Gao et al., 2013;Li & Yan, 2019;Liu et al., 2015). The authors of Liu et al. (2015) applied the Adaboost algorithm to four neural network models and achieved good results, indicating that the Adaboost algorithm can enhance the forecasting performance effectively. It is worth noting that the above methods are derived from model itself rather than the original data. And because the power demand side is characterized by thousands of households, it is ever-changing. Therefore, the load power has the characteristics of strong random variation and significant nonlinearity. This urges us to seek some methods to convert these non-stationary signals into relatively stationary signals and then predict it. As reported in Kiplangat et al. (2016), Liu et al. (2018), Ren et al. (2016), Guo et al. (2012), Zhang et al. (2016), the mode decomposition algorithm is an efficient way to deal with non-stationary original data and improve prediction accuracy. For example, in Liu et al. (2018) and Guo et al. (2012), wavelet decomposition (WD) and empirical mode decomposition (EMD) are utilized to decompose the original load time series (TS) into an ensemble of subsequence, respectively. The experimental results showed that the prediction accuracy have been improved. Ensemble empirical mode decomposition (EEMD) is an effective signal analysis technique proposed by Huang to suppress the end effect and mode aliasing in the process of EMD, which does not need to tune the parameters such as the type of mother wavelet and the decomposition layer number as WD (Huang et al., 1998). Because of its great superiorities in processing non-linear and non-stationary signals, and it is widely used in some forecasting models (Ali & Prasad, 2019;Santhosh et al., 2019;Wang et al., 2017). The authors in Santhosh et al. (2019) decomposed the original wind TS into finite intrinsic mode functions (IMFs) and a residue through EEMD, then each subsequence was predicted by the Deep Boltzmann Machine model, and finally, the results were superimposed to get the final results.
Based on the above discussions, a new composite model consisting of Adaboost-BP neural network algorithm based on EEMD is established. The key contributions of this paper are outlined as follows: (1) EEMD technique is utilized to decompose the original load TS, then an ensemble model of Adaboost-BP algorithm is implemented as the predicting model for every subsequence decomposed by EEMD.
(2) Compared to BP and Adaboost-BP, the experimental results show that the hybrid model of EEMD-Adaboost-BP can effectively improve the prediction accuracy.
The organizational structure of this article is as follows. The overall framework of this paper and the principle of EEMD and BP neural network algorithm based on Adaboost algorithm are described in Section 2. Section 3 and Section 4 present the simulation results and conclude the paper, respectively.

The architecture of EEMD-Adaboost-BP
The prediction model proposed in this paper is shown in Figure 1. From Figure 1, it can be observed that forecasting model is mainly composed of EEMD and Adaboost-BP. The main process is to decompose the original load into a series of IMFs with different characteristics by using EEMD technology and then use BP model to capture the internal nonlinear relationship among sample data. Finally, the outputs of multiple weak learners are combined by Adaboost algorithm to obtain the final results.

Ensemble empirical mode decomposition
EEMD is developed on the basis of EMD. The principle of this technique is to locally stabilize the TS data and then use Hilbert transform to get the time spectrum and get the frequency with physical meaning. The main goal of EEMD is to decompose complex signals into a series of IMFs and residuals. The decomposition process can be expressed as follows: (1) Input the original load data χ(t) and add a white noise sequence (t) to form a new signal ϒ(t); (2) Find out all the local maximum and minimum of the new signal ϒ(t), and then the local maxima and minima of the original load ϒ(t) are connected by cubic spline fitting curve to form the upper envelope ϕ 1 (t) and the lower envelope ϕ 2 (t) of ϒ(t); (3) According to the upper envelope ϕ 1 (t) and the lower envelope ϕ 2 (t), the local mean value of the envelope is calculated as (4) A new sequence h 1 (t) can be gained via deducting m 1 (t) of the local mean from the original signal ϒ(t), the expression is as (5) Whether h 1 (t) meets the two conditions of IMFs (a. the number of extreme points and zero-crossing points is equal or at most one difference; b. at any time in the whole process of EEMD, the local mean value of the local maximum and the local minimum envelope must be 0 (Huang et al., 1998).) If so, F IMF1 (t) = h 1 (t); otherwise, take h 1 (t) as ϒ(t), then repeat steps (2) − (4) until h 1k (t) satisfy the condition of IMFs. (6) The residual is calculated by the following formula (7) Take r 1 (t) as the decomposition object, then repeat the sifting steps (2) − (5) until the residual sequence r(t) is monotonic trend or only has one local extreme point. At this time, after the decomposition, the original signal is reconstructed into where N denotes the number of IMF, F IMFi (t) and r N (t) represent the IMF component of each layer and final residuals, respectively. The original load data decomposed by EEMD is shown in Figure 2.

Adaboost algorithm
AdaBoost algorithm was originally a feature classification algorithm in machine learning. With the deepening of research, it is also applied to regression problems. At present, the algorithm is widely used in load forecasting and short-term wind speed forecasting and has achieved  good forecasting results. The main idea of the algorithm is to train several weak learners in the same sample space and then adjust the weight of these weak learners to form a strong learner according to the prediction results of each weak learner (Freund & Schapire, 1997).
The algorithm flow chart is demonstrated in Figure 3. The specific process of the Adaboost algorithm is outlined as follows: (1) Basic learner and data selection. Firstly, determine the weak learning algorithm and sample space (x i , y i ), the sample data is group M, and normalize sample data with the mean of 0 and the variance of 1 where x i ∈ R n , y i ∈ R n . (2) Network initialization. Assume that the sample distribution is uniform, the test data uniform distribution weight D t (i) is assumed to be 1 M . The neural network structure is set according to the characteristics of sample data, and then the weights and thresholds of the neural network are initialized. Finally, set the number of iterations.
(3) Weak predictor prediction. Train the t-th weak predictor, use the training data to train the BP neural network and obtain the output of the prediction training data, then the error e i and average error e t of the weak learner at each sample are calculated, where the calculation formula is expressed as (4) Calculate the weight of weak learner. According to the average error e t of the prediction sequence f (t), we have (5) Sample weight update. Update the weight of the next round of training samples on the basis of the weight a t , the sample weight formula can be expressed as where B t is the normalization factor, and it aims to make the sum of distribution weights equal to 1 under the condition that the weight proportion remains unchanged. f t (x i ) denotes a weak predictor obtained after training data. (6) Strong predictor function. After training t rounds, t groups of weak predictor functions can be obtained, then strong predictors are combined according to the t groups of weak predictor functions which are expressed as where T represents the number of weak learners.

Data description
The test sample used in this paper is the local load data provided by the Guangdong power grid, China, and the time interval is 5 min. The data of the first 130 days is selected as the training sample, and the rolling forecasting method is used to forecast the load data of the next day. Considering the strong correlation between weather variables and load demand, the model input data not only includes the load data of the previous day but also contains the maximum temperature, minimum temperature, rainfall, day type (mainly holiday and workday) of the forecast day and the previous day.

Evaluating indicator
In order to accurately and intuitively evaluate the prediction performance among the three models in this study, four estimation indexes are used, including mean absolute percentage error (MAPE), mean absolute error (MAE) and root mean square error (RMSE). The calculation formulas are as follows: where M represents the number of prediction points, y i represents the real load value,ŷ i represents the predicted value of the corresponding model.

Result analysis
To accurately measure the performance of the forecast model proposed in this paper, BP and Adaboost-BP neural network model are respectively established to predict the load. In addition, two different working day load and holiday load are utilized to verify the superiority of the proposed algorithm model in the test set. The numerical comparison results of all prediction models are displayed in Figures 4-9. The absolute value trajectories of prediction errors of several models are shown in Figures 4 and 7.
Although the absolute value error of the method proposed in this paper is greater than that of other comparison models at individual data points, intuitively speaking, the absolute value of prediction error of the proposed method is lower than other models on the worker and holiday day, indicating that Adaboost-BP generates better prediction effect after EEMD of the original load data. Figures 5 and 8 describe the comparison between the real value and the predicted value of the three models in the two-day types. Compared with the other two forecasting models, EEMD-Adaboost-BP load forecasting model is closer to the real value. Figures 6 and 9 depict the error distribution of the three models on the workday and   holiday, which implies that the advantages of the proposed method can be seen as more intuitive and clear. The prediction results of several models are evaluated in Table 1.
As can be seen from Table 1, the following conclusions can be drawn:  (1) The BP model integrated with Adaboost algorithm can effectively improve the prediction effect. For instance, compared with BP, the MAE and the RMSE of Adaboost-BP are cut by 36.96% and 34.46% on workday and are cut by 40.50% and 34.46% on holiday.
(2) Compared with the undecomposed model, the prediction effect of the decomposed model is significantly improved, indicating that the decomposition algorithm can effectively reduce the instability of the decomposed data. For example, compared  with Adaboost-BP, the MAE and the RMSE of EEMD-Adaboost-BP are reduced by 33.43% and 10.46% on workday and are reduced by 9.52% and 8.53% on holiday. To sum up, the proposed Adaboost-BP hybrid prediction model based on EEMD has higher prediction accuracy than other algorithms.
Remark 3.1: It can be seen from Figures 6 and 9 that the error distribution is mainly between 15 and negative 15, indicating that the method proposed in this paper is more stable than other models. In addition, F IMFi (t) decomposed by EEMD algorithm represents the main random component of the original load sequence. According to the experimental results of BP, Adaboost-BP and EEMD-Adaboost-BP, the reconstructed modal components F IMFi (t) can effectively reduce the prediction difficulty and improve the prediction accuracy.

Conclusion
In view of the ever-changing power demand, which is easily affected by the weather and has strong random variability and significant nonlinearity problems, this paper has proposed a power load forecasting method based on EEMD-Adaboost-BP. Based on the analysis of the prediction results, the following conclusions are obtained: (1) Ensemble weak learner can effectively enhance the prediction performance of the model through AdaBoost algorithm.
(2) The EEMD is efficient in reducing the nonstationarity of load data, improving the quality of load data, and making the later prediction better.
(3) Compared with BP and Adaboost-BP, EEMD-Adaboost-BP is superior to the other two models and is more accurate for forecasting the overall trend of power load. However, some parameters in the model, such as the number of weak learners and the number of neurons need to be manually debugged, which will waste some time. The future work will focus on finding some suitable optimization algorithms to adjust the key parameters.

Funding
This work was partially supported by the National

Notes on contributors
Wenshuai Lin received the M.S. degree in control engineering from the Guangdong University of Technology, Guangzhou, China, in 2020, where he is currently pursuing the Ph.D. degree in control science and engineering. His research interests include artificial intelligence algorithm and its application in model prediction.