Monthly and seasonal hydrological drought forecasting using multiple extreme learning machine models

ABSTRACT Hydrological drought forecasting is a key component in water resources modeling as it relates directly to water availability. It is crucial in managing and operating dams, which are constructed in rivers. In this study, multiple extreme learning machines (ELMs) are utilized to forecast hydrological drought. For this purpose, the standardized hydrological drought index (SHDI) and standardized precipitation index (SPI) are computed for 1 and 3 aggregated months. Two scenarios are considered, namely, using SHDI in previous months as the input, and using SHDI and SPI in previous months as the input. Considering these scenarios and two timescales (1 and 3 months), 12 input–output combinations are generated. Then, five different ELMs and support vector machine models are used to predict the SHDI on both timescales. For preprocessing of the data, the wavelet is hybridized with the models, leading to 144 different models. The results indicate that ELMs are capable of forecasting SHDI with high precision. The self-adaptive differential evolution ELM outperforms the other models and the wavelet has a highly positive effect on the model performance, especially in error reduction. In general, using ELMs in hydrological drought forecasting is promising and this model can feasibly be used for this purpose.


Introduction
Drought is defined as a period with below-normal availability of water (Tallaksen & Van Lanen, 2004). Drought has widespread effects on the water resources, society, economy and ecology of areas that are exposed to it. As an example, the prolonged drought in the Middle East during 1999-2001 caused extensive livestock deaths, land degradation and diseases (Barlow et al., 2016). It especially affects reservoir operation in dammed rivers. Based on the classification of Wilhite and Glantz (1985), drought can be categorized into meteorological, hydrological, agricultural and socio-economic types. Drought starts with a considerable shortage of precipitation and extends to hydrological drought, in which levels of water resources, including lakes, rivers and reservoirs, decline to below-normal conditions. Hydrological drought threatens water availability for humans CONTACT Qian Zhang 20200420@wzu.edu.cn; Shahab S. Band shamshirbands@yuntech.edu.tw; amir mosavi amir.mosavi@kvk.uni-obuda.hu and the environment . Climate change can intensify the impacts of drought and cause more frequent drought events. As such, drought prediction plays a substantial role in water resources management, especially in arid and semi-arid regions of the world, which suffer from limited freshwater availability. For drought monitoring and forecasting, it is essential to quantify the drought at the first stage. For this purpose, several drought indices have proposed since the 1960s. Among these, the standardized precipitation index (SPI), for meteorological drought, gained a high level of popularity. Thereafter, most of the drought indices developed for each drought type have followed the SPI computational procedure. For hydrological drought, the surface water supply index (SWSI), standardized streamflow index (SSI) and standardized hydrological drought index (SHDI) are widely used.
The next challenge is the selection of an appropriate model . As physical and conceptual models are usually data intensive, most research has been performed based on data-driven models (Belayneh et al., 2014). In recent years, different data-driven models have been employed, including autoregressive (Faruk, 2010;Han et al., 2010;Tian et al., 2016;Zhang et al., 2020), support vector regression (SVR) (Deo et al., 2017;Ganguli & Reddy, 2014;Shamshirband et al., 2020;Tian et al., 2016;Xu et al., 2018), adaptive neuro-fuzzy inference systems (ANFIS) (Ali et al., 2018;Başakın et al., 2020;Rahmati et al., 2020), artificial neural network (ANN) models (Khan et al., 2020;Nabipour et al., 2020;Singh et al., 2020). All of these models have their own advantages and disadvantages. The fast computational time and simple structure of the model are the advantages of autoregressive models, while they are dependent on their own history and are not capable of forecasting values that have not already happened. SVR models have suitable generalization capability but do not perform well when encountering noisy data. ANFIS has linguistic and numerical knowledge, and hence is extensively used as a prediction model. However, its application is limited in cases with large inputs (Salleh et al., 2017). ANN models can work with incomplete knowledge and have parallel processing ability. On the other hand, the architecture of an ANN is not constant and the best result in each problem is defined by an optimal architecture, which needs a trial-and-error procedure. Jehanzaib et al. (2021) address several issues and challenges in modeling drought with machine learning. They present a comprehensive evaluation of machine learning techniques for hydrological drought forecasting. Choubin and colleagues (Choubin et al., 2014;Choubin, Khalighi-Sigaroodi, et al., 2016) consider modeling the drought and further investigation on the SPI using several advanced data-driven methods, including multiple linear regression (MLR), multi-layer perceptron network and ANFIS, while also considering large-scale climate signals. Dikshit et al. (2021), alternatively, propose a model for drought forecasting with a longer lead-time using lagged climate variables and a stacked long-shortterm memory model. Most recently, Shahdad and Saber (2022) show how ensemble-based models of a reduced error pruning tree can be even more effective. Nevertheless, the application of extreme learning machines (ELMs) and wavelets for drought forecasting has not been fully explored.
Drought forecasting involves a complex modeling system owing to the inherent nature of this natural phenomenon. The physical process of drought and the spatiotemporal variability of its characteristics, associated with non-stationary, nonlinear and complex behaviors, render it a complicated case for forecasting. Understanding of the drought propagation from one type to another can help to improve drought prediction. Basically, drought starts with a considerable shortage of precipitation, which means that a meteorological drought happens. As a result of a lasting meteorological drought, water deficit will occur, which is termed a hydrological drought. Thus, it is possible to use the meteorological drought as a driver of hydrological drought. Based on previous research, as mentioned above, different models have been utilized for drought forecasting, but there is no unique model that can be fully relied upon for drought forecasting. However, accurate and reliable drought forecasting is vital in water resources planning and management. Different models have been examined for this purpose. In the twenty-first century, a new approach using artificial intelligence, the ELM, was developed by Huang et al. (2006). It needs less computational time compared to ANN, SVR and ANFIS, and can automatically generate the weights and biases for hidden layers based on a probability distribution function (PDF) and appropriate generalization skills (Deo et al., 2017;Yaseen et al., 2019). ELMs have been used extensively in different fields, including the environment, energy, climatology and health (Mohammadi et al., 2015;Shamshirband et al., 2020;Zhu et al., 2019). However, few studies have used ELMs for drought forecasting. The first attempt at using an ELM for drought forecasting was performed by Deo and Şahin (2016). They used an ELM to forecast the effective drought index (EDI) and compared the results with the conventional ANN, verifying the superiority of the ELM. Deo et al. (2017) used a wavelet ELM for prediction of a 1 month lead-time EDI. Ali et al. (2018) compared ELM, ANFIS and MLR models in the prediction of the SPI in Pakistan. Finally, Mouatadid et al. (2018) evaluated SVR, ANN, MLR and ELM in SPEI forecasting.
From this literature review, it is apparent that ELMs have rarely been used for drought forecasting and the application of ELMs in drought forecasting is still emerging. Based on the authors' best knowledge, ELMs have not previously been used for hydrological drought forecasting. This motivated the authors to investigate the capability of the ELM in hydrological drought forecasting. As different ELM models have been proposed during the past decade, five different ELM models were selected for this purpose. Besides, the wavelet as a preprocessing tool has been tested in previous research, and coupled with ANNs, ANFIS, etc., resulting in different performance, mostly improving the applications(e.g. Choubin et al., 2014;Choubin, Khalighi-Sigaroodi, et al., 2016;. Therefore, in this study, the wavelet is coupled with different ELM models to examine its capability in drought forecasting. For this purpose, Dez basin in the south-west of Iran is selected and the SHDI is computed on 1, 3 and 6 month timescales based on the measured stream flows upstream of Dez dam. Twelve different input-output combinations are considered for modeling. The aims of this study are four-fold: (1) to evaluate of the capability of ELM models in hydrological drought forecasting and to select the best model; (2) to use meteorological drought as a driver of hydrological drought and evaluate the impact of the SPI as a meteorological drought index on SHDI forecasting; (3) to evaluate the impact of the wavelet as a preprocessing procedure on the capability of different ELM models for hydrological drought forecasting; and (4) to compare five different ELM models in drought forecasting.
The rest of the paper is organized as follows. Section 2 presents the study area and methodology, including the SPI and SHDI calculation procedure, ELM architecture and evaluation criteria. Section 3 includes the results of the study. Finally, Section 4 presents the conclusions of this study.

Study area
Dez dam was constructed in 1963 on the Dez River in the south-west of Iran. It is a double-curvature arch dam with a height of 203 m. The primary reservoir volume was 3340 × 10 6 m 3 , which was reduced to 2600 × 10 6 m 3 during the operation period as a result of sedimentation (Boroujeni, 2012). This is a multi-purpose dam, which supplies water for domestic and industrial purposes and for the irrigation of 125,000 ha of farmland downstream of the dam. It also controls floods, which regularly occur in the Dez basin. A 520 MW hydropower plant has been installed downstream of the dam. As it is the only reservoir on this river, the drought and wet periods, which lead to a respective increase and decrease in precipitation, and finally the water storage, must be given special attention in this basin.
Dez basin ranges from 32°35 to 34°07 N and from 48°20 to 50°20 E ( Figure 1). It is bordered by Karun basin to the east and south, Ghareh Chay to the north and Karkheh to the west (Valipour et al., 2013). The inflow to the dam is measured at Tale-Zang station, upstream of the dam. There are four meteorological stations for measuring precipitation in the basin and 10 others outside and around the basin. The measured data in these 14 stations are used for the estimation of the average precipitation over the basin using an inverse distance weight method. Both precipitation and discharge time series range from 1963 to 2017. The data are gathered from Iran Meteorological Organization (IMO) and Iran's Water Resources Management Company.  McKee et al. (1993) proposed the SPI as a standardized drought index, and it is widely used in several countries across the world. The SPI computation procedure includes two main steps. First, a PDF is fitted to the precipitation records in each timescale, in which gamma is used as the default PDF, although it is not limited to this PDF. Second, the probabilities of the best fit are transformed to the standardized normal probability with an equi-probability transformation and, finally, the z-scores relative to these probabilities are found. These z-scores are the SPI values. Positive and negative values represent the drought and wet periods, respectively. Dehghani et al. (2014) used this procedure to develop the SHDI as a hydrological drought index, by replacing the precipitation with discharge. As the computation procedures for these two drought indices are the same, it is possible to use them jointly for meteorological and hydrological drought analysis. The classification of the SPI and SHDI is presented in Table 1. For detailed descriptions of SPI and SHDI computation, one may refer to Hayes et al. (1999), Lana et al. (2001) and Wu et al. (2005Wu et al. ( , 2007.

Extreme learning machine
The ELM, which was developed by Huang et al. (2006), is widely used in different aspects of environmental modeling (Yaseen et al., 2019). It is a novel approach using a single-layer feedforward neural network (SLFN). The main idea behind this model is that there is no need to tune the internal parameters of the model, including the hidden neurons. In fact, this is an improved ANN Table 1. Different input-output combinations used for modeling.

Model
Input Output Note: SPI = standardized precipitation index; SHDI = standardized hydrological drought index.
model, which reduces the execution time. In this way, the weights and biases are randomly generated and the output weights have a unique least squares solution (Yaseen et al., 2019). Based on the randomly initiated hidden neurons, the ELM is capable of attaining a global optimum solution (Huang & Chen, 2007). The general architecture of the ELM is presented in Figure 2.
In recent years, different alternative versions of the ELM have been proposed. Qin et al. (2009) proposed the self-adaptive differential evolution extreme learning machine (SaDE-ELM). Differential evolution was proposed by Storn and Price (1997) as an efficient stochastic search algorithm. However, the vector generation strategies and the associated parameters were determined through a trial-and-error procedure, which was computationally time consuming (Qin et al., 2009). Thus, Qin et al. (2009) developed the SaDE algorithm, in which the trial vector generation and associated parameters were determined through a gradual self-adaptation. Zong et al. (2013) suggested the weighted extreme learning machine (W-ELM) for regression or classification tasks with imbalanced class distribution. This new alternative was developed for easy and fast implementation. The online sequential extreme learning machine (OS-ELM) was proposed by Liang et al. (2006). The bidirectional extreme learning machine (B-ELM) was proposed by Yang et al. (2012) for regression problems. These five ELM models, namely, ordinary ELM, W-ELM, B-ELM, SaDE-ELM and OS-ELM, are utilized in the present research. Detailed descriptions of the computational procedure and mathematical formulation of ELMs have been presented elsewhere Liang et al., 2006;Mohammadi et al., 2015;Sajjadi et al., 2016;Yang et al., 2012;Yaseen et al., 2019;Zong et al., 2013).

Input-output combinations
One of the main challenges in any prediction is the determination of input-output combinations. It is necessary to find the drivers of each phenomenon and arrange the input(s) for it. Drought is a complicated phenomenon and accurate forecasting of drought needs a high precision in input selection. As different types of drought are successive, and meteorological, hydrological, agricultural and socio-economic droughts occur successively, it is feasible to use meteorological drought as a predictor of hydrological drought. Besides, the status of hydrological drought in previous months can be a suitable predictor. Therefore, in this research, for monthly and seasonal hydrological drought (depicted by the SHDI) forecasting, the SHDI in previous months and meteorological drought (depicted by the SPI) are selected as predictors. Based on this, 12 input-output combinations are considered for modeling, as presented in Table 1.

Evaluation metrics
Evaluation of the recommended framework efficiencies is carried out using statistical evaluation criteria, such as the root mean squared error (RMSE), coefficient of determination (R 2 ) and mean absolute error (MAE). Several visualization approaches are also used to assess the model predictions.
RMSE is a common mean error indicator that clarifies how close the data points are to a best fit line (Equation 1) (Amr et al., 2011;El-Shafie et al., 2012). R 2 is computed by the squared value of the correlation coefficient following Bravais-Pearson (Artusi et al., 2002). It clarifies the amount of the observed dispersion described by the estimation. The range of R 2 is from 0 to 1. A value of 0 indicates no correlation between the observed and predicted data, whereas an R 2 equal to 1 shows that the scattering of the estimation data is the same as that of the observation (Krause et al., 2005) MAE is a measure of errors between paired observations expressing the same phenomenon (Fan et al., 2013) (Equation 3).
These parameters are computed using following formulations: in which O i is the observation value, P i is the predicted model output,Ō is the average of observations,P is the average of model outputs, and N is the number of data. Besides, the scatterplot, Sina plot and the plot of error in each model are used for the evaluation of model results.

Results
In this study, for SPI and SHDI computations, different PDFs, including gamma, log-normal (3P), log-logistic, Johnson SB, Weibull, Pearson type III and Burr, are fitted to the precipitation and streamflow time series to find the best fit. Based on Vicente-Serrano et al. (2012), it is recommended to use the best fit for each month rather than Note: SaDE-ELM = self-adaptive differential evolution extreme learning machine; W-ELM = weighted extreme learning machine; OS-ELM = online sequential extreme learning machine; B-ELM = bidirectional extreme learning machine; RMSE = root mean squared error; R 2 = coefficient of determination; MAE = mean absolute error. Figure 3. Scatterplots of M1-M12 in the training and testing phases using the wavelet self-adaptive differential evolution extreme learning machine (wavelet SaDE-ELM). SHDI = standardized hydrological drought index. to use one PDF for all months. Thus, the authors used the PDF with the best fit for each month to compute the SPI and SHDI on different timescales. After SPI and SHDI computation, the models presented in Table 1 were used for drought forecasting. The parameters of different ELM and support vector machine (SVM) models were tuned during a trial-and-error procedure. The results of 10 ELM and two SVM models using the combinations in Table 1 are presented in Table 2.
A quick overview shows that the wavelet SaDE-ELM outperforms the other models. Also, it is possible to say that wavelet has a positive effect on the performance of ELM and SVM models, and in almost all of the models and input-output combinations, the results of the models with wavelets are better than the standalone ELM and SVM models. Models M1-M3 consider previous SHDI values for SHDI1 forecasting. Among them, M2 and M3 perform better than M1 in all 12 models. While wavelet SaDE-ELM has the best performance, B-ELM is the poorest model. As the results in M2 and M3 are almost the same, it is possible to conclude that SHDI1 (t) in each month related to SHDI1 (t − 1) and SHDI1 (t − 2) and SHDI1 (t − 3) is redundant, which has negligible or no effect on SHDI1 (t) forecasting.
M4-M6 are used to forecast SHDI3 (t). RMSE and R 2 in the best model are 0. 60, 0.37, 0.20 and 0.61, 0.83, 0.97 for M4, M5 and M6, respectively. This shows that the model performs well in all scenarios, although by increasing the input variables, the results improve considerably. Again, the worst results belong to B-ELM. In M7-M9,    SPI1 with different lags is used with SHDI1 in previous months for SHDI1 (t) forecasting. As in the previous cases, wavelet SaDE-ELM has the best performance. However, using the SPI associated with the SHDI has no or negligible effect on improving the results. In some scenarios, such as M8 and M9, the results are even poorer compared to M2 and M3. This means that the SHDI values in previous months are the most suitable input variables for SHDI1 forecasting, and using SPI just results in a complicated ELM network, which leads to poorer results. This can also be observed in M10-M12. In these models, using the SPI cannot improve the results. Moreover, an overall comparison between the ELM and SVM models revealed that wavelet SaDE-ELM, as the most accurate ELM model, provided more suitable predictions than the SVM and wavelet SVM models.
In the next step, the scatterplots of M1-M12 for both training and testing phases using wavelet SaDE-ELM are plotted and presented in Figure 3. It is obvious that there is a suitable agreement among the observed and predicted SHDI values and no considerable overestimation or underestimation is visible in these plots. Besides, there is a close agreement between the training and testing phases, which proves the robust modeling.
Time series of M1-M12 are plotted for the testing phase using wavelet SaDE-ELM in Figure 4. The error of the models in forecasting the SHDI is also presented in these plots. In M1-M3, a suitable agreement may be comprehended between the observed and forecast SHDI. The high/low values are forecast with an acceptable error. Just two values in the time series are predicted with more than 1 unit error. Among them, M3 has the lowest error compared to M1 and M2. Although the statistical metrics show that M2 and M3 have nearly the same performance, considering SHDI1 (t − 3) as an input leads to the lowest error. For M4-M6, while it is possible to find a general agreement between the observed and forecast values, the errors are high, especially in M4 and M5. The error reduces considerably from M4 to M6, which indicates the effect of inputs on the model performance. The model performs nearly perfectly in M6, with low errors. All SHDI values are predicted with less than 1 unit error. There is no obvious difference between M8 and M9, while the results of the model using these two combinations are better than those of M7. However, it can be observed that in M8, there is a better agreement between the observed and forecast values even in extreme values. M10-M12 have nearly the same performance. However, as M12 has Figure 5. Frequencies of errors in M1-M12 using the wavelet self-adaptive differential evolution extreme learning machine (W-SaDE-ELM) and the self-adaptive differential evolution extreme learning machine (SaDE-ELM). just one error value above 1, it can be considered as the best scenario.
Finally, the effect of wavelet as a preprocessing procedure in modeling is evaluated. The frequencies of errors in modeling using wavelet SaDE-ELM and SaDE-ELM are presented in Figure 5. Based on this figure, it is obvious that the model error reduces considerably in M1-M12 using the wavelet as a preprocessing tool. In M1, 51% of forecast values have an error less than 0.3 using the wavelet, while 35% have this range of error without the wavelet. Besides, 2% and 15% of the forecast values have an error more than 1 with and without the wavelet, respectively. In all scenarios, 50% of the wavelet SaDE-ELM forecast values have an error less than 0.3, while in M12, SaDE-ELM is capable of predicting values with an error less than 0.3 in more than 50% cases. Moreover, an error greater than 1 is seen in less than 5% of all wavelet SaDE-ELM models, except in M10, while in the SaDE-ELM models, an error greater than 1 happens in more than 10% of values in all scenarios except for M6 and M12.

Conclusion
In this study, multiple ELM and SVM models are used to forecast hydrological drought captured by the SHDI. For this purpose, the SPI and SHDI are computed on 1 and 3 month timescales. Five different ELM models and one SVM model are used for modeling, and 12 input-output combinations are considered. For preprocessing, the wavelet is coupled with the different ELM and SVM models. Based on the results, SaDE-ELM is the best model for SHDI modeling on both timescales and with different input-output combinations, while B-ELM is the worst. Besides, the wavelet has a positive effect of increasing the precision of the models. The models that are coupled with the wavelet have less error in the predicted values. The error distribution shows that a lag of at least 3 months is needed for precise forecasting, while the statistical criteria show that a lag of 2 months is satisfactory. Moreover, using the SPI as input for SHDI prediction does not improve the model performance considerably. It can be concluded that using ELM models for hydrological drought forecasting is promising, and this method can be used as an effective tool for water resources management.