Prediction of daily water level using new hybridized GS-GMDH and ANFIS-FCM models

Accurate prediction of water level (WL) is essential for the optimal management of different water resource projects. The development of a reliable model for WL prediction remains a challenging task in water resources management. In this study, novel hybrid models, namely, Generalized Structure-Group Method of Data Handling (GS-GMDH) and Adaptive Neuro-Fuzzy Inference System with Fuzzy C-Means (ANFIS-FCM) were proposed to predict the daily WL at Telom and Bertam stations located in Cameron Highlands of Malaysia. Different percentage ratio for data division i.e. 50%–50% (scenario-1), 60%–40% (scenario-2), and 70%–30% (scenario-3) were adopted for training and testing of these models. To show the efficiency of the proposed hybrid models, their results were compared with the standalone models that include the Gene Expression Programming (GEP) and Group Method of Data Handling (GMDH). The results of the investigation revealed that the hybrid GS-GMDH and ANFIS-FCM models outperformed the standalone GEP and GMDH models for the prediction of daily WL at both study sites. In addition, the results indicate the best performance for WL prediction was obtained in scenario-3 (70%–30%). In summary, the results highlight the better suitability and supremacy of the proposed hybrid GS-GMDH and ANFIS-FCM models in daily WL prediction, and can, serve as robust and reliable predictive tools for the study region.


Introduction
Prediction of river water level is a critical process in river discharge estimation and it is required for better water resources management (Dingman & Bjerklie, 2005;Tsujikura et al., 2016;Vachtman & Laronne, 2014). The accurate prediction of a river water level improves flood prediction systems and can act as a warning alarm for early decision-making and planning to reduce the effect of flood events which is considered as one of the most damaging natural hazards on life and property (Hettiarachchi & Thilakumara, 2014;Morales-Pinzón et al., 2015;Tsujikura et al., 2016;Xu et al., 2019). In Malaysia, floods and flash floods are often happened due to prolonged heavy rainfall; however, the possibility of floods may increase as a result of climate change and global warming (Arbain & Wibowo, 2012;Buslima et al., 2018;Suri et al., 2014). To deal with the flood phenomena, three categories of critical river water levels have been introduced by the Department of Irrigation and Drainage CONTACT Saad Sh. Sammen Saad123engineer@yahoo.com (DID) Malaysia, namely, normal, alert, and danger levels (Gasim et al., 2007). The three categories have been identified by analyzing the characteristics of floods in Malaysia, such as water level, peak discharge, inundated area, the volume of flow, and flood duration, for many years.
Water level prediction in rivers is usually conducted using empirical models. These empirical models are developed based on accumulating long-time-series data using in situ sensors that are expensive, hard to maintain, and available in specific areas (Rigos et al., 2020). However, as per (Hettiarachchi & Thilakumara, 2014) prediction of river water level using non-linear models that includes many environmental parameters (e.g. catchment area and flow rates) imperfectly agreed with the realistic observation data; this may due to the complex nature of the dynamic and rapidly water level fluctuations or due to ignoring some important parameters in the theory (See & Openshaw, 1999). Moreover, modeling these complex processes by differential equations has little use in practice as it results in complex, time-consuming, and mathematically intractable non-linear models.
Moreover, a comparison among ANN, ARMA (Autoregressive Moving Average), and SVM models that were conducted by Lin et al. (2006) revealed that the SVM model can give a more accurate prediction of long-term flow discharges than the others. A comprehensive review of the applications of genetic programming (GP) in the analysis of water resources systems was conducted by Mohammad-Azari et al. (2020). The review indicates the capability and superiority of the model for solving a wide variety of water-related problems such as modeling rainfall-runoff, streamflow, sedimentation, flood, evaporation, water quality, water demand, and water distribution systems. Moosavi et al. (2017) evaluated the performance of GMDH and wavelet-GMDH models for daily runoff forecasting from Darian-Chay, Ghale-Chay, and Lilan-Chay Rivers in East Azerbaijan (Iran). The evaluation indicates that the performance of the GMDH model was efficiently enhanced when the wavelet-based analyzed data was added to the model to deal with the non-stationarities in the data.
In this study, two hybrid models, namely, Generalized Structure-Group Method of Data Handling (GS-GMDH) and Adaptive Neuro-Fuzzy Inference System with Fuzzy C-Means (ANFIS-FCM) were developed by using data obtained from two water level stations located in Perak River, Malaysia. The study also compared the efficiency and performance of the hybrid models (i.e. GS-GMDH and ANFIS-FCM) with two standalone models, namely, the Gene Expression Programming (GEP) and Group Method of Data Handling (GMDH) through statistical indicators and graphical interpretation. The results of this study promise better accuracy of the hybrid GS-GMDH and ANFIS-FCM models in river water level prediction.

Study area
Cameron Highlands is the smallest region in the province of Pahang Darul Makmur and offers its fringes with the territory of Kelantan and Perak, in the north and west, respectively. It is situated in the Main Range (Banjaran Titiwangsa) between 4°27 53 N -4°32 39 N and 101°23 10 E -101°25 25 E. The region of Cameron Highlands with an expected region of 71,218 hectares is hilly, extending from 300 m at the stream valleys on the eastern limit to 210 m (Gunung Irau) on the western boundary. The most elevated point open by street in Peninsular Malaysia, Gunung Brinchang (2031 m), is one of the significant tops in Cameron Highlands, side from Gunung Swettenham (1961 m), Gunung Siku (1916 m), Gunung Berembun (1840 m), Gunung Cantik (1802 m) and Gunung Jasar (1704 m). About 75% of the area of the provenance is situated above 1000 m heights. The examination zone falls within Cameron Highlands Districts arranged at Pahang Darul Makmur, which the region assessed to be 712 km 2 . Its temperature falls not more than 25°C and is broadly known as an uneven region with horticultural practices (Eisakhani & Malakahmad, 2009). Cameron Highlands is comprised of three significant catchments of Bertam, Telom, and Lemoi as shown in Figure 1. Bertam comprises five main sub-catchments which are Habu, Ringlet, Lembah Bertam, Tanah Rata, and Brinchang. While, Tringkap, Kampung Raja, and Kuala Terla are the sub-catchments in Telom. Cameron Highlands gets normal yearly precipitation of 2800 mm and normally 2 out of 3 days is raining (Tan & Beh, 2015). Therefore, almost every day precipitation could  (Nasidi et al., 2021). be felt in Cameron Highlands. The details of the station are organized in Table 1 Table 2 summarizes statistical parameters i.e. Max. = maximum, Min. = Minimum, SD = standard deviation, Skew = Skewness, Q1, Q2, and Q3 = first, second and third quartiles of WL at both stations.

Gene expression programming (GEP)
GEP was initially introduced by Ferreira (2002). It is a generated technique with the base of genetic algorithms (GA) and has been broadly adopted in recent investigations (Ebtehaj et al., 2015a;Ferreira, 2002). The PC program of GEP is encoded in linear chromosomes, which are then explained into trees term (Shabani et al., 2018). A systematic diagram of GEP appears in Figure 2. The initial step is to create the underlying population, which   occurs with subjective births of chromosomes. Then the chromosomes are converted to expression trees (ETs) that are analyzed by performance measures to shows the solubility of delivered ETs. If the outcomes convince the performance measures criteria, population producing stops, and if the outcomes are not agreeable, the system redeveloped with some improvement to make generation with improved value, and this procedure happens until the best outcomes are accomplished. For additional clarification about GEP, readers and researchers are referred to (Ferreira, 2006;Kiafar et al., 2017). According to that, there is no certain method to find the optimum values of the GEP parameters, the optimum values of the GEP parameters for each station were found through a trial-and-error process (Azimi et al., 2017). The most optimum values of the GEP parameters for each station are provided in Table 3. Ivakhnenko (1971) initially proposed the GMDH method. It's practical in different sections for deep learning and science detection and is applied in several fields as forecasting, pattern recognition, and optimization. Analogical GMDH algorithms present the feasibility to discover automatically interrelations in data, to obtain the best structure of model or network, and to enhance the accuracy of existing algorithms. GMDH is containing numerous algorithms for the solution of various types of problems including clusterization, parametric, and probability algorithms. This method is relying on the sorting-out of gradually complex models and chooses the superlative solution via the lowest of outside criterion features. Generally, this method has numerous inputs and one output, which is a subset of elements of the base function (Madala & Ivakhnenko, 2019). To obtain the superior solution this model considers a variety of elements subsets of the initial function (Madala & Ivakhnenko, 2019) known as partial models. Least-squares techniques are used to find the coefficients of these models. GMDH algorithms gently enhance the number of incomplete model elements and discover a model structure with optimal complications represented via the lowest value of an outside criterion. This method is known as the self-organization of models (Schmidhuber, 2015):

Group method of data handling (GMDH)
where Y(x 1 , . . . x n ) represent the input content and n the number of input variables. Also, α(α 1 , . . . α n ) coefficients are acquired via regression techniques for each couple of x i andx j input variables (Farlow, 1981). Hence, the GMDH algorithm uses various second-degree polynomials.

Generalized structure-group method of data handling (GS-GMDH)
The standard GMDH model has few drawbacks that the low performance of this model in complex and nonlinear problems. In the present investigation, a novel encoding of GMDH has developed to increases the accuracy of the standard GMDH model. The main drawback of this model is the utilize of only two parameters as inputs for every neuron. In standard GMDH models, the input variables of every neuron are chosen from neighboring neurons. In the present study, a new generalized structure of the GMDH algorithm (GS-GMDH) model was developed to decrease the drawbacks of the standard GMDH model. The introduced new method decreases the limitations available in the standard GMDH algorithm. In GS-GMDH, the proposed neurons can be consisting of 2 or 3 input variables. Moreover, the polynomials are considered as second and third order. Besides, the input of every neuron can be chosen from both neighboring and non-neighboring layers. The most favorable structure of GS-GMDH (Figure 3) is obtained based on the Akaike Information Criterion (AIC) as follows (Ebtehaj et al., 2015b): where N is the number of neurons in the model, n is the number of samples, and MSE is the mean square error.

Adaptive neuro-fuzzy inference system (ANFIS)
The adaptive neuro-fuzzy inference system (ANFIS) is an amalgamation of ANN and fuzzy logic (FL) for developing non-linear problems and initially, it was developed by Jang et al. IF-THEN fuzzy rules are utilized in model development by ANFIS (Sobhani et al., 2010;Yuan et al., 2014). It draws advantages of both ANN and FL. It could successfully be used where ordinary traditional techniques fail or too weighty (Vakhshouri & Nejadi, 2018). Shape and number of membership functions (MFs) are significant parameters in ANFIS to generate a model with the least error zone. Figure 4 display the structure of an ANFIS model having two input variables. For simplicity of illustration only two inputs p, q, and single target, y is considered in this figure.

Fuzzy C-means method (FCM)
The algorithm k-mean is one of the grouping algorithms, which is utilized broadly. This algorithm with unsupervised in large data sets is faced with limitations in preparing. To deal with the shortcoming, distinctive grouping algorithms are given. Fuzzy C-means clustering as an alternative technique is utilized . Fuzzy c-means (FCM) were presented by Bezdek et al. (1981), and improved by variables and dependent variables (target) specified in this stage are: S 1 N a = μN a (q), a = 1, 2 where p and q are crisp inputs, and M a and N a are fuzzy set, low, medium, high-class size membership functions are applied, which could any shape such as triangular, trapezoidal, bell-shaped, Gaussian function, etc. (Cai et al., 2007). In Fuzzy clustering, designs in clusters with common are classified, and a pattern can appertain multiple clusters with an alternate proportion. In the FCM algorithm, designs are blocked to the C cluster, truth be told, the quantity of clusters (C) is indicated prior, yet the focal point of the cluster is chosen haphazardly. The level of membership for each example as indicated by the membership function is determined by the focal point of each cluster. The goal of the FCM algorithm is to discover a group that the likeness between designs inside various clusters is minimized. One of the primary benefits of the FCM technique is that in this approach, every data point is related to at least two clusters. The FCM cluster center utilizes the minimization of the objective function, which is considered as the squared separation between each group center and information point and is weighted by its memberships.

Performance indicators
The accuracy of the hybrid (i.e. GS-GMDH and ANFIS-FCM) and standalone (i.e. GMDH and GEP) models developed for water level prediction at both study stations were evaluated by using four performance or statistical indicators i.e. Root Mean Square Error (RMSE) (Malik et al., 2019b;Pham et al., 2021;Sammen et al., 2020), Nash-Sutcliffe efficiency (NSE) (Nash & Sutcliffe, 1970), Pearson Correlation Coefficient (PCC) (Adnan et al., 2019;Malik & Kumar, 2020), and Willmott Index (WI) (Willmott, 1981), and through graphical inspection (time-variation plot, scatter plot, box-whisker plot, and Taylor diagram). The RMSE, NSE, PCC, and WI are stated as where N, WL obs , WL pre , WL obs and WL pre are the data points, observed and predicted water level (WL) values for the ith observations, and mean of observed and predicted WL values, respectively. In general, if the applied models follow the criteria of higher values of NSE, PCC, and WI, and the lower value of RMSE designated a relatively better model for WL prediction at study stations. These four statistical indicators are commonly used performance indicators in assessing model performance, which has proven their values in previous studies. They are used together in this study because each of them has both advantages and disadvantages. The use of all four indicators will ensure that an all-around assessment can be made of the model performance.

Performance assessment using statistical metrics
Four different machine learning techniques, namely, GEP, GMDH, GS-GMDH, and ANFIS-FCM were employed to predict the daily WL for two stations of Cameron Highlands in Malaysia.   Table 5 summaries the results of GEP, GMDH, GS-GMDH, and ANFIS-FCM models at Bertam station during validation phase. It was noted from Table 5  that the values  Furthermore, the prediction accuracy of the GS-GMDH model improved by 0.81%, 1.37%, 1.00% in scenario-1; 0.44%, 0.23%, 0.59% in scenario-2, and 0.17%, 0.97%, 0.75% in scenario-3 with respect to RMSE over GMDH, ANFIS-FCM and GEP models at Telom station. Likewise, the prediction accuracy of the GS-GMDH model enhanced by 1.43%, 0.47%, 1.17% in scenario-1, 1.44%, 0.30%, 1.26% in scenario-2, and 1.83%, 0.49%, 0.73% in scenario-3 regarding the RMSE over GMDH, ANFIS-FCM and GEP models at Bertam station. Therefore, for the Telom and Bertam stations, the obtained results indicate that the best performance was attained under scenario-3 where the data was divided by 70% for the calibration and the remaining 30% for validating the models.

Performance assessment using graphical interpretation
Besides the statistical assessment of the results, graphical methods have been widely used for model assessment. Accordingly, three different graphical methods namely temporal and scatter plots, Box-Whisker plot, and Taylor diagram were adopted in the study to assess the model performance graphically. Figures 5 and 6 illustrate the temporal and scatter plots of GS-GMDH, GMDH, ANFIS-FCM, and GEP models under scenario-1, scenario-2, and scenario-3 during the validation period at Telom and Bertam stations, respectively. As can see from these two figures that the GS-GMDH model had a higher value of the coefficient of determination: R 2 = 0.7619, 0.7724, and 0.7679 for scenario-1, scenario-2, and scenario-3 respectively at Telom station. Similarly, the high value of R 2 = 0.6562, 0.6494, and 0.6916 for scenario-1, scenario-2, and scenario-3, respectively was obtained when the GS-GMDH model was applied at Bertam station. Furthermore, the performance of the GS-GMDH, GMDH, ANFIS-FCM, and GEP models in this study was evaluated by using the Box-Whisker plot. According to this diagram, it is easy to explain if there is any skew in the distribution of the data or there are any outliers. Figures 7 and 8 display the Box-Whisker plots for Telom and Bertam, respectively. In these figures, the distribution of the predicted values over the observed values during the validation period was explained. It was seen from the figures the distributional variation among predicted vs observed water level values were relatively minor. Therefore, the verdict based on performance measures (RMSE, NSE, PCC, and WI) and graphical inspection (coefficient of determination of regression line in scatter plots) showed the better water level prediction accuracy of the hybrid GS-GMDH model than the GMDH, ANFIS-FCM, and GEP models.
Likewise, the Taylor diagram (Taylor, 2001), an association of standard deviation, RMSE, and the correlation coefficient was employed to display the spatial variation of predicted water level using all four models in three different scenarios over the observed one in a single topology. Figures 9 and 10 demonstrate the Taylor diagram for the relative performance at Telom and Bertam sites, respectively. These diagrams clearly show the better performance of the GS-GMDH model for both stations. It is clear from Figures 9 and 10 that the obtained results by the GS-GMDH models are closer to the observed values of water level prediction and it has the superior performance as discussed before in the previous section.
The final equation of the GEP for both stations are provided as follow: − WL(t − 1))/(WL(t − 3) + 9.44)))) + ((((((WL(t − 2) + WL(t − 6)) * −5.32)/WL(t − 4)) * WL(t − 2)) /WL(t − 5)) − 5.32) The results of the current research were compared with existing studies on water level prediction by employing machine learning techniques. Altunkaynak (2019) predicted monthly WL in Lake Van, Tukey by employing the multilayer perceptron (MLP), wavelet-MLP (W-MLP), and MLP-ASA (additive season algorithm). Their prediction performance was evaluated using RMSE and NSE criteria. They found that the MLP-ASA model (RMSE = 3.550 cm, NSE = 0.992) outperformed the other models. Alizamir et al. (2020) employed a deep echo state network (DESN) to predict the monthly WL of lake Van (Turkey), and its outcomes were compared against the ANN, extreme learning machine (ELM), and regression tree (RT) based on RMSE, R 2 , and NSE performance indicators. The investigation shows better performance of the DESN model with RMSE = 0.025 m, NSE = 0.998, and R 2 = 0.998 than the ANN, ELM, and RT models. Nhu et al. (2020) applied four decision tree-based algorithms i.e. M5 pruned (M5P), random forest (RF), RT, and reduced error pruning tree (REPT) for predicting the daily WL in Zrebar Lake, Iran during 2011-2017. These models were optimized with 70% data for training and 30% data for testing. Their performance was evaluated using RMSE, MAE, (mean absolute error), R 2 , PBIAS (percent bias), and RSR    models in predicting the daily/ monthly Lake water levels.

Conclusions
The present study has presented two new hybrid machine learning models, namely, generalized structure with GMDH algorithm (GS-GMDH) and adaptive neurofuzzy inference system with fuzzy C-means (ANFIS-FCM) for daily water level prediction at Telom, and Bertam stations positioned on Perak River in Malaysia. The performance of the hybrid models was compared with standalone models i.e. GEP and GMDH. To meet the objectives, the daily water level data of two stations for the Cameron Highlands in Malaysia were used. In addition, three different percentage ratios were used to divide the data for calibration and validation sets which include 50%-50%, 60%-40%, and 70%-30%, respectively. The results of the analysis reveal that the performance of the GMDH model could be enhanced with a new general structure (GS-GMDH) model. According to the best performance, the models were ordered as GS-GMDH > ANFIS-FCM > GEP > GMDH for both study locations. In addition, the results of the hybrid GS-GMDH model can be utilized to formulate the smart and truthful intelligent system for managing the water-related operations over the study sites.