Novel genetic-based negative correlation learning for estimating soil temperature

Agenetic-basedneuralnetworkensemble(GNNE)isappliedforestimationofdailysoiltemperatures(DST)atdistinctdepths.Asequentialgenetic-basednegativecorrelationlearningalgorithm(SGNCL) isadoptedtotraintheGNNEparameters.CLMSalgorithmisusedtoachievetheoptimumweightsofcomponents.RecordeddatafortwodifferentstationslocatedinIranareusedforthedevelopment oftheGNNEmodels.Furthermore,theGNNEpredictionsarecomparedwiththeexistingmachine-learningmodels.TheresultsdemonstratethatGNNEoutperformsothermethodsfortheprediction ofDSTs.


Introduction
Accurate estimation of soil temperature plays a vital role in many scientific fields such as agriculture, hydrology, geotechnics, solar and geothermal energy. Soil temperature depends on several factors such as meteorological conditions, physical soil parameters, topographical parameters, and further hydrological parameters of time and depth.
The variation of soil temperature at different depths provides useful information about the land-surface ecosystem processes, environmental and climatic conditions, climate change, and production of crops. Furthermore, soil temperature controls the interactive processes between ground and atmosphere. Although many meteorological variables including relative humidity, air temperature and atmospheric pressure are measured in meteorological sites, measurements of soil temperature data or its spatial variability are not usually available. Consequently, developing theoretical methods to estimate soil temperature from the existing meteorological data is vital.
Soil temperature modeling may be conducted using different approaches: (1) physical methods based on heat transfer mechanism in soil, (2) empirical methods that are commonly regression-based equations between soil CONTACT Shahaboddin Shamshirband shahaboddin.shamshirband@tdt.edu.vn temperature and other available measured meteorological variables, and (3) hybrid methods combining physical and empirical methods. However, the complex interrelationship between soil temperature and other climatic parameters make accurate prediction of soil temperature a difficult task. Artificial Intelligence (AI) techniques are considered as well-known tools to deal with the complex problems (Rumelhart & McClelland, 1986). In the last two decades the soft-computing methods have been successfully used in the soil time series predictions such as modeling and prediction of soil salinity (Shahabi, Jafarzadeh, Neyshabouri, Ghorbani, & Valizadeh Kamran, 2017), estimation of field capacity, and permanent wilting point . (Mihalakakou, 2002) used two models to estimate soil temperature in Dublin and Athens. He reported Artificial Neural Network (ANN) as an efficient tool, yet the analytical model shown to deliver more precision. Regressions and ANN methods were employed by (Bilgili, 2010) to calculate monthly soil temperature at five different depths for Adana, Turkey where the efficiency of ANN models reported higher compared to the regression models. In (Ozturk, Salman, & Koc, 2011), a feed-forward ANN models developed for estimating soil temperature at different depths in several stations of Turkey. Their outcomes provide a favorable precision for soil temperature estimation. Tabari (Tabari, Sabziparvar, & Ahmadi, 2011) estimated a Daily Soil Temperature (DST) at six depths in IRAN using two regression methods. They demonstrated that ANN is superior to the MLR. In another attempt, the Feedforward Neural Network (FNN) and nonlinear auto regressive neural network with exogenous input (NARX) methods were implemented to forecast weekly soil temperature in Seri Lanka (Napagoda & Tilakaratne, 2013). The NARX model found out to be more precise. (Hosseinzadeh Talaee, 2014) conducted the coactive neuro-fuzzy inference system method to estimate DST in two different region of Iranian stations. It was demonstrated that this method is an adequate method to provide accurate estimation of soil temperature. Recently, (Gill & Singh, 2015) proposed a precise ANN-genetic algorithm to forecast DST. Chau (2007) developed a high accuracy based on PSO method to train Multilayer Perceptron (M LP) for real-time water level estimation in Shing Mun River of Hong Kong. Three evolutionary algorithms such as DE, ABC and ACO proposed by (Chen, Chau, & Busari, 2015) for downstream river flow forcasting. A groundwater level prediction proposed by (Gholami, Chau, Fadaee, Torkaman, & Ghaffari, 2015) which uses ANN the period from 1912 to 2013. A Binary-coded swarm optimization applied for different gagging station in the northern United states to determine the actual baseflow component expected for most of the considered gages (Taormina, Chau, & Sivakumar, 2015). (Wang, Xu, Chau, & Lei, 2014) proposed a fuzzy-based approach to evaluate the quality of river water. A genetic algorithm (GA)-based on ANN employed for flood forecasting in Yangtze River, China (Wu & Chau, 2006). (Uyumaz, Danandeh Mehr, Kahya, & Erdem, 2014) researched on linear genetic programming (LGP) technique to estimate rectangular side weirs in circular channel. Furthermore, (Bonakdari, Baghalian, Nazari, & Fazli, 2011) studied on numerical analysis for prediction of flow field in a 90 bend.
A survey of the articles reveals the appealing application of the AI methods in soil temperature prediction. Possibly, an efficient approach to solve such complex real-world problems is to create a group of predictors by combining the AI prediction models in order to enhance the overall performance of prediction system (Masoudnia, Ebrahimpour, & Arani, 2012). Joint learning methods mainly aim at simplifying a prediction task through a convergence prediction outcome for the base data (García-Pedrajas, Maudes-Raedo, García-Osorio, & Rodríguez-Díez, 2012). In this context, neural network ensemble (NNE) is one of the dominant ensemble methods (Hansen & Salamon, 1990). NNE involves a certain number of neural networks (Tian, Li, Chen, & Kou, 2012;Zhai, Xu, & Wang, 2012). Usually, these methods have two separate phases of creating NN components and combining those components. Experimental and theoretical research revealed that the NNE model is more useful if estimations of model components have a negative correlation. In this case there is no relation between its components and NNE performs less effective. In fact the algorithm will act with no efficiency if there is positive correlation between the components (Brown, Wyatt, Harris, & Yao, 2005). A common method to create negatively correlated and accurate components, appropriate for the NNE model is penalty method through manipulation of the training dataset. Bootstrap aggregating are common methods of building ensembles. These methods manipulate the examples to train individual NN components sequentially and independently (Breiman, 1996;Freund & Schapire, 1996).
Negative correlation learning (NCL) (Liu & Yao, 1999) is a common consequence method to develop diverse components which boost a correlation penalty terms for component of cost functions to reduce mean square error (MSE) of every component and the error correlation of ensemble (Masoudnia et al., 2012). There is a regularization term in NCL that helps balancing the bias-var-cov trade-off and enhancing the ability to generalize. The typical NCL has certain constants. A study showed that NCL only drives away the ensemble components from the mean and not certainly from one to another (McKay & Abbass, 2001). Consequently, the NCL's low variety is not unexpected. Moreover, the back-propagation algorithm (BP) can be employed to train the neural network components in the NCL method. The BP algorithm is often stuck in a local minimum and it is unable to find a global minimum (Asadi, Hadavandi, Mehmanpazir, & Nakhostin, 2012).
The classic NCL static combiner approaches are not able to model of the local fitness of its components. Lee (Lee, Kim, & Pedrycz, 2012) introduced a novel selective ensemble of NN-Negative Correlation. A series of component networks utilized to minimize the generalization error and increase the negative correlation. Another NCL model was proposed by (Alhamdoosh & Wang, 2014) to create ensembles capable of sound generalization by regulating the incongruity between outputs of the base learners. They used the random vector functional link networks (RVFL) to find a better solution.
In this study, a new ANN ensemble method entitled as genetic-based NNE (GNNE) is proposed by combining an NCL model (Hadavandi, Shahrabi, & Hayashi, 2015) and evolutionary algorithm to calculate daily soil temperature at 5, 10, 20, 30, 50 and 100 cm depths. A sequential genetic-based NCL (SGNCL) (Kazemi, Hadavandi, Shamshirband, & Asadi, 2016) algorithm is introduced to train the components of the GNNE. SGNCL uses genetic algorithm (GA) as a global search method instead of the BP algorithm to tune the weights of network. A CLMS algorithm considered to find best components weights in combination module of the GNNE. To build the soil temperature prediction models, a daily weather dataset of two Iranian stations with different climate characteristics are utilized. The performance of the GNNE model is compared with our previous work which used an extreme learning machine (ELM) and hybrid self-adaptive evolutionary (SaE)-ELM (SaE-ELM) (Nahvi, Habibi, Mohammadi, Shamshirband, & Al Razgan, 2016).

Methodology
In this section, the implemented technique and related formulation process for estimating soil temperatures are described.

The suggested genetic-based neural network ensemble
To comprehend the possibility of generalizing the ensemble, Theorem 1 from (Naonori Ueda & Nakano, 1996) is recalled and expressed below: Theorem 2.1. The subsequent bias-variance-covariance decomposition is employed to express an ensemble's generalization error averaged over total potential D N (training datasets) of size N, where the same dataset trains all components: Here, D v represents the validation dataset. Eqs.
(2)-(4) are used to determine Cov(X n ), Bias(X n ) and Var(X n ) as below: Here, N represents the number of training cases, the ensemble size is shown by T, f t (X n ) is the output of component t and f ens (X n ) = (1/T) T t=1 f t (X n ) is the ensemble model for X n . (Geman, Bienenstock, & Doursat, 1992) provided the proof for Theorem 1. Based on Eq. (1), the disagreement between outputs of ensemble components is controlled by explicit handling the covariance term Cov(X n ), thus developing a better generalized ensemble model. Eqs. (5) and (6) show the NCL's cost functions for t (1 ≤ t ≤ T): λ regulates the trade-off between the penalty function and objective, and P t is the coefficient of error correlation penalty. The ensemble components can be trained individually using the BP algorithm when λ = 0. Therefore, we propose the SGNCL algorithm to train the GNNE model components. It has key characteristics as below: 1) In the error function of component t, the GNNE uses a moving average for the correlation penalty term that calculates the error correlation between previous components and itself (1, 2 . . . , t − 1), and successively locates every component in a position relative to preceding components, hoping that GNNE components trained by the SGNCL move away from each other, hence maximizing their diversity. Eqs. (7) and (8) express the suggested cost functions in the SGNCL for training t: In these equations, y k (X n ) and f tk (X n ) are the actual and predicted values of k target of instance n, respectively (1 ≤ k ≤ P), and the quantity of training instances is represented by N.
2) The GNNE employs (GA) to tune the weights of the NN experts (Asadi et al., 2012). Thus, we used GA in the suggested SGNCL algorithm in this paper to obtain the best set of network in the subsequent phases: Phase 1 -Encoding Every gene signifies the weight between two neurons in various layers. As shown in Figure 1, a chromosome has a series of genes. As such, for a normal feed-forward neural network one neuron in output layer, with two neurons in hidden layer, and three neurons in input layer, the 1st gene is the weight between neuron 4 and 1, Namely W 14 . The 2nd gene is the weight between neuron 5 and 1, namely W 15 and so on. Real-number form is used to signify the connection weights. So in this way we encoded the weights of feed-forward neural network by each gene in chromosome as it seen below.
Phase 2 -Create the primary population The primary population (N pop ) is created arbitrarily. Every single initial weight is developed randomly between −1 and 1. Hence there are first population which created discretionary and first weight which distributed uniformly between −1 and 1.
Phase 3 -Obtaining the fitness values These values are calculated based on the E t in this case the fitness values obtained and they established on E t .
Phase 4 -Selection mechanism The truncation selection scheme was used. In this mechanism, individuals are organized based on their fitness to choose the best individuals for parents. The population's quantity that is designated as parents is called the truncation threshold. Next, binary tournament selection scheme is adopted to choose parents for creating new off springs through sue genetic operators. Two members of the population are chosen in binary tournament selection randomly, then the fitness function is associated to select the best for one parent according to the fitness value. Moreover, other parents are chosen through a similar procedure. Selecting randomly parents and comparing their fitness according to fitness value which is depended on E t as well.
Phase 5 -Genetic operators The binary tournament selection is applied to the tophalf best-performing persons to choose individuals for crossover. One-point crossover is done on the designated parents to create offspring. Next, the newly produced offspring replace individuals with poor performance. Likewise, uniform mutation is employed, which substitutes a randomly chosen gene with a value selected from a uniform random distribution between lower and upper domain limits for the gene. The probability of mutation must, therefore, be small enough to avoid degenerate of the performance of GA.
Phase 6 -Replacement The newly generated offspring replace the current population to create the succeeding generation. Due to this replacement new generations produced and it occur until reaching the stop phase.
Phase 7 -Stopping measures In case the quantity of generations exactly resembles the maximum number of generation, then stop; or else move to Phase 3.

Combination method
Following training of the MLP components, a combination method is needed to combine the outputs. This phase aims to fully take advantage of diversity's positive effects and evade from the negative outcomes (Yang, Zeng, Zhong, & Wu, 2013). Similar to ensemble of MLP components, weighted and simple averaging are the most popular combination methods for regression problems. Nevertheless, it is worth exploring methods of finding suitable weights for component of NNE (Tian et al., 2012;Yang et al., 2013). Eq. (9) computes the last GNNE model prediction with T components for target k (1 ≤ k ≤ P): Here, the ensemble model predicts f ens,k (X n ) and f tk (X n ) values and t for k target of n instance, and g tk represents the weight of t for k.  In an NNE model, there are several ways to measure the weight parameters of components. Two main categories exist. One includes methods that use the CLMS with a mathematical outline to find optimal weights. The combination weights are obtained through an optimization technique, such as the Lagrange multipliers (Ueda, 2000). In the second category, an evolutionary algorithm is used to discover an optimal combination weights based on the outcomes of every single component of the training dataset (Nabavi-Kerizi, Abadi, & Kabir, 2010). The CLMS was used in this work to get the optimal components' weights (Hadavandi, Shahrabi, & Shamshirband, 2015). The g tk weights are calculated so that to lessen the probable quadratic deviation of E for the assumed training set. Calculation of weight parameters for k target feature (1 ≤ k ≤ P) is expressed in this paper as follows: (10) Figure 2 shows the suggested GNNE model with its components.

Dataset
In this study, the datasets from Bandar Abbas and Kerman cities in Iran were utilized. Bandar Abbas is the capital of Hormozgan province situated at the northern coast of Persian Gulf at 27°13/N and 56°22/E, and 10 m above the sea level. In summer, the highest air temperature is up to 49°C whereas in winters the minimum air temperatures can reach to 5°C. Long-term yearly averaged air temperature and relative humidity are 27°C and 65%. Also, the annual rainfall is approximately 170 mm. In summer, it has one of the maximum average dew points of any location in the world. Bandar Abbas has a hot desert climate categorized as BWh based on Köppen climate classification (Kottek, Grieser, Beck, Rudolf, & Rubel, 2006).  City of Kerman is situated southeast of country at 30°29/ N and 57°06/ E, and its elevation is 1754 m above the sea level. Its climate condition is generally moderate and dry which experience hot summers because of its placement in the loot desert. Long-term yearly averaged air temperature and relative humidity in Kerman are 15.8°C and 32%, respectively. Also, the annual average of rainfall is around 140 mm. Kerman has a cold desert climate classified as BWk on the basis of Köppen climate classification.
In Figure 3, we show the geographic location of the tested cities For Bandar Abbas, 10 years recorded data from Jan 1996 to Dec 2005, and for Kerman, 7 years recorded data from Jan 1998 to 2004 were used. The daily data collections consist of daily soil temperature (ST) at several depths as well as maximum, minimum and average air temperatures. We divided the data collections to two subsets for training (50% of total data) and for testing (50% of total data). The datasets used in this study are part of the data sets utilized in a previous study on estimating daily soil temperatures at Bandar Abbas and Kerman using the SaE-ELM and ELM models (Nahvi et al., 2016). Using self-adaptive evolutionary algorithm to improve the performance of an extreme learning machine for estimating soil temperature. This enables us to validate the performance of the GNNE against other robust AI methods.
To determine how much air temperatures are significant for predicting ST, the correlation coefficient between air temperatures and ST at all depths were calculated. For both stations, air temperatures showed favorable correlations with ST at all depths. It was found that the correlation coefficient for both stations decrease slightly with increase of the soil depth from 5 to 100 cm. For Bandar Abbas, the correlation coefficients of T min , T max and T avg with ST at different depths were between 0.8602 and 0.9574, 0.8352 and 0.9307 as well as 0.8793 and 0.9793, respectively. In addition, for Kerman the correlation coefficients of T min , T max and T avg with ST at Figure 6. Comparison of the estimated daily ST by the GNNE model against measured data for Kerman. different depths were between 0.8280 and 0.9038, 0.8492 and 0.9559 as well as 0.8366 and 0.9722, respectively. These high correlations of air temperatures with ST at different depths indicates that air temperatures are highly influential and suitable for predicting soil temperature. This can be attributed to the fact that air and soil temperatures are determined according to the energy balance at the ground surface.

Results and discussion
A series of analyses were carried out by developing the GNNE models and comparing their performance with other models. Multi-Layer perceptron adopts as a NN components of the suggested model. In the hidden layer, the number of nodes was set at 8. There are three inputs in input layer for each modeling and at last outputs that are shown in Figure 4. The hyperbolic tangent function was the activation function employed in the output and hidden nodes and gating network. Figure 4 depicts the architecture of the GNNE model's MLP components.
The performance of the proposed GNNE model is assessed by mean absolute bias error (MABE), root mean square error (RMSE) and coefficient of determination (R 2 ). The model with smaller RMSE and MABE brings more accuracy, whereas bigger R 2 indicates that the model offers more precision. Tables 1 and 2 show the obtained statistical indicators for the GNNE model for Bandar Abbas and Kerman, respectively. The statistical results reveal that for both cases studies, the GNNE model has a favorable performance for estimating daily soil temperature at all depths. The predictions of ST at all depths have very good agreements with the measured data. RMSE and MABE increase and R 2 decreases with increase of the soil depth. This is clearer when the soil depth rises from 50 to 100 cm. The reason is that influence of the considered meteorological variables on ST decrease at higher depths.
Moreover, the estimated daily ST by the GNNE model at all depths are plotted against the obtained data for the testing stage for Kerman and Bandar Abbas ( Figure 5 and Figure 6). The plots indicate that there is a good correlation between the estimated and measured data. However, the higher scattering of the data points at higher soil depths implies a decrease of the prediction accuracy with increasing the depth.
In order to have an idea about the prediction performance of GNNE, its accuracy is compared with the classical ELM and combined SaE-ELM proposed in authors' previous study for the same case studies (Nahvi et al., 2016). To fulfill this validation phase, the average values of MABE and RMSE for all six depths achieved for Bandar Abbas and Kerman cities are calculated. Figures   Figure 7. Comparison of the prediction performance of the GNNE, ELM and SaE-ELM models for Bandar Abbas. 7 and 8 shows the comparative study results. As seen, the GNNE models notably outperform the SaE-ELM and ELM models on the data for both Bandar Abbas and Kerman cities.

Conclusions
A forecast model for soil temperature (especially on daily horizon) is an important decision-support tool in soil engineering, agriculture, meteorology, and geotechnics. Thus, determination of soil temperature at various depths is critical in many research fields. The present research introduces a new GNNE method by coupling NCL with evolutionary algorithms to predict soil temperature at several depths. The proposed method combines different NN prediction models and builds a predictor ensemble. The SGNCL algorithm is implemented for training the components in GNNE, and LMS algorithm integrated for reaching the components' optimal weights. Daily weather datasets of two stations in different regions of Iran with variety of climate conditions are used for developing the models. The proficiency of the GNNE models is assessed by comparing with two evolutionary models such as ELM and SaE-EL. Under this circumstance the statistical evaluations are RMSE and MABE. The results demonstrate that the GNNE model provides significant level of precision in predicting daily soil temperatures in all depths for both stations as well. For the Bandar Abbas station, the MABE and RMSE values for all depths lie between 0.85 and 1.40°C, and between 1.05 and 1.83°C, respectively. The recorded RMSE and MABE and for the Kerman station are ranged between 1.56−2.23°C and 2.07−2.85°C, respectively. The results show that the predictions which produced by GNNE models are superior to those made by the SaE-ELM and ELM methods in both stations considering to evaluation criteria. These findings suggest the power of application of GNNE method in predicting daily soil temperature in different depths and it makes a way to encourage researchers for utilizing GNNE model in this scientific area. Future research in this field can focus on the inclusion of other fuzzy systems or neural networks with the NCL process. Making effective data decomposition algorithms to construct diverse components in the GNNE can also introduce another field for the research on GNNE model in further research.