Deep learning versus gradient boosting machine for pan evaporation prediction

ABSTRACT In the present study, two innovative techniques namely, Deep Learning (DL) and Gradient boosting Machine (GBM) models are developed based on a maximum air temperature ‘univariate modeling scheme’ for modeling the monthly pan evaporation (E pan) process. Monthly air temperature and pan evaporation are used to build the predictive models. These models are used for evaluating the evaporation prediction for the Kiashahr meteorological station located in the north of Iran and Ranichauri station positioned in Uttarakhand State of India. Findings indicated that the deep learning model was found best at Kiashahr station for testing datasets MAE (0.5691, mm/month), RMSE (0.7111, mm/month), NSE (0.7496), and IOA (0.9413). It can be concluded that in the semi-arid climate of Iran both of the used methods had the good capability in modeling of monthly E pan. However, DL predicted monthly E pan better than GBM. Moreover, the highest accuracy of the deep learning model was also observed for the Ranichauri station in terms of MAE = 0.3693 mm/month, RMSE = 0.4357 mm/month, NSE = 0.8344, & IOA = 0.9507 in testing stage. Overall, results expose the superior performance of DL-based models for both study stations and can also be utilized for various other environmental modeling.


Background
Evaporation is a crucial natural phenomenon of the hydrological cycle and reflects the interaction between sea and air. This process has always been a point of interest in energy and heat flux transfer (Katsaros, 2001). It is an intricate process associated with humidity, temperature, amount of insolation, wind, and other meteorological factors (Shirgure & Rajput, 2011;Tao et al., 2018). Natural evaporation takes place when the water below the surface transforms to vapor due to solar energy or by the atmosphere. This process can be described by the examination of molecular diffusion and turbulence from the atmosphere or the sun (Shuttleworth, 1991). Evaporation uses insolation when it enters into the CONTACT Zaher Mundher Yaseen zaheryaseen88@gmail.com atmosphere meanwhile the terrestrial radiation escapes in the water vapor. Evaporation is an influential process of the environment and holds an important place for many analytical and technical innovations especially in microfluidic technology (Musolino & Trout, 2013). The process got scientific consensus after the publication of 'Manual of Meteorology' by Sir Napier Shaw in 1926. Furthermore, Bowen (1926) used temperature gradient and humidity to measure evaporation (Bowen, 1926). However, the measurement of evaporation is quite a topic of dispute (Eames et al., 1997). Previous studies on hydrology provide a significant impression of methods used for solving environmental problems. These models consisted of several issues as models for dealing with the missing data, real-time flood forecasting, predicting the implications of deforestation, and the impact of increasing CO 2 concentration on hydrology. Moreover, it has been found that the fundamentals behind these rigorous models are often untested hypotheses (Morton, 1990). The examination of the impact of hydrology on the environment requires effective basic knowledge of the phenomena and processes to minimize errors. During 1950 it has been thought that surface evaporation is influenced by meteorological conditions. Meanwhile, various scholars have tried to model and estimate evaporation for a long time (Kalma & Calder, 1994;Lang et al., 1983;Morton, 1990). Bowen ratio instrumentation technique has been found one of the effective methods for measuring both evaporation and evapotranspiration (ET o ). In this method, the soil flux and radiation have been split into sensible and latent heat (Hatfield et al., 2005). Another technique is the eddy correlation which considers the absolute humidity and vertical velocity to provide direct evaporation measurement (Goltz et al., 1970). Evaporation is complex in nature and its nonlinearity making it difficult to measure. However, the use of artificial intelligence has been found important in analyzing these complex phenomena which are difficult to examine using physical equations. This robust technique utilizes less computational and technical effort to provide accurate outputs (Shirgure & Rajput, 2011). Brutin and Starov (2018) identified that evaporation and wetting have increasingly been analyzed by scholars since the last few decades. The main reason behind their assessment is the wider applicability in the real world. Various techniques and methods have been used by scholars to measure the evaporation as multilinear regression (Baier & Robertson, 1965;Bruton et al., 2000), neural networkbased fuzzy method (Moghaddamnia et al., 2009;Salih et al., 2019), energy budget estimation (Garratt, 1984), remote sensing, and meteorological datasets (Miralles et al., 2011;Rivas & Caselles, 2004), flow gauges methods (Allen & Grime, 1995) and radiation-based approach (Xu & Singh, 2000). The advanced soft computing models have reported a noticeable progression in the evaporation and evapotranspiration process simulation (Jing et al., 2019;Khosravi et al., 2019;Malik, Kumar, et al., 2020;Sanikhani et al., 2019;Yaseen, Al-Juboori, et al., 2020). Most recently, scholars have devoted more efforts to developing robust and advanced methodologies by combining artificial intelligence models with nature-inspired optimization algorithms (Ashrafzadeh et al., 2019). For instance, the firefly algorithm was hybridized with a classical neural network for E pan prediction to acquire more precise evaporation measurements Ghorbani et al., 2018). Zhakhovsky et al. utilized molecular dynamics and Boltzmann kinetic equation to examine the pathways of condensation and evaporation (Zhakhovsky et al., 2019). They emphasized that some of the advanced methods are required to examine the evaporation for solving various engineering problems.
Most importantly, the research analysis is very limited to understanding the reasons for variation as increase and decrease in E pan to estimate more accurate and reliable for future purposes. The important parameters for the estimation of pan evaporation are air temperature, solar radiation, relative humidity, wind speed, and precipitation. Temperature maximum, minimum and mean, E pan , ET o , aridity index are significantly found for increasing trends in time series and also found the decreasing trends for relative humidity and precipitation in Brazil by (Da Silva, 2004). Wang et al. (2007) indicated the changes of E pan and ET o in the Yangtze River basin in China from 1961 to 2000. They analyzed that E pan and ET o decreased during the summer months contributing most to the total annual reduction. Jhajharia et al. (2009) considered 11 sites of northeast India and examined the trends in temporal features of E pan under humid conditions. It has been observed that in pre-monsoon and monsoon seasons, the E pan trends start decreasing. The major findings of this study concluded that wind speed and sunshine duration firmly affected the E pan changes at different sites from various regions in different seasons. From 1901to 2002, Oguntunde et al. (2006 demonstrated positive trends in E pan data on analyzing the variability and trends in hydro-climatology parameters of the Volta River Basin in West Africa. It has been stated that the need for crop water and evaporative demands of the atmosphere has increased over the last 22 years and the area has become more drier and warmer. Till now, several authors demonstrated the estimation of ET o and E pan in Iran (Sabziparvar & Tabari, 2010), but no study has been carried out on the investigated temporal trends in ET o and E pan time series. The major objective is to investigate the temporal changes in Hamedan province in western Iran for the annual E pan of 12 stations during 1982-2003. Moreover, the impacts of precipitation and air temperature were analyzed on the temporal trends, observed in E pan . (Tabari & Marofi, 2011) shown that the positive correlations between T max and E pan were observed in almost all the stations. The important factors related to the increment in E pan are temperature variables (min, mean, and max). Although, the variations of pan evaporation are not very receptive to variations in precipitation.

Machine learning models
It is also identified that the unavailability of highresolution data hinders the evaporation measurement. Duethmann and Blöschl (2018) combined the meteorological dataset with discharge data and catchment characteristics to analyze the changes in Austria (Duethmann & Blöschl, 2018). The numerical estimation has been found significant in evaporation measurements (Sacomano Filho et al., 2018). Zhao et al. (2018) used a novel mass model to analyze flash evaporation (Zhao et al., 2018). Assessment of evaporation has played a crucial role in many disciplines (Brutin & Starov, 2018;Shirgure & Rajput, 2011). Kişi et al. (2012) estimated the daily pan evaporation from the meteorological variables using the Generalized Neuro-Fuzzy model (Kişi et al., 2012). Over time, Deep Learning (DL) and hybrid Machine Learning (ML) based models have received massive applications in water resources engineering (Xia et al., 2020).  and (Ardabili et al., 2019) presented a systematic review and comparison of DL and ML techniques in hydrological processes climate change, and earth systems. Mosavi et al. (2021) applied four ensemble models: Boosted generalized additive model (GamBoost), adaptive Boosting classification trees (AdaBoost), Bagged Classification and Regression Trees (Bagged CART), and Random Forest (RF) to predict the potential of groundwater in Dezekord-Kamfiruz catchment (Iran). They found better efficacy of ensemble models in groundwater prediction over the study basin. (Wu et al., 2021) proposed a hybrid ML model to estimate the monthly ET o in the Poyang lake basin of south China at 26 weather stations. The proposed K-means-FFA-KELM (K-means-Firefly Algorithm-Extreme Machine Learning) model was found with higher accuracy using input variables (solar radiation, maximum & minimum temperature) than the ANFIS (Adaptive Neuro-Fuzzy Inference System), M5P, and RF models. Bellido-Jiménez et al. (2021) showed the performance of six remote-sensing-based ML models to estimate the daily ET o in the region of Andalusian. The ELM and MLP (Multilayer Perceptron) models were found with higher accuracy than RF, SVM (Support Vector Machine), GRNN (Generalized Regression Neural Network), and XGBoost (Extreme Gra- Over the past two years, the application of DL models on predicting weather evaporation processes has observed successfully with limited research output. The available literature was confirmed its potential in solving this complex climate natural problem (Abed et al., 2021;Majhi et al., 2020;Sattari et al., 2020). However, exploration of this model is still at the stage for inspection different climate regions, different data stochasticity and different data span. Hence, there is still room for this research domain to be further extended on testing the feasibility of this new version of machine learning.

Research objectives
The main focus of the paper is on better understanding and accurate estimation of monthly pan-evaporation with a one-input parameter. Therefore, the need is to find an alternative approach for the estimate of pan evaporation with limited climate data in large-scale water footprint studies with less error. However, there is a gap in investigating the deep learning model with limited climate parameters for the estimation of pan evaporation. An overview of these techniques is as follows. H2O is an open-source platform for artificial intelligence. It includes various machine learning algorithms as Logistic Regression, Principal Component Analysis (PCA), Naive Bays, Stacked ensemble model, and k-means clustering. H2O combines various models to provide effective and accurate predictions (Candel et al., 2016). Shurbaji and Phillips (1995) identified that H2O and H2O models were effective in examining the infiltration and evaporation in the soils (Shurbaji & Phillips, 1995). Generalized Linear Model (GLM) and Distributed Random Forest (DRF) are applied to continuous monitoring of suspended sediment concentration using H2O (Ghorbani, Khatibi, et al., 2020;. Based on the cited literature and author's experiences the present study was designed to provide an accurate quantification for monthly E pan at Kiashahr and Ranichauri stations with limited climatic data. Therefore, this study was conducted with the following objectives: (i) to estimate the pan evaporation of two stations using single input i.e. maximum temperature (ii) to compare the estimates of DL and GBM models with observed one, and (iii) to test the capability of the proposed models for the E pan at two different locations with two different data stochasticity based on statistical measures. The proposed methods will be instructive for further assessment of evaporation at other spatial scales. The findings of the study will ameliorate the understanding of various intricated engineering problems.

Study area and database
Evaporation estimation has been carried out at two stations (i) Kiashahr, a city of Gilan province, Iran, and (ii) Ranichauri in Uttarakhand State, India. The climate and physiographic characteristics of both stations are different. Kiashahr experiences a warm temperate climate (Malik, Kumar, et al., 2021). The average precipitation of the region is 1295 mm while the difference between driest and wettest months is 195 mm. Ranichauri is located between 30°18 40 north latitude and 78°24 35 east longitude. It is situated at a distance of 74 km from Rishikesh having an average elevation of 1676.4 meters. The station experiences a pleasant climate throughout the year. It relishes a humid subtropical climate where the average temperature during summers ranges from 20°C to 25°C while mercury drops to 5°C during winters (Malik, Rai, et al., 2020).
Monthly maximum (T max ) temperature data was used for monthly pan-evaporation estimation for both stations. Temperature data for Kiashahr station is collected from IMO (Iran Meteorological Organization) during 2010-2017 while temperature data for Ranichauri station is obtained from Crop Research Centre (CRC), Uttarakhand from 2000 to 2012. For training, model 70% of the dataset was randomly selected for the training phase while the rest 30% was utilized in the validation (testing) phase. In this study the dependent parameter is evaporation, and the independent parameter is monthly maximum air temperature. Statistical characteristics of the dataset variables are shown in Table 1. The map of the study and the location of the stations is shown in Figures 1  and 2, respectively. Wide monthly variations are observed in monthly maximum temperature and E pan over the two stations (Figure 3(a,b)).

Deep learning
The term 'Neural Network' is referred to as artificial neurons or perceptron which contains a single input layer (dendrites), a cell body, and an output node (the axon).
The extension of conventional artificial neural networks is deep neural networks. Deep neural networks compose computations performed by several layers. Deep Learning is a field of machine learning, which is based on Deep Neural Networks with representation learning (Huynh et al., 2021). DL is based on Feed-forward Neural Network (FNN), and Multilayer Perceptron (MLP), which is used to optimize the functions as an activation function such as Tanh activation functions . MLP contains an input layer, hidden layers, and output layer nodes. It contains the directed graphs and activation functions related to each node. DL in the modern era provides a robust framework for supervised machine learning.
Deep learning applies a neural network based on different parameters to be enhanced through backpropagation techniques and adaptive learning rate with stochastic descent optimizer. Thus, the output signal f (α) is relayed in a further layer to the neuron. The MLP function is denoted by α (Minaee et al., 2021): A perceptron's consist of a vector of weights w i = [w 1 , w 2 . . . .w n ], one for each input x 1 = [x 1 , x 2 . . . .x n ], a distinguished weight b, called bias, so the weighted sum f (α) can be computed by Eq. (1). In addition, the activation function is applied to the input variable (x i ) which is then multiplied with the (w) and added to the (b) function. Bias is an intercept that is added in a linear equation which is used to best fit for the given data. The adjustment of the weights to an optimal state is an iterative process with a learning rate.
The tanh or hyperbolic is a tangent rescaled and shifter logistic activation function. Tanh function looks like an S-shaped curve, which performs better than a logistic sigmoid. Its outcome exists between the −1 to +1 and is expressed as follows: Its gradient and range are expressed as: The distribution of the training dataset is important to understand how to vectorize the data for modeling.  Gaussian processes have long been used as a traditional nonparametric modeling tool. Each distribution has a primary association with a particular loss function. The Gaussian Distribution (GD) is parameterized into two parameters: the mean and the variance. GD is represented as equal to mean, median, and mode (Chi et al., 2021). A one-hidden layer Deep Learning consists of 100 units of neurons, from the input layer to the first hidden layer f (h i ) with Tanh activation expressed as Eq. (1), and last output layer f (h l ) is expressed as Eq. (2), and the loss error is calculated by the f (a) = Tan h(h l ) which is expressed by Eq. (3). The mean squared error loss function is appropriate for real-valued output. The loss function in regression is Gaussian with mean square error as: The linear activation function is applied in the output node for regression problems or discrete outputs. Gaussian Distribution (GD) is a function for continuous targets, and it is defined by the continuous probability density. In addition, the MLP network learns the mean and variance of GD. The output distribution will be controlled by the model, and it can easily assign the high density to the correct training outcomes. The deep learning weight parameters are initialized with the random variable which is generated by Gaussian distribution. LW(W, B|j) is the process of minimizing the loss function by stochastic gradient descent (SGD) expressed by Eq. (7).

Gradient boosting machine
Gradient Boosting Machine (GBM) is built by many decision trees that reduce the residual errors from the last iteration (Bhagat et al., 2020). The term 'boosting' refers to the iterative process and uses a gradient descent for the optimization. It is used to improve the accuracy of trees . GBM is an ensemblebased method for regression, classification purposes, and applying a weak classifier on the data to generate the set of decision trees (Bhagat et al., 2021). Moreover, the optimization problem for estimating the regression functionf (.) of a statistical model, relating the input variables X with the outcome Y, can be expressed in algorithm 1. The model is fitted by maximizing the corresponding penalized likelihood: The corresponding deviance is equal to: The algorithm steps are summarized below, where input data is (x, y) N i=1 , the number of iterations refers as M, loss function as (y, f ), ρ(.)is a loss function, and choice of base learner model as h(x, θ).

Algorithm 1. Gradient Boost Algorithm
Steps Compute the negative gradient g t (x) fit a new base-learner function h(x, θ ) find the best gradient descent step-size (ρ t ): update the function estimate: f t ←f t−1 + ρ t h(x, θ t ) end for A gradient descent involves three parameters: a loss function to be optimized, a weak learner to make predictions, and an additive model to add weak learners to minimize the loss function. Gradient boosting builds the first learner on the training dataset to predict the samples, and calculate the loss (difference between the real value and output of the first learner). It is used to build an improved learner in the second stage. At every step, the residual of the loss function is calculated by the gradient method and the new residual becomes the target variable for the subsequent iteration.

H2O framework
The machine learning models are trained in R-studio using the H2O package (R Core Team, 2013). H2O package provides an open-source, scalable, lightweight, and predictive analytics platform to perform supervised, and unsupervised machine learning algorithms and data analysis.
Deep learning, gradient boosting machine, random forest, and generalized linear algorithms have been builtin API (application programming interface) and written in Java. However, it includes several parameters to limit the overfitting models to ensure accurate results with a number of trees, limiting the number of epochs, and using the right amount of regularization.
Map-reduce as distribution working is used by H2O. The first step is to import the data file using H2O, where each node is load in parallel. Data is converted into several chunks with the process of the Map function. H2O uses distributed key-value storage. The map-reduce framework is applied for the robust and generalized performance of models with regression and classification. The framework of the model is shown in Figure 4.

Data collection and inputs
In the initial phase, we have received the two station datasets from Kiashahr, Iran, and Ranichauri in Uttarakhand, India. The one input is the maximum air temperature, and the target is monthly E pan . The temperature is the most critical meteorological parameter that influences the evaporation process. The evaporation is mainly dependent on this parameter, and it is a very logical phenomenon.

Data division
To develop new predictive models, the whole dataset is divided into two subsets i.e. training and testing. Daily datasets of each selected station have been divided into training and testing datasets with a proportion of 70% and 30%, respectively. The training dataset consists of 70 data vectors and the testing dataset consists of 26 data vectors of station 1 (Kiashahr) and for training consists of 110 data vectors and testing consists of 46 data vectors for station 2 (Ranichauri) respectively.

Data modeling and analysis
In this phase, we have applied various modeling techniques and their parameters are calibrated to optimal values. DL and GBM models are applied to find the optimal values using various parameters. The deep learning (DLs) and gradient boosting machine (GBM) models are utilized and trained through training sets up to satisfiable accuracy.
Various epochs and parameters are applied to set the tuning parameters for building the model to achieve the best accuracy using the training dataset. These models are then applied to the testing datasets after achieving the highest accuracy on the training datasets. The five-fold cross-validation approach is applied to train the models with the training and testing data. H2O-DL model is connected with a high-level artificial neural network (ANN) to optimize the hidden layer nodes via backpropagation. An adaptive learning rate algorithm (ADADELTA) optimizer is enabled as default in H2O to produce better results than the constant learning rate.
H2O-GBM uses distributed trees, where each node generates a local histogram using only node-local data in parallel. GBM creates a histogram using a predefined number of bins, which defaults to 1024. Further, the histograms are merged into one and a split column is selected to make the decision. Each node is reassigned to the rows and the procedure is repeated. H2O -GBM provides a stochastic-based GBM, which can significantly improve performance over the standard GBM execution. Firstly, the default parameters are used to train the GBM model, and the number of trees is set as ntrees = 100 and learn rate as 0.1. The early stopping function is used to prevent overfitting and also find the optimal parameters while training the model with the number of trees. The various epochs and early stopping parameters have been applied to obtain better performance and to find the optimal value in deep learning.

Evaluation statistical indicators
In this section, to evaluate the performance of models, the following statistical indicators have been selected using mean absolute error (MAE), Nash-Sutcliffe efficiency (NSE), index of agreement (IOA), and root mean square error (RMSE) (Chai & Draxler, 2014;Malik, Tikhamarine, et al., 2021;Tur & Yontem, 2021;Yaseen, Naganna, et al., 2020). The statistical measures are represented by Eqs. (10-13): where E pan obs,i and E pan pre,i refers to the observed and predicted values of pan-evaporation for ith observations, E pan obs defines the mean of observed values of panevaporation and N is the total number of the variables.

Results and discussion
The present study aims to estimate the evaporation based on H2O models and to explore the performance based on temperature as a single input for Station1 and Station2. This model was compared through the prediction skills using different statistical metrics (Eqs. (10-13)). However, the evaporation actual values are ranged from (0.40-10.43 mm), and estimated values are ranged between (0.71-5.76 mm) in testing for Station1 using the DL model. For the GBM model, the estimated value ranged between (1.21-4.13 mm) during the testing period.
Machine learning is applied to analyze the complex relationship between independent and dependent variables using training & testing datasets. By optimizing the model architecture, it can be possible to obtain an improved modeling process. In some countries, a higher number of input parameters are not easily available to process the modeling.
For this purpose, empirical methods are used for the accurate estimation of pan evaporation or evapotranspiration with a very limited number of parameters. But these empirical methods are not sufficient and applicable to several regions with completely different climates. It is not always possible to install the evaporation pans and penman's equation at every location. Therefore, the need is to find an alternative approach for the estimate of pan evaporation with limited climate data in large-scale water footprint studies with less error. The novelty of this study is to estimate the accurate results of pan evaporation with a one-input parameter.
The temperature-based model can provide accurate results for the estimation of evaporation. Many researchers predicted the pan evaporation based on a single equation with mean temperature. The temperature is used as an input variable and shows superior results of predicted evaporation for Station1 with MAE values of (0.5691 and 0.5907) in the testing dataset by DL and GBM models, respectively.
In 1961, Sermer (Šermer, 1961) derived an equation for the estimation of daily evaporation based on mean air temperature (T mean ).
Mrkvickova (Mrkvičková, 2007) published many equations by statistical analysis of data from 2001 to 2005 based on mean temperature. E pan2 = 1.2061 T 1.0712 mean − 1.3906 T mean + 1.7986 (15) There are many application groups, which do not estimate the evaporation to obtain the high accuracy, but able to quantify the error with sufficient accuracy in the model. The studies are related to estimating the water balance, virtual water studies, evapotranspiration, evaporation, and other application can achieve effective results with few input variables.

Kiashahr Station
The DL and GBM models are trained by 72 data points while the 24 data points are utilized for testing in Sta-tion1 (Kiashahr). Results are presented in Table 2; the DL model performed excellently and was found most suited for Station1 using the testing period. At Kiashahr Station, the deep learning model was suitable for testing datasets indicated by MAE (0.5691 mm/month), RMSE (0.7111 mm/month), NSE (0.7496), and IOA (0.9413). The lower value of MAE & RMSE, and higher values of NSE & IOA show the superiority of DL model performance using maximum air temperature input parameter.
For station1, the GBM model provides efficient results on the testing dataset in terms of MAE (0.5907 mm/   Table 2. The essential parameters deployed using DL and GBM model are presented in Table 2. Based on statistical results, MAE, RMSE, NSE, and IOA are indicated for the deployed models. Findings revealed that the DL model was superior as compared to the GBM model in the case of the testing period for the Kiashahr station. Figures 5 and 6 represent the comparison between observed and predicted datasets using the DL and GBM models. It is indicative from Figures 5 and 6 that the DL model is the most suitable and the difference between observed and predicted evaporation is also very minimum. Figures 5 and 6: observed and predicted evaporation at Kiashahr station using DL & GBM models during testing dataset.

Y(Predicted value)
We have presented the linear regression approach to modeling the relationship between observed and predicted evaporation values. Y denotes the dependent variable (evaporation), X signifies the independent variable (i.e. T max ), β o is the intercept of the regression line and β 1 is the slope of a regression line. β 1 measures the change in mean evaporation for a unit change. Figure 5 shows the predicted values of the evaporation versus the measured values for the testing period. The scatter plots show that the predicted values of evaporation are well scattered around the best-fitted regression line with the (determination coefficient: R 2 = 0.80) for the DL model and (R 2 = 0.78) for the GBM model, respectively. DL model estimates β o = 0.98 and β 1 = 0.18. Figure 5 displays that the range of variation of residual error for the testing set is found between −1.5% and 1.5%. The results of the linear regression fit between the predicted and measured pan-evaporation for Station1 are expressed in Figure 6. It was observed that the regression equation of the GBM model is Y = 1.15 + 0.58 * X, where 1.15 is the intercept and 0.58 is the slope. The variation of relative error (RE) lies between −2% and 2%. According to Figure 6, the predicted results of GBM models have R 2 = 0.78 at the testing stage.

Ranichauri Station
For the Ranichauri station, the evaporation actual values are ranged between (0.90-5.50 mm) and estimated (1.34-4.85 mm) using the testing dataset by the DL model. For the GBM model, the estimated value ranged between (1.59-3.71) using the testing period. The DL and GBM models were trained by using 110 data points while the remaining data with 46 data points were utilized for testing in Station2 (Ranichauri). Ranichauri station achieves the highest accuracy in terms of MAE = 0.3693 mm/month, RMSE = 0.4357 mm/month, NSE = 0.8344 and IOA = 0.9507 using the DL model while GBM models yields Table 3. Summary of the performance metrics at Ranichauri station in the testing phase.  (Table 3). It is also observed that the DL model provided the best outcomes during the testing period than the GBM model. Figures 7 and 8 are representing the accuracy of the two models for predicted and measured (or observed) evaporation at Ranichauri station. It is observed from these figures that DL is the best-suited model for station2. The difference between predicted and measured evaporation during the testing dataset is less in the case of the deep learning model (Figure 7). Figure 8: predicted and measured evaporation at Ranichauri using GBM model during testing dataset. It is possible to predict the pan evaporation when the temperature lies outside the range of values observed in the study. If the results were found outside of the range, it means it could be a negative relationship.
Linear regression can be used to estimate the pan evaporation that lies in the observed range (0.90 mm to 5.50 mm) for the testing dataset at Station2. Similarly, in Figures 7 and 8, the predicted values of evaporation are not well scattered around the best-fitted regression line with the (R 2 = 0.86) for the DL model and (R 2 = 0.84) for the GBM model, respectively. The DL model estimates β o = 0.52 and β 1 = 0.79 and the GBM model estimates β o = 1.18 and β 1 = 0.51, respectively. The figure shows that the range of variation of residual error for the DL and GBM models within the testing set is found between −1.5% to 1.5% respectively.
Similarly, the predictive performance of applied DL and GBM models was evaluated by employing the Taylor diagram (Taylor, 2001), which combines the multiple measures i.e. RMSE, standard deviation, and correlation coefficient in a polar coordinate system. Figure  9(a,b)   as the results presented in Tables 2 and 3, respectively.
Several scholars have also reached similar results using the advanced machine learning models (Kişi & Cimen, 2009;Krauss et al., 2017;Laaboudi et al., 2012). Kumar et al. (Kumar et al., 2016) compared the efficiency of various neural models for estimating evaporation. They identified that the extreme learning model has the highest accuracy and is comparatively fast than other models in evaporation estimation. Wu et al. (Wu et al., 2019) also identified that extreme learning models are capable of predicting evaporation with convincing accuracy. Saggi and Jain  have also similar findings. They identified that the learning model has high-performance accuracy than the generalized linear and gradient boosting mechanism model in estimating the daily evaporation. Dou and Yang (Dou & Yang, 2018) discerned that Extreme Learning Models and Adaptive Neuro-Fuzzy techniques are suitable for evaporation calculation. Guan et al. (Guan et al., 2020) applied the Krill Herd Algorithm (KHA) to optimize the Support Vector Regression (SVR) for daily E pan prediction in coastal regions of Iran. Their findings reveal that the SVR-KHA model outperformed the standalone SVR model. Seifi & Soroush (Seifi & Soroush, 2020) estimated daily E pan in five climatic regions (i.e. hyper-arid, semi-arid, arid, humid, and sub-humid) of Iran by enhancing ANN with WOA (Whale Optimization Algorithm), GWO (Grey Wolf Optimizer), and GA (Genetic Algorithm). Results expose the better feasibility of GA-ANN model over the other models. The scatterplots and linear regression and Taylor diagrams confirmed the superior performance of the DL model compared to the GBM model for estimating monthly pan evaporation at both stations.
Towards this end, we have not just implemented but also analyses the effectiveness of deep learning with multilayer model and gradient boosting model on training and testing dataset of two sites. We have analyzed that the DL and GBM models efficiently estimated the pan evaporation of Ranichauri station as a comparison to Kiashahr station. Additionally, it was noted from Tables 2 and 3 that the outcomes of the DL model at Ranichauri station are much better than Kiashahr station results using a single variable. Also, station 2 demonstrated superior results with DL and GBM models.
Pan evaporation analysis was carried out by the linear regression approach to modeling the relationship between observed and predicted evaporation values.
The best-fitted regression line was found with the Deep Learning model with the (determination coefficient: R 2 = 0.86) for Kiashahr station as compared to Ranichauri station with (R 2 = 0.80) for the estimation of pan evaporation. Analysis of relations between Epan and the meteorological variables indicated that Epan has a significant positive correlation with temperature parameter.
This work is limited on a temporal scale only for two stations. However, in future research more stations can be investigated, and their modeling performance can be compared with other advanced machine learning models and empirical equations. In addition, more climatic parameters possibly to be considered and their effect on evaporation can be mapped on the spatial scale with geospatial methods for better utilization of available water resources to enhance the agricultural practices. Furthermore, the integration of the feature selection algorithms as a prior stage for the essential predictors can be inspected for better prediction accuracy (Chen et al., 2021;Hadi et al., 2019).

Conclusion
The present study has explored the effectiveness of DL and GBM models in estimating the evaporation for two stations i.e. Kiashahr (Iran) and Ranichauri (India) by using maximum air temperature parameter as input. These models have been analyzed by utilizing the H2O framework. The fivefold cross-validation test has been deployed to estimate the performance of the models. Results and accuracy of the models were evaluated and compared to each other with respect to the root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), index of agreement (IOA), and mean absolute error (MAE). Results revealed that the deep learning model was found best at Kiashahr and Ranichauri stations for testing datasets indicated by MAE (0.5691 & 0.3693 mm/month), RMSE (0.7111 & 0.4357 mm/month), NSE (0.7496 & 0.8344), and IOA (0.9413 & 0.9507), separately. Ranichauri station also received the highest accuracy in terms of performance measures using the DL model during the testing period. The findings of the study demonstrated that the performance of the deep learning (DL) model was better than the GBM model for both stations.

Disclosure statement
No potential conflict of interest was reported by the author(s).