Coupling a firefly algorithm with support vector regression to predict evaporation in Northern Iran

ABSTRACT Evaporation accounts for varying shares of water balance under different climatic conditions, and its correct prediction poses a significant challenge before water resources management in watersheds. Given the complex and nonlinear behavior of the evaporation component, and according to the fact that this parameter is not measured at many meteorological stations, at least during some timeframes, and that the meteorological stations measuring this component are not properly distributed in many developing countries, including Iran, the main objective of this work was to predict the evaporation component at two meteorological stations (Rasht and Lahijan) located in Gilan province in northern Iran over the 2006–2016 time period. To that end, those meteorological parameters recorded at the two stations which had the highest impact on evaporation prediction were identified using Pearson correlation coefficient. Selected parameters were then used, under separate scenarios, as inputs to support vector regression (SVR) and SVR model coupled with firefly algorithm (SVR-FA) in order to simulate evaporation values on a daily scale. Evaporation amounts showed the highest correlation with net solar radiation and saturation vapor pressure deficit at Lahijan and Rasht stations, respectively. Root mean square error values of evaporation prediction at testing phase of SVR and SVR-FA ranged from 1.05 to 1.43 and 1.02 to 1.31 mm, respectively, at Lahijan station and from 1.02 to 1.28 and 0.88 to 1.17 mm, respectively, at Rasht station for various scenarios. For underpredicted evaporation data set, the magnitude of RMSE reduction from SVR1 to SVR7 was 27% at Lahijan and 18% at Rasht station; whereas RMSE decrement from SVR-FA1 to SVR-FA7 was 18 and 26 percent at Lahijan and Rasht stations, respectively. This means that for the underpredicted data set, the role of increasing the number of SVR and SVR-FA input parameters in decreasing evaporation prediction error has been more conspicuous at Lahijan and Rasht stations, respectively. Analysis of SVR and SVR-FA performance at various 2-mm intervals of measured evaporation showed that prediction error has generally been increasing with increment of evaporation values, with the highest errors observed at the 8-10 mm interval for both Lahijan and Rasht stations (error rates of 3.42 and 2.42 mm/day at Lahijan and 6.13 and 5.84 mm/day at Rasht station, with SVR1 and SVR-FA1 models, respectively).


Introduction
The accurate prediction of evaporation is a major challenge in the water resources management of watersheds, and its modeling is of great importance in regions where there is insufficient measured data in terms of either spatial or temporal distribution (Dalkiliç, Okkan, & Baykan, 2014). Evaporation varies depending on the climatic conditions and the availability of surface water bodies in any given area, and its contribution to the discharge of surface water and to atmospheric feed also varies accordingly. This variation affects the design, planning, and management of irrigation systems and water resources (Sudheer, Gosain, Mohana Rangan, & Saheb, 2002;Tabari, Marofi, precipitation, wind speed, sunshine hours, and relative humidity (Gavin & Agnew, 2004;Singh & Xu, 1997;Vallet-Coulomb, Legesse, Gasse, Travi, & Chernet, 2001). Dalkiliç et al. (2014) use average temperature, relative humidity, wind speed, minimum air temperature, maximum air temperature, and solar radiation as input parameters for predicting daily evaporation. Their results show that the air temperature and wind speed make the most significant contribution to evaporation prediction, while the least significant contribution is that of relative humidity. Using meteorological data from eight meteorological stations in China for the period 1961 to 2000, Wang, Kişi, Zounemat-Kermani, and Li (2017) present a model which uses air temperature, solar radiation, sunshine hours, relative humidity, and wind speed as its input parameters, and which makes the best prediction of evaporation with a root mean square error (RMSE) of 0.77 mm/day. Kim, Shiri, Kişi, and Singh (2013) use air temperature, wind speed, sunshine hours, relative humidity, and solar radiation as inputs parameters into artificial neural networks (ANNs) for predicting evaporation in South Korea for the period 1985 to 1990, and conclude that air temperature and solar radiation have the most significant impact on daily evaporation prediction. Goyal, Bharti, Quilty, Adamowski, and Pandey (2014) use the main meteorological parameters in the form of four scenarios to predict daily evaporation amounts in India using a support vector regression (SVR) model, and report RMSE values in the range of 1.92 to 2.12 mm/day. It is impossible to model hydrological systems in their entirety due to the complexity of determining all the relevant parameters and the lack of statistical information; thus, the use of simulation methods such as artificial intelligence models is essential (Kişi, Genc, Dinc, & Zounemat-Kermani, 2016;Mosavi, Bathla, & Varkonyi-Koczy, 2017;Wu, Chau, & Li, 2009). The ANN technique is one such method, and its suitability for hydrological research applications is verified by the results of a number of studies (Cigizoglu & Kişi, 2006;Cobaner, Unal, & Kişi, 2009;Guven & Kişi, 2011;Kumar, Raghuwanshi, Singh, Wallender, & Pruitt, 2002;Moghaddamnia, Ghafari Gousheh, Piri, Amin, & Han, 2009;Taormina, Chau, & Sivakumar, 2015;Wu & Chau, 2006). The ANN is an effective tool for modeling nonlinear systems as it does not require complex mathematical equations to be defined for the phenomenon under study. This technique has been widely used for predicting daily evaporation (Guven & Kişi, 2011;Kim et al., 2013;Shirsath & Singh, 2010;Tan, Shuy, & Chua, 2007). Tabari et al. (2010) reported better performance in an ANN compared to nonlinear regression (with RMSE values of 0.42 and 0.92, respectively) for their evaporation predictions at five meteorological stations in Hamadan province, Iran over the period 1996 to 2005. A sensitivity analysis of their results shows that air temperature and wind speed are the most significant factors in the evaporation prediction. Dalkiliç et al. (2014) predicted daily evaporation in Erzincan, Turkey over the period 2004 to 2010, and found that their ANN (RMSE = 2.27 mm/day) performed better than their Penman model (RMSE = 3.06 mm/day).
According to the literature, predicting evaporation using artificial intelligence algorithms has resulted, in some cases, in superior performance to that of ANNs. Kim et al. (2013) predicted evaporation values on a daily scale at the Daegu and Ulsan meteorological stations in South Korea and report satisfactory performance for all three of their methods; using a generalized regression neural networks model (GRNNM), an adaptive neurofuzzy inference system (ANFIS), and a multilayer perceptron (MLP) neural network model, they obtained RMSEs of 1.665, 1.235, and 1.396 mm/day at Daegu, respectively, and RMSEs of 1.136, 1.215, and 1.364 mm/day at Ulsan, respectively. Allawi and El-Shafie (2016) obtained a correlation coefficient of .96 using an ANFIS model to predict daily evaporation in Johor, southeastern Malaysia. Kişi et al. (2016) used three methods to predict daily evaporation in Turkey, and found that their ANN (RMSE = 2 mm/day) performed better than their chi-squared automatic interaction detector (CHAID, RMSE = 2.06 mm/day) and their classification and regression tree (CR-T, RMSE = 2.07 mm/day), although the differences are not significant.
SVR models have been widely used in watershed hydrological studies and water resources management systems in recent years (Baydaroglu & Koçak, 2014;Cheng-Ping et al., 2011). Their advantages include a fast data-processing speed and higher precision than other classical methods (Baydaroglu & Koçak, 2014). Wu et al. (2009) applied the methods of autoregressive (integrated moving) average (ARIMA), k-nearest neighbors (KNN), ANN, and SVR to streamflow prediction in China over the period 1974 to 2003, finding that the SVR method has the highest precision (with RMSE values of 376.6,561.5,299.7,and 148.0 m 3 /s, respectively). Kişi (2015) used air temperature, wind speed, relative humidity, and solar radiation as the input parameters of an SVR model to predict evaporation in Turkey over the period 2002 to 2006, and the results are indicative of appropriate performance (RMSE = 0.597 mm/day). Goyal et al. (2014) compared the daily evaporation values predicted by four methods for Jharkhand state, India, and found that the SVR (RMSE = 1.92 mm/day) and fuzzy logic (RMSE = 1.95 mm/day) methods produced better estimates than the ANN (RMSE = 2.34 mm/day) and ANFIS (RMSE = 2.94 mm/day) methods.
Optimization algorithms can be effective for optimizing the training of artificial intelligence models (Chau, 2007;Chen, Chau, & Busari, 2015;Gholami, Chau, Fadaee, Torkaman, & Ghaffari, 2015). The firefly algorithm is a relatively novel optimization technique which has two advantages over other similar algorithms. First, it is based on attraction, and attractiveness decreases with distance. This means that the entire population is automatically divided into subgroups which swarm around local optima until eventually the best solution can be found. Second, these subgroups enable the firefly algorithm to simultaneously find all optimal modes (Yang & He, 2013). Studying groundwater contamination, Kazemzadeh-Parsi, Daneshmand, Ahmadfard, and Adamowski (2015) compared a finite-element method (FEM) numerical solution and a modified firefly algorithm with conventional optimization methods (such as a genetic algorithm), and presented their optimized model as an effective tool that is applicable to the remediation and management of contaminated aquifers. In their study of water-level fluctuations in Urmia lake in Iran, Kişi et al. (2015) conclude that their firefly algorithm and SVR method produced better results than their ANN and genetic algorithm, and that the SVR method coupled with the firefly algorithm can be used as a new model and tool for defining various strategies to predict the lake's water level. Ghorbani et al. (2017) utilized SVR and SVR coupled with a firefly algorithm for predicting the field capacity (FC) and permanent wilting point (PWP) of 215 soil samples collected from the East Azerbaijan province in Iran, finding that the coupled method performed better than the SVR method. The FC predictions produced RMSE values of 18.36 and 8.74 mm/m for the SVR and coupled methods, respectively, and the PWP predictions produced RMSE values of 21.75 and 10.61 mm/m for the SVR and coupled methods, respectively.
The accurate identification and prediction of the components of the hydrological cycle is essential to designing, operating, and analyzing effective water resources systems. The process of evaporation is a fundamental component of the hydrological cycle, but evaporation is subject to nonlinear change and there is a lack of measured evaporation data in many meteorological stations for certain time periods, a problem that is further compounded by the uneven spatial distribution of the stations. Bearing all of this in mind, the main objective of this study is to use an SVR model and a hybrid SVRbased firefly algorithm (SVR-FA) model to simulate the evaporation over the period 2006 to 2016 for the Rasht and Lahijan meteorological stations in Gilan province, northern Iran.

Data collection sites
In this study, the main meteorological parameters recorded at the Rasht and Lahijan stations during the period 2006 to 2016 were used to estimate daily evaporation values using the SVR and SVR-FA methods. The maximum air temperature, mean relative humidity, precipitation, sunshine hours, wind speed, saturation vapor pressure deficit, and net solar radiation were selected based on the maximum values of the Pearson correlation coefficients to predict the amount of evaporation in seven different scenarios. The daily measured values of these parameters (over the period 2006 to 2016) were used. After analyzing the data, outlier values were eliminated from the data set. The selected data set was standardized using Eq. (1) below prior to being fed into the models. A total of 3155 and 3296 data points from the Lahijan and Rasht stations, respectively, were used in the modeling procedure. The recorded meteorological parameters and their ranges are presented in Table 1.
The Lahijan synoptic station is located at a latitude of 37°11'N, a longitude of 50°1'E, and an elevation of 34.2 m above sea level. The Rasht synoptic station is located at a latitude of 37°27'N, a longitude of 49°58'E, and an elevation of 24.9 m above sea level ( Figure 1). The average temperature and sunshine hours measured over the time period covered in this study are 16.9°C and 5.22 h at Lahijan and 16.9°C and 6.50 h at Rasht. The average values of annual precipitation recorded are 1399 and 1278 mm/year at Lahijan and Rasht, respectively, and the average daily evaporation rates are 2.7 and 2.4 mm/day at Lahijan and Rasht, respectively. The evaporation amounts range from 0 to 10 mm/day at Lahijan and 0 to 11.4 mm/day at Rasht.

Support vector regression (SVR)
SVR is a set of supervised learning methods that can be used for classification and regression analysis. Introduced by Vapnik and Chervonenkis in 1974 and founded upon statistical learning theory, this method is based on dual classification in the arbitrary feature space and hence is well suited to prediction problems (Jha & Hayashi, 2014;Pai & Hong, 2007;Yoon, Jun, Hyun, Bae, & Lee, 2011).

The firefly algorithm
The firefly algorithm is a bio-inspired algorithm introduced by Yang in 2009 which simulates the social behavior of fireflies. These insects flash light, and each species has its own flash pattern. The attractiveness of a firefly Note: E = evaporation; e s −e a = saturation vapor pressure deficit; P = precipitation; RH = relative humidity; R n = net solar radiation; SSH = sunshine hours; T = temperature; U 2 = wind speed. is proportional to its light intensity or brightness. Taking the light intensity of each individual insect as the objective function, the social behavior of fireflies can be modeled as an optimization algorithm (Yang & He, 2013).

Coupling SVR with a firefly algorithm
Not only can SVR implicitly detect complex nonlinear relationships between independent and dependent variables, it also has the ability to detect all possible interactions between predictor variables (Goyal et al., 2014). The firefly algorithm is robust and efficient compared to other metaheuristic algorithms' local and global searches. It also has good exploitation capabilities and can find better solutions because fireflies come together more closely around the optimal solution, as many candidates (fireflies) gather near to the optimal solution (Ghorbani et al., 2017). Considering the nonlinear trend of variation in evaporation, combining SVR with the firefly algorithm produces a new method which inherits the predictive abilities of SVR and the optimization capabilities of the firefly algorithm. This new method can make predictions with high accuracy at a reasonable speed. The general advantages of combining these two approaches are as follows: 1. It avoids having to explain the complex and nonlinear behaviors of predictive and predicted factors -in the present case, the process of evaporation. 2. It prevents the scope of the optimization algorithm from being trapped in local minima, thanks to the ability to find local as well as global solutions. 3. It simultaneously takes advantage of SVR models' predictive power and firefly algorithms' optimization potential in order to obtain the best results.
The SVR and SVR-FA models were used in seven different scenarios with various inputs (Table 2 and Figure 2), and the results were compared with the T max , RH mean , Precipitation, Sunshine, Wind speed, e s −e a , R n 7 7 Note: e s −e a = saturation vapor pressure deficit; RH mean = mean relative humidity; R n = net solar radiation; T max = maximum temperature. measured evaporation values. A set of daily evaporation values from the period 2006 to 2016 was randomly selected for training and testing the SVR and SVR-FA models, with 2405 and 750 data points used from Lahijan and 2502 and 794 data points from Rasht, respectively. In order to achieve better and faster learning, the meteorological data were normalized between 0 and 1 before being input into the two models and then converted back to their initial values after modeling. The equation used for the normalization process is: where X i is the measured data, X n is the standardized data, and X max and X min are the maximum and minimum data points, respectively. The computational procedures, including the development of the SVR and SVR-FA models, were implemented in a MATLAB (The Math-Works, 2012) environment, and parameters of the kernel function were optimized through trial and error.

Performance evaluation
Two statistical indices -the root mean square error (RMSE) and the determination coefficient R 2 -were used to evaluate the performance of the two modeling approaches: where E obs , E pre ,Ē obs andĒ pre are the observed, predicted, average observed, and average predicted values of evaporation, respectively. Taylor (2001) introduced a single diagram to summarize multiple aspects of model evaluation indices, including the RMSE and correlation coefficient values, and recommended its use in natural science and hydrology studies for evaluating models' performance. Taylor diagrams can highlight the accuracy of models' predictions by comparing the measured and predicted values through visualizing a series of points on a polar plot. The azimuth angle of the diagram shows the correlation coefficient between the measured and predicted values, and the radial distance from the reference point represents the ratio of normalized standard deviation of the simulation from the measured values.

Results and discussion
The trend of variation in measured evaporation values, which directly depends on some of the climatic parameters and is indirectly a function of time, along with the average recorded values of this component (Table 1) highlight its role in the water balance in the studied area. Evaporation values as high as about 10 mm/day were recorded at both stations, thus making the importance of generating accurate predictions even more evident. The evaporation that occurs on land largely depends on the soil's moisture content, especially in the top layers, which are fed by irrigation and/or precipitation. Changes in precipitation are therefore expected to have a significant effect on the evaporation that takes place in the surface soil. The situation is different for surface water bodies, as precipitation does not play a significant role (except when the water level is so low that precipitation will significantly increase it), while other climatic factors -such as the received solar radiation and the humidity -can limit or intensify evaporation. Figure 3 shows the relationship between the precipitation and evaporation measured at the Rasht and Lahijan stations, which are both located in a high-precipitation area of Iran -the average precipitation at Rasht and Lahijan is 1278 and 1399 mm/year, respectively, compared with an average of 250 mm/year for the entire country -so precipitation cannot be considered a significant factor. As shown in Figure 3, not only did an increase in precipitation fail to result in higher amounts of evaporation, but lower precipitation (that is, in the warm season when the received solar radiation is higher) has been accompanied by increased evaporation.

Correlation between evaporation and meteorological parameters
Pearson correlation coefficients were used to find the climatic parameters exhibiting the highest impact on the evaporation predictions, and the results are presented in Tables 3 and 4 (all of the correlation coefficients are significant at 95%, p ≤ .05). It can be seen that the evaporation is directly affected by the temperature and inversely affected by the relative humidity at both stations. The maximum temperature and average relative humidity were selected as inputs for the models based on the Pearson correlation coefficient values. As expected due to the climatic conditions of the study area, the evaporation showed a high correlation with the number of sunshine hours, saturation vapor pressure deficit and net solar radiation, with the net solar radiation and saturation vapor pressure deficit exhibiting the highest correlation coefficients for the Lahijan and Rasht stations, respectively. The results presented in Tables 3 and 4

Evaporation prediction
The evaporation values measured at the Lahijan and Rasht stations, and those predicted by the SVR and SVR-FA models at the testing phase, are shown in Figures 4 and 5. The RMSEs and coefficients of determination between the measured and predicted evaporation amounts for both stations in both the training and testing phases are presented in Table 5, which shows that for both the training and testing phases for both stations, the RMSEs of the evaporation prediction decrease as the number of model inputs increase, although the magnitude of the decrement varies for different meteorological parameters. For the Lahijan station, using the SVR model in the training phase, the highest RMSE decrease occurs for the scenarios in which the saturation vapor pressure deficit, sunshine hours, and net solar radiation are added to the set of input parameters (the RMSE decreases by 9.8, 8.2, and 8.1%, respectively). In the testing phase, the saturation vapor pressure deficit and sunshine hours are the parameters with the most significant roles in estimating evaporation, resulting in decreases of 9.7 and 7.4% in the RMSE, respectively. For the SVR-FA model, the addition of the sunshine hours and relative humidity to the input parameters in the training phase reduces the prediction error by approximately 10.5 and 9.8%, respectively, whereas in the testing phase, only the addition of the relative humidity results in a relatively significant decrease (9.0%) compared to other scenarios. For the Rasht station, for both the SVR and SVR-FA models, the addition of the saturation vapor pressure deficit and the net solar radiation to the input parameters brings about the highest error reduction in the training and testing phases; the addition of these parameters decreases the RMSE by 8.2 and 11.7% in the SVR model, respectively, and by 8.4 and 11.1% in the SVR-FA model, respectively. A comparison of Figures 4 and 5, Note: e s −e a = saturation vapor pressure deficit; RH max = maximum relative humidity; RH mean = mean relative humidity; RH min = minimum relative humidity; R n = net solar radiation; T max = maximum temperature; T mean = mean temperature; T min = minimum temperature. Note: e s −e a = saturation vapor pressure deficit; RH max = maximum relative humidity; RH mean = mean relative humidity; RH min = minimum relative humidity; R n = net solar radiation; T max = maximum temperature; T mean = mean temperature; T min = minimum temperature. as well as the results in Table 5, indicates that for the same scenarios, the SVR-FA model provides more accurate predictions than the SVR models for both stations in both the training and testing phases, with the highest RMSE decrement from SVR to SVR-FA occurring in the testing phase in the third scenario for the Lahijan station (approximately 14.0%) and the seventh scenario for the Rasht station (approximately 13.7%). In a study conducted to predict daily evaporation over the period 1987 to 1990 in California, Kişi (2006) showed that the temperature and solar radiation had the highest impact on the estimation of this component, whereas the wind speed and precipitation had the lowest impact. A look at the days when the evaporation amount is underpredicted or overpredicted (Table 6) shows that at both the Lahijan and Rasht stations higher input numbers resulted in lower prediction errors for both the SVR and SVR-FA models. This is the case with the entire set of evaporation data (Table 5), except for the SVR-FA for Lahijan, which shows a diminutive increment in prediction errors from SVR-FA5 to SVR-FA6 in the underpredicted data set and from SVR-FA4 to SVR-FA5 in the overpredicted data set. For the Lahijan station, the lowest and highest differences between the numbers of days with underprediction or overprediction of evaporation amounts are 6 (SVR7) and 70 (SVR3) days for the SVR models and 6 (SVR-FA6) and 38 (SVR-FA7) days for the SVR-FA models, indicating that the difference between the numbers of underpredicted and overpredicted days is lower for the SVR-FA models. For the Rasht station, the values are 4 (SVR1) and 50 (SVR6) days for the SVR models and 8 (SVR-FA5) and 58 (SVR-FA1) days for the SVR-FA models. Another important point drawn from Table 6 is that at both stations and in all scenarios, both the SVR and SVR-FA models performed better on the days when the evaporation was overpredicted. For the Lahijan station, the lowest and highest RMSE decrements between the underpredicted and overpredicted sets are approximately 14.3% (SVR1) and 19.4% (SVR3) for the SVR models and 7.3% (SVR-FA1) and 21.3% (SVR-FA4) for the SVR-FA models. For the Rasht station, these values are approximately 27.9% (SVR-FA2) and 38.5% (SVR-FA7) for the SVR models and 32.7% (SVR-FA7) and 40.0% (SVR-FA4 and SVR-FA6) for the SVR-FA models. A comparison of the above figures indicates that the reduction in prediction errors for both the SVR and SVR-FA models between the underpredicted and overpredicted sets is higher for the Rasht station.
Although the distribution of the predicted evaporation values compared to the distribution of the measured values and the 1:1 line (Figures 4 and 5) partially depicts the performance of the SVR and SVR-FA models for both stations, the authors attempted to further examine their performance in more detail. If the average of the measured evaporation amounts in each station is considered as a threshold value of this component, Figure 6 shows the error rates of both the SVR and SVR-FA models when predicting evaporation values that are lower or higher than this threshold. However, the difference in the performance of the two models in predicting evaporation values lower and higher than the threshold (average measured evaporation at each station) is more evident for the  Rasht station, which is indicative of the different behaviors of the SVR and SVR-FA models in predicting various evaporation intervals at the Rasht station. Overall, the highest prediction error is observed for the Rasht station, for evaporation amounts higher than the average. As shown in Figure 6, both models performed better when predicting evaporation amounts lower than the mentioned threshold compared to those above thresholds. For a better analysis of the results, measured values from each station were divided into five 2-mm intervals and the evaporation prediction error rates of both models were calculated for each interval in order to determine the intervals with the highest prediction errors (Figure 7). Figure 7 shows that with both models for both stations, the magnitude of the prediction errors generally increases at higher evaporation intervals, with the highest rates of error increment observed in the 6-8 and > 8 intervals. With the SVR model and the 0-6 mm interval at Lahijan, however, the prediction errors decreased for higher measured evaporation values, although this decrease is not significant. Another important point understood from Figure 7 is that the influence of having more input parameters on the reduction of the error rates of the models is most evident with the higher evaporation values (especially above 8 mm). Figure 8 shows the results of the Taylor diagram analysis for the evaporation prediction for the Lahijan and Rasht stations in the training (left) and testing (right) phases. The reference point in a Taylor diagram is determined according to the standard deviation of the data, which here is approximately 2.24 and 1.91 at the Lahijan and Rasht stations, respectively, reflecting the wider range of data at the former station. This is also evident from the wider scattering of the statistical indices of the SVR and SVR-FA models (Figure 8). Each point in this diagram represents the performance of a particular model, and being closer to the reference point means that the model has made a more accurate prediction. Accordingly, for both stations in both the training and testing phases, the SVR and SVR-FA models had the highest and the lowest error rates, respectively, although the range of RMSE changes in the testing phase is higher at the Lahijan station than at the Rasht station.
For the Lahijan station in the training phase, the RMSEs range from 0.96 to 1.44 for SVR-FA7 and SVR1, and the coefficient of determination varies from .60 to .82 for SVR1 and SVR-FA7, respectively (Table 5 and Figure 8), whereas in the testing phase the RMSEs and coefficients of determination are in the range of 1.02 to 1.43 and .61 to .79, respectively, for SVR1 and SVR-FA7. The lower error rates for SVR4, SVR5, SVR6 and SVR7 compared to SVR-FA1 (Table 5 and Figure 8) emphasize the importance of the number and type of inputs used in the models in this study, and disprove the hypothesis that a hybrid firefly algorithm always leads to better results than SVR. Increasing the number of inputs reduces the prediction error, especially in case of SVR models. Comparisons between SVR-FA1 and SVR-FA7 and between SVR1 and SVR7 shows 22 and 27% decreases in the RMSEs, respectively. On the other hand, comparing the performance of SVR1 with SVR-FA1 and SVR7 with SVR-FA7 (showing approximately 8 and 3% decreases in the RMSEs, respectively) indicates that the difference between the SVR and SVR-FA models is more evident when using fewer input parameters, and that the difference in performance between the two types of model decreases as the number of inputs increases. In other words, the difference in performance between the SVR and SVR-FA models is more apparent when fewer measured meteorological parameters are available to input into them.
For the Rasht station, the RMSE values are in the range of 0.83 to 1.23 in the training phase and 0.88 to 1.28 in the testing phase for SVR-FA7 and SVR1, respectively, and the coefficient of determination values are in the range of .59 to .81 in the training phase and .60 to .81 in the testing phase for SVR1 and SVR-FA7, respectively. The Taylor diagrams of the models for the Rasht station ( Figure 8) show that SVR6 and SVR7 performed better than SVR-FA1. A comparison of the percentages of the RMSE changes in the validation set indicates that the SVR-FA models are more sensitive to an increase in the number of inputs than the SVR models, with the increased number of inputs (changing from the first to seventh scenarios) resulting in 25 and 20% decreases in the RMSEs for the SVR-FA and SVR models, respectively. The behavior of the SVR and SVR-FA models with various inputs for the Rasht station is different from that for the Lahijan station. SVR-FA1 resulted in an 8.5% decrease in the RMSE compared with SVR1, which is comparable to the corresponding value at the Lahijan station (about 8.4%); however, using SVR-FA7 reduced the RMSE by about 13.7% compared to SVR7, which is approximately five times greater than that observed for the Lahijan station. In other words, the advantage of the SVR-FA models over the SVR models, as reflected in the RMSE decrements, is clearer with higher numbers of inputs for the Rasht station (changing SVR7 to SVR-FA7 vs. changing SVR1 to SVR-FA1) but with lower input numbers for the Lahijan station (changing SVR1 to SVR-FA1 vs. changing SVR7 to SVR-FA7).

Conclusion
Unfortunately, few studies have been conducted on the application of hybrid algorithms such as the SVR-FA for evaporation prediction, despite the great importance of this component in understanding the water balance in watersheds. In some developing countries, including Iran, many meteorological stations do not record evaporation -thus, using novel hybrid algorithms to predict evaporation can be a useful alternative. The aim of this study was to assess the performance of SVR and SVR-FA models in predicting daily evaporation amounts at the Lahijan and Rasht stations in northern Iran. According to the Pearson correlation coefficient values, the solar radiation and saturation vapor pressure deficit show the highest correlation with the evaporation amounts at Lahijan and Rasht, respectively. Identifying the climatic parameters with the highest impact on evaporation prediction and using them as inputs for the SVR and SVR-FA models in seven different scenarios showed that the prediction error decreases as the number of inputs increases. For the Lahijan station, in the testing phase the prediction errors (mm/day) decreased from 1.43 (SVR1) to 1.05 (SVR7) in the SVR models and from 1.31 (SVR-FA1) to 1.02 (SVR-FA7) in the SVR-FA models. For the Rasht station, in the testing phase the prediction errors (mm/day) decreased from 1.28 (SVR1) to 1.02 (SVR7) in the SVR models and from 1.17 (SVR-FA1) to 0.88 (SVR-FA7) in the SVR-FA models. For the Lahijan station, the coefficient of determination varies from approximately .61 (SVR1) to .78 (SVR7) for the SVR models and from .66 (SVR-FA1) to .79 (SVR-FA7) for the SVR-FA models. For the Rasht station, the coefficient of determination varies from approximately .60 (SVR1) to .74 (SVR7) for the SVR models and from .65 (SVR-FA1) to .81 (SVR-FA7) for the SVR-FA models. Using the average measured evaporation values for each station as a threshold value, both the SVR and SVR-FA models provided more appropriate results when predicting evaporation amounts lower than the threshold. The highest variation in performance for the seven scenarios (reduction of prediction error) for both the SVR and SVR-FA models (excluding the SVR model for the Rasht station) is observed at the 8-10 mm/day interval of measured evaporation amounts. Future research could conduct comparisons between the results from the various empirical methods described in the literature with the results of the algorithms used in the present study, as well as with other hybrid optimization algorithms, in order to scrutinize their performance in the area of watershed water resources management.