Soil moisture estimation using novel bio-inspired soft computing approaches

ABSTRACT Soil moisture (SM) is of paramount importance in irrigation scheduling, infiltration, runoff, and agricultural drought monitoring. This work aimed at evaluating the performance of the classical ANFIS (Adaptive Neuro-Fuzzy Inference System) model as well as ANFIS coupled with three bio-inspired metaheuristic optimization methods including whale optimization algorithm (ANFIS-WOA), krill herd algorithm (ANFIS-KHA) and firefly algorithm (ANFIS-FA) in estimating SM. Daily air temperature, relative humidity, wind speed and sunshine hours data at Istanbul Bolge station in Turkey and soil temperature values measured over 2008–2009 were fed into the models under six different scenarios. ANFIS-WOA (RMSE = 1.68, MAPE = 0.04) and ANFIS (RMSE = 2.55, MAPE = 0.07) exhibited the best and worst performance in SM estimation, respectively. All three hybrid models (ANFIS-WOA, ANFIS-KHA and ANFIS-FA) improved SM estimates, reducing RMSE by 34, 28 and 27% relative to the base ANFIS model, respectively. A more detailed analysis of model performances in estimating moisture content over three intervals including [15–25), [25–35) and ≥35% revealed that ANFIS-WOA has had the lowest errors with RMSEs of 1.69, 1.89 and 1.55 in the three SM intervals, respectively. From the perspective of under- or over-estimation of moisture values, ANFIS-WOA (RMSE = 1.44, MAPE = 0.03) in under-estimation set and ANFIS-KHA (RMSE = 1.94, MAPE = 0.05) in over-estimation set showed the highest accuracies. Overall, all three hybrid models performed better in the underestimation set compared to overestimation set. GRAPHICAL ABSTRACT


Introduction
Moisture is one of the most important soil properties and plays a substantial role in energy partitioning between topsoil and atmosphere (Kornelsen & Coulibaly, 2014;Maroufpoor et al., 2019;Wang et al., 2011). This parameter is dramatically affected by the interactions occurring within the soil profile -such as changes in physical and chemical properties of the soil -or by such external factors as evaporation, irrigation, precipitation and vegetation (Elshorbagy & Parasuraman, 2008;Gaur & Mohanty, in irrigation scheduling, and determines irrigation water requirement and irrigation timing. Over recent years, the idea of precision agriculture on the farm scale has been very popular among researchers and farmers. One of the approaches in precision agriculture is determination of irrigation water requirement and the appropriate irrigation time, which has been realized with the use of moisture sensors and by accurate determination of SM. However, spatial and temporal variability, which is a characteristic feature of SM, complicates its monitoring (Dobriyal et al., 2012).
The significance of SM in irrigation scheduling becomes much more apparent on the small scale of a farm. Accordingly, several studies have been conducted to estimate SM at field capacity (FC) and permanent wilting point (PWP). Application of moisture sensors for exact irrigation timing based on real-time SM measurement has been also developed under the concept of 'precision agriculture' and with the aim of optimizing water consumption and minimizing the water stress imposed on plants (Seneviratne et al., 2010;Zhou et al., 2019). The role of this parameter is obvious on a large scale, as in catchments, where it plays a part in controlling such processes as infiltration, runoff and evapotranspiration and eventually the water balance of catchments.
Gravimetric and volumetric methods, neutron probes and time domain reflectometers (TDR) are some of the methods and tools used for measuring SM. Capacitancebased sensors, which measure the moisture content in real-time and allow measurements to be recorded and stored by data loggers, have also attracted the attention of researchers in recent years (Jimenez et al., 2019). Although these methods provide measurements close to actual values -apart from measurement and instrumentation errors -they do have limitations such as being time consuming, labor intensive, expensive, or complex.
In the unsaturated phase, moisture is a function of matric potential and soil hydraulic properties and varies by soil depth and time. One of the best equations incorporating these variations is Richard's equation, which can be solved either numerically or analytically with certain boundary conditions (Abbasi et al., 2003;Sadeghi et al., 2011;Sadeghi & Jones, 2012). Two general approaches can be used for estimating SM. In the first approach, SM is regarded as a function of matric potential and estimated by pedotransfer functions. In parametric functions, the parameters of an equation describing (θ − h) relation are expressed in the form of pedotransfer functions (Javanshir et al., 2020). Point transfer functions are a class of functions which explicitly estimate moisture content for a particular suction (Dobarco et al., 2019), most importantly at −33 kPa and −1500 kPa (Ghorbani et al., 2017;Vaheddoost et al., 2020). In the second approach, SM is estimated regardless of the magnitude of suction by various techniques and using the soil physical, chemical and hydraulic parameters.
Two approaches can be adopted for improving SM estimation. The first is to better understand the physics governing the relationships which occur in the soil profile. This, however, appears to be very difficult and complex due to the considerable diversity of the variables controlling the soil environment; although researchers who work in the pure branch of soil science have always shouldered this task (Amente et al., 2000;Chanasyk & Neath, 1996;Elliott & Price, 2020;Gao et al., 2013;Jung et al., 2019;Pan et al., 2015;Pan & Nieswiadomy, 2016;Pardossi et al., 2009;Pires et al., 2005). The second approach, which has been taken by engineers from different disciplines, seeks improvement of the tools employed and application of more efficient techniques, so that soil properties can be better approximated with the least complexity (Zhang, Shao, et al., 2017;Maroufpoor et al., 2019;Moazenzadeh & Mohammadi, 2019;Pezij et al., 2019;Penghui et al., 2020).
SM is also one of the pivotal components used in agricultural drought assessment, and various SM-based indicators have been developed for drought monitoring (e.g. normalized difference moisture index or NDMI, soil adjusted vegetation index or SAVI, moisture stress index or MSI, and soil wetness deficit index or SWDI). The use of satellite images and various algorithms which have enabled SM monitoring on different temporal and spatial scales (Kolassa et al., 2018;Lees et al., 2021;Mishra et al., 2021;Moran et al., 2004;Sadeghi et al., 2015;Sadeghi et al., 2020;Schnur et al., 2010;Vergopolan et al., 2020;Wang et al., 2011;Yang et al., 2019), application of AIbased models (Ahmad et al., 2010;Srivastava et al., 2015;Im et al., 2016;Karandish & Simnek, 2016;Brandhorst et al., 2017;Han et al., 2018;Shin et al., 2018), or simultaneous application of both options have led to significant improvements in SM estimation studies. An AI-based model with meteorological variables including air temperature, wind speed, relative humidity and solar radiation as inputs was developed by Tsang and Jim (2016) to determine irrigation time and depth based on SM. Coopersmith et al. (2016) estimated SM at a depth of 5 cm by a machine learning technique using measured moisture contents at 10 cm depth and antecedent precipitation in 2010 and 2013, and reported RMSEs of 0.0173 and 0.0215 m 3 .m −3 , respectively. Using dynamic variables (albedo, NDVI (normalized difference vegetation index), and LST (land surface temperature)) extracted from satellite images as well as steadystate variables (DEM, longitude, and latitude), Liu et al. (2020) investigated the capability of six machine learning algorithms for SM spatial downscaling in four regions in North America, Spain, China, and Australia. Their findings showed that the RF (random forest) model has had the best performance and has been mostly affected by DEM, LST and NDVI variables. Effectiveness of AI-based models has been also proved in various areas of study including soil temperature estimation (Moazenzadeh & Mohammadi, 2019), water resources engineering (Deo et al., 2017) and solar energy management (Feng et al., 2018;Moazenzadeh et al., 2022;Qin et al., 2018;Qin et al., 2019;Qiu et al., 2022;Wang et al., 2016;Wang et al., 2019).
The SM content in the rhizosphere is important from the viewpoint of irrigation scheduling. On the one hand, the moisture required by a plant is mainly absorbed from topsoil and on the other, topsoil is the layer from which evaporation occurs, and it plays an important role in energy exchange with the atmosphere. A review of the literature shows that SM estimation is impacted by complexities associated with nonlinear changes in both space and time. Meteorological parameters, physicochemical parameters of the soil and interactions that occur in this environment will control soil moisture variations. Although pedotransfer functions (at specific matric potentials) and satellite imagery have been successfully used for SM estimation, the need for a database of soil physical and chemical properties in the former, and the constraints of obtaining appropriate images and complexity of algorithms in the latter case do limit the application of these methods. Therefore, using the main meteorological parameters (which are measured at most stations) and AI models (which possess a remarkable ability to simulate nonlinear phenomena) can be an effective solution worth examining. As a novel contribution in terms of applied algorithms, this study scrutinizes the applicability of three bio-inspired metaheuristic optimization algorithms for hybridization with ANFIS to construct robust models for SM computation using meteorological variables as model inputs.

Study area and data collection
Istanbul is the largest city in the north-west of Turkey and the Strait of Istanbul (the Bosphorus) bridges Asia and Europe. Its climate is a combination of the Black Sea and Mediterranean climate types, with warm summers and cold winters. According to the European Commission and European Soil Bureau Network (2006), four soil types including calisols, cambisols, leptosols and fluvisols are found in Turkey. Istanbul Bolge weather station is located in the Kartal district of Istanbul at 40°54' N, 29°09' E.
Most of the agricultural soils in the region are classified as leptosols. With an area of about 5540 km 2 and 815 mm of precipitation per year (Cuceloglu et al., 2017), Istanbul has approximately 74100 hectares of cultivated land. Wheat, barley, oats, paddy, corn, broad beans, chickpeas, beans, vetch, sugar beet, sunflower, potato and alfalfa are produced in Istanbul. Wheat, sunflower and barley are the most important plant products in the region. In addition, although the production of vegetables such as tomatoes, lettuce, beans, watermelon and spinach is dominant in Istanbul, almost all other vegetables are also grown such as cucumber, lettuce and spring onion. According to the Ministry of Agriculture and Forestry (2017) report, between 1975 and 2015, 1209 floods took place in Turkey which caused almost $100 million economic losses annually. As Istanbul is the most populated city in Turkey, it is a high-risk region in terms of flood hazard. The flood taken place in Marmara region of Istanbul is 2009 resulted in a death of at least 31 people in Istanbul and considerable economic losses (Altunkaynak & Bizimana, 2020;Ekmekcioğlu et al., 2021). Therefore, the study area is important both in terms of agriculture (from the perspective of irrigation scheduling and drought monitoring in the region) and water balance management at the catchment level (through controlling the infiltration component and surface runoff) as well as in terms of flood control.
In the present study, SM content and soil temperature were measured once a day at a depth of 20 cm in different parts of the study area at various time intervals totaling 490 days between 2008 and 2009. SM was measured by the volumetric method using a Campbell Scientific CS616 device which measures the volumetric water content of porous media from zero to saturation by vertically inserting its probe rods into the soil. This device gives an indication of the water content in the upper 30 cm of soil. In this study, measurements were performed at 20 cm below the soil surface. The main meteorological variables including air temperature (minimum and maximum), relative humidity, wind speed and sunshine hours measured at Istanbul Bolge station over these same time intervals were also obtained from the Turkish State Meteorological Service, a governmental bureau authorized with collecting climatic and meteorological data throughout the country. The study area is shown in Figure 1.
Statistical indices for measured parameters (soil temperature and moisture at 20 cm depth as well as meteorological data) are given in Table 1, and distribution of moisture contents of the soil samples is plotted in Figure  2. As can be seen in Figure 2, the highest density of SM contents falls in the range above 35% (by volume). Moisture contents of about 16% of samples are in the range  [15-25), about 31% of samples have moisture contents in the range [25][26][27][28][29][30][31][32][33][34][35], and about 53% have moisture contents higher than 35%.

ANFIS model
Introduced by Jang (1993), ANFIS is a combination of ANNs and fuzzy inference systems (FISs). This intelligence system uses a non-linear process from input to output and has a multi-layer form with three sections connected with a number of nodes (Leski & Czogala, 1999). In the first part, a rule-based approach including fuzzy rules is selected. The second part contains the database. A learning algorithm with the compatible membership functions (MFs) is used to create a set of fuzzy if-then laws from certain input-output subsets. For this purpose, ANFIS receives the primary FIS and modifies it with a propagation algorithm. Finally, in the last section, a logical inference system is used to remove it from the rules and input data to reach an acceptable output. The most famous models used in fuzzy systems are Mamdani, Sugeno and Takagi models. Figure 3 illustrates an ANFIS structure with two inputs (x 1 , x 2 ) and one output (F) in which a circle represents a stable node and a square shows an adaptive node .

Hybrid models
In the present study, classical ANFIS model's ability was improved by three bio-inspired optimization algorithms in the form of hybrid models. The algorithms included Firefly Algorithm (FA), Krill Herd Algorithm (KHA), and Whale Optimization Algorithm (WOA). These algorithms find the optimal ANFIS parameters considering the lowest error rate (Devi & Vijayalakshmi, 2020). Hybrid models were implemented in three steps: (i) Calculating error rates with optimal ANFIS parameters (ii) Computing the series values (iii) Fitting the ANFIS model to the values computed in previous steps.

Firefly algorithm (FA).
FA was initially presented by Yang (2009), and its main idea was inspired by the real life of fireflies (Yang, 2009). It can be considered a new manifestation of swarm intelligence, which has been widely used as a boosting tool in hydrological modeling (Moazenzadeh & Mohammadi, 2019). Researchers have found that fireflies use flashing lights as a protective mechanism, alerting their conspecifics in the environment. The rhythm or frequency of flashing light, the rate at which light flashes, and the duration of light blinking by fireflies form the different parts of the communication system of these insects (Yang, 2009). Since the attractiveness of a firefly is proportional to the light intensity seen by adjacent (neighboring) fireflies, the attractiveness parameter can be defined by the following equation: where β • is attractiveness of the brighter insect at r = 0. The fireflies find the optimized value of each objective function.

Krill Herd Algorithm (KHA).
KHA was introduced by Gandomi and Alavi (2012) as a solution for optimization problems. It is a metaheuristic populationbased algorithm and was inspired by the natural behavior of krill herds. The fitness function of each krill is unique and equals the distance from food to the highest congestion density. Types of krill movement include: (1) Herd movement; (2) Activity to find food; (3) Random displacement. If a herd is attacked by predators, a group of krill will be removed, resulting in a reduction in their density. Krill shrinkage is a multi-purpose process with two main purposes: increasing density and reaching food. Each shrimp moves toward the best possible solution, which is the shortest distance to the highest density and maximum food (Gandomi & Alavi, 2012).

Whale Optimization Algorithm (WOA).
WOA is another bio-inspired algorithm that mimics the hunting behavior of whales in nature (Mirjalili & Lewis, 2016). It is performed in three phases as follows: (i) Siege hunting; (ii) Operation phase: the method of attacking the net bubble; (iii) Exploration phase: Hunting search. WOA starts with a set of random solutions. In each repetition, the search agents update their position using the three aforementioned operators (Vaheddoost et al., 2020). This algorithm supposes that the best answer at the moment (the best solution) is prey, so it recognizes the prey and then surrounds it. Once the best search agents are identified, other search agents will update their location to the best search agent. This behavior is expressed by Equations (2) and (3) (Mirjalili & Lewis, 2016): where t denotes the present iteration, X is the location vector, X * is the location vector of the best solution obtained at the present time, and A and C are coefficient vectors. It should be noted that if there is a better answer, then the parameter of X * should be updated in each iteration. Vectors A and C are determined as follows (Vaheddoost et al., 2020):  where a decreases linearly from 2 to 0 during iterations (in both exploration and extraction phases), and r is a random vector ranging between 0-1. Figure 4 depicts the general structure used in the present study for estimating SM from meteorological parameters in the form of the ANFIS model and its hybrids with bio-inspired optimization algorithms. Structure of the coupling process between ANFIS and bio-inspired algorithms is shown in Figure 5

Scenario definition
In the present work, we first determined the best combination of input variables for the base ANFIS model. For this purpose, input variables were fed into ANFIS through different scenarios and their performances evaluated according to SM estimation error rates in testing set. The best results of each scenario based on the number of input variables are given in Table 2. After selecting the best combination of input parameters fed into the base   (Scenario 3), parameter values were optimized for this model using the optimization algorithms in the form of hybrid models.

Model evaluation
Five statistical indices including root mean square error (RMSE), mean absolute percentage error (MAPE), relative root mean square error (RRMSE), Nash-Sutcliffe Efficiency (NSE) and Index of Agreement (IA) were used for assessing model performance as follows: where SM i,obs , SM i,est and SM obs are observed, estimated and average of measured SM values, respectively, and n refers to the number of data points. Ertekin and Yaldiz (2000) proposed the following categories for rating models according to accuracy: A model is excellent if its RRMSE is below 10%, good if 10% < RRMSE < 20%, fair if 20% < RRMSE < 30%, and poor if RRMSE is higher than 30%. Figure 6 shows measured SM values at 20 cm depth as well as corresponding values estimated using ANFIS as the base model and ANFIS coupled with three bioinspired optimization algorithms. Model performance indicators are given in Table 3. In training set, application of each of the three optimization algorithms to the ANFIS model improved SM estimates, and the best values of all three indices (RMSE, MAPE, and RRMSE) were obtained with ANFIS-WOA. According to the results, the lowest and highest decreases in RMSE (RRMSE) were approximately 24% (ANFIS-KHA) and 36% (ANFIS-WOA), and minimum and maximum decreases in MAPE were about 31% (ANFIS-KHA) and 43% (ANFIS-WOA), respectively. In training phase, the increase in NSE from the base ANFIS model (NSE = 0.8) to hybrid models ranged from 11% (ANFIS-KHA model) to 15% (ANFIS-WOA model); and the lowest IA was 0.95 for the base ANFIS model, whereas all three hybrid models exhibited similar performances with IA values of 0.97 and 0.98. In testing set, all optimization algorithms improved results when coupled with the base ANFIS model. According to the results, the lowest and highest decreases in RMSE (RRMSE) were about 27% (ANFIS-FA) and 34% (ANFIS-WOA), and the lowest and highest decreases in MAPE were about 30% (ANFIS-FA) and 41% (ANFIS-WOA), respectively. In validation phase, the lowest NSE (approximately 0.79) was again obtained with the base ANFIS model as expected. ANFIS-KHA and ANFIS-FA showed a similar performance with NSE = 0.9 (13.9% higher than ANFIS) and ANFIS-WOA outperformed all models with NSE = 0.91 (15.2% higher than ANFIS). In this phase, the IA index was improved by about 3% for ANFIS-WOA and ANFIS-KHA and by about 2% for ANFIS-FA relative to the base model, indicating the relatively similar performance of the hybrid models. There are two main reasons for the differences in model performance: (1) Different results due to the differences in training data (input values), (2) Different results due to the differences in learning algorithms or in optimizing method of the learning algorithm; and differences in the models result from the mathematical equations behind them. ANFIS model uses fuzzy if-then rules by integrating neural networks technique. Then ANFIS finds input-output pairs using its membership functions. Regarding the membership functions of ANFIS in the present study, the differences in ANFIS results are caused by different meteorological variables. The FA algorithm can be implemented for parallel problems and then the system can easily find optimum minimum (Khan et al., 2016). The WOA is a capable method in global search, has a suitable performance in terms of convergence accuracy, and is able to solve complex optimization problems (Ning & Cao, 2021). The KHA considers the neighbor responses around possible optimum response for analyzing all possible optimum values (Gandomi & Alavi, 2012). Therefore, differences in the type and composition of the data used for training of the simulator models and differences in training algorithms, how the parameters of the hybrid model were optimized, and mathematical relationships and simplifications in the course of implementation of each optimization algorithm can be among the most important reasons of the different performances of simulator models.

SM estimation
As one of green water sources, SM is a highly important factor in agricultural drought monitoring in any region, to the extent that some drought indicators have been developed based on it. Also on a large scale (e.g. catchments), SM controls the rate of water infiltration and thus can affect one of the main components of water balance, namely surface runoff. All of the above shows that accurate estimation of SM can greatly help in proper management of water resources, both on the farm scale by reducing water loss and on the catchment scale by controlling water balance components. Using ERAinterim data, Srivastava et al. (2015) compared performance of the WRF-Noah LSM model (Weather Research and Forecasting model coupled with Noah land surface model) with PDM (Probability Distributed Model) in estimating soil moisture deficit and reported NSEs of 0.63 and 0.7 in calibration and validation phases, respectively. Elliott and Price (2020) compared soil moisture values measured by triplicate CS605 TDR probes in different depths and in two profiles with those estimated by three methods of determining van Genuchten model parameters including RETC (SSL), inverse modeling using HYDRUS-1D (TF) and alternate strategy for TF parametrization (ALT) and obtained NSE values of −1.4 and 0.24 for SSL, 0.89 and 0.91 for TF, and 0.89 and 0.79 for ALT in the two mentioned profiles. Using a number of soil properties, performance of three AIbased models (ANN, SVR (support vector regression), and ANFIS) and a hybrid model (ANFIS coupled with Gray Wolf Optimization (GWO)) in simulating SM in the northwest of Iran (Sanandaj, Dehgolan plain) was investigated by Maroufpoor et al. (2019). According to an approximately 50% improvement of estimation accuracy of ANFIS-GWO compared to ANFIS and SVR, they proposed ANFIS-GWO as a powerful tool for SM estimation. Capability of Random Forest (RF), a type of learning machine with input variables including meteorological, vegetation and soil parameters, was examined for root zone soil moisture estimation in an agricultural catchment by Carranza et al. (2021). With high R 2 values ( > 0.75) and low RMSEs ( < 0.06 m 3 .m −3 ), they proved the accuracy of data-driven models in estimating SM. Accuracy of an ANN-based model (with climatic data and rooting depth as inputs) in estimating SM during the plant growing season from 2008 to 2010 was examined and compared to that of a processbased model (RZWQM2), and NRMSEs of 7.64, 6.37 and 6.66 per cent were reported for the three years of the experiment, respectively (Gu et al., 2021). With NRM-SEs of 1.27 and 0.02 per cent for 0-15 and 150-200 cm layers, respectively, the ANN-based model proved to be more efficient for evaluating SM dynamics throughout the growth season. Elshorbagy and Parasuraman (2008) compared the performance of a higher-order neural network (HONNs) with that of traditional ANNs and a conceptual model (a site-specific system dynamic watershed, (SDW)) in estimating moisture in peat and till soils of various depths. Inputs of their neural network model included precipitation, air temperature, net radiation and soil temperature at different depths. In a 35 cm deep soil, the proposed HONNs model (with RMSEs of 0.05 and 0.01 cm 3 .cm 3 in peat layer and till layer, respectively) outperformed SDW (with corresponding RMSEs of 0.08 and 0.1, respectively). They concluded that application and ≥ 35% (last interval) were examined separately in testing set, with the results given in Table 4. Also, the trend of RMSE variations over the same three intervals along with RMSE values for the entire measured moisture data set ('All data') in testing set are represented in Figure 7.
Examination of the results showed that the lowest and highest RMSE, MAPE and RRMSE values for all models have been obtained in the last interval (θ S ≥ 35) and the middle interval (25 ≤ θ S < 35), respectively. In other words, the trend of variations in error rates between the three intervals has been first upward (from first to middle interval) and then downward (from middle to last interval), so that the minimum increase in error rate is 2.8% (ANFIS-FA) and its maximum is 39% (ANFIS); and the lowest and highest decreases in error rate are 18% (ANFIS-WOA) and 48.9% (ANFIS), respectively. For all models, error reduction from middle interval (25 ≤ θ S < 35) to last interval (θ S ≥ 35) is greater than corresponding increase in error from first interval (15 ≤ θ S < 25) to middle interval (25 ≤ θ S < 35), although this trend is stronger for ANFIS-FA compared to the other models.  According to the results shown in Table 4, minimum and maximum RMSE increments between the last interval (θ S ≥ 35) and the entire testing set (All data) are 8.4% (ANFIS-WOA) and 42% (ANFIS), respectively. The highest difference in estimation error ( RMSE) in the first interval (15 ≤ θ S < 25), middle interval (25 ≤ θ S < 35) and last interval (θ S ≥ 35) are also about 0.85, 1.63, and 0.25, respectively, between ANFIS and ANFIS-WOA in all mentioned intervals. It can be therefore concluded that the differences in performance between the optimization algorithms, and particularly their difference with the base ANFIS model, in the entire testing set (All data) are caused by the differences in how they simulate moisture in the first two intervals (15 ≤ θ S < 25 and 25 ≤ θ S < 35), that is the same intervals in which the base ANFIS model had the poorest performance among all models with RMSEs of 2.534% and 3.524%, respectively. According to the results (Table  4), the highest decreases in RMSE in the first, middle and last intervals relative to the base ANFIS model are 33.4, 46.3 and 13.8 per cent, all of them obtained with ANFIS-WOA.

Under-and over-estimation
Distribution of measured and estimated SM values for all models in testing set and statistical indicators of performance of models in under-and over-estimation sets are given in Figure 8 and Table 5, respectively.
In under-estimation set, ANFIS-WOA has had the best performance with RMSE, MAPE and RRMSE of 1.441, 0.032 and 4.288 per cent, respectively. In overestimation set, ANFIS-KHA has outperformed the other models with RMSE and RRMSE of 1.936 and 6.155 per cent, respectively. ANFIS-WOA and ANFIS-KHA share the best MAPE value (MAPE = 0.049), although it is not much different from that of ANFIS-FA (MAPE = 0.053). Comparison of the results shown in Table 5 reveals that apart from the base ANFIS model, all other models have performed better in under-estimation set, with minimum and maximum increases in RMSE from under-to overestimation set being approximately 10.8 and 36 per cent, for ANFIS-KHA and ANFIS-WOA, respectively. One of the main reasons for the different performance of ANFIS compared to other models (better performance in overestimation set) has been its moisture estimates at three particular points which are marked in Figure 9. RMSE of these three points alone is about 8.35%, while RMSE of the whole under-estimated subset of the testing set for the ANFIS model is about 2.72%; and the high error rate in estimating these three particular points has decreased the overall accuracy in under-estimation set.
In under-estimation set and in comparison to the base ANFIS model, minimum and maximum decreases in RMSE were 35.7% (ANFIS-KHA) and 47% (ANFIS-WOA), respectively; whereas corresponding values in over-estimation set were 13.8% (ANFIS-FA) and 19.3% (ANFIS-KHA). These figures not only confirm better performance of the models in under-estimation set compared to the over-estimation set, but also show that the percentage of error reduction in estimation of SM due to application of each of the optimization algorithms to the base ANFIS model has been higher in under-compared to over-estimation set.

Conclusions
Soil moisture plays a key role in irrigation scheduling as the goal of irrigation is to increase current soil moisture until reaching field capacity. Therefore, knowing SM content will be very helpful in determining irrigation depth and frequency, which are effective in optimal water consumption and prevention of water stress to plants, respectively. This work examined the capability of bio-inspired optimization algorithms in estimating SM over 2007-2008 in Turkey using meteorological variables as model inputs. Hybridization of all bioinspired algorithms with ANFIS (ANFIS-WOA, ANFIS-KHA and ANFIS-FA) improved SM estimates in comparison with the base ANFIS model, although ANFIS-WOA performed best with an RMSE of 1.68. All models showed almost the same performance for measured moisture contents higher than 35%, so it can be concluded that the differences in performance among the optimization algorithms and their differences with the base ANFIS model in estimating SM has stemmed from the first two intervals, i.e. [15-25) and [25-35)%. In terms of model performance from the viewpoint of under-or overestimation of SM, ANFIS-WOA in under-estimation set and ANFIS-KHA in over-estimation set outperformed the other models, with RMSEs of 1.44 and 1.94, respectively. All models except ANFIS had lower errors in under-estimation set compared to the over-estimation set. Limitations of the present study are twofold. The first is SM simulation in the form of AI models, which, similar to all modeling attempts, may be accompanied by uncertainties. The structure of the ANFIS model is complex, and the number and type of its membership functions can add to this complexity. Metaheuristic optimization algorithms also suffer such disadvantages as parameter setting, high computational complexity, getting trapped in local optimums, and high running time. The second limitation is the constraints that affect the structures defined in this study. For example, the authors tried to estimate SM -which is one of the hard-to-measure soil parameters and only measured in rare and special conditions -using the main meteorological parameters which are measured at all stations together with a single soil parameter (soil temperature) as the inputs of AI models.
Obviously, the use of other readily available soil parameters along with the parameters used in this study can lead to a better understanding of SM dynamics and thus to better SM estimates. The basic idea of the present work can be evaluated more rigorously in further studies by employing other AI models and optimization algorithms under various climates, and using remote sensing indices that indirectly indicate the SM status and can help in making more accurate SM estimates across larger areas.

Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.