Performance evaluation of sediment ejector efficiency using hybrid neuro-fuzzy models

ABSTRACT Sediment transport in the ejector is highly stochastic and non-linear in nature, and its accurate estimation is a complex and challenging mission. This study attempts to investigate the sediment removal estimation of sediment ejector using newly developed hybrid data-intelligence models. The proposed models are based on the hybridization of adaptive neuro-fuzzy inference systems (ANFIS) with different metaheuristic algorithms, namely, particle swarm optimization (PSO), genetic algorithm (GA), differential evolution (DE), and ant colony optimization (ACO). The proposed models are constructed with various related input variables such as sediment concentration, flow depth, velocity, sediment size, Froude number, extraction ratio, number of tunnels and sub-tunnels, and flow depth at upstream of the sediment ejector. The estimation capacity of the developed hybrid models is assessed using several statistical evaluation indices. The modeling results obtained for the studied ejector sediment removal estimation demonstrated an optimistic finding. Among the developed hybrid models, ANFIS-PSO model exhibited the best predictability potential with maximum correlation coefficient values CC Train = 0.915 and CCTest = 0.916.


Research background
Sediment control in irrigation canals is considered one of the most challenging issues since the advent of diversion headworks, because an excessive deposition of silt leads to the reduction of the carrying capacity of the canal (Vázquez-Méndez et al., 2018). These results to farmers not getting sufficient water at the tail end of the canal to irrigate their crops (Depeweg et al., 2014;Lawrence & Atkinson, 1998).
To avoid problems caused by sedimentation, a number of methods of silt control have been developed (Chavarrías et al., 2019). They are generally categorized into two: the preventive and the curative methods. Preventive methods address problems, even before they start forming or become serious. For these methods, tunnel-type excluder devices (Kothyari et al., 1992;Tiwari, Sihag, and Das, 2019), have been mainly used. Staggered tunnels are employed at the intake of the offtake canal at the bed of CONTACT Zaher Mundher Yaseen yaseen@alayen.edu.iq the river, and a relatively silt-free upper layer of water is allowed to enter the offtake canal. One disadvantage of these devices is that, however, efficient they are; a large amount of silts is bound to enter the offtake canal. This is due to the turbulence created at the mouth of the offtake canal at the sediment excluder, which causes and keeps sediment materials in suspension. Curative methods, on the other hand, correct the problems after they have started. Silt ejectors are usually used for the offtake canal at a suitable distance downstream of the head regulator (Lisé-Pronovost et al., 2019). Depending on the situation and position of the problem, different types of silt ejectors are used to mitigate and manage silt materials in the canal. A traditional type of settling basin is used where there is the scarcity of water, as without fraction of water loss, it can be successfully employed (Garde et al., 1990;Raju et al., 1999). However, this device has several disadvantages, because it requires construction in a large area, a longer residence period to settle, and a periodic physical cleaning which interrupts the farmers' work. These drawbacks have been addressed by refining traditional settling basins into vortex settling basins, as discussed and analyzed by Athar et al. (2003) and Paul et al. (1991). However, this fluidic structure also suffers from two shortcomings: it works only when discharge is low, and it removes the finest particles which are beneficial for crop growth. Vortex tube sand traps remain ineffective in case of higher suspended load as it removes the bed load only from the flow (Dashtbozorgi & Asareh, 2015a, 2015bLawrence & Atkinson, 1998;Moradi et al., 2013;Tiwari, Sihag, Kumar, et al., 2020). The tunnel-type sediment ejector has advantages over other devices due to its simple construction and effective function. At the same time, it does not suffer from the same limitations as other devices as long as water availability is not an issue (Kothyari et al., 1994). It comprises of a flat rooftop above the canal bed, which divides the silt-loaded lower portion from the upper portion. Under the roof slab, tunnels carry the bottom layers of sediment through escape routes to the river downstream of the diversion headworks, while the comparatively siltfree upper part of the water is allowed to cross over the rooftop to the canal downstream of the sediment ejector.

Basic hydraulic principle
In a moveable bed channel, silt moves as the suspended load and the vertical turbulent component of the stream force up fine materials, which are kept in suspension and transported by the current of the stream. Heavier particles that cannot be forced into suspension moves along the surface of the bed by either sliding or bouncing. However, there is a very fine boundary line between the bed and suspended sediments. This is because of the continuous exchange of particles floating up from and sinking down to the bed of the channel. The intensity of silt materials in the lower part is higher than that of the top part. As a consequence, the silt intensity is decreased in the canal downstream of sediment ejector device.

Present state of knowledge, novelty of the work and objective
Sediment ejectors have been examined by the Indian Standard (IS) 6004-19806004- (Dhillon et al., 1977Garde & Pande, 1976). However, all these studies are dependent entirely on the physical model works where only hydraulic principles have been used. However, the exact estimation of tunnel ejector sediment removal efficiency remains inconclusive, because traditional models do not consider the complexity involved with fluid current in the tunnel ejector. Recently, soft computing models have been utilized massively in the field of water resources and hydraulics engineering, where knowledge of neither the mechanisms involved nor the physical model study in laboratory and field are required (Ansari, 2014;Ansari & Athar, 2013;Singh et al., 2018;Tiwari, Sihag, Kumar, et al., 2020). Artificial Intelligence (AI) models perform extraordinarily in mapping the actual mechanism of the simulated hydraulic applications (Yaseen, Sulaiman, et al., 2019). However, the implementations of the AI models are limited for modeling hydraulic engineering problems. Hence, finding new developed model is garnered great interest with hydraulic engineering groups.
Among several AI models, the adaptive neuro-fuzzy inference system (ANFIS) is one of the most reliable and robust predictive models applied in the field of water resources and hydraulic engineering (Gholami et al., 2017;Safavi et al., 2015). ANFIS model is associated with the limitation of the internal parameters tuning and thus hybridization with metaheuristic optimization algorithms is the solution for this limitation . Taking into account that the performance of the hybrid ANFIS models attained remarkable performance of the prediction accuracy over the standalone ANFIS model for diverse hydraulic engineering problems (Azimi et al., 2017;2018;Gholami et al., 2018), the primary aim of this work is to develop hybrid predictive models based on the hybridization of ANFIS model with different metaheuristic algorithms. ANFIS-metaheuristic models with different input parameters, including sediment concentration, velocity, sediment size, Froude number, extraction ratio, number of tunnels and sub-tunnels, and flow depth at upstream of sediment ejector were trained and evaluated in terms of statistical evaluation indices to find the best estimator models. The performance accuracies of the proposed hybrid models are compared with modern ANFIS, which is trained by particle swarm optimization (PSO), genetic algorithm (GA), differential evolution (DE), and ant colony optimization (ACO). The selection of those metaheuristic algorithms was owing to their scientific successful implementation in diverse hydraulic and sediment problems (Chen et al., 2017;Panahi et al., 2020;Ray et al., 2021;Sharafati et al., 2019). This research is considered as the first from type on the implementation of hybrid ANFIS models for estimating tunnel ejector sediment removal efficiency.
The structure of the manuscript is designed as follows. Section 2 presents the essential parameters influencing the removal efficiency. Section 3 reports the overview of the applied predictive models. Section 4 exhibits the modeling development of the hybrid ANFIS models. The application result and discussion are presented in Section 5. In the final section, the conclusion is stated.

Parameters affecting the removal efficiency of settling basins
In field and laboratory studies, the most important parameters affecting the removal efficiency include the parameters representing the sedimentation characteristics of the hydraulic flow conditions as well as the geometrical characteristics of the flume or canal (Athar et al., 2005;Athar et al., 2002;Garde et al., 1990;Mashauri, 1986). For this purpose, a relation with regard to the variables used for the research is presented in the following in order to express the relationship between removal efficiency and effective variables on it: In the above equation, the variables represent sediment concentration (C), flow depth (D), discharge (Q), sediment size (d n ), Froude number (Fr), flow depth at upstream of the settling basin (H 1 ), the number of main tunnels (m), opening ratio (ex), number of secondary tunnels (s), velocity (V), and the width of the flow (b). Dimensional analysis was also applied to investigate the dimension of variables involved in the removal efficiency. Using the Buckingham π theorem, the relationship between the removal efficiency with the effective parameters on it is expressed in the following equation: In the equation above, the additional variables represent hydraulic radius (R h ), shear velocity (U) and particle falling velocity (ω j ).
The 198 laboratory data is collected from physical model that conducted by Singh (2018). These experiments carried out in rectangular flume with dimensions of 0.45 × 1.0 × 24 m in National Institute of Technology University, India. The sediment trapping efficiency was investigated in Different flow conditions, sediment properties, extraction ratio and combinations of main and sub tunnels. The physical models were prepared by re-circulation system, stilling chamber and baffle wall for flowing water through the flume and decreasing its turbulence conditions. In addition, a transition zone is designed to neglect the impacts of turbulent flows. A sediment ejector was used at the appropriate distance from inlet of flumes, also, different numbers of tunnels and sub tunnels were fixed to perform experiments.
The input data were used from a laboratory study carried out by combining the variables for the estimation of the removal efficiency which has been done based on the correlation of non-dimensional laboratory variables to the target variable (Table 1). The variables with the least correlation were then eliminated, and new combinations were defined from the correlation test. Due to the correlation ratio of each variable to the actual target values in the laboratory, the variables of the data were omitted, and the combination of the new variables was obtained as shown in Table 2. On the other hand, the input combinations defined in present study are provided based on the correlation coefficient between the target and predictive variables. In this way, the first combination (M1) includes all of the predictive variables. The second combination (M2) comprises all of the predictive variables except the input variable (V/U * ) which offers the lowest correlation (r = −0.012) with the target variable. Hence, the last combination (M11) includes only the predictive variable (D/dn) which provides the highest correlation (r = −0.644).
In this study, almost 75% (152 samples) of the data is used as the train data, and the remained (46 samples) is Table 2. Input variables combination for prediction of sediment removal efficiency. used for testing phase. The ranges of the input parameters in both stages are presented in Table 3.

Adaptive neuro-fuzzy inference systems (ANFIS)
ANFIS is a soft computing technique that integrated neural networks and fuzzy inference systems. An ANFIS model makes use of neural networks for solving nonlinear problems and their ability to recognize relations between variables. It also makes use of fuzzy inference systems for reasoning in complicated environments using the principles derived from human decision-making (Sharafati et al., 2019). ANFIS models have been already used as reliable prediction tool for scouring, precipitation and water level monitoring (Sharafati et al., 2019;Yaseen, Sulaiman, et al., 2019).
In summary, an ANFIS model applies if/then rules for expressing nonlinear relations between inputs and outputs. With fuzzy logic, each input (x and y) is defined as fuzzy set (A and B) through membership functions. The if/then rules are expressed as follows: where A i and B i are fuzzy sets, f represents the output of the fuzzy rules and p i , q i and r i are parameters that are 'tuned' in the training stage. In train stage ANN is adjusted the mentioned parameters using backpropagation and least square methods. The adjustment of the parameters is an iterative process and stops once defined criteria satisfied. Generally, the structure of an ANFIS model has five layers ( Figure 1), described as follows: Fuzzification layer (Layer 1), with two main tasks: 1. Computing input variables of membership function MF 2. Generating the input variables for the next layer The results of the first task are expressed as O 1 i = μ Ai (x). μ(x) is bell-shaped function and in this study, the Gaussian membership function (GMF) is employed to fit appropriate relations between inputs variables and target one (trapping efficiency). The GMF can be described through a relation as follows: where x, c and σ are input parameter, mean of input parameter and its standard deviation, respectively. Rule layer (Layer 2), which determines the firing strength ω i of each rule via multiplication of input signals. The main task of this layer is finding the best rule that relates the inputs. this layer computed as follows: where O 2i and ω i are output of layer 2 and firing strength, respectively.

Normalization layer (Layer 3)
, which normalizes the firing strengths obtained from each rule in the previous layer as follows:

Defuzzification layer (Layer 4):
The node function of each adaptive node computed via output values that obtained from 'if-then rules' as follows: where {a i , b i , c i } are defuzzification parameters that determined in training stage.

Output layer (Layer 5):
The weighted average of overall outputs is identified in this layer as follows: The structure of ANFIS model is presented in Figure 1. In addition, the adjustable parameters of ANFIS is reported in Table 4.

Ant algorithm
This algorithm is categorized as computational intelligence or swarm intelligence. It is a type of intelligence derived from the congestion of a number of factors. The key strength of this model is associated with congestion of members, therefore, it doesn't depend on intelligent of individual.
The key factors in swarm intelligence include: i. Population ii. Interaction between population members: This algorithm is unlike genetics, where there is no swarm intelligence because there is no behavioral interaction between population members iii. Communication: Any collaboration requires the communication of population members iv. Information exchange v. Information flow vi. The members of the population must adhere to their rules.
If there is a flow of information and self-discipline, there must be swarm intelligence. Ant behavior was first studied by a scientist named Goss using Argentine ants. In this experiment, different paths which an ant passed to reach its prey were studied. In this experiment, the ants start in the nest, and one of the paths is shorter than the other. The ants do not know the environment at first. They release a substance called pheromone on their path. In the shorter paths, where more ants move, the pheromone concentration is higher. Ants are naturally attracted to pheromones. Little by little, they achieve their permanent system and gradually converge to the short path. It should be noted that the choice of the shortest path can be possible due to the congestion of the ants rather than their individual existence. The pheromone's effect upon the colony of ants acts like the numerical information distributed throughout the answer space, and the ants use it when they run the algorithm in order to share their experiences with others. Artificial ants which are used in the ant colony produce random responses using the effective procedures in such a way that they alternately generate responses by adding the components of a solution to the partial response. Therefore, they use the heuristic information within the problem, the effects of synthetic pheromones that change during the process of solving, and the experiences gained by the search agents (the ants) in order to create the answer for the problem. The route that ant chooses to move from a nest toward food depends on amount of pheromones secrete in the route and the information available for the cost of selecting that route which is defined in the following way: In the above relation, p k i,j is the probability of choosing the available ant-k paths to move from point i to point j. The variable τ i,j is generally equal to the amount of pheromone secreted by the ants on the route from point i to point j. The value of τ i,j is expressed as the addition of the amount of pheromone secreted in the preceding steps to the amount of pheromone secreted by each ant τ k i,j on the route from point i to point j.
as a dependent variable is the experience gained by ants η i,j , which is defined in the following way: In the above relation, F(x) is the value of the cost function produced in each ant's past experiences. Moreover, the variables α, β are the weighting coefficients of each one of the pheromone and experience variables. The variables η im and τ im are the mean pheromones and cost function, respectively. In each iteration, a number of wrong answers must be eliminated. Hence, a variable named 'evaporation coefficient ρ' influences the amount of pheromone secreted in each path and k ant: In the equation above, the variables represent the evaporation coefficient (ρ) and the cost function (J(ψ t )) obtained from the experience of the best ant which has passed a specific route. The variable τ * i,j is the amount of pheromone accumulated in each route in the previous steps. It is defined in the following way: In general, the ant algorithm flowchart is presented in Figure 2(a).

Particle swarm algorithm
This algorithm was first proposed by Kennedy and Eberhart. They named this algorithm particle swarm optimization (PSO). This algorithm is a social search algorithm that is modeled on the social behavior of bird flock. Initially, the algorithm was designed to discover patterns dominating on the simultaneous flight of birds and the sudden change of directions in the search space.
In AF lock of birds, each bird always flies towards the bird leader. If the bird leader deviates, the others will have the same deviation. The leader is in the best position. In this algorithm, every particle (a member of the population) looks for the optimal point. Therefore, it moves under any circumstances; hence, this movement is  rapidly done. Like other population based algorithms, the PSO algorithm uses a set of possible answers, and these responses continue until an optimal response is found, and the completion conditions of the algorithm are provided. In this method, each response x is represented as a particle in the genetic algorithm, each response is called a chromosome, and like all other algorithms, a basic population must be formed. In this way, the velocity equation guarantees the movement of particles toward the optimal area. This equation is usually based on three basic elements: (a) the cognitive component: personal best (p best ) is the best particle status; (b) the collective component: global best (g best ) is the best particle which has existed so far; and (c) Velocity (V i ).
In the simulation of this algorithm, the behavior of each particle can be influenced by p best in a specific neighborhood, the best solution that the particle has ever had, and the g best is the best particle compared to other particles.
The PSO algorithm flowchart is presented in Figure 2(b).

Genetic algorithm
The general structure of the genetic algorithm was initially introduced by Goldberg in 1989. The genetic algorithm is a random search technique based on the natural selection mechanism. The genetic algorithm begins with an elementary answer set called the population. Each member of the population is called a chromosome X n , (X n ∈ R), and an agent for finding the answer to the problem is under investigation. Each chromosome includes a sequence of numbers called a gene.
Gene values that actually represent the value of decision variables can be defined as real numbers or binary system. The basic differences of genetic algorithm from other common optimization algorithms and search structures include: i. The genetic algorithm does not work directly with the answers themselves but deal with the codes which are its agents. ii. The genetic algorithm searches for a set of answers instead of a single answer. iii. The genetic algorithm uses fitting function information instead of derivatives or other auxiliary information. iv. The genetic algorithm uses probabilistic selection rules instead of definitive rules. Selection In the genetic algorithm, the selection is based on the principle of Darwin's natural selection. The criteria, methods and thresholds of selection are considered the most important components and processes in genetic algorithm. On the other hand, by reducing the selection pressure, high generation dispersion reduces the possibility of reaching the final answer. There are various methods for the selection mechanism, including ranking, competition and roulette wheel. In this study, the roulette wheel selection is employed.

Crossover
Genetic crossover is also one of the functions of genetic algorithm, and it is applied by combining parental genes; new offspring will appear in order to attend the next generation. This process can be performed in different forms. Consider parental genes as X and Y: And by selecting K ∈ [1, n] as the random point of crossover, the offspring can be expressed in the following way: Y new and X new which are new offspring from the X and Y parents, respectively, have their corresponding genetic attributes.

Genetic mutation
A mutation is an operator that causes changes in different chromosomes. One simple way to make a chromosome mutation is to change one or more genes at random. Like crossover, a chromosome is hypothetically selected, and the mutation is performed on it. The chromosome mutation is performed in the following way: In the above relation, X K is a random gene characteristic in the range [X L K , X U K ]. The values of X L K andX U K are the minimum and maximum boundary values associate with the variable, respectively. The X K gene can be replaced by either X L K or X U K with equal probability. In the genetic algorithm, mutation plays a vital role in replacing lost genes into the population so that it can be present in a new format, and with new genes as a response that was not present in the primitive population. The GA flowchart is presented in Figure 2(c).

Differential evolution algorithm
The DE algorithm was first introduced by Storn and Price (1997) which is a random population algorithm. The main feature distinguish this algorithm other the other metaheuristic algorithms, is the differential mutation (Takagi & Sugeno, 1985). The main emphasis on the development of the DE algorithm is to solve the complex optimization problems associated with the lack of its local search solution. In addition, the order of the selection operator, mutation and transplantation. In other words, in this algorithm, the mutation operator is used before the crossover operation. DE algorithm starts its process mechanism by producing random population that presents a symbol of solution for the studied problem. The DE flowchart is presented in Figure 2(d).

Models hybridization
Researchers have combined different algorithms to overcome the unique limitations of each algorithm. In the first group, the combination of evolutionary algorithms acts as input preprocessors, and another intelligent approach is used to find the best possible answer. In the second group, another intelligent approach processes the data, and then the best possible answer is obtained by evolutionary algorithms. In the third group, both algorithms work together to optimize and adjust each other's parameters. In the fourth group, the intelligent approach is used solely to prepare and determine the necessary parameters in the evolutionary algorithm. Indeed, the standalone estimators such as support vector regression, ANFIS and ANN employed classical approaches such as gradient decent to adapt their adjustable parameters. In some case, especially, when numbers of input variables are increased, they encounter with different difficulties such as high-cost of computation or generating not precise results. In this regard, there is strong need to use different approaches such as evolutionary algorithms (EA) for enhancing their capabilities of standalone estimators to find best solutions. These models are dynamic and could improve the flexibility of standalone models for tacking complex problems. However, there are many types of EA algorithms and they use different paradigms to solve a problem. Some of these algorithms are trapped in local solution while some find the best solution easily with less cost of computation. This study strongly focused to find best EA model to enhance the capabilities of standalone ANFIS.
The ANFIS includes two important sections that named antecedent and consequent sections. In standalone ANFIS, the back propagation neural network is employed to find appropriate values for parameters of each sections. In antecedent section, parameters in membership function which discussed in Section 3.1 is tuned by evolutionary algorithms. In addition, the evolutionary algorithm attempts to find optimum values of consequent section which incorporate with parameters p i , q i and r i in relation 9. The computations are iterative and will be terminated after termination criteria is satisfied. The hybridization flowchart is presented in Figure 3.
There are different methods for the hybridization of the algorithms: i. The solutions obtained from the initial population of evolutionary algorithms can be obtained by the discovery method in the solutions. ii. In some methods, the solutions obtained by an evolutionary algorithm can be improved by local search. These types of algorithms are known as privileged algorithms. iii. The solution of the methods is indirectly given to generate a unique decoding algorithm by mapping the indirect solutions. The mapping method is performed in such a way that the decoding algorithm extracts the problem characteristics, and then the super-innovative algorithm starts to solve the problem. iv. Some methods extract the solution to the problem by variable operators based on information. For example, in the combination of some algorithms, the secondary population is better than the integration and combination of the primary population.
In this section, ANFIS is combined with evolutionary algorithms mentioned in the previous section in the i. Receiving training data: In this step, laboratory data are entered into the model in the percent of 75% and 25% in training and testing ways and randomly selected. ii. Creating a fuzzy base system (FIS). iii. Adjusting the parameters of fuzzy with respect to the performance error indices by the evolutionary algorithms. iv. Choosing the best fuzzy system with the best parameters and the least error as the final result.
The user-defined parameters of the optimization methods are reported in Table 5. The developed hybridized ANIFS models were assessed using several statistical indices including correlation coefficient (CC), root mean square error (RMSE), scaled root mean square error (SRMSE) and Willmott Index (W-Index). The mathematical formulation can be expressed as follows: where N is the number of the dataset, (SR) obs (SR) pre are presented the observed and the predicted sediment removal. (SR) obs and (SR) pre are the mean values of the observed and the predicted sediment removal

Results and discussion
In this study, dimensionless variables (C, Fr, m, ex, s, R h  capabilities of the improved ANFIS models in predicting the depth, several computed statistical indices were adopted for the model's evaluation. RMSE represents the standard deviation of the remains (the difference between the predicted values relative to the observed data). This criterion states how much data is centered around best fit line. CC is an index of how closely the two variables are related to each other. The W-Index computes the agreement between predicted and observed data, and SRMSE is the normalized RMSE.
The best performance results of each of the models used in the training and testing stages were tabulated in Table 6. The measurement error in the training stage shows that the ANFIS-PSO model predicts removal efficiency better than other models. The values of the performance indices in the training stage in the ANFIS-PSO model indicate that this model has the lowest RMSE = 5.276 and the highest CC = 0.915. This model is more powerful than the standalone ANFIS model. This best can be explained due to the feasibility of the of the PSO metaheuristic algorithm for tuning the essential internal parameters of the ANFIS model to attain a reliable learning process for studied problem. However, the non-hybrid ANFIS model also performs better than   the ANFIS-PSO model (CC = 0.916) performed better than the standalone ANFIS model (CC = 0.897). Based on the presented results in Table 6, it is obvious that the best input combination of ANFIS-GA and ANFIS-ACO was performed in similar capacity. Whereas the other hybrid ANFIS models offer different combinations due to different paradigm of the optimization methods used in those models for finding the best solution.
The scatter diagram of all developed predictive models was shown in Figure 4 for the testing stages, respectively. According to the presented diagram, the correlation coefficient values of the developed modes were ANFIS-ACO (CC = 0.862), ANFIS-DE (CC = 0.845), and ANFIS-GA (CC = 0.877) models having the least linear correlation value between the observed data and the predicted ones. The Heat Map diagram was used to compare the applied models to predict the removal efficiency. The Heat Map diagram is expressed by superior elements with a more regular process in the values of performance indices (dark red color) and unbalanced elements with a more irregular trend (dark blue color). If the values of the performance indices have a better process and report lower error, the model will be more appropriate (more reliable with less error), and if the responses of the performance indices have more errors and weaker performance, its color will be warmer. As presented in Figure 5, if the indices of the models have higher and stronger power in prediction, the graph will have a cooler color. If the model has more irregularity and errors and weaker performance, its color will be warmer. In these two stages, the ANFIS-PSO model has the coolest color which indicates the least weakness and irregularity in the process, and the ANFIS-DE model has the warmest color indicating the highest weakness and the most irregularity in the process of performance index.
For better graphical representation, the Taylor Diagram was generated. The results of the Taylor Diagram showed a good agreement with the scatter charts. Figure 6 shows the better relative convergence between the ANFIS-PSO predictive model (green rectangle) and the observed data. The ANFIS-PSO model has the highest convergence between predictive models with the observed data (over 0.9) in both testing and training stages. The ANFIS-DE predictive model (yellow circle) also has the least convergence between the predictive models and the observed data.
Box diagram is a standard way of displaying data distribution. One application of this diagram is symmetry display. As shown in Figure 7, the results obtained by the ANFIS-PSO model were in good agreement with the observed data. The determination was based on the characteristics of a box diagram (minimum, maximum, the first quartile, the second or middle quartile and the third quartile).

The influence of model uncertainty on the performance indices
In this section, the impact of the model's uncertainty on the performance indices (RMSE, CC, W-Index, SRMSE) was investigated. In accordance to the reported results in Figure 8, the index of W-Index (IQR = 0.78) has the highest sensitivity compared to other indices over the test modeling stage. The RMSE and SRMSE indices have the same values of IQR (0.76), also the CC (IQR = 0.75) has less IQR in comparison with the other indices.

The influence of input variables on the performance indices
The impact of the input variables on the performance indices was investigated for the ANFIS-PSO model due to its superiority in sediment removal predictability. In this way, the indices values were obtained by eliminating some of the input variables through the ANFIS-PSO. The results showed that in the testing stage of the indices of CC (IQR = 0.17) has the highest sensitivity in the way of data entry and the combination of independent variables in the model. However, other indices have similar IQR values (0.16) in this stage. Figure 9 shows the performance of each of the indices influenced by the input change on the ANFIS-PSO model.

Impact of the data number on performance indices
In order to investigate the effect of the input data number on the sediment removal efficiency prediction for the superior model (i.e. ANFIS-PSO), the data have been randomly eliminated with the different percentages of 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45% and 50%.
In the testing stage, CC index (IQR = 0.0.76), W-Index (IQR = 0.66) have demonstrated the most sensitivity to the number of input data. However, in this stage, the sensitivity of SRMSE (IQR = 0.46) and RMSE (IQR = 0.34) is less than others. The results of ANFIS-PSO model performance after the random elimination of the data at different percentages are shown in Figure 10.

Comparison of the obtained results to empirical relations
The obtained results from the ANFIS-PSO model as a superior predictive model were validated against the well-established empirical relation obtained by Singh (2018) (Table 7). Validation showed a noticeable improvement through the achieved results using the developed ANFIS-PSO model with respect to the regression relation.
The results indicated an improved correlation error between the predicted data and the observed ones and in between regression relationship and the ANFIS-PSO model with the value of 23.648% in the training stage and the decrease of 40.65% in RMSE error. In addition, for the ANFIS-PSO model testing stage, correlation between the predicted data and the observed ones has been increased 29.01 percent and root mean square error has been decreased 46.01 percent in compare to regression relation.

Conclusion
In this paper, four hybrid AI models (i.e. ANFIS-PSO, ANFIS-GA, ANFIS-ACO and ANFIS-DE) were investigated to predict removal efficiency in sediment dischargers. The published data by Singh (2018) were used to build the proposed hybrid AI models to estimate the removal efficiency. Ten non-dimensional input variables were employed in 11 input combinations, and the best input combination of the variables was selected in accordance to the modeling performance. The results indicated that the ANFIS-PSO model attained the least error RMSE = 5.276 and W-Index = 0.954, and the highest correlation coefficient (CC = 0.915), compared to other models in the training stage. In addition, in the testing stage, the ANFIS-PSO model showed the superior prediction capacity with minimum (RMSE = 6.176 and W-Index = 0.944) and maximum (CC = 0.916). The indices of scattering distribution of the obtained results (median: Q1, Q3) showed that the ANFIS-PSO model has better symmetry with the observed data. The errors for the mentioned indices over the testing stage were 4.863% 'median', 7.36% 'Q1', and 2.037% 'Q3', respectively. The study investigated the effect of eliminating one, two, three, four and five variables. The results demonstrated that the performance indices have more sensitivity results to the number of input variables and their combination. In the final stage, the data were eliminated as 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, and the scattering of the distribution of each index was investigated. The results indicated the sensitivity of the correlation coefficient index over the testing stage.