Estimation and interpretation of equilibrium scour depth around circular bridge piers by using optimized XGBoost and SHAP

Most bridge failures result from scouring around bridge piers, resulting in economic losses and risks to public safety. The conventional equations for predicting the depth of scour at bridge piers have several limitations: (1) They mainly use regression-based techniques that cannot robustly capture the nonlinear relationship between the scour depth and its effective variables; (2) they are applicable only to a narrow range of variability of data; and (3) they are typically calibrated using laboratory data rather than field measurements and thus cannot simulate the prototype environment. To overcome these limitations, in this study, three novel hybrid machine learning methods: particle swarm optimization - extreme gradient boosting (PSO – XGBoost), red fox optimization - XGBoost (RFO – XGBoost), and relativistic particle swarm optimization - XGBoost (RPSO – XGBoost) are applied to estimate the scour depth around circular bridge piers, and their effectiveness is validated using three statistical metrics, i.e. the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). RPSO – XGBoost generates the best results for both dimensional and dimensionless data. Moreover, the proposed approaches outperform the state-of-the-art techniques. The SHapley Additive exPlanations (SHAP) method is used to assess the relative significance of the contributing factors for predicting the scour depth.


Introduction
Scouring around bridge piers is a notable driving factor of bridge collapses (Wardhana & Hadipriono, 2003;Wang et al., 2017).According to the US Department of Transportation, scouring led to the damage of 17 bridges in New York and New England during the flood in 1987 (Richardson & Davis, 2001).The Federal Highway Administration defines scouring as the process in which the streambed sediments are eroded and removed from bridges by the water flowing over the bed (Mueller & Wagner, 2005).Different bed materials undergo various levels of scouring, leading to different scour rates.For example, non-cohesive materials such as sand and gravel may erode over a few hours, but more time is required for cohesive bed materials to be eroded (Richardson & Davis, 2001).
The scouring process around bridge piers is composed of two mechanisms: three-dimensional flow separation and sediment transport, which render it complex (Sreedhara et al., 2019).The water rushing around the pier CONTACT Changhyun Jun cjun@cau.ac.kr;Shahab S. Band shamshirbands@yuntech.edu.twexperiences three-dimensional separation, which generates pressure resulting in a downward flow upstream of the pier.This downward flow creates a vortex system upstream of the pier, termed a horseshoe vortex owing to its appearance being similar to that of a horseshoe from the top view (Pandey et al., 2020a).The shear stress increases due to the vortex system and corresponding downflow around piers, leading to accelerated sediment transport, which finally creates a scour hole near the pier.
After the scour hole is fully developed and reaches its maximum size, the flow pattern is altered, and the shear stress and sediment transport diminish (Pandey et al., 2020b).The safe design of bridge foundations is critical for preventing bridge failures.The scour depth must be precisely estimated to ensure the economic and secure design of bridges.Hence, many regression-based formulas have been established to determine the depth of equilibrium scour around bridge piers (Laursen & Toch, 1956;Shen et al., 1969;Hancu, 1971;Jain & Fischer, 1979;Kothyari et al., 1992;Lee & Sturm, 2009;Melville & Coleman, 2000;Khan et al., 2017;Pandey et al., 2018).Notably, the existing equations suffer from the following limitations: (1) They are mainly based on limited experimental data and are thus applicable to only a narrow range of variability of inputs (e.g.flow depth, flow velocity, pier diameter, and sediment size) and output (scour depth) for which they are calibrated; (2) they cannot robustly capture the highly nonlinear relationship between scour depth and its determining factors; and (3) they are typically calibrated based on laboratory data and thus cannot accurately capture the prototype environment, resulting in overestimation of the scour depth (Bateni et al., 2007a).
In a departure from regression-based approaches, machine learning (ML) techniques have been used to predict scour depth.Sharafi et al. (2016) showed that the support vector machine (SVM) outperforms artificial neural networks (ANN), adaptive neuro-fuzzy inference systems (ANFIS), and nonlinear regression methods in estimating the scour depth.Ebtehaj et al. (2017) compared the scour depth prediction performance of the self-adaptive extreme learning machine with those of regression-based, ANN, and SVM techniques.Choi et al. (2017) used ANFIS to determine the scour depth around circular bridge piers.While their approach outperformed the conventional equations, the small number of data points in their study restricts the generalizability of their proposed method.Genetic programming and ANFIS were proposed as alternatives to conventional methods owing to their promising performance in estimating the scour depth near piers (Abd El-Hady Rady, 2020).Shamshirband et al. (2020) developed several PSO-based equations to predict the scour depth around circular bridge piers using different input variables.Dang et al. (2021) optimized ANN by PSO and firefly algorithms to enhance its efficiency in scour depth estimation.However, the performances of their proposed methods were not compared with other studies, limiting the ability to assess their effectiveness.Sreedhara et al. (2021) utilized gradient tree boosting (GTB) to estimate the temporal scour depth around circular, rectangular, sharp-nosed, and round-nosed bridge piers.Their proposed ML model was calibrated based on a limited number of laboratory data points and was not evaluated against previous studies.Chou and Nguyen (2022) introduced the Metaheuristic-Optimized Stacking System (MOSS) to estimate the scour depth around bridge piers.By combining a metaheuristic algorithm with multiple ML methods, they showed the superiority of MOSS over single ML models, conventional regressionbased methods, and mathematical approaches.Despite the strength of their study, it used a small dataset, which hinders its generalization.Also, none of the abovementioned studies used explainable artificial intelligence (AI) to show the relative contribution of input features to scour depth estimations.
This study overcomes the drawbacks of the existing studies by (1) using 841 experimental data points from 35 field and laboratory studies in clear water conditions, thereby covering a wide range of variability of the scour depth and other relevant variables; (2) applying three novel hybrid ML techniques: particle swarm optimization-extreme gradient boosting (PSO-XGBoost), red fox optimization-XGBoost (RFO-XGBoost), and relativistic particle swarm optimization-XGBoost (RPSO-XGBoost), given that PSO, RFO, and RPSO can tune the hyperparameters of XGBoost to improve its performance; and (3) interpreting the proposed ML models using the SHapley Additive exPlanations (SHAP) method.In particular, SHAP represents an intuitive and effective tool for exploring the contribution of each input feature to the output of an ML model.
XGBoost is a robust and scalable ML method that was proposed by Chen et al. (2015).This framework operates on the basis of a gradient boosting (GB) tree for gradient enrichment.XGBoost has demonstrated robust performances in regression problems, in terms of speed, memory usage, scalability, and hardware (Lucca et al., 2021;Qiu et al., 2021), in applications such as detection in water delivery systems (Wu et al., 2022), evaluation of flash flood risk (Ma et al., 2021), groundwater level estimation (Osman et al., 2021), and flood forecasting (Venkatesan & Mahindrakar, 2019).In general, XGBoost can model a variety of water resource problems, and its performance can be improved by tuning its hyperparameters (Ni et al., 2020;Yu et al., 2020;Shi et al., 2021;Nguyen et al., 2021;Demir & Sahin, 2023).Although this tuning can be performed through trial and error, such approaches are extremely time-consuming and cumbersome.
PSO is a popular swarm-based metaheuristic optimization technique that draws inspiration from animal foraging behaviours (Kennedy & Eberhart, 1995).RFO, as one of the newest metaheuristic optimization tools, was introduced by Połap and Woźniak (2021).Its concept is based on the predation tricks of red foxes (Cui et al., 2022;Natarajan et al., 2022).RPSO, introduced by Roder et al. (2020), is a variant of the PSO algorithm that functions based on the concept of relativity.These optimization strategies have been successfully used to adjust hyperparameters of ANN (Alizamir & Sobhanardakani, 2018), XGBoost (Yu et al., 2020;X. Zhang et al., 2020), and random forest platforms (Pham et al., 2020).
In this study, we exploit the PSO, RFO, and RPSO techniques to identify the optimal hyperparameters of XGBoost to improve its performance (Gu et al., 2021;Lucca et al., 2021).To the best of our knowledge, the proposed hybrid approaches have not been previously used to predict the scour depth.Moreover, the performances of the developed hybrid models are compared with those of state-of-the-art methods.
Furthermore, the interpretation of ML methods is crucial to explain how these algorithms obtain optimal predictions.To this end, explainable artificial intelligence (XAI) methods have attracted increasing attention in recent years.Representative XAI techniques include SHAP, accumulated local effects, and partial dependence plots (Ishfaque et al., 2022).
In this study, the relative significance of the different input parameters for scour depth prediction is determined using SHAP values.SHAP, proposed by Lundberg and Lee (2017), quantifies the significance of each feature and explains how ML models make predictions.The predictions of a model can be expressed as the summation of a fixed base value and each feature's corresponding SHAP values.Unlike the conventional feature importance methods such as feature importance and mean decrease impurity (Mangalathu et al., 2020;Demir & Sahin, 2023), SHAP can determine whether the contribution of each feature is positive or negative.SHAP has been used to interpret several ML methods for dam seepage problems (Ishfaque et al., 2022) and drought forecasting (Dikshit & Pradhan, 2021) owing to its notable advantages such as local and global model interpretation, scalability, robustness, and feasibility for a vast range of problems such as regression, classification, and ranking (Lundberg et al., 2020).To the best of the authors' knowledge, this study represents the first attempt at performing SHAP analysis to interpret ML methods for scour depth prediction.
The rest of the paper is organized as follows: Section 2 describes the variables affecting the scour depth and data sources.Section 3 discusses the XGBoost, PSO, RFO, RPSO, and SHAP techniques.Section 4 presents the findings, and Section 5 presents the concluding remarks.

Data
The scour depth, influenced by flow characteristics, sediment properties, and pier dimensions, is inherently complex due to the intricate interplay of hydraulic forces, sediment dynamics, and morphological changes (Choi et al., 2017).This complexity accentuates the need for extensive understanding and modeling techniques to analyze and predict the scour depth accurately.However, conducting three-dimensional physical-based simulations around bridge piers to accurately represent the scour mechanism is challenging (Dang et al., 2021).Given the physical complexity and the limitations of physical models, the availability of a comprehensive scour dataset becomes crucial.When utilized with ML approaches, such a dataset provides valuable insights into the intricate scour phenomena and enables the rapid generation of accurate scour depth estimates (Dang et al., 2021).Importantly, our study stands out as the first one to employ such a comprehensive dataset to model scour depth around circular bridge piers, further highlighting its significance.
The maximum scour depth, also known as the equilibrium, scour depth, is reached around bridge piers after a period of gradual development.The equilibrium scour depth near circular bridge piers is determined by the fluid flow, bed materials, and pier diameter (Melville & Coleman, 2000): where d se is the equilibrium scour depth, ρ is the density of water, μ is the dynamic viscosity of water, V is the flow mean velocity, Y is the flow depth, g is the gravitational acceleration, d 50 is the median sediment size, V c is the mean critical velocity, and D is the pier diameter.
Using the Buckingham theory and considering ρ, D, and V as repeating parameters, the dimensionless form of Eq. (1) can be expressed as follows: Here, V √ gY is the Froude number (Fr), and ρVD μ is the Reynolds number (Re).
Reynolds number determines the flow regime around bridge piers, where a higher Reynolds number signifies a more turbulent condition that may enhance the erosive potential and increases scouring (Hu et al., 2022).Therefore, including Re in simulating scouring around bridge piers can be beneficial to account for its underlying effect and ensure a thorough understanding of the scouring process (Tavouktsoglou et al., 2017).However, it is worth mentioning that our results in Section 4.2.
show that Re has a marginal impact on the scour depth estimates, and its effect on d se / Y is less than the other variables in equation ( 2).
In this study, 841 field and laboratory data points in clear water condition for circular bridge piers are collected from 35 published and unpublished reports, which are listed in Table 1.
Table 1.Sources for scour depth data used in this study.

Study
Type of data Number of data points Chabert and Engeldinger (1956) Table 2 summarizes the mean, minimum value, maximum value, standard deviation (SD), and coefficient of variation (CV) of the data.

XGBoost
XGBoost is a novel variant of the GB technique (Friedman, 2001(Friedman, , 2002)).Unlike the GB technique, XGBoost can efficiently manage a considerable amount of data by generating boosted trees and implementing them in parallel (Le et al., 2019;Zhang et al., 2020).Consequently, reliable and fast simulations can be performed for engineering problems with a large number of data points.The complexity of trees is addressed by the variation of a loss function.In other words, the residuals are used to calibrate the former predictions in each iteration through the optimization of a loss function.Furthermore, XGBoost uses a regularization function in the objective function (Zhou et al., 2021).
The process flow of XGBoost can be mathematically illustrated as follows: Consider a dataset with m data and n feature : x i and y i are the input and output, respectively.The output of a tree ensemble model can be obtained by adding P functions: where F is the area containing regression trees and is expressed as: where q indicates the associated leaf for each data point as a tree structure, T is the number of leaves on the tree, f p is a function corresponding to a stand-alone tree structure, and ω is the output weight of leaves.
The target function (J) for the XGBoost method is defined as in Eq. ( 5) to minimize the error of ensemble trees: where L is the training loss function used to determine the distance between the estimated (ŷ i ) and observed (y i ) values, superscript t denotes the number of iterations, and is the regularization function that controls the where λ is the penalty coefficient, and γ shows the complexity of each tree.A higher value of γ means fewer complex trees.The hyperparameters of XGBoost are the learning rate, max_delta_step, max_depth, and min_child_weight.The learning rate demonstrates the shrinkage (reduction in the size of incremental steps) used to prevent overfitting in the learning process.A shrinking weight in each step will result in a more conservative model.Max_delta_step indicates the weight estimate of each tree.A positive value results in a conservative update step.The maximum depth of the tree, max_depth, is used to control overfitting in the learning phase.Min_child_weight is the minimum summation of weights in a child tree, higher values of which make the model more conservative.In this study, we apply three optimization techniques to find the optimal hyperparameters of XGBoost, as described in the following subsections.

PSO
PSO, introduced by Kennedy and Eberhart (1995), is a metaheuristic algorithm that draws inspiration from nature.Each particle has an assigned position (x) and velocity (v).The initial position (x) and velocity (v) of each particle are defined by an n-dimensional random vector and vector of zeros, respectively.Each dimension represents a specific decision parameter.Assume that particle i moves at velocity v t i at iteration t, as a part of a swarm sized P, where i ∈ {1, 2, . . ., P}.The particle velocity at the time t + 1 (v t+1 i .)can be obtained as (Le et al., 2019): where x * i represents the best location achieved by particle i upo the current iteration, g is the present best global solution among all swarms w is the inertial weight, c 1 and c 2 are the cognitive and social parameters, respectively; and r 1 and r 2 are random numbers that vary from 0 to 1.
In the following step, given the updated velocity of particle i, the particle's position is adjusted as follows: where x t i indicates the location of particle i at iteration t.

RFO
Red foxes are highly populated species of foxes that can live and survive in different climatic conditions.Based on the red foxes' lifestyle and method of hunting, Połap and Woźniak (2021) developed a metaheuristic optimization algorithm named RFO.The RFO initialization is simulated by a fixed number of foxes, with the coordinates of each fox defined as: where n describes the number of coordinates.Each fox in each iteration is identified using the notation (X i j ) t , where i indicates the number of a fox in the population, j is the coordinate based on the dimension of the searching area, and t is the number of iterations.
Assuming f ∈ R n to be a condition function of n variables, each point in the search space [a, b] n , where a, b ∈ R, can be expressed as in Eq. ( 10): (X) i is considered the optimum solution once the value of f ((X) i ) yields the global optimal.Every fox must participate in protecting the flock from threats.If adequate prey is not available in the area, the foxes must move to farther regions to achieve a better result for the exploration term.
The information collected by a fox for the best location is shared with the others if a region with adequate prey is found.The fitness value is used to sort the results of the evaluated function.
The Euclidean distance (D) is used to find (X best ) t : where || .|| is the Euclidean norm.Thus, the foxes travel toward the optimum solution as follows: where a is a randomly chosen integer number in the range of a Then, the appropriate solution is obtained using the updated position of the candidates.Otherwise, the former location is retained.The probability of a fox being observed while they move to capture the prey is modeled using a random variable λ: move closer λ > 0.75 stay and camouflage λ ≤ 0.75 (13) The radius (r) is another important term in the RFO, which indicates the vision radius of the predator fox: where α is the scaling coefficient for modeling changes in the fox's vision radius while they approach a prey and lies in the range [0, 0.2].β is a random value between 0 and 1 that denotes the impact of adverse weather such as rain or fog on the foxes' vision angle.ϕ 0 , which represents the fox's angle of sight, varies in the range [0, 2π ].Therefore, the fox population approaching prey can be simulated as follows: x New
where x New is the updated coordinate, x Old is the coordinate in the previous iteration, and ϕ 1 , ϕ 2 , . . ..ϕ n−1 are the foxes' angles of sight for the corresponding coordinates.

RPSO
RPSO is a variant of the PSO, which was introduced by Roder et al. (2020).We introduce the theory of relativity to illustrate the RPSO.The theory of relativity is one of the most significant theories in physics, proposed by Albert Einstein in 1916.Conventional mechanics' theories were implemented to numerically model the phenomena detected by Einstein.One of the most famous theories pertains to momentum, which can be calculated in three-dimensional coordinates as: where M is the momentum, v = (v x , v y , v z ) and m are the velocity and mass of a body, respectively.ς is the Lorentz factor, which is defined as follows: where |v| denotes the size of vector v, and c is the speed of light.Unlike PSO, RPSO considers the effects of the speed of light, improved social behaviour of swarms, and particles' mass on the optimum values of the solution.RPSO transforms the three-dimensional momentum formula (Eq.16) into an n-dimensional formula to compute the velocity of each particle in an n-dimensional search space: where M(.) is the typical relativistic momentum in three dimensions, which is extended to the n-dimensional space.Mass m is selected from a uniform distribution with values between [0, 1] to compute the relativistic momentum of each particle (Roder et al., 2020).

Model performance evaluation metrics
Three statistical indices are used to assess the efficiency of the proposed methods: root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R 2 ).Lower MAE and RMSE denote better results.R 2 , which indicates the percentage of variation in dependent variable that can be accounted for by independent variables, lies in the range [0,1].R 2 values closer to 1 indicate a stronger relationship between the predictions and observations.The three metrics are defined as follows: where N is the number of data points, d pred se i and d obs se i are the estimated and observed scour depths for the ith data point, respectively, and dse indicates the average values of observed scour depth.Lundberg and Lee (2017) developed the SHAP method based on game theory to interpret ML models.The concept was used in game theory to assess a player's participation in a cooperative team game and distribute a fair reward among players based on their contributions (Shapley, 1953).SHAP determines the degree of contribution of each input to the model's output prediction.Moreover, it evaluates whether the effect of each input on the model's prediction is positive or negative.The use of SHAP to interpret ML models can help prevent the problem of these models operating as black boxes.The model behaviour in both global (for the entire dataset) and local (for a single prediction) aspects is explained by assigning SHAP values to each feature (Zhang et al., 2022).The contribution of each input feature can be determined as:

SHAP
where κ i indicates the contribution of the i-th feature (in terms of Shapley values), G is the collection of all features, where κ 0 is a base value, z ∈ {0, 1} n , where z = 1 when a feature is present and z = 0otherwise.Figure 1 illustrates the process of developing hybrid ML models for scour depth prediction.

Model evaluation
The 841 experimental data points are split randomly into training (70%), validation (15%), and testing (15%) datasets.The training dataset is used to train the ML models and typically includes most of the data.The model uses the training data to minimize the differences between observed and predicted values by adjusting the hyperparameters (Brownlee, 2020).The validation dataset is used to assess the model efficiency during training and to obtain the best set of hyperparameters.The validation dataset is distinct from the training dataset to prevent overfitting (Brownlee, 2020).The performance of the final (calibrated) ML model is assessed using the testing dataset, which is completely withheld from the model during the training and validation stages (Mangalathu et al., 2020).In this manner, testing data enable an unbiased evaluation of ML models.All three datasets should be used to ensure that the ML model is precise, robust, and not overfitted to training data.In this study, the hyperparameters of the XGBoost method, i.e. learning rate, max_delta_step, max_depth, min_child weight, and n_estimators are optimized using PSO, RFO, and RPSO.The default values of these parameters are used to initialize XGBoost.The merits of hybrid models include higher accuracy, flexibility, and robustness in handling high-dimensional data (Jalil et al., 2022).Tables 3 and 4 present the optimal values of the hyperparameters obtained by different optimization approaches for dimensional and dimensionless datasets, respectively.
First, the original variables in Eq. ( 1) are used in the developed models to estimate the scour depth.Figure 2 presents the scatter plot of scour depth predictions from XGBoost, PSO-XGBoost, RFO-XGBoost, and RPSO-XGBoost versus observations for the training, validation, and testing stages.The scour depth estimations from all models are close to the 45°line, which shows that the proposed models can efficiently predict the scour depth.However, the predictions for larger scour depths deviate from the 1:1 line in the validation and testing steps, owing to the lower frequency of data in this range.Scour predictions in the training and validation phases are better than those in the testing step because the training and validation data are used in the learning process.
Table 5 summarizes the performance metrics of the proposed models (i.e.RMSE, MAE, and R 2 ) for the training, validation, and testing stages.
Subsequently, the dimensionless parameters in Eq. ( 2) are used in the models to estimate d se /Y. Figure 3  Figure 3 shows that the scour depth predictions from all proposed models are concentrated around the 45°line, indicating their satisfactory accuracies.A comparison of Figs. 2 and 3 demonstrates that all models produce more reliable results when trained by dimensional data (Eq.( 1)) instead of dimensionless data (Eq.( 2)).A possible explanation for this may be that the use of raw variables rather than a combination of variables enhances the flexibility of models in simulating the highly nonlinear relationship between the inputs and output.This result is in line with those of Bateni et al. (2007aBateni et al. ( , 2007b) ) and Zounemat-Kermani et al. (2009), who reported that    Therefore, the proposed models can be ranked in decreasing order of their accuracy in predicting d se /Y as RPSO-XGBoost, RFO-XGBoost, PSO-XGBoost, and XGBoost.This finding is consistent with the previous ranking of methods for scour depth prediction using the dimensional variables in Eq. ( 1).
The superiority of RPSO-XGBoost over PSO-XGBoost and RFO-XGBoost is attributed to several factors.Firstly, the RPSO technique incorporates relativistic effects, allowing particles to move at speeds that can approach the speed of light (Roder et al., 2020).These relativistic effects enable RPSO-XGBoost to explore the search space extensively, conducting a more comprehensive and thorough search for optimal hyperparameters (Roder et al., 2020).Secondly, the relativistic effects in RPSO facilitate more efficient convergence of particles towards the global optimum (Roder et al., 2020).The effective navigation of the complex search space by RPSO allows RPSO-XGBoost to reveal significant patterns and dependencies that influence scour depth, resulting in highly accurate predictions.
Based on testing data, Table 7 compares the performances of the proposed models and 28 existing techniques.The numbers in parentheses show the percentage of increase in the RMSE/MAE and decrease in R 2 compared with those of RPSO-XGBoost, which is the highest-performing approach developed in this study.Because the testing data are withheld from the model in the training and validation steps, they can be used as an independent data source to perform a fair evaluation of the final models.The results show that the four proposed methods outperform all the existing approaches.Among the 28 existing methods, the model proposed by Shamshirband et al. (2020) achieves the best scour depth predictions with RMSE, MAE, and R 2 of 0.0654, 0.0470, and 0.8268 m, respectively.Shamshirband et al. (2020) used the PSO algorithm to develop an explicit equation for estimating scour depth around circular bridge piers.Their equation was developed by minimizing the discrepancy between predicted and observed scour depth values via the PSO algorithm (Shamshirband et al., 2020).The PSO technique enhances the capability of their expression to capture inherent relationships among input and output data, leading to improved prediction accuracy.Furthermore, the optimization process performed by the PSO algorithm facilitates the generalizability of the equation based on the available dataset (Taormina & Chau, 2015).As a result, their derived equation has the potential to be applied to a diverse range of scenarios beyond the specific dataset utilized for its development.The ability to generalize effectively is of considerable significance for practical applications.
Other studies in Table 7 mostly used regression techniques to derive equations for predicting d se .The traditional regression-based approaches cannot robustly capture the nonlinearity and complexity among the scour depth and its influential variables (Ebtehaj et al., 2017).
The RMSE (MAE) of the scour depth estimates from RPSO-XGBoost is 39.14% (55.95%) lower than those of Shamshirband et al. (2020).The model proposed by Blench (1969) produces the least accurate results, with RMSE = 0.7712 m, MAE = 0.7463 m, and R 2 = 0.1318.On average, RPSO-XGBoost improves the RMSE and MAE of scour depth predictions by 64.91% and 70.94% compared with those of the 28 existing approaches.Figure 4 visually compares the scour depth estimates from the four proposed models with those obtained using the model of Shamshirband et al. (2020) as the highest-performing method among the 28 considered techniques.Most of the scour depth predictions obtained using the proposed models are closer to the 1:1 line, which demonstrates their superiority over those of the model proposed by Shamshirband et al. (2020).
The violin plot in Fig. 5a shows the distribution of scour depth estimates from RPSO-XGBoost (as the best proposed model) in comparison with those obtained using two famous models (i.e.HEC-18 and Sheppard et al. ( 2014)) and the best method among the 28 considered existing methods (i.e.Shamshirband et al. (2020)).A similar comparison is made for d se /Y in Fig. 5b.In general, violin plots combine a kernel density plot with a box plot.Unlike box plots, violin plots can show both the statistic and density of data, thereby explaining the variability of data.The small white dot in the violin plots in Fig. 5 represents the median of the data.The interquartile range is presented by a thick black line.The thin black line indicates the data beyond the interquartile range except outliers.The shape of the data distribution is shown by the kernel density approximation on the two sides of the black line.The wider shape of the violin plot around the median denotes a high concentration of data in this region.The tapered shape of the ends of the violin plot indicates a lower concentration of data in that area.
Figure 5 shows that the median and interquartile values of the scour depths predicted by the RPSO-XGBoost are close to the observations, highlighting its high predictive accuracy.For both dimensional and dimensionless data, the scour depth estimates from all methods are clustered around the median, indicating a higher probability in this region.In the case of dimensional data, the distribution of scour depth predictions from RPSO-XGBoost is similar to that of the measurements.In addition, the interquartile range, range of data, and median of predictions from RPSO-XGBoost are closer to the observations for both dimensional and dimensionless data, compared with those of Shamshirband et al. (2020), Sheppard et al. (2014), and HEC-18.A comparison of Figure 5(a,b) confirms the findings of the previous analyses demonstrating the superiority of scour depth predictions obtained using the dimensional variables in Eq. (1) over those obtained using the dimensionless variables in Eq. ( 2).The upper end of the violin plots for d se /Y predictions (Figure 5(b)) indicates a higher value than that of d se /Y measurements because the models overestimate d se /Y in the case of large values.
To further investigate the ability of the proposed methods, the Taylor diagram for scour depth measurements and estimates is derived, as shown in Figure 6.Taylor diagrams are valuable tools for visualizing results, as they can graphically summarize several evaluation metrics, i.e. the SD, correlation coefficient between the predicted and measured values, and centered RMSE, in a single plot.Thus, Taylor diagrams can help obtain reliable conclusions regarding the efficiency of the proposed models.Notably, the statistical indicators shown in the Taylor diagram in Figure 6 correspond to the testing data.When using the dimensional dataset, the proposed XGBoost, PSO-XGBoost, and RPSO-XGBoost models underestimate the range of scour depth variation, whereas RFO-XGBoost tends to overestimate the scour depth variations.When using dimensionless data, all the proposed methods overestimate the variations in the scour depth, as indicated by the higher SD compared with the observations (i.e. the values lie beyond the red dashed curve).Additionally, Figure 6 provides further evidence of the higher accuracy of proposed models using dimensional variables compared with those using dimensionless variables, as the correlation coefficient of the former is more than 0.95 but that of the latter ranges between 0.9 and 0.95.RPSO-XGBoost yields better results for both input configurations given their higher correlation with the observations and lower RMSE compared with those of the other three methods.Furthermore, the variability

Model interpretation
A SHAP analysis is performed to identify the variable with the most notable contribution to the scour depth predictions.As mentioned previously, SHAP values not only allow us to determine the contribution of each variable contributes to the prediction but also to identify and visualize the significant relationships in the model.SHAP values can be visualized using various approaches such as waterfall, force plot, decision plot, mean SHAP plot, and beeswarm plot.A beeswarm plot aggregates the SHAP values of all observations to ensure that all SHAP values can be visualized at once. Figure 8(a,b) show the beeswarm plots based on RPSO-XGBoost (the most accurate model) for dimensional and dimensionless variables, respectively.
In this plot, the variables are ordered based on their importance on the y-axis.The x-axis denotes the mean SHAP value, and the colour of each point shows the actual value of that feature (i.e.not the SHAP value).Thicker parts of the beeswarm plot imply a higher density of predictions in that region.Figure 8(a) indicates that for dimensional data, the pier diameter (D) has the most notable effect on scour depth prediction.The second most influential variable is the flow depth (Y), followed by the flow velocity (V), median grain size (d 50 ), and critical velocity of sediments (V c ).This ranking is consistent with the study of Bateni et al. (2007a), who reported that the pier diameter and critical velocity have the greatest and least impact on the estimation of scour depth using ANN, respectively.Larger values of the pier diameter correspond to larger SHAP values, which in turn yield higher scour depth estimates.A similar positive relationship is observed for the flow depth and flow velocity.In contrast, a clear inverse relationship is found between the scour depth and the median grain size and critical velocity, both of which are associated with sediment characteristics.These relationships are compatible with the physical nature of the scouring process, as larger pier diameter, flow depth, and flow velocity result in higher scour depths, and piers located in riverbeds with finer sediments and sediments with a lower critical velocity typically experience larger scour depths.As outlined in Figure 8 To ensure the physical consistency of scour depth estimates from ML models, the following multifaceted approach is essential (Najafzadeh & Oliveto, 2020): (1) Compare predictions from ML models with those obtained from established physical models and/or widely used empirical equations to verify the physical consistency of results (Kollet et al., 2017).In this study, we met this criterion by comparing the scour depth estimations from XGBoost, PSO-XGBoost, RFO-XGBoost, and RPSO-XGBoost with those of 28 existing equations in the literature (see Table 7); (2) Identify the importance of input features to ensure an alignment with the physical governing principles.Herein, the SHAP method is used to find the relative significance of each input feature on scour depth predictions.The results of SHAP are consistent with those of Bateni et al. (2007aBateni et al. ( , 2007b)) 2020); (3) Evaluate the performance of ML models using unseen data to provide additional evidence of their physical consistency and generalization capability.This study assessed the feasibility of all ML models via the testing dataset.This dataset is kept hidden from the models during the training and validation phases.Hence, this study considered all the criteria to ensure the physical consistency of our results.

Conclusions
Three novel hybrid ML methods: PSO-XGBoost, RFO -XGBoost, and RPSO -XGBoost are applied to estimate the equilibrium scour depth around circular bridge piers.A comprehensive laboratory and field dataset containing 841 data points from 35 studies is used to train, validate, and test the proposed ML methods.The hyperparameters of the XGBoost method are tuned using three optimization techniques (PSO, RFO, and RPSO).The proposed models are trained using both dimensional and dimensionless datasets.The results are evaluated using three statistical metrics i.e. the RMSE, MAE, and R 2 .All three proposed methods accurately predict the scour depth.The higher accuracy of the hybrid models over XGBoost shows that the three optimization techniques can efficiently tune the hyperparameters of XGBoost and enhance its accuracy in scour depth prediction.
For models trained by dimensional data, RPSO-XGBoost yields the best scour depth predictions with The results of the proposed methods are compared with those of 28 existing methods, and their superiority over the existing techniques is demonstrated.The K 1 : pier shape factor K 2 : pier alignment factor Table A1.Continued.

Figure 1 .
Figure 1.Process flow of the proposed hybrid ML models and SHAP for estimating the scour depth near circular bridge piers.
compares the d se /Y estimates from XGBoost, PSO-XGBoost, RFO-XGBoost, and RPSO-XGBoost with observations for the training, validation, and testing stages.

Figure 4 .
Figure 4. Visual comparison of the predictions obtained using the proposed methods and the method proposed by Shamshirband et al. (2020), based on testing data

Figure 5 .
Figure 5. Violin plots comparing the distribution of (top) dimensional dataset and (bottom) dimensionless dataset estimates from different approaches with measurements.
Figure 7(a,b) show the RMSEs of d se and d se /Y predictions from RPSO-XGBoost (the most accurate method) under different numbers of iterations, respectively.As the number of iterations increases, the RMSEs of d se and d se /Y predictions from RPSO-XGBoost decrease rapidly to their asymptotes.A learning curve of a model is considered a good fit (i.e.neither underfit nor overfit) if it satisfies two conditions: (1) The RMSEs in the training phase decrease to the final stable value, and (2) the RMSEs in the validation phase follows the trend of the training phase with minor differences.According to Figure 7, the learning behaviour of the proposed model (RPSO-XGBoost) can be a good fit.

Figure 6 .
Figure 6.Taylor diagrams for scour depth estimates from various approaches and observations.

Figure 7 .
Figure 7. Variations in the RMSE of (left) d se and (right) d se /Y predictions from RPSO-XGBoost (the best model in this study) versus the number of iterations in the training and validation stages.
(b), the most and least significant dimensionless variables for predicting d se /Y are D/Y and Re, respectively.The remaining dimensionless variables can be listed in decreasing order of their importance as V/V c , d 50 /Y, and Fr.Higher values of D/Y and V/V c lead to higher d se /Y predictions.Because of certain outliers in the SHAP values of d 50 /Y, Fr, and Re, no clear relationship can be concluded between these variables and d se /Y predictions.

Figure 8 .
Figure 8. SHAP summary plots based on RPSO-XGBoost to rank the significance of each feature in estimating (a) d se and (b) (d se /Y).
pier shape factor, D p is the projected width of pier.20Melville and Sutherland (1988) d se D = K d K I K y K α K s K d =sediment size coefficient; K I = flow intensity coefficient; K y = flow shallowness coefficient; K s = pier shape coefficient; K α = pier alignment coefficient 21 Ansari and Qadar (1994) d se = 0.024 D 3 D < 7.2 ft 2.238 D 0.4 D > 7.2 ft (continued).

K 1 =
depth-size coefficient; K I = flow intensity coefficient; K d = sediment size coefficient; K s = pier shape coefficient; K θ = pier alignment pier shape factor K 2 = flow angle of attack factor K 3 = bed condition factor 26 Sheppard et al. (

Table 2 .
Statistical indices of variables.

Table 3 .
Initial hyperparameters of the XGBoost model and values optimized using the PSO, RFO, and RPSO techniques for the dimensional dataset.

Table 4 .
Initial hyperparameters of the XGBoost model and values optimized using the PSO, RFO, and RPSO techniques for the dimensionless dataset.

Table 5 .
Statistical indicators of scour depth predictions made by the proposed machine learning techniques using the dimensional variables in Eq. (1).

Table 6 .
Statistical indices of (d se /Y) predictions from the proposed methods using the dimensionless variables in Eq. (2).MAE = 0.1712, and R 2 = 0.7824 in the testing phase.Table6and Figure3also highlight the superiority of all hybrid models over XGBoost: In the training stage, the MAEs (RMSEs) of and RMSE = 0.2065, MAE = 0.1354, and R 2 = 0.9714 for testing.For the testing phase, the RMSE of RPSO-XGBoost is 4.36% and 5.48% lower than those of RFO-XGBoost and PSO-XGBoost, respectively.XGBoost has the weakest performance among all proposed methods with RMSE = 0.1825, MAE = 0.0763, and R 2 = 0.9735 in the training phase; RMSE = 0.4066, MAE = 0.1832, and R 2 = 0.8889 in the validation phase; and RMSE = 0.3435,