Novel hybrids of adaptive neuro-fuzzy inference system (ANFIS) with several metaheuristic algorithms for spatial susceptibility assessment of seismic-induced landslide

Abstract Strong ground motions usually trigger lots of slope failures in the affected area. In this work, we analyse the occurrence likelihood of earthquake-triggered landslide by employing the ensembles of adaptive neuro-fuzzy inference systems (ANFIS) with four well-known metaheuristics techniques, namely particle swarm optimization (PSO), genetic algorithm (GA), ant colony optimization (ACO), and differential evolution (DE) algorithms. Twelve landslide conditioning factors namely, elevation, slope degree, lithology, peak ground acceleration (PGA), stream power index (SPI), topographic wetness index (TWI), distance to road, distance to river, distance to fault, normalized difference vegetation index (NDVI), slope aspect, and plan curvature are considered within the geographic information system (GIS) to produce the required spatial database. In this paper, frequency ratio (FR) model is used to evaluate the spatial interaction between the landslides and conditioning factors. Meantime, among a total of 458 marked earthquake-induced landslides, 366 (80%) are specified to the learning process, and the remaining 92 (20%) landslides are used to evaluate the accuracy of applied models. The landslide susceptibility maps are generated in the GIS environment. Three accuracy criteria of mean square error (MSE), root mean square error (RMSE), and area under the receiving operating characteristic curve (AUROC) are used to develop a ranking system for comparing the integrity of the designed models. The total ranking scores (TRSs) of 15, 8, 10, and 18, respectively, obtained for PSO-ANFIS, GA-ANFIS, ACO-ANFIS, and DE-ANFIS revealed the superiority of the DE algorithm compared to other metaheuristics techniques. Also, the DE-ANFIS emerged as the fastest ensemble, due to the highest convergence speed obtained for this model.


Introduction
As a widespread disaster, landslides cause severe loss of property and human life worldwide. China has been faced with numerous landslides, mostly due to the strong ground motions which bring huge slope failures. Ludian earthquake, for example, occurred at 4.30pm on 3 August 2014, at a focal depth of 12 km (Hong et al. 2016b). The intensity was reported as M6.5 with an epicentre located at 27 14 0 42 00 N and 103 25 0 37 00 E. The meizoseismal area's peak ground acceleration (PGA) reached 948.5 cm/s 2 , according to the U.S Geological Survey (USGS). Also, the depth of the aftershocks ranged between 3 km and 15 km (Cheng et al. 2015). This earthquake killed 617 people and caused 3143 injuries. Moreover, thousands of houses ruined or seriously damaged by the Ludian earthquake. Even though various methods have been proposed to investigate the seismic landslide, it has remained a serious problem for scientists. Up to now, different landslide assessment methods have been developed to appraise the risk of landslide occurrence in a particular study area (Moayedi et al. 2011;Pham et al. 2018;Yan et al. 2018;Nguyen et al. 2019;Xi et al. 2019;Yang et al. 2019;Yuan and Moayedi 2019;Moayedi et al. 2019b). Xie et al. (2015) provided the landslide susceptibility map of the Ludian after the Ludian 2014 earthquake by using random forest model. With approximate accuracy of 92%, they concluded that 80% of the marked landslides are in the areas that have been categorized as moderate and high landslide risk.
Lots of researchers have investigated the possibility of landslide occurrence by providing landslide susceptibility maps or generating formulas to calculate the landslide hazard index (Pradhan 2010;Pourghasemi et al. 2013;Regmi et al. 2014;Youssef et al. 2015;Vakhshoori and Zare 2016;Hong et al. 2016a). These studies have mainly considered the altitude, aspect, soil type, slope, vegetation density, lithology, distance from linear phenomena like faults, rivers and roads, stream power index (SPI), etc. as the landslide conditioning factors. Moreover, plenty of studies have outlined the landslide risk evaluation by means of simple predictive methods like statistical index (SI), frequency ratio (FR), weights of evidence (WoE) or regression-based approaches (Demir et al. 2015;Youssef et al. 2015;Chen et al. 2016). Chen et al. (2018a) proposed three ensembles of evidential belief function (EBF), SI, and WoE approaches with kernel logistic regression (KLR) for landslide hazard zonation at Chongren of China. Referring to the calculated area under the curve (AUC) as well as the prediction rate (AUC_P) and success rate (AUC_T) criteria, it was demonstrated that EBF-KLR (with AUC ¼ 0.753, AUC_P ¼ 0.7615, and AUC_T ¼ 0.8511) presents the most reliable prediction, followed by SI-KLR and WoE-KLR hybrid methods.
Also, looking for more reliable approximations, soft artificial intelligence methods have been used and emerged as effective approaches for landslide risk analyse (Yao et al. 2008;Zare et al. 2013;Pham et al. 2018;Shirzadi et al. 2018;Binh Thai et al. 2018a;Dieu Tien et al. 2018a;Chen et al. 2018b;Polykretis et al. 2019;Tian et al. 2019;Chen et al. 2019d). In a research carried out by Lee et al. (2017), support vector machine (SVM) was successfully used to map the landslide hazard in PyeongChang and Inje districts of Gangwon Province, Korea. Referring to the obtained accuracies 81.36% and 77.49%, they found that the SVM can be effectively used for this aim. Also, numerous comparative attempts have dealt with evaluating the applicability of different developed models. Chen et al. (2017) examined the integrity of three stateof-the-art intelligent models namely, adaptive neuro-fuzzy inference system combined with frequency ratio (ANFIS-FR), generalized additive model (GAM), and SVM in landslide hazard mapping in Hanyuan County, China. This study showed that SVM outperforms ANFIS-FR and GAM with respective accuracies of 87.5%, 85.1%, and 84.6%. Bui et al. (2016) applied five predictive tools including SVMs, multi-layer perceptron neural networks (MLP Neural Nets), radial basis function neural networks (RBF Neural Nets), KLR, and logistic model trees (LMT) to model the risk of landslide in Son La hydropower basin, Vietnam. To select the most compatible subset of conditioning factors, they used a k-fold cross-validation technique. The results of this paper revealed that MLP and SVR could perform more efficiently than RBF, KLR, and LMT.
Likewise, robust evolutionary techniques have been effectively applied to improve the applicability of typical intelligent models (Bui et al. 2017b;Moayedi et al. 2019b;Gao et al. 2018a;Binh Thai et al. 2018b;Dieu Tien et al. 2018b;Gao et al. 2018b;Tien Bui et al. 2018b;Binh Thai et al. 2018c;Gao et al. 2018cGao et al. , 2018dGao et al. , 2019Jaafari et al. 2019;Chen et al. 2019b). In this sense, Moayedi et al. (2019b), compared the efficiency of multilayer perceptron neural network against its improved version, that is, artificial neural network synthesized with particle swarm optimization algorithm (PSO-ANN), in landslide susceptibility modelling at Kermanshah province, west of Iran. They considered 12 landslide conditioning factors including elevation, plan curvature, lithology slope aspect, slope degree, soil type, distance to road, distance to river, distance to fault, SPI, topographic wetness index (TWI), as the landslide-independent parameters. As the main result of this study, they found particle swarm optimization (PSO) performs as a promising optimization algorithm for landslide hazard assessment. Li et al. (2017) employed a new genetic algorithm (GA) for optimal selection of landslide-related factors in Wenchuan, Ludshan, and Ludian areas, China. Considering the acquired AUCs of 93.48% and 93.47%, 83.48% and 83.45%, and 82.28% and 82.21%, respectively in Wenchuan, Lushan, and Ludian districts, the proposed method outperformed the typical GA. Moreover, differential evolution (DE) algorithm was satisfactorily used by Bui et al. (2016) to find the optimal tuning parameters of least-squares support vector machines (LSSVM) model in hazard assessment of rainfall-induced shallow landslide in central Vietnam. Their results revealed the superiority of the proposed models (with the approximate accuracy of 82%) in comparison with SVMs, multi-layer perceptron neural networks (MLP), and J48 decision trees. The applicability of three optimization methods namely, GA, DE, and PSO synthesized with the adaptive neuro-fuzzy inference system (ANFIS) for landslide spatial modelling was evaluated by Chen et al. (2017). Referring to the acquired AUC values, they introduced the ANFIS-DE (AUC ¼ 0.844) as the most accurate ensemble data mining technique, compared to ANFIS-GA (AUC ¼ 0.821), and ANFIS-PSO (AUC ¼ 0.780).
Besides, similar optimization methods have been successfully used to generate the flood susceptibility map (Ahmadlou et al. 2018;Hong et al. 2018;Termeh et al. 2018;Tien Bui et al. 2018a) or the groundwater spring potential for different areas (Tien Bui et al. 2018a;Chen et al. 2019aChen et al. , 2019c through discerning the spatial relationship between the proposed phenomena and related factors. This paper investigates the efficiency of PSO, GA, ant colony optimization (ACO), and DE applied to ANFIS for spatial prediction of the earthquake-induced landslide in Ludian County, China. To this end, after providing the required dataset in the geographic information system (GIS), the proposed models were implemented, and the landslide susceptibility maps were developed. To evaluate the reliability of the results, we used the area under the receiving operating characteristic (ROC) curve (AUROC), mean square error (MSE), and root mean square error (RMSE) of the results.

Study area
The study area is located in Ludian district, Yunnan province which is one of the southernmost provinces of China ( Figure 1). The area of the selected region is roughly 14877 km 2 and lies within the east longitude 103 09 0 to 103 40 0 and north latitude 26 59 0 to 27 32 0 . Topographically, the altitude ranges from 568 to 3356 m, under the monsoon climate. In addition, Dongchuan and Jialing Jian group, Dolomite, Quartzose, sandstone, and Chert Limestone are common rocks in this region. The slope varies in the approximate range of 0-90 so that more than half of this area has the slope lower than 15 . Meanwhile, the average annual values of temperature and rainfall are approximately 12.1 C and 924 mm. As Figure 1 illustrates, most of the landslides have occurred along the territorial roads and rivers as well as the recognized faults.
According to Zhou et al. (2016b), which investigated different aspects of the landslides triggered by the Ludian earthquake, this catastrophe occurred at the Baogunao-Xiaohe fault which is located in the Anninghe-Zemuhe-Xiaojiang fault zone. In their study, two useful statistical indices of landslide area ratio (LAR) and landslide density (LD) were calculated for a part of the study area. In this sense, LAR and LD were obtained 2.62% and 2.48 landslides/km 2 . Also, the area of the discovered landslides varies from 76 m 2 to 0.45 km 2 . About the type of the marked landslides, they are mainly classified as rock falls and shallow, disrupted landslides from steep slopes. Around 20% of the mapped landslides have an area greater than 10,000 m 2 , where 21% of them cover an area greater than 50,000 m 2 . Moreover, around 6% of the landslides cover less than 1000 m 2 in area. More information are available in Zhou et al. (2016b) and Zhou et al. (2016a).
Due to the importance of the seismology and structure geology of the affected area in evaluation of the earthquake effect (Rezaie and Panahi 2015), these maps are presented in Figures 2 and 3(a), respectively. The PGA map of the Ludian earthquake is also shown in Figure 3

Data preparation and spatial interaction between the landslide and related factors
Providing the landslide inventory map is a crucial step in any kind of landslide hazard assessment (Ercanoglu and Gokceoglu 2004). In this work, a total of 458 landslide points were identified within the Ludian County utilizing the previous recorded, aerial photos interpretation, and field monitoring using GPS in 1:25,000 scale (see Figure 1). Then, Similar to other studies Regmi et al. 2016;Xie et al. 2017;Moayedi et al. 2019b), the landslide locations were randomly divided into two categories containing 366 landslides (80% of whole landslide locations) for training the proposed ensembles and 92 landslides (20% of whole landslide locations) for validating their performance. Remarkably, these points were assigned a value of '1'. Next, the same number of non-landslide points was randomly produced over the areas devoid of landslide. They were then divided into the training and testing samples as well, and were assigned a value of '0' Chen et al. 2019b).
Considering the landslide-causative parameters of the intended district and also regarding the occurred earthquake, twelve of geological and hydrological landsliderelated factors namely, elevation, slope degree, lithology, PGA, SPI, TWI, distance to road, distance to river, distance to fault, normalized difference vegetation index (NDVI), slope aspect, and plan curvature were considered within the GIS. After the preparation process, the mentioned GIS rasters were generated from their initial formats (i.e. polygons, polylines, points, and tabular data).  Similar to Dou et al. (2019), most of the considered conditioning layers were first classified using 'Natural Breaks'. Table 1 details the spatial distribution of the landslides based on the sub-classes of each effective factor. In the last column of Table 1, the FR of each sub-class is featured indicating the proportion of the observed slope failures over the percentage of domain occupied by the proposed sub-class. Note that, a high correlation is represented by FR value more than 1 and vice versa (Oh et al. 2011). The slope degree, slope aspect and plan curvature GIS layers were derived from the digital elevation model (DEM) of the Ludian County acquired from Landsat 8 imagery. As stated before, elevation varies from 568 to 3356 m so that approximately 82% of the studied area has an altitude between 1300 and 2300. As Table 1 reports, the largest FR for the elevation sub-classes is 2.18 which is obtained for the altitude 1312-1755 m. Slope degree ranges from 0 to 87 degrees. In this factor, the highest and lowest values of FR were 1.32 and 0.87, respectively for the slope groups of [62-87] and . From this, it can be concluded that the slope degree is positively correlated with the number of slope failures. Based on the lithology map of Ludian (after Li et al. (2017)), various types of bedrocks suchlike Dongchuan and Jialing Jian group, Dolomite, Quartzose, sandstone, and Chert Limestone form the geology of this area. Accordingly, 46.52% and 13.69% of the slope failures have been observed on the bedrocks labelled as 'Shale and sandstone (Upper Permian)' and 'Shale', respectively. The seismic parameter of PGA was used to consider the effect of the Ludian earthquake. Note that, PGA is the most commonly used criterion to determine the shaking intensity of damaged area (Nath 2004). For this study, earthquake records were used to draw the PGA map in the GIS environment. Not surprisingly, the more PGA values, the higher the calculated FR. In other words, as we get close to the epicentre, the FR raises which reaches 1.37 for the range of 500-1100 gal. Also, to examine the effect of the geo-morphometric conditions, we calculated two widely-used secondary factors of SPI and TWI. The SPI and TWI parameters respectively symbolize the erosion power of streams and the amount of accumulated water in a place, which are expressed by Equations 1 and 2 (BEVEN and Kirkby 1979; Moore et al. 1991): where a stands for specific catchment, and b denotes the gradient. Moreover, to investigate the effect of linear phenomena (e.g. the rivers), three factors namely, distance from faults, distance from rivers, and distance from roads are used as the independent landslide parameters in the current study. As Table 1 denotes, the distance of the farthest point from the fault, river, and road lineaments equals to 19,185 m, 16,986 m, and 7566 m. Referring to Figure 1, most of the landslides have occurred along the faults, rivers and territorial roads. This claim can also be supported by the respective FRs of 1.28, 1.42, and 1.25 acquired for the first subclasses of the mentioned landslide conditioning factors. As for NDVI factor which indicates a quantitative evaluation of the vegetation growth and biomass (Yilmaz Table 1. Spatial interaction between the landslide and its causative factors.  . Roughly 47% of the study area is categorized as Flat which contains the same share of the landslide points. Analysis of FR between earthquake-induced landslides and different aspects of slope demonstrates that North-West (FR ¼ 1.12) and North-East (FR ¼ 0.84) subclasses have the largest and least correlation. Additionally, plan curvature factor was taken into consideration to examine the effect of the convergence or divergence in declivitous streams regarding the erosion of slopes (Ercanoglu and Gokceoglu 2002). In the present study, the plan curvature map was stratified as concave, flat, and convex with the approximate percentages of 27%, 47%, and 26% of the study area. The highest FR value (1.03) was seen for the concave sub-class and then flat and convex stands at the value of 0.99.

Methodology
As stated earlier, in this paper we utilized the combination of the ANFIS models with four robust optimization techniques, namely PSO, GA, ACO, and DE for analyzing the occurrence risk of landslide triggered earthquake in the Ludian County of China. To this end, after preparing the proper database, the proposed predictive methods were implemented to estimate the landslide susceptibility index. Figure 4 portrays the procedure we carried out to achieve the landslide susceptibility map. The description

Adaptive neuro-fuzzy inference system (ANFIS)
Recently, the fuzzy theory has widely gained attention by researchers due to its ability for displaying complex processes by employing the concepts and if-then rules. The fuzzy theory is inspired by the concept of decision making in human life. However, it should be mentioned that preferable results are not offered for unforeseen situations by this algorithm (Dehnavi et al. 2015). Thus, to take on a self-learner property, the artificial neural network (ANN) has been utilized to optimize the fuzzy theory as ANFIS. Jang (1993) has proposed ANFIS as a hybrid of ANN and a fuzzy system. To solve nonlinear problems, it can be mentioned that the ANFIS offers better results than a fuzzy inference system (FIS). According to the ANFIS, a FIS is employed for training in a multilayer feed-forward network (Dixon 2005). By combining least-squares methods and backpropagation gradient descent, FIS membership function (MF) parameters can be trained through training the input data by the ANFIS. As Figure 5 illustrates, in the ANFIS model, the layers are structured as follows (Termeh et al. 2018): Each node in layer 1 contains adaptive nodes as: where x and y symbolize the input nodes. A and B indicate the linguistic variables, and lA i ðxÞ and lB i ðyÞ represent the MFs of the proposed node. In the second layer, Equation (5) expresses the output of each node which is the outcome of all input signals to the proposed node.
where W i indicates the product of each node. The normalized outputs of layer 2 are considered as the nodes of layer 3.
For layer 4, a node function is employed to associate each node as follows: where w l represents the normalized firepower of layer 3, and p i ; q i ; and r i denote the node parameters. In layer 4, the parameters are considered as the result parameters.
The sum of all input signals and all the way to the output is considered as a single node of layer 5.

Particle swarm optimization (PSO) algorithm
During the past several decades, improving the optimization approaches based on particle intelligence technique has increasingly gained attention by a number of researchers because of its potential for modelling the real-life swarm behaviours such as bird-flocking, fish-schooling, etc. Due to the high run strength and flexibility, the PSO has always been amongst researchers' top priorities for many issues and has been employed for many nonlinear-continuous and discrete systems (Kennedy 2010;Bui et al. 2017a). Figure 6 exhibits the flowchart of the PSO technique. In this model, a population of random solutions is considered for the system; subsequently, by updating generations, the system searches for optima. Therefore, it should be mentioned that the PSO algorithm shares many similarities with conventional optimization methods. However, unlike evolutionary computation techniques, the position of particles depends on their velocity parameter (step motor) according to the PSO algorithm. Based on the PSO algorithm, each particle possesses a utility function and a speed vector which signify its orientation and step, respectively (Clerc and Kennedy 2002). Each particle moves in the search space, and the position of its previous best position and velocity has been kept as a record. By considering its own moving experience, the velocity and position of each particle are dynamically updated in the evolution of each generation. Thus, each particle goes to a better location, and the optimal solution is determined for the system. For each particle, the mathematical expression for updating the velocity is presented by the following equation.
where p ij ðtÞ and g i ðtÞ stand for the best total personal experience and the most proper cumulative experience, respectively. W indicates the inertia coefficient which illustrates the tendency of a parameter for balance. c 1 and c 2 represent the personalized coefficient learning and the collective learning coefficient, respectively. In addition, and r 1 and r 2 denote uniform random numbers which can range between 0 and 1. By considering the updated velocity of each particle, the new particle position can be obtained by Equation (10): Consequently, for the next stage, the particle position and new particle velocity are updated according to the above equation.

Genetic algorithm (GA)
To solve learning processes and optimization problems in different fields, GA has been utilized by the researchers as a heuristic algorithm. This algorithm is inspired by Darwin's evolution theory as well as biological evolution in nature (Sheta and Turabieh 2006). Holland (1975) has introduced the GA, and then it has been developed. In this algorithm, the structure is composed of a population, and each individual of the population, called chromosome, is considered as a solution to the problem. Different problem variables act as genes for an individual in this algorithm (Mitchell 1998). Figure 7 depicts the mechanism of this method. According to GA, a random population of chromosomes is developed in order to create the search procedure. Subsequently, three operators produce the next generation of the population as follows: Selection Operator: Based on this operator, the fitness function of each chromosome is calculated, and the obtained results are evaluated in each stage. Consequently, the best solutions to the problem are identified as the best chromosomes of society. Furthermore, to produce the next generation, the chromosomes are employed as parents; therefore, new child chromosomes are produced as offspring. It is worth noting that better chromosomes have a higher chance of combining with other individuals.
Crossover Operator: According to this operator, the combined selected chromosomes have been used; therefore, the new child chromosomes have better fitness than its parents. Thus, the ratio and structure of the offspring's chromosome compared with its parents' chromosomes have been determined by employing the crossover operator. Various methods, including Uniform, Cycle, N-Point, Tournament, Order, Ranking Selection, and partially mapped can be employed for implementing the crossover operator (Saeidian et al. 2016). In this study, the uniform method has been employed for this issue.
Mutation Operator: Searching new areas has been carried out in the available dimension based on this operator. A mutation is produced in the chromosome by changing one or more genes inside the chromosomes randomly. Thus, the value of the selected gene is changed by selecting a chromosome for mutation. It should be mentioned that the mutation operator is employed for applying to either a terminal node (variables and constants) or a function node (arithmetic operations) (Nourani et al. 2014).
By employing these three operators, new child chromosomes are produced and they are considered as the next generation parents. In addition, the trend goes on in order to the termination condition is met.

Ant colony optimization (ACO) algorithm
As a multi-criteria technique, Dorigo and Blum (2005) have proposed the ant algorithm for solving optimization problems, including travelling salesman. The ant algorithm has been introduced by considering a principle that collective clever behaviour is achieved by local simple and confined interactions of members or class of the population. The speed and parallel performance, distributed creatures' transactions, and lack of centralized control can be mentioned as the main superiority of collective intelligence (Beckers et al. 1992). According to the ant algorithm, the trail of different pheromones is left along the path by the ants, and they help each other for choosing the shortest path. For an n-dimension space, a route between X i and X j ðor Y i and Y j Þ has the probability of selecting by an ant q as follows (Beckers et al. 1992): where s ij ðtÞ; t; g ij ðtÞ; and N q i indicate the pheromone intensity, the time, the value of cost function between the points i and j, and the sum of point i and neighbours of the ant q, respectively. In addition, the cost function and weight of pheromone intensity have been determined by the control parameters b and a, respectively. In the real world, each ant follows paths with more pheromones which have been secreted by other ants in their way back to the nest. Thus, the ants choose a shorter route which has more pheromones because its pheromones are strengthened faster than the pheromones of a longer path. According to the ant algorithm, the process of finding the best solution is based on the mentioned action of ants. The learning of model and evaluating the selected route through back and forth trial are inspired by the mentioned action. For an ant q, P q ij ðtÞ denotes the probability of transferring from the point i to j in t the iteration. If a point has not been visited by an ant, P q ij ðtÞ is zero, and j 6 2 N q i . Equation (12) formulates the intensity of pheromone is upgraded for selecting a new route: where q and s 0 denote the evaporation intensity of pheromone and the pheromone's initial value, respectively. The error probability of choosing a shorter route by ants can be increased for an attractive route because its food may end, but the remaining pheromones are still present on the track. Consequently, the pheromone intensity is refreshed at the end of each iteration by the following formulation: where Ds ij ðtÞ; Q; and L k ðtÞ represent the amount of pheromone added to s ij by ants, the constant parameter, and the optimal value of cost function, respectively.

Differential evolutionary (DE) algorithm
Recently, as another widely used evolution algorithm, the DE algorithm has been employed to find the globally optimal solution for a problem with continuous space (Storn and Price 1997;Storn 2008;Das et al. 2009). Storn and Price (1997) have proposed the DE algorithm which the next optimal generation is produced by three operators, including, mutation, crossover, and selection. By producing a random population, the DE algorithm starts its operation, and each individual is considered as a symbol of a solution for the problem. The vector X G i ¼ ðx G 1;i ; x G 2;i ; x G 3;i ; :::; x G D;i Þ stands for an individual of the population and i ¼ f0; 1; 2; :::; NPg indicates the number of the mentioned individual in the population. D denotes the search dimension (i.e. the number of components in a problem). G ¼ f0; 1; 2; :::; G max g is the generation times in which G max shows the total number of produced generations. By supposing X L ¼ fx 1;L ; x 2;L ; x 3;L ; :::; x D;L g and X U ¼ 11 as the lower and upper limits of each search dimension, the initial population can be defined as Equation (14): where randð0:1Þ represents a uniformly distributed random number in [0,1]. Mutation: As the first operator of DE algorithm, the mutation operator employs each individual, called the target vector X G i , to produce the mutant vector (donor vec- where r1; r2; r3; and r4 indicate random integers from [0, NP] in which r1 6 ¼ r2 6 ¼ r3 6 ¼ r4. F symbolizes a number that is randomly chosen from [0, 1] and denotes a scale factor that defines the mutation scale. X G best presents an individual who will have the most appropriate fitness value in the generation population. Crossover: Producing the trial vector U G j;i ¼ fu G 1;i ; u G 2;i ; u G 3;i ; :::; u G D;i g is the aim of the crossover operator. Therefore, some relations of the target vector X G i are replaced with the mutant vector V G i in order to achieve this goal. The crossover operator can be formulated by Equation (16) (Storn and Price 1997): where i 2 f1; 2; 3; :::; NPg and j 2 f1; 2; 3; :::; Dg. j rand denotes a random number in [1, D]. CR is uniformly distributed at random in [0, 1] and indicates the crossover rate. Selection: According to this operator, the best choice is determined for the next generation by comparing the fitness value of the trial vector U G i with the target vector X G i as follows (Storn and Price 1997):

Results and discussion
In this paper, using the programming software of MATLAB version 14.0 and ArcGIS version 10.2 we developed four optimization techniques, namely PSO, GA, ACO, and DE, synthesized with ANFIS for analyzing the earthquake-triggered landslide risk. The Ludian County of China was selected as the study area, due to the Ludian earthquake occurred in 2014 which caused several slope failures in this region. Notably, we considered 12 landslide causative factors (namely, elevation, slope degree, lithology, PGA, SPI, TWI, distance to road, distance to river, distance to fault, NDVI, slope aspect, and plan curvature) to produce the spatial database. In this sense, factors with basic formats (point, polygon, polyline) were converted to GIS rasters. Also, 80% of landslide points (i.e. 366 landslides) devoted to the training process, and the remaining 20% (i.e. 92 landslides) were used to validate the performance of PSO-ANFIS, ACO-ANFIS, GA-ANFIS, and DE-ANFIS models.
In the context of model performance, PSO, GA, ACO, and DE metaheuristic algorithms were coded to fine-tuning the parameters of ANFIS MFs (Mahapatra et al. 2015). More specifically, when the algorithm starts, a basic fuzzy system is generated using the received training data. Next, among the stored basic parameters, the optimization algorithms select the best ones according to a predefined objective function (OF). An example of the optimized ANFIS structure is illustrated in Figure 8.
In this study, we used mean square error (MSE) as the OF which is defined by Equation (18): where N is the number of instances, and Y i observed , and Y i predicted denote the target data and predicted values of landslide susceptibility index, respectively. Each model performed within 1000 repetitions. The convergence curves of the designed models are depicted in Figure 9 in term of MSE. According to this chart, DE-ANFIS yielded the converged solution in the first iteration that indicates the highest convergence speed. The ACO-ANFIS reached the lowest value of the OF after 271 iterations. This is while, the ensembles of PSO-ANFIS and GA-ANFIS kept the optimization until the 858 th and the latest iteration, respectively.
Then, the proposed ensemble methods were implemented to calculate the landslide susceptibility index. The landslide susceptibility maps were generated by transferring the outputs into the ArcGIS environment. Then, a natural break classification (Akgun et al. 2012;Xu et al. 2012;Pourghasemi et al. 2013;Jaafari et al. 2014) was applied to stratify the resulted maps into five susceptibility groups namely, 'Very low', 'Low', 'Moderate', 'High', and 'Very high'. Figure     The statistical report from the performance of the applied models are presented in Figure 11 for both training and testing phases based on the (i) comparison between the targets and outputs, (ii) MSE and RMSE value of data, and (iii) frequency errors of data samples. In addition, Figure 12 illustrates the ROC curve we depicted in SPSS environment for each ensemble model. This diagram shows the true positive rate (on the vertical axis) against the false positive rate (on the horizontal axis) of the model's estimations. Remarkably, the AUROC criterion represents the accuracy of the proposed prediction so that a casual prediction is determined by 0.5, and 1 indicates an ideal estimation.
Table 2 denotes the ranking system developed based on the values calculated for MSE, RMSE, and AUROC. According to this table, each predictive model gains a total ranking score (TRS) which determines the final rank of each model. In this sense, the total scores of 15, 8, 10, and 18, respectively acquired for PSO-ANFIS, GA-ANFIS, ACO-ANFIS, and DE-ANFIS methods demonstrate that the combination of adaptive neuro-fuzzy inference system and DE algorithm suggests the most robust method for the assessment of earthquake-induced landslide hazard. After that, PSO-ANFIS outperforms the ACO-ANFIS and GA-ANFIS which have featured as the models with gratifying accuracy.
Based on the results, the combination of the used fuzzy-based tool with evolutionary algorithms can perform efficiently for the purpose of landslide susceptibility mapping. Putting a detailed comparison between various ensembles of ANFIS aside, the greatest achievement of this research lies in presenting an accurate and inexpensive  evaluative approach for spatial assessment of earthquake induced-landslide hazard. In addition, due to the high speed of the DE-ANFIS in the convergence, the quickness of this model can be mentioned as an advantage in comparison with other intelligent and statistical-based approaches.
In comparison with previous studies that developed the susceptibility map for the landslides triggered by the Ludian earthquake, the results of this study are more satisfying. In details, as explained, Li et al. (2017) generated the landslide susceptibility map of the Ludian along with two other study areas, by designing a GA and a novel GA models for proper selection of conditioning factors, which led to 82.21% and 82.28% accuracy of the results for the Ludian, respectively. However, the GA-based ensemble of our study performed worse than their models, other three ensembles of PSO-ANFIS, ACO-ANFIS, and DE-ANFIS outperformed those. Moreover, Tian et al. (2017) investigated the susceptibility analysis of the pre-earthquake and coseismic landslides (immediately after the earthquake) related to the mentioned earthquake, in the southern part of the Ludian. They used the Information value method in that study. The AUC-based accuracy for the spatial risk analysis of the coseismic landslides was 77.05%. As is seen, the accuracy of all four methods we used in this study is considerably higher than this value. This is noteworthy that they considered only seven landslide-related factors, namely slope angle, elevation, slope aspect, distance from drainages, intensity, curvature, and lithology, while we used five more factors. Along with the higher potential of the used employed models, it may be a reason for more accuracy of our research.
Furthermore, in comparison with the studies that used the same methods for the landslide susceptibility mapping in different areas, it was revealed that the main results of this study are more accurate. In Xie et al. (2017), for instance, they used ANFIS-PSO, ANFIS-DE, and ANFIS-GA for landslide hazard assessment in Hanyuan County of China. In term of AUROC, the ANFIS-DE was found the best model with 0.844, followed by ANFIS-GA with 0.821, and ANFIS-PSO with 0.780. As is seen, however their GA-based ensemble performed slightly better than ours', the PSO-ANFIS (AUROC ¼ 0.833) and DE-ANFIS (AUROC ¼ 0.852) of this study present a more reliable prediction. Notably, in term of the MSE and RMSE, all three methods of the present research showed a more efficient performance. Other than some differences between the used conditioning factors, the considered ratio for the division of the marked landslides (i.e. into the training and testing datasets) may be another influential factor. This ratio was determined as 80:20 and 70:30 in the present study and .
For more details, the percentages of landslide susceptibility classes are presented in the form of a column chart in Figure 13. As the chart illustrates, 13.43%, 9.03%, 14.83%, and 14.31% of the studied area is located in the very low susceptibility class, respectively by PSO-ANFIS, GA-ANFIS, ACO-ANFIS, and DE-ANFIS ensembles. Likewise, respective proportions of 35.45%, 26.36%, 32.00%, and 31.46% of the Ludian County are distinguished by the low risk of earthquake-induced landslides. All designed ensembles have detected close ratios (between 28% and 33%) of the intended region for the moderate susceptibility class. As for the dangerous areas, about 22.80%, 32.40%, 22.80%, and 25.70% of the Ludian district is under the high and very high landslide occurrence risk, respectively based on the PSO-ANFIS, GA-ANFIS, ACO-ANFIS, and DE-ANFIS predictions.
In the following, a novel idea was performed to calculate the percentage of the training and testing landslides located in each susceptibility class. To do this, we crossed the landslide layers with the generated susceptibility maps. The results of this process are shown by 3D bar charts in Figure 14(a,b) for the training and validation landslide points. According to these charts, GA-ANFIS has categorized the greatest percentage of the identified landslides (32.75% and 25.88%, respectively for the training and testing records) in very high susceptibility class. In a glance, it can be induced that the highest and lowest density of the landslide locations in both datasets observed within the high and very low susceptibility classes, respectively. The ACO-ANFIS ensemble has stratified approximately 12% and 29% of the training landslides in low and moderate susceptibility classes which are the greatest proportions in the mentioned classes. Investigating the results of the elite models, i.e. DE-ANFIS and PSO-ANFIS, about 58% and 63% of the training landslides and also 46% and 43% of the landslides devoted to the validation phase were located in perilous regions of the Ludian County (high and very high categories).

Conclusions and remarks
Due to the crucial role of the landslide susceptibility mapping in hazard mitigation, this study deals with the application of ensembles of ANFIS with four metaheuristics techniques, namely PSO, GA, ACO, and DE algorithm for analyzing the occurrence likelihood of earthquake-induced landslide in Ludian County, China. To achieve this aim, 12 landslide-related factors, namely: elevation, slope degree, lithology, PGA, SPI, TWI, distance to road, distance to river, distance to fault, NDVI, slope aspect, and plan curvature were taken into consideration within the GIS. Also, the landslide inventory map was composed of 458 earthquake-induced landslides. Among those, 366 (80%) landslides were devoted to the learning process, and the remaining 92 (20%) were used to evaluate the accuracy of results. Three well-known accuracy criteria of MSE, RMSE, and AUROC were used to develop a ranking system to compare the usefulness of the applied models. The following conclusions were obtained based on the results. It shall be noted that, to have a coherent presentation of the results, calculated values of MSE, RMSE, and AUROC have been reported in the respective order for PSO-ANFIS, GA-ANFIS, ACO-ANFIS, and DE-ANFIS in all cases.
The convergence rate of the applied models showed that the DE-ANFIS is the fastest ensemble which achieved the optimum solution after the first iteration. The ACO-ANFIS and PSO-ANFIS were found to be the second and third quick models, due to reaching the lowest OF after 271 and 858 iterations, respectively. This is while, GA-ANFIS kept decreasing the error till the 1000th iteration. All four ensembles of PSO-ANFIS, GA-ANFIS, ACO-ANFIS, and DE-ANFIS have presented an effective approximation of the landslide hazard, due to the acquired values of accuracy criteria. Based on the MSEs computed for training (0.086, 0.088, 0.087, and 0.087) and testing (0.080, 0.081, 0.084, and 0.077) phases, the DE-ANFIS has shown the best performance in the validation phase. Also, the PSO algorithm outperformed GA and ACO for enhancing the prediction capability of ANFIS. In term of RMSE, the training error values (0.294, 0.297, 0.295, and 0.294) indicate the superiority of PSO-ANFIS and DE-ANFIS with equal RMSEs. Likewise, the RMSE of the testing samples (0.284, 0.285, 0.290, and 0.279) shows that the highest and lowest RMSEs are obtained for ACO-ANFIS and DE-ANFIS, respectively. The calculated AUROCs report 83.3%, 81.9%, 83.9%, and 85.2% accuracy for the designed systems. Unlike the MSE and RMSE, ACO-ANFIS performed slightly better than PSO-ANFIS. However, similar to prior conclusions, DE improved ANFIS more efficiently compared to the other three techniques. All in all, based on the acquired TRSs, DE-ANFIS (TRS ¼ 18) was found to be the most capable algorithm in this research, followed by PSO (TRS ¼ 15), ACO-ANFIS (TRS ¼ 10), and GA-ANFIS (TRS ¼ 8). Classifying the study area into five landslide susceptibility groups, it was revealed that about 22.80%, 32.40%, 22.80%, and 25.70% of the Ludian district is under the high and very high landslide occurrence risk. Last but not least, analyzing the distribution of the identified landslides, it was concluded that 63%, 66%, 56%, and 58% of the training landslides, and also 43%, 58%, 45%, and 46% of the testing landslides were located in the perilous areas of the Ludian.

Disclosure statement
No potential conflict of interest was reported by the authors.