Potential of hybrid evolutionary approaches for assessment of geo-hazard landslide susceptibility mapping

Abstract As a prevalent disaster, landslides cause severe loss of property and human life worldwide. The specific objective of this study is to evaluate the capability of artificial neural network (ANN) synthesized with artificial bee colony (ABC) and particle swarm optimization (PSO) evolutionary algorithms, in order to draw the landslide susceptibility map (LSM) at Golestan province, Iran. The required spatial database was created from 12 landslide conditioning factors. The area under curve (AUC) criterion was used to assess the integrity of employed predictive approaches. In this regard, the calculated AUCs of 90.10%, 85.70%, 80.30% and 76.60%, respectively, for SI, PSO-ANN, ABC-ANN and ANN showed that all models have enough accuracy for simulating the LSM, although SI presents the best performance. The landslide vulnerability map obtained by PSO-ANN model is more accurate than other intelligent techniques. In addition, training the ANN with ABC and PSO optimization algorithms conduced to enhancing the reliability of this model. Note that, a total of 76.72%, 23.96%, 30.55% and 5.37% of the study area were labeled as perilous (High and Very high susceptibility classes), respectively by SI, PSO-ANN, ABC-ANN and ANN results.


Introduction
Landslide is one of the disasters that have a very complex natural phenomenon. Landslide susceptibility map (LSM) is the first step that needs to be taken for risk assessment and landslide hazard mitigation. During the last three decades, landslide gained a lot of attention, due to the damages caused by this catastrophe worldwide. In a rough estimation performed Iranian Landslide Working Party (2007), about 187 people have been killed every year by landslides (Pourghasemi et al. 2012b). Different landslide assessment techniques have been implemented to compute the landslide susceptibility values (LSVs) (e.g. in a particular study area) (Youssef et al. 2015;Balamurugan et al. 2016;Vakhshoori and Zare 2016). The most influential factors in calculating LSV are elevation, aspect, slope, curvature, soil type, lithology, distances of predefined cells to faults, rivers and roads, land use, stream power index (SPI), topographic wetness index (TWI). Numerous researchers such as Pradhan (2010) and Hong et al. (2016) have argued, and introduced formulas or LSM to provide a reliable approximation with the likelihood of landslide occurrence in the future. Many attempts and cases studies (Parsons et al. 2014;Chaytor et al. 2016;Meng et al. 2016; Yavari-Ramshe and Ataie-Ashtiani 2016) has been applied in the focal point of estimating landslide hazard zonation maps or deformation of large scale structures (Zou et al., 2018). LSM can be a reliable way to have an estimation of a special place, if only this work implement by proper procedure and real dataset, and also in an appropriate working scale (Dong et al. 2017). One of the most efficient usages of LSMs can be their influences on expanding land use planning and residential areas in the future to evade exposing psychological and financial consequences and damages (Pourghasemi et al. 2012a). Because of the huge troubling of relocating people from their own home (Mokoena and Musakwa, 2018;Hoque Mozumder et al., 2018), up to now various tools and methods are used to estimating landslide susceptibility for different study areas. Among these researches, such as Baeza and Corominas (2001) and Moayedi et al. (2011), were based on simple methods like statistical index (SI), frequency ratio (FR) or regression methods. Wang et al. (2015) developed the LSM of Qianyang County of China using the index of entropy (IOE) and certainty factor (CF) models. In this work, they considered fifteen landslide triggering factors, namely slope angle, slope aspect, general curvature, plan curvature, profile curvature, altitude, distance to faults, distance to rivers, distance to roads, the sediment transport index, the SPI, the TWI, geomorphology, lithology, and rainfall. Based on their findings, CF outperformed IOE with a respective prediction accuracy of 82.32% and 80.88%. Chen et al. (2015) compared the applicability of SI, FR and IOE models for landslide hazard analysis in Baozhong Region of China. In this study, 70% of the marked landslides were selected for training the methods, and the remaining 30% landslides were used to calculate the areas under the curve (AUC) index to evaluate the accuracy of each model. Accordingly, the highest prediction accuracy obtained for FR map with 84.95% accuracy, followed by the SI map with 82.37% and IOE map with 82.05% accuracy.
Also, to achieve more accurate results, application of soft computing techniques is reported in many studies (Lee et al. 2004;Lian et al. 2013;Shahri 2016;Gao et al. 2018cGao et al. , 2018d. Pradhan and Lee (2010) applied a back-propagation (BP) neural network on the measured data points of landslides in Cameron Highland, Malaysia. They estimated the possibility of landslides with an accuracy rate of 83%. They have used other soft computing techniques as a capable tool for landslide hazard zonation (Arora et al. 2004;Farrokhzad et al. 2011). In another case, Oh and Pradhan (2011) used ANFIS tool with four membership functions, including triangular, trapezoidal, generalized bell and polynomial; and introduced ANFIS as a very useful tool in regional landslide susceptibility assessment. Recently, many studies have been done to represent a comparison of different developed models. For example, Pourghasemi et al. (2016) analytical hierarchy process (AHP) and fuzzy logic (FL) methods were compared, and finally, FL was known as a more successful method by 89.7% accuracy. In other similar cases, the AHP model was used for landslide hazard mapping (Pourghasemi et al. 2012b;Althuwaynee et al. 2014;Chen et al. 2016). Also, in recent years, new techniques and optimizing algorithms are used to model an LSM (Chen et al. 2017a;Hong et al. 2017). Chen et al. (2017b) examined the capability of genetic algorithm (GA), differential evolution (DE) and particle swarm optimization (PSO) synthesized with the adaptive neuro-fuzzy inference system (ANFIS) for landslide spatial modelling in Hanyuan County of China. Referring to the acquired AUC values, they introduced the ANFIS-DE (AUC ¼ 0.844) as the most accurate ensemble data mining technique, compared to ANFIS-GA (AUC ¼ 0.821), and ANFIS-PSO (AUC ¼ 0.780).
As explained supra, evolutionary algorithms have been extensively used to enhance the performance of various predictive models. According to the best knowledge of authors, however, ABC and PSO algorithms have been designed in many studies to remedy the shortcomings of intelligent models (e.g. SVM and ANFIS) (Cheng and Hoang 2015;Bui et al. 2017;Chen et al. 2019), no prior attempt has focused on comparing the capability of this technique in performance improvement of ANNs for landslide hazard assessment. Therefore, in this research, we applied ABC and PSO algorithms to a typical artificial neural network (ANN) to improve its applicability for landslide hazard mapping at the northern district of Golestan province, Iran. The capability of the mentioned models (ANN, ABC-ANN, PSO-ANN) was evaluated by comparing the obtained results with actual landslide locations. In this regard, 20% of total landslide cases were used to calculate the AUC index. Remarkably, the input dataset contains elevation, slope degree, lithology, rainfall, SPI, TWI, distance to road, distance to the river, land use, slope aspect, soil type, and plan curvature; where the output was taken LSV. In what follows, section 2 describes the study area, data preparation and statistical analysis of them is presented in section 3, the detailed description of the used models is given in section 4, and the outcomes are presented and discussed in section 5, and eventually, the conclusion is expressed in section 6.

Study area
The study area is located in Kalaleh County, Golestan province, Iran ( Figure 1). Golestan is one of the northernmost provinces in Iran and is in the neighborhood of the Caspian Sea. The area of the selected region is roughly 163 km 2 and lies between 37 34 0 to 37 44 0 N and 55 35 0 to 55 47 0 E, longitude and latitude, respectively. The district is under the Mediterranean climate chiefly, and the difference between the highest and lowest reported annual rainfall reaches 59 mm per year. Most of the area contains law slope values so that slopes more than 40 are rarely observed. Topographically, the altitudes (i.e. elevation) varies in the range of 200-890 m, and the majority of the area is covered by Calciustepts and Xerorthents soil units. This is noteworthy that the majority of the observed slope failures cover an area <100 m 2 (10 Â 10 m 2 ) and are categorized as 'rotational' and 'translational' slides. As is clearly shown in Figure 1, most of the landslides have occurred along the territorial roads and rivers, as well as the location of the villages. Note that, no significant fault has been recognized in the purposed study area.

Data preparation and landslide conditioning factors
Landslide is a prevalent disaster in Iran, especially in northern parts of the country, due to the dominant climate condition. In any kind of landslide phenomena modelling, providing the landslide inventory map is the most substantial step (Ercanoglu and Gokceoglu 2004). In this work, this map was acquired through the aerial photos interpretation and extensive field survey using GPS in 1:25,000 scale (see Figure 1). Considering the region climatology, 12 of geological and hydrological factors (e.g. landslide-triggering factors) have been listed as independent parameters in ArcGIS, which were: elevation, slope degree, lithology, rainfall, SPI, TWI, distance to road, distance to river, land use, slope aspect, soil type and plan curvature (see Figure 2). All of the mentioned GIS layers were converted to raster format by 10 Â 10 m 2 cell size to achieve appropriate formats. The elevation map of the area was derived from the digital elevation model (DEM) of Golestan. Minimum altitude value is 200 m and altitudes more than 750 m are rarely observed (see Figure 2(a)). Terrain slope map was subsequently produced from DEM in the range of 0 -43 (see Figure 2(b)). According to lithology distribution map, this region is constructed from five types of rocks. As it is illustrated in Figures 2(c) and 3(a), more than 93% of this region is based on the Swamp sedimentary rocks. Other parts are represented by different rock categories like shale and sandstone, bedded limestone and dolomite and Ammonite bearing shale with the interaction of orbitolina limestone. In the field of land utilization, as is depicted in Figures 2(i) and 3(b), 13 cases of utilization have been reported. However, 82.37% of the area is stratified as mixing of agriculture, forest, rangeland,  etc. Furthermore, to investigate the effects of geo-morphometric conditions, two of widely used secondary factors, namely the TWI and SPI were also calculated. TWI measures the amount of water that accumulates in a place, and it is defined by Equation (1) (Moore et al. 1991): In the above formula, a is the cumulative catchment area, and b defines the slope (see Figure 2(f)). Also, SPI as a topographic attribute indicates the erosion power of a stream is defined in Equation (2) based on the equation presented by Moore et al. (1991); where, similarly, a stands for specific catchment and b defines as the gradient (see Figure 2(e)). Two parameters which relate to the effect of proximity to linear phenomena including roads and rivers (i.e. distance to road and distance to the river) are acquired through applying 'Euclidean distance' tool on vector lines in ArcMap (Pradhan and Lee 2010). For these factors, distance varies from 0 to 4234 m and 0 to 4230 m, for roads and rivers, respectively (see Figure 2(g,h)). Also, a map for slope aspect was derived from DEM and classified in nine classes: North (0 -22.5 ), North-  c)). Soil unit map was made of five types of soils which have been classified in groups 1 to 5, whereas, the majority of region is covered by Calciustepts, and Xerorthents soils (83%) and the remained parts are formed by Rock Outcrops, Lithic, Xerorthents, Calcixerepts and Fluventic Haploxerolls soil units (see Figures 2(k) and 3(d)). The efficiency of the convergence or divergence in declivitous streams that is important for slope erosion can be measured by plan curvature factor (Ercanoglu and Gokceoglu 2002). In the current research, plan curvature map produced from DEM and classified by concave (-4 --0.001), flat (-0.001 -0.001) and convex (0.001 -4). This is noteworthy that the majority of the study area is labeled as flat (see Figure 2(l)). In addition, the description and distribution of the nominal parameters are presented in Figure 3.

Methodology
As was demonstrated previously, the main motivation of this research is to improve the efficiency of typical ANN model utilizing two potent optimization algorithms, namely artificial bee colony (ABC) and PSO. Investigating the risk of landslide occurrence in the northern part of Iran is aimed, using a statistical approach (e.g. SI) and ANN optimized with PSO (PSO-ANN), ABC (ABC-ANN). General steps of LSM producing process could be explained as follows: First, data layers from all sources were prepared. Overall, 12 landslide-conditioning (e.g. influential factors affecting the landslide) factors were derived from various sources. For achieving those, factors with basic formats (point, polygon, polyline) were converted to raster and extracted for the intended area. Similar to other studies (Wu et al. 2014;Regmi et al. 2016;Xie et al. 2017), to reveal the relationship between causative factors and landslide occurrence risk, 80% of total landslide points (i.e. 59) opted for the training process; and the remaining 20% (i.e. 15 landslides) were used to evaluate the performance of the proposed methods. Then, SI, PSO-ANN, ABC-ANN and ANN were implemented to calculate the LSVs. Eventually, the maps produced by each method were drawn in GIS and stratified in five classes of susceptibility, namely very low, low, moderate, high and very high. To assess the accuracy of the purposed predictive models, the AUC index, which is a well-known factor for such works, was calculated for the resulted maps. Figure 4 illustrates the graphical methodology of the present study.

SI method
One of the simply used methods for forecasting landslide hazard is an SI. Bivariate statistical analysis is the essence of this technique that was presented by van Westen (1997a) for LSM. According to Yesilnacar (2005), SI technique provides a proper direct mapping associated with different GIS options such as data-driven analytical ability (Pourghasemi et al. 2013b;Gao et al. 2018aGao et al. , 2019. In a simple description, the importance of sub-classes of every independent factor is determined by calculating a weight, considering the distribution of the prone pixels. To do so, the number of pixels that landslide has occurred in them is divided by the entire pixels of the area. Firstly, this procedure must be performed for a specific (e.g. purposed) class, and then for the whole map. By dividing these values (Equation (3)), the weight for the corresponding class appears (Çevik and Topal 2003): where W sc is the weight attributes to the categorical unit, D class and D map are is the landslide density for purposed class and entire map, respectively. Also, N pix ðD i Þ indicates the number of landslides in the intended area, and N pix ðE i Þ stands for the total number of pixels for purposed class. Although landslide map can be plotted by any extent of values (i.e. even with normalized values), but in this equation, natural logarithm is applied in order to produce negative and positive values of W, for less and more susceptibility rather normal situation (Van Westen 1997b; Gao et al. 2018b).

Multi-layer perceptron (MLP)
In the last two decades, the application of computational intelligence, especially ANN has raised rapidly, for solving various engineering problems. ANNs' structure is inspired by the biological neural network, and it is a capable tool to establish non-linear equations between input(s) and output(s) of a model (Wang 2003). The facility of implementation can be mentioned as the main superiority of this method, compared to the approaches, which are based on statistical rules. The statistical distribution of the input layers and landslide events is not a determinant factor for training the ANN (Asadi et al. 2011a(Asadi et al. , 2011bMoayedi and Rezaei 2017;Mosallanezhad and Moayedi 2017). In other words, numerical data do not need to be defined in different classes to be used by the network. MLP neural network (MLPNN) is one of the most efficient types of ANNs. As the name connotes, the MLP network is constructed from three layers, namely the input layer, hidden layer and output layer. Note that, MLP can contain more than one hidden layer, but it has been shown in various studies that one hidden layer is sufficient for modeling any complex problem (Hornik et al. 1989;Moayedi and Armaghani 2017;Moayedi 2018;Moayedi and Hayati 2018aAlnaqi et al. 2019). The graphical methodology of MLP is shown in Figure 5. Each layer contains a number of 'processing units' called computational neurons. When the input-target pairs are received by a neuron, it commences acquiring the non-linear relationship between them. This is carried out through updating the weights and biases in each iteration. Considering the input and output vectors, the jth neuron calculates the output as follows: where I denotes the input value, W and b are the weight and bias belonging the neuron. Also, F is the activation function that in this study is considered to be hyperbolic tangent sigmoid (briefly Tansig). Equation (5) expresses this function: Moreover, in this study, the feed-forward back-propagation (FFBP) method is applied in which the initial weights and biases are adjusted in a backward path, responding to the computed error. More detail for the theory of FFBP networks can be found in Haykin (1994). Besides, the Levenberg-Marquardt training algorithm is employed for the landslide hazard approximation, due to its better performance in comparison with the conventional gradient descent (GD) techniques (Hagan and Menhaj 1994;El-Bakry 2003)

ABC algorithm
Inspired by the behaviour of natural bees Karaboga (2005) suggested the ABC algorithm. Searching the food sources (nectars) and sharing the data with other honey bees is the essence of this idea. The solution is performed by three kinds of bees, namely employed bees, onlookers and scouts. It is to be noted that onlookers and scouts are known as unemployed bees. In this approach, each possible solution is demonstrated by the position of a food source; and its efficiency (fitness) is evaluated by the amount of nectar in the proposed food source. Since every employed bee belongs to one food source, these two parameters receive the same values. Like other optimization algorithms, ABC commences with an initialization phase and then, the main search cycle is executed, and it until an acceptable performance rate or the maximum iteration is met. During the initialization phase, an initial population is constructed with a random distribution, and after that, the solution is subjected to repeat the search process of the employed, onlooker and scout bees. The employed bees seek new food sources (V ij ) with more nectar within the vicinity of the recognized food source (X ij ). Then, they examine the quality of the discovered source. The parameters are defined by the following formula: where i indicated the randomly picked parameter index (i 6 ¼ m). Also, V ij and X mj stand for the new and determined food sources, respectively. After obtaining fresh food source (V ij ), the best source is chosen between V ij and X ij X ij , with respect to the corresponding fitness.
Onlooker bees must choose their food source considering the information about food sources is shared by employed bees. This selection depends on the probability values (P i ) that is given by the calculated fitness values. P i is defined by the following equation: where f i describes the fitness of the food source. Following this, the algorithm updates the food sources by recruiting more onlooker individuals. In the ABC programming, an 'abandonment criteria' is determined which is representative of the employed bees that their solution cannot be enhanced during a fixed number of trials. These bees are known as scouts, and their solution is deserted. In addition, they begin to look for new random solutions (Karaboga et al. 2007). Figure 6(a) illustrates the flowchart of the ABC algorithm implementation.  Figure 6(b) exhibits the learning process of ABC-ANN algorithm. During this technique, it has been aimed to generate the optimum values for interconnection weights and biases of ANN. In this way, the ANN network is developed by some initial parameters. In the following, the ABC relations (i.e. employed, onlookers and scout bees) discover and evaluate various positions (food sources) and attempt to replace the ANN initial parameters with the most appropriate alternatives. This procedure is carried out by defining an objective function (here mean absolute error (MSE)) to achieve the best solution (Sarangi et al. 2014). More details about this technique can be found in Nandy et al. (2012).

PSO algorithm
PSO is a powerful metaheuristic algorithm that is first designed by Eberhart and Kennedy (1995). The higher learning speed and requiring less memory are the main advantages of this technique, compared to other optimization algorithms (e.g. GA and imperialist competition algorithm . During the PSO implementation, two parameters of g best (the best global positions) and p best (the most proper personal) are found by the particle activities. Equations (8) and (9) formulate the position and velocity of the particles, respectively.
in which X 1 and X 2 are the current and new positions of the particles, and V 1 and V 2 denote the current and new velocities of the particles, respectively. Also, C 1 and C 2 symbolize two constant and user-defined acceleration variables, r 1 and r 2 stand for random values that can be determined as (0,1), and x indicates the inertia weight. Moreover, Figure 7(a) shows a graphical description from the PSO performance. Figure 7(b) depicts the learning process of the hybrid PSO-ANN model. Similar to the ABC algorithm, PSO gets started with an initialization phase, which selects the ANN connection parameters (particle's positions) randomly. The convergence of the trained network is checked recursively by defining an objective function (e.g. MSE). The optimization continues by updating theg best andp best values. As a maxim, PSO aims to minimize the MSE criterion, until one of the defined stopping conditions is met. The outcome of this procedure is to extract the best solution by adjusting the weights and biases of the proposed ANN. Further information about this model is well-detailed in various studies (Chen et al. 2017a).

Statistical index
The landslide distribution map and its effective factor layers were prepared in ArcGIS 10.2 software. Classifying the values of each factor was done based on the authors' experience and previous studies (Regmi et al. 2014;Moayedi et al. 2018). The landslide inventory dataset divided into 2 classes indicating landslide (1) and no-landslide (0) occurrence. During the SI model, independent layers have been crossed by landslide raster one by one to acquire the spatial relationship between them. The attribute table of the resulting layer contains the distribution of landslide in each sub-class (see Table 1). As is clearly described in Table 1, no slope failure has been reported for altitude more than 600-m. Additionally, almost all of the landslide-prone pixels are in the range of 0 -10 for the slope degree conditioning factor. This is while a higher frequency of landslides is expected for the steep slopes, due to the lower shear stresses associated with high gradients. As for independent rainfall factor, the study area is located in the northern part of Iran, which is known for heavy precipitations. As is known, regular precipitations cause loss of shear strength of the soil slopes. It was observed that the distribution of landslides is directly proportional to the annual amount of rainfall. Approximately one-third (33.87%) of pixels that indicate the landslide occurrence is classified by the precipitation more than 524 mm/year. In the case of the lithology of the study area, almost entire landslides (99.71% of prone cells) have been observed in the fifth group of rocks, namely swamp sedimentary rocks. Due to the proximity of the linear phenomena (i.e. roads and rivers) to each other, a nearly equal contribution of landslides is obtained for them. In the case of distance from the road, for example, many mass movements have been triggered by road cutting operations (Nefeslioglu et al. 2008). Also, the presence of the streams affects the likelihood of the slope failure through changing the drainage rate, air humidity and erosion susceptibility of the study area (Pourghasemi et al. 2013a;Hu et al., 2018). In  this regard, the distance range of 0-100 m contains approximately half of (45.92% and 55.41%, respectively, for roads and rivers) landslide representative cells. In the following, the computation process of the SI method carried out in Excel 2010 software.

Artificial intelligence models
The second part of the current study aimed to produce the LSM based on the purposed ANN and hybrid (ABC-ANN and PSO-ANN) method. The required dataset for developing the mentioned networks was derived through converting GIS spatial database from raster to ASCII format. In fact, every raster contains special values, which are assigned to a special pixel. A landslide distribution map was used to compose target data and 12 inputs (e.g. elevation, slope degree, lithology, rainfall, SPI, TWI, distance to road, distance to river, land use, slope aspect, soil type and plan curvature) were inserted to train the predictive network models and produce the LSV as an output. For nominal data (lithology, for instance), the title of each class is altered with continuous values equal to their group number (shown in Figure 3) to become acceptable for the network. During the implementation of artificial intelligence tools, at least two stages of data are needed. The first part, which called the training stage, is the chief part of the dataset, and are used to educate the system. Learning process includes adjusting inter-layer weights and biases to reduce the error in each iteration. The trained network must be evaluated by the second stage called the test phase. Test data are different from another phase, and the performance accuracy of the trained network is distinguished by them. Dividing the dataset was done by 80% and 20% ratio (59 and 15 landslide cases), respectively for the train and test phase. Finding the appropriate network architecture is a fundamental step in the field of soft computing utilization. To this subject, an extensive trial and error process was carried out for ANN, ABC-ANN and PSO-ANN networks. The MSE index was used to evaluate the performance of modes. When the optimum structure is determined, the optimized weights and biases were arranged to compute the outputs of each predictive model.

Artificial neural network
To find the best ANN architecture, an MLP network was tested by 10 different numbers of neurons in its unique hidden layer (i.e. hidden neurons). Figure 8 denotes the result of this procedure. As is clearly seen, the ANN with at five neurons in the hidden layer could present more precise execution (i.e. the lowest error). Therefore, the optimum ANN structure could be defined as 12 Â 5 Â 1 indicating the MLP structure with 12, 5 and 1 neurons in its input, hidden and output layers, respectively. In addition, the weights and biases extracted from ANN are presented in Table 2.

ABC-ANN model
In order to amend the defects of BP learning method in ANN, an ABC optimization algorithm is applied. This algorithm attempts to adjust the ANN parameters. This is noteworthy that this work repeats until the lowest error is reached. To find the most    suitable ABC architecture, different colony sizes were tested within 1000 iterations. As Figure 9 shows, colony size was considered to vary from 50 to 500 with 50 intervals to assess the impact of changes in the number of bees. It must be mentioned that no significant change in MSE values is observed after the thousandth cycle. According to Koopialipoor et al. (2018), the tendency of bees to convene in the place with the best response is the main cause that this occurs. In this respect, the least MSE acquired for ABC network with colony size equals to 150. The produced landslide hazard map and ABC-ANN derived optimal weights and biases are presented in Figure 11(c) and Table 3, respectively.

PSO-ANN model
Considering the previous explanation about the PSO algorithm, apart from the number of iterations, several characteristics such as swarm size, inertia weight and coefficient of velocity equation also affect the performance of an applied PSO algorithm. In this subject and due to the variety of determinant factors, the authors decided to use similar values that have been successfully assigned to previous PSO works. The coefficient of velocity equation and inertia weight were considered to receive the values 2 and 0.25, respectively (Clerc and Kennedy 2002). In the next step, similar to ABC-ANN, a trial and error process was carried out to find the proper swarm size. Ten different number of swarms (from 50 to 500 with 50 intervals) were employed, and the MSE was calculated. Based on the obtained error (Figure 10), the best model was found to be constructed from 400 swarms. PSO-ANN approximation of LSVs has been illustrated in Figure 11(d). Furthermore, ANN parameters optimized by the PSO evolutionary algorithm are featured in Table 4.
All developed LSMs (i.e. for SI, ANN, ABC-ANN, and PSO-ANN models) were depicted into five susceptibility classes with respect to natural break classification (Xu et al. 2012). The results represent an acceptable performance rate for all used techniques. According to Figure 11(a-d), the main landslide distribution path, showing the location of test and train landslide has been correctly categorized as dangerous areas (i.e. High and Very high hazard classes) by all four models. In addition, the main discernible distinction between the above maps refers to the ANN responds. As is clearly observed, the majority of ANN map is classified as safe areas. In contrast, the statistical method shows a more conservative prediction. For more details, the percentage of the pixels occupied by each susceptibility category is illustrated in Figure 12 for all employed models. According to this column chart, the majority of the area (86.83%) is identified by "Very low" susceptibility class by ANN. In addition, ABC-ANN and PSO-ANN methods exhibit an almost equivalent approximation, particularly for the first four hazard groups ( Figure 12). Moreover, a total of 76.72%, 23.96%, 30.55% and 5.37% of the study area were found to be exposed by significant landslide risk (High and Very high susceptibility classes), respectively by SI, PSO-ANN, ABC-ANN, and ANN predictive models.
In the following, the AUC index was used to evaluate the prediction accuracy of landslide predictive models (Pradhan and Lee 2010). Note that, the testing landslides (i.e. landslide validation) were considered here to evaluate the competency of each model. To do so, SPSS software was used to draw the receiving operating Table 3. Optimized weight and biases of the ABC-ANN model.  Table 4. Optimized weight and biases of the PSO-ANN model.  characteristic (ROC) curve. This diagram is available in Figure 13, indicating the true positive rate (on the vertical axis) against the false positive rate (on the horizontal axis) of assessment. This is noteworthy that AUC can vary from 0.5 to 1 so that 1 indicates an ideal performance and adversely, a casual prediction is determined by 0.5. Referring to the calculated AUC values (which are 0.901, 0.857, 0.803 and 0.766 respectively for SI, PSO-ANN, ABC-ANN, and ANN), it can be concluded that SI with 90.1% accuracy, has shown most reliability for landslide susceptibility modelling in this study. About the artificial intelligent methods, it is concluded that applying the optimization  algorithms including ABC and PSO can lead to improving the performance of the ANN model. Along with this, the accuracy rate of 85.7% reveals the superiority of PSO-ANN comparing with ABC_ANN (80.3%) and ANN (76.6%) approaches. In summary, the results of this study suggest that LSM for the purposed study area is viable. In addition, authors believe that the resulted maps could be helpful for urban land use planning and engineers in Golestan province.

Conclusion and remarks
Recent years have witnessed broad use of various soft computing approaches for analyzing the landslide hazard worldwide. In this paper, we assessed the feasibility and efficacy of ABC and PSO for landslide susceptibility zonation. To do so, the mentioned evolutionary algorithms were synthesized with an ANN to enhance its performance. The SI method was also implemented. To generate a reliable LSM, 12 of geological and hydrological effective factors, namely elevation, slope degree, lithology, rainfall, SPI, TWI, distance to road, distance to the river, land use, slope aspect, soil type and plan curvature were used as independent variables. Based on the results, the following conclusions were obtained in this study: Referring to the calculated AUCs of 0.901, 0.857, 0.803 and 0.766, respectively for SI, PSO-ANN, ABC-ANN and ANN, all applied approaches exhibited a satisfying performance. The efficiency of the typical ANN was effectively enhanced by applying PAO and ABC metaheuristic algorithms. From comparison viewpoint, SI model (90.1% accuracy) presented the most accurate prediction, followed by PSO-ANN (85.7% accuracy), ABC-ANN (80.3% accuracy) and ANN (76.6% accuracy) models. According to the achieved landslide hazard maps, a total of 76.72%, 23.96%, 30.55% and 5.37% of the study area were found to be exposed by high landslide risk, respectively by SI, PSO-ANN, ABC-ANN and ANN.
Finally, the results of the current study are suggested to be used for decision makers and land use planning in the study area.