Two novel neural-evolutionary predictive techniques of dragonfly algorithm (DA) and biogeography-based optimization (BBO) for landslide susceptibility analysis

Abstract Due to the wide application of evolutionary science in different engineering problems, the main aim of this paper is to present two novel optimizations of multi-layer perceptron (MLP) neural network, namely dragonfly algorithm (DA) and biogeography-based optimization (BBO) for landslide susceptibility assessment at a study area, West of Iran. Utilizing 14 landslide conditioning factors, namely elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), and topographic wetness index (TWI) and rainfall as the input variables, and 208 historical landslides as target variable, the required spatial database is created. Then, the MLP is synthesized with the mentioned algorithms to develop the proposed DA-MLP and BBO-MLP ensembles. Three accuracy criteria of mean square error, mean absolute error, and area under the receiving operating characteristic curve are used to evaluate the performance of the models and also to develop a score-based ranking system. As the first outcome, the application of the DA and BBO metaheuristic algorithms enhances the accuracy of the MLP. Moreover, referring to the calculated total ranking scores of 6, 14, and 16, it was revealed that the BBO performs more efficiently than DA in optimizing the MLP.


Introduction
As a frequent and ubiquitous natural hazard, landslides are described as downward mass movements (triggered by gravity), and as a consequent of anthropogenic and natural activities (Varnes and Radbruch-Hall 1976;Cruden 1991;Aksoy et al. 2016;Aksoy and Gor 2017;Bui et al. 2019;. Due to undesirable impacts of this environmental threat on urbanization and environmental developments, studying the landslides has received a growing attraction from the side of international scientific community (Aleotti and Chowdhury 1999;Tsangaratos and Ilia 2016). Globally, the vast majority (more than 90%) of the landslides have occurred in developing countries, and they are responsible for around 17% of the reported fatalities (Pourghasemi et al. 2012). In Iran, the landslides have caused more than 185 loss of lives. Considering the distribution of the historical landslide events, the western part of Iran is known as a landslide-prone region. Seimareh landslide (occurred in Lorestan Province), for example, it was the largest debris flow observed worldwide (Shoaei and Ghayoumian 1998). Therefore, analysing the landslide susceptibility is a crucial prerequisite in facing this phenomenon .
Due to the involvement of various geo-environmental causative factors, investigating the landslide susceptibility entails establishing complex and non-linear relationships. So far, many studies have conducted proper landslide susceptibility mapping using statistical techniques (Youssef et al. 2015;Chen et al. 2016;Razavizadeh et al. 2017). Also, multicriteria decision models have shown good capability in this field (Kumar and Anbalagan 2016;Myronidis et al. 2016). In a comparative study, Nicu (2018) employed analytic hierarchy process (AHP), statistical index (SI), and frequency ratio (FR) for landslide susceptibility mapping in Romania. It was found that all three models achieve the prediction rate higher than 50%. However, the AHP (with around 80% accuracy in both training and testing phases) outperforms two other used approaches.
According to Pradhan (2013), utilizing machine learning tools are one of the most effective strategies suggested for analysing the relationship between the landslide and conditioning factors. These models have also shown high robustness for other environmental modelling like flood (Mojaddadi et al. 2017;Termeh et al. 2018) and groundwater contamination . In this sense, Pourghasemi, Jirandeh, et al. (2013) showed the efficiency of support vector machine by testing different kernel classifiers for landslide susceptibility assessment in Golestan Province of Iran. Similarly, Oh and Pradhan (2011) showed the efficiency of adaptive neuro-fuzzy inference system (ANFIS) for regional landslide susceptibility mapping in a prone district of Penang Island, Malaysia. Moreover, many studies have successfully used artificial neural network (ANN) for discerning the spatial relationship between landslide and its conditioning factors (Lee et al. 2004;Kawabata and Bandibas 2009;Can et al. 2019;Tian et al. 2019). Yilmaz (2009) showed the superiority of the ANN when is compared with FR and logistic regression (LR) models. Accordingly, area under the curve (AUC) of the used models was 0.852, 0.826, and 0.842, respectively for the ANN, FR, and LR.
Looking for more powerful evaluative tools, many scholars have suggested the use of hybrid metaheuristic algorithms ). More clearly, utilizing typical intelligent models is associated with some computational drawbacks like local minimum (Moayedi et al. 2018). Optimization algorithms enhance the accuracy of these tools by overcoming the mentioned problems (Chen et al. 2017;Gao, Wang, et al. 2018;Gao et al. 2019;. Bui et al. (2017) used artificial bee colony optimization technique for determining proper set of hyper-parameters of least squares support vector machines (LSSVM) in spatial analysis of rainfall-triggered landslides at Lao Cai area, Vietnam. With 90% accuracy, the susceptibility map produced by the proposed model was more accurate than which produced by the typical SVM prediction. Likewise, Tien Bui et al. (2018) used imperialist competitive algorithm (ICA) for optimizing relevance vector machine (RVM) to produce the landslide susceptibility map of Lang Son city of Vietnam. Their findings indicated that the suggested RVM-ICA (AUC ¼ 0.92) performs more efficiently than two benchmarks, namely the SVM (AUC ¼ 0.91) and LR (AUC ¼ 0.87).
The focal point of this paper is to present two novel optimizations of the ANN, namely dragonfly algorithm (DA) and biogeography-based optimization (BBO) for landslide susceptibility mapping at a landslide-prone area, West of Iran. Although various optimization techniques have been applied to intelligent models, the authors did not come across any precedent study focused on combining the DA and BBO with ANN for the mentioned purpose. This is worth noting that the main contribution of these algorithms to the spatial analysis of the landslide susceptibility is finding the most appropriate computational parameters of the ANN which are assigned to the conditioning factors.

Study area
The study area is located between the longitude 46 00 0 to 47 20 0 E, and the latitude 34 45 0 to 35 48 0 N, West of Iran ( Figure 1). It covers an area of nearly 7811 km 2 . The study area is known as a mountainous area with around 500 mm of average annual precipitation. Two major climates in this region are humid weather and the Mediterranean warm, which results in heavy rainfalls and snowfalls . Our proposed area comprises three cities of 'Marivan', 'Sanandaj', and 'Kamyaran'. The altitude ranges approximately from 750 to 3100, and the maximum terrain slope is about 60%. The land is mostly covered by good ranges, and the economy of the inhabitants is significantly influenced dependent on dry farming and irrigated agriculture (Rahmati et al. 2015). According to the lithology map, 40 different lithology units are found there, where 'Dark grey argillaceous shale' is the most common one, covering around 18% of the area. Moreover, five soil groups can be found, namely 'Inceptisols', 'Rock Outcrops/Entisols', 'Rock Outcrops/Inceptisols', 'Water Body', and 'Entisols/Inceptisols', where the majority of the area (around 80%) is covered by the second soil group.

Data preparation and spatial interaction between the landslide and conditioning factors
When it comes to artificial intelligence techniques, the used dataset consists of two types of parameters, including a target variable, influenced by one or more independent variable(s). In this research, the landslide inventory map plays the role of the target variable, and the input variables were 14 landslide conditioning factors, including elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), and topographic wetness index (TWI) and rainfall. These layers were created and processed in geographic information system (GIS) with a pixel size of 10 Â 10 m Vakhshoori and Pourghasemi 2018). A total of 208 historical landslide events were identified by using previous records Mezaal et al. 2018;Mojaddadi Rizeei et al. 2019;Rizeei and Pradhan 2019) as well as interpreting the aerial photos (in 1:25,000 scale), supported by field monitoring. It is worth noting that, typologically, the marked landslides were mostly rotational slides. Therefore, based on the landslide classification of Varnes (1978), other types of slope failures suchlike translational slides, flow, and lateral spreading were eliminated (Nguyen, Bui, et al. 2019). The same number of non-landslide points were then randomly created over the areas devoid of the landslide. Also, in order to explore the spatial relationship between the landslide and its conditioning factors, the FR theory is used. In this method, the correlation between each sub-group and landslide is directly proportional to the magnitude of the calculated FR . Let N landslide and N domain stand for the percentage of the landslides marked in the proposed sub-group and the percentage of the terrain covered by it, respectively; then the FR is calculated as follows: Figure 2 portrays the map of each conditioning factors along with the obtained FRs. Moreover, Table 1 presents the description of the lithological units (Figure 3), as well as the calculated FRs. As these charts and the table denote, the categories of (1000-1500) m for elevation, North (0-22.5 ) for slope aspect, 'Concave' for plan curvature (0.001-1.1576e þ 10) for profile curvature, 'Inceptisols' for soil type (100-200) m for distance to the river (200-300) m for distance to the road (300-400) for distance to the fault, 'Agriculture' for land cover (10-15) degrees for slope (85e5-23e6) for SPI (À9.38 to À7.40) for TWI (500-600) mm for rainfall, and 'Jugr' for lithology are the most correlated sub-groups, due to the largest FRs obtained for them.

Methodology
Acceding to Figure 4, the overall methodology of this research can be presented in three major steps. Firstly, the spatial database consisting of landslide inventory map as well as 14 landslide conditioning layers is provided, then the identified landslide points are randomly divided into the training (i.e. 70% or 146 landslides) and testing (the remaining 30% or 62 landslides) parts. Note that, the first group is devoted to develop and train the intelligent models, and the second group is used as unseen landslides to evaluate the accuracy of the implemented models. After that, the MLP (multi-layer perceptron) is coupled with the DA and BBO evolutionary algorithms to develop the DA-MLP and BBO-MLP neural ensembles. Lastly, after generating the landslide susceptibility maps, the performance of the models is evaluated by three accuracy criteria, namely mean square error (MSE), mean absolute error (MAE), and area under the receiving operating characteristic curve (AUROC). Equations (2) and Figure 2. The calculated FR for: (a and b) elevation, (c and d) slope aspect, (e and f) plan curvature, (g and h) profile curvature, (i and j) soil type, (k and l) distance to river, (m and n) distance to road, (o and p) distance to fault, (q and r) land cover, (s and t) slope degree, (u and v) SPI, (w and x) TWI, and (y and z) rainfall. (3) express the formula of the MSE and MAE: in which N shows the number of involved samples, and Y i observed and Y i predicted are the desired and predicted values of landslide susceptibility, respectively. Here, the used intelligent models are described:

Artificial neural network
Mimicking the connecting behaviour of the biological neural networks, the idea of ANN was first presented by McCulloch and Pitts (1943). The ability of non-linear analysis has made the ANNs capable tools for input-output mapping of a set of data samples (ASCE Task Committee 2000). Different types of ANNs have been promisingly used for simulating various engineering problems (Moayedi and Rezaei 2017;Hayati 2018a, 2018b;Azeez et al. 2019). MLP is one of the most regularly used types of ANNs which is composed of at least three layers which contain the main processor units (see Figure 5). It has been stated that one hidden layer is sufficient for neural modelling of any problem (Cybenko 1989;Hornik et al. 1989).
Kg be the input and output vectors, respectively. Then, the response of the jth node is calculated as follows: in which W and b stand for the weight and bias. Also, F symbolizes the activation function that is set to be hyperbolic tangent sigmoid (Tansig) in this study: (5)

Dragonfly algorithm
The DA has been proposed as a proper algorithm for optimization tasks (such as adjusting the ANN synaptic weights). Mirjalili (2016) proposed this algorithm for the first time.
It is an algorithm that inspired using the static and dynamic conducts of dragonflies. In the life cycle of dragonflies, there are two different stages including (i) the nymph stage which forms most of their cycle of life and (ii) transformation to the adult stage. The stationary behaviour of dragonflies is the same as the utilization stage in meta-heuristics (Russell et al. 1998). From a different aspect, the dynamic conduct is same to the exploration stage that is where they enter to groups and fly aboard of various zones to discover food resources (Wikelski et al. 2006). The DA is derived by the Reynolds Swarm Intelligence. It draws on around five distinct principles, which are fundamental in discovering the solution of weights.
1. The separation origin has been known as the collision digression of a dragonfly of other dragonflies, which are near its zone (Figure 6(a)). 2. The alignment origin shows velocity conforming of a dragonfly for other different dragonflies, which are near to zone (Figure 6(b)). 3. The cohesion origin is considered as the trend of a dragonfly to the space centre and includes other dragonflies near to its location (Figure 6(c)). 4. Whereas the basic aim of dragonfly swarms is to survive, attraction to food origin ( Figure 6(d)) shows that dragonflies have to move near food sources. 5. The distraction from enemies' origins (Figure 6(e)) occurs when dragonflies run far away from the enemy sources for surviving (Mirjalili 2016).
By Equation (6), we can calculate the separation value. X shows the current dragonfly position, X j stands for the of the jth dragonfly position near X, n indicates the dragonflies number. The alignment amount can be computed by Equation (7). The term Vj shows the velocity of jth dragonfly near the current one. The cohesion amount can be computed by Equation (8). The attraction to food from dragonflies can be computed by Equation (9). X f stands for the food source position. The confusion of enemy amount can be predicted by Equation (10). X e shows the enemy position. To update the dragonflies position for examining different weight solution as well as possess another fitness amount, DX and X have been computed by Equations (11) and (12) (Mirjalili 2016): in which, a, s, e, f, c, e, and w are the weights of their related element.

Biogeography-based optimization
BBO is a population-based search method which was developed according to the biogeography theory (Simon 2008). Mirjalili et al. (2014) coupled this algorithm with a MLP neural network for the first time. The flowchart of the BBO is presented in Figure 7. Similar to other evolutionary techniques, at the first step of the BBO, a random population is produced named 'habitat'. Two parameters of habitat suitability index (HSI) and suitability index variable (SIV) are defined to evaluate the goodness of these individuals (i.e. possible solutions) and the habitability of the habitats and areas. The BBO originally includes two operations, namely migration and mutation that are briefly explained in the following (Hadidi 2015). Firstly, the possible solutions are modified in order to enhance their goodness. In this phase, to decide whether the modification of the SIVs is necessary or not, an immigration rate (kg) is defined (Bhattacharya and Chattopadhyay 2010). In the cases where the modification is required, an emigration rate (lg) is defined. It is used to probabilistically determine the solution that will migrate. This is worth noting that the algorithm keeps away the highly fitted solutions to prevent probabilistic corruption (Roy et al. 2010).
As is known there are various natural hazards which threaten a geographical area. Hence, it is likely to observe some abrupt changes in HIS values. Consequently, during the mutation process, the habitat might deviate from its equilibrium HIS. In this stage, a probability factor is calculated for each relation of the population decide whether it needs to mutate or not. Moreover, this factor determines the solution to the existing problem. In this sense, the more value of probability, the more closeness to the overall solution (Hadidi 2015). Let S be the number of species, then, the rate of mutation is expressed as follows:

Model implementation
As mentioned previously, this paper outlines two metaheuristic optimization of the neural intelligence for the problem of landslide susceptibility. To do this, the DA and BBO techniques are synthesized with a MLP neural network to enhance its prediction capability. More specifically, the solutions that are suggested by these algorithms contain the optimized values of the weights and biases of the MLP. Therefore, it needs to be mathematically introduced to the algorithms. This process was carried out by the assist of the programming language of MATLAB 2014. The proposed DA-MLP and BBO-MLP networks were implemented with population size of 250 and 1000 repetitions. Also, the MSE was defined as the objective function (OF) to measures the performance error at the end of each iteration. The convergence curves of the implemented models are presented in Figure 8. As can be seen, there are some significant distinctions between the optimization behaviours of the DA and BBO algorithms. Although the maximum value of the OF is between 0.23 and 0.25 for both of them, the BBO continues decreasing error until the last repetition. This is while in DA, the majority of the error reduction has occurred between 220 and 424th repetitions. Also, at the end of the process, the training error of the DA-MLP and BBO-MLP reaches to 0.1923 and 0.1378, respectively.

Landslide susceptibility mapping
After developing the DA-MLP and BBO-MLP predictive ensembles, the calculated landslide susceptibility values were transformed into the GIS environment to produce the susceptibility maps of each model. Next, based on these values, the study area was divided into five susceptibility classes, namely 'Very low', 'Low', 'Moderate', 'High', and 'Very high' . It was done by using natural break classification method which tries to reduce the variance within classes and maximize the variance between classes (Jenks 1967;Liu and Duan 2018). The main reason for selecting natural break was the highest popularity of this method for this aim (Irigaray et al. 2007;Akgun et al. 2012;Xu et al. 2012;Jaafari et al. 2014;Gao, Guirao, et al. 2018;. Also, it is proper to note that other existing methods, like equal interval and quantile method, have some weaknesses for being used in such works    Figure 9(a-c), respectively for the typical MLP, DA-MLP, and BBO-MLP prediction. The percentage of the area covered by each susceptibility category is calculated and presented in Table 2. As the table denotes, the MLP has classified 71.02% (around 5548 km 2 ) of the area as hazardous regions (i.e. under the High and Very high susceptibility). This value is obtained as 50.06% (around 3911 km 2 ) and 23.51% (around 1836 km 2 ) by the DA-MLP and BBO-MLP, respectively. At the same time, considering the safe areas (i.e. Very low susceptibility class), it can be deduced that the MLP has presented more cautious prediction. In this regard, 3.07%, 11.67% and 20.99% of the area is recognized by the least landslide susceptibility.

Performance assessment of the models
The performance assessment of the implemented models consists of two parts. Statistically, the MSE and MAE are defined to measure the training and testing error of the implemented MLP, DA-MLP, and BBO-MLP. The results are portrayed in Figure 10, showing a graphical comparison between the predicted and observed landslide susceptibility indices, as well as the histogram of the errors. According to these figures, synthesizing the MLP with the DA and BBO evolutionary techniques has effectively helped it to enjoy more learning and prediction capability of landslide pattern. More clearly, in the training phase, the MSE experienced a decrease by around 16% and 40% as the result of applying the DA and BBO algorithms, respectively. About the MAE, it was reduced by around 12% and 29%. As for the testing phase, the DA was more successful than BBO, in terms of decreasing the MSE (nearly 11% vs. 16%). But the MAE results show a slightly lower mean absolute error for the BBO products (0.4134 vs. 0.4142).
The third considered criterion is the area under the receiver operating characteristic curves (AUROC) which measures the accuracy of the developed susceptibility maps. In this regard, many studies have recommended the ROC curves for evaluating the predictions related to diagnostic problems (Egan 1975), like natural hazards (Beguer ıa 2006). When it comes to landslide susceptibility modelling, the ROC shows the proportion of the non-landslide grid cells which are correctly labelled 'no-failure' (i.e. the specificity) versus the proportion of the landslide grid cells which are correctly labelled 'failure' (i.e. the sensitivity) (Swets 1988;Lasko et al. 2005;Beguer ıa 2006). The obtained ROC curves of both training and testing phases of the MLP, DA-MLP, and BBO-MLP models are presented in Figure 11(a,b), respectively. As the

Discussion and comparison
The obtained results were presented in the previous section. In this part, a scorebased ranking system is developed to compare the efficiency of the applied models.
To this end, based on the calculated MSE, MAE, and AUROC measures, one of the scores 3, 2, and 1 is assigned to them so that the lager scores indicate more accuracy of the model (i.e. the smaller MSE and MAE, and larger AUROC). Finally, the total ranking score (TRS) is obtained by summing the mentioned scores for both training and testing phases, to rank the implemented MLP, DA-MLP, and BBO-MLP. The results are presented in Table 3. As can be seen, the BBO-based neural ensemble outperforms the DA-based and typical MLP networks in training phase in terms of all three MAE, MSE, and AUROC indices. As for the resting phase, the DA-MLP was the superior model, due to the lower MAE and also a higher AUROC in comparison with MLP and BBO-MLP. However, the MAE of the BBO-MLP is slightly lower than DA-MLP. All in all, the obtained TRSs of 6, 14, and 16, respectively for the MLP, DA-MLP, and BBO-MLP indicate that the BBO improved the performance of the MLP, and at the same time, performed more efficiently than the DA. In this paper, it was shown that the solution suggested by the BBO is more successful than DA. In other words, the weights and biases of the MLP which were optimized by the BBO contribute better to the spatial susceptibility assessment. In this sense, many scholars have stated that the weights of each landslide conditioning factor are used to determine the relative importance of that layer (Lee et al. 2004;. Examining the optimization process, it was observed that the DE achieves a reasonable MSE after around 420 tries and remains more and less steady after that. While the BBO kept decreasing the error until the last iteration. It resulted in a huge distinction between the training accuracy of the models. Moreover, since both DA and BBO presented a very close accuracy for the resting phase (DA outperformed BBO in terms of MSE and AUROC), it can be deduced that a better-trained network does not necessarily guarantee a higher prediction accuracy. Besides, simultaneous investigation of the calculation time and accuracy gives that, however, the BBO featured as the more reliable model, the DA seems to be a more reasonable algorithm for landslide susceptibility mapping when the time comes out as a determinant factor. This is because the DA minimized the error in nearly 3590 s and achieve a more accurate prediction, compared to the BBO which took about 4690 s.

Conclusion
Recently, hybrid metaheuristic algorithms have gained a growing tendency for solving various engineering issues with high complexity. In this work, two novel notions of these algorithms, namely DA and BBO were applied to the problem of landslide susceptibility mapping. The mentioned algorithms were coupled with a MLP neural network to optimize its performance in discerning the spatial relationship between the landslide and the considered conditioning factors. As the first outcome, utilizing evolutionary science is an effective way for increasing the reliability of neural computing. In results, it was revealed that the DA-MLP ensemble needs less time to minimizes the training error, compared to the BBO-MLP. However, the BBO-MLP achieved a lower error (MSE BBO-MLP ¼ 0.1378 and MSE DA-MLP ¼ 0.1923). Also, despite the superiority of the BBO-MLP in learning landslide pattern, both ensembles presented a close prediction accuracy (AUC BBO-MLP ¼ 0.722 and AUC DA-MLP ¼ 0.744). ALL in all, the BBO-MLP was the most successful model of the current paper. Finally, the produced landslide susceptibility maps can be used in land use planning for alleviating the damages caused by this natural disaster.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This research was supported by Ton Duc Thang University (Vietnam).