Mapping landslide susceptibility and types using Random Forest

ABSTRACT Landslides are one of the most destructive natural hazards; they can drastically alter landscape morphology, destroy man-made structures, and endanger people’s life. Landslide susceptibility maps (LSMs), which show the spatial likelihood of landslide occurrence, are crucial for environmental management, urban planning, and minimizing economic losses. To date, the majority of research into data mining LSM uses small-scale case studies focusing on a single type of landslide. This paper presents a data mining approach to producing LSM for a large, heterogeneous region that is susceptible to multiple types of landslides. Using a case study of Piedmont, Italy, a Random Forest algorithm is applied to produce both susceptibility maps and classification maps. These maps are combined to give a highly accurate (over 85% classification accuracy) LSM which contains a large amount of information and is easy to interpret. This novel method of mapping landslide susceptibility demonstrates the efficacy of Random Forest to produce highly accurate susceptibility maps for a large heterogeneous region without the need for multiple susceptibility assessments.


Introduction
Landslides can be broadly defined as a movement of a mass of rock, earth or debris down a slope (Cruden, 1991). They are one of the most destructive natural hazards; they can drastically alter landscape morphology, destroy man-made structures, and endanger people's life. Identifying areas which are predisposed to landslides is vital for ensuring human safety, environmental management, urban planning, and minimizing economic losses (Kavzoglu, Sahin, & Colkesen, 2014;Zêzere, 2002).
Landslide can range from an individual rock fall to large creep failures. The type of landslide is central to impacts such as casualties, damage to structures, and socioeconomic consequences (Alexoudi, & Papaliangas, 2010). In areas where numerous varieties of landslide occur, it is useful to differentiate between and explicitly map susceptibility to each type (Mantovani, Soeters, & Van Westen, 1996). As an example, rock avalanches are highly destructive, fast-moving debris flows can cause widespread damage and casualties, while slow-moving landslides typically cause damage to property (Guzzetti, Carrara, Cardinali, & Reichenbach, 1999). Events which trigger landslides LSM which treats all landslides as a single class, showing overall susceptibility. The second is to classify the most probable landslide type for each grid cell in the study area. By combining the two maps, it is possible to identify both highly susceptible areas and attribute the area with a landslide class. This has the benefit of presenting a large volume of information on a single map, which can be easily interpreted by planners and decision makers.
The method proposed in this paper is a Random Forest (RF) data mining algorithm. RF offers many appealing characteristics for classification task. As RF is a non-linear, nonparametric algorithm, it can deal with large datasets containing both categorical and numerical data and account for complex interactions and non-linearity between variables. Second, it can handle the case where there are more predictors than observations and incorporate interaction between multiple predictors. Third, RF is able to handle missing values and maintain accuracy for missing data. Furthermore, compared to other machine learning methods, such as artificial neural network and support vector machine, RF does not require much fine-tuning of hyper-parameters. In many cases, using default parameter settings can achieve excellent performance. Comparing with other treeensemble methods (e.g. Boosting), RF is computationally light. Therefore, RF is commonly used in for large-scale mapping and classification applications in ecology (Akar & Güngör, 2015;Prasad, Iverson, & Liaw, 2006), soil science (Hengl et al., 2015;Taalab, Corstanje, Creamer, & Whelan, 2013), and flood mapping (Feng, Liu, & Gong, 2015).
The remainder of the paper is organized as follows. First, we describe the framework and methodology in Section 2. Then, an empirical case study of Piedmont, Italy is presented in Section 3. Finally, conclusion and future research are demonstrated in Section 4.

Framework and methodology
In sum, we need to train a binary RF classifier to predict the landslide susceptibility and train a multi-class RF classifier to infer the landslide type. Given the conditioning factor (input features) of a certain area, we first estimate the susceptibility probability of an area, if the probability is higher than a predefined threshold, we then predict the most probable landslide category, as shown on Figure 1. As the RF is the model to do the binary and multi-class prediction, we mainly describe the RF in the following subsection and then we give a brief introduction to the dataset.

Random Forest (RF) for landslide susceptibility mapping and classification
RF is a data mining algorithm that is able to accurately classify large amounts of data using an ensemble of decision trees (Breiman, 2001). Decision trees are predictive models which use a set of binary rules to determine a target variable ( Figure 2). The data used to train the model is comprised of a set of predictor variables and the target variable, which is being predicted. The purpose of the decision tree is to use the predictor variables to partition the data into homogeneous datasets with respect to the target variable. In the simplest case, the target variable will be a binary classification (e.g. the presence or absence of landslides). Both target and predictor data are present in the root node. The algorithm then tests the ability of each predictor variable to divide the target variable into the two classes. The predictor  variable which leads to the most accurate classification is selected and the process is repeated at each of the new nodes. The splitting process continues until there are no more splits to be made (i.e. each terminal node contains target variable data of only a single class).
A single decision tree is a weak classifier. Typically, it has either high variance (the splits at each node are so closely aligned to the training data that the model cannot be used to predict new data) or high bias (the splits do not accurately represent the relationship between predictor and target variables). RF mitigates these problems by using an ensemble of decision trees, which strikes a balance between the two sources of error.
If the same data were used to train multiple decision trees, they would all be identical, which defeats the object of an ensemble model. To avoid this, RF increases the diversity of the trees by making them grow from different sub training datasets created via bagging. Bagging is used when our goal is to reduce the variance of a decision tree. It generates several subsets of data by randomly sampling with replacement from training samples (typically two-thirds) to train multiple decision trees. The rest of the data that is not used to train the decision tree is known as "Out-Of-Bag" (OOB) data. RF further extends the concept of bagging. Besides using the subset of data, it also takes the random selection of predictor variables rather than using all of them to grow trees. The classification accuracy of each decision tree and ultimately the RF is assessed by predicting the mean square error (MSE) of the "out-of-bag" portion of the data, then averaging over the entire forest (known as the OOB error): whereẑ OOB i is the mean out of bag prediction for the ith observation. RF modeling also provides a measure of fit comparable to the R 2 values of the other models. This "pseudo R 2 " is labeled the "percent variance explained" and is calculated using: whereσ 2 y is the total variance of the dependent variable calculated with n as the divisor, rather than n-1 (Liaw & Wiener, 2002).
The OOB data allow RF to rank predictor variables in order of importance. This is done by measuring how much the OOB estimate error increases when data for a particular variable is "removed" from the analysis and the other variables are left intact. This is done on a tree-by-tree basis for the entire forest. The variables which cause the greatest increase in OOB error when removed are deemed to be those of greatest importance. RF needs to be defined two parameters for generating a classification model: the number of trees in the forest (ntree) and the number of variables tested at each node (mtry) to make the tree grow. There are no specific rules about the number of trees required in a RF and increasing the number of trees does not automatically increase the accuracy of the model, however, it will increase the computational burden. A rule of thumb is that the number of trees should be tested and increased to the point where the OOB error stabilizes. Models can be sensitive to the mtry parameter, as using a greater number of variables at each split will increase the strength of the individual tree but also increase the correlation between trees in the forest. Increasing correlation between trees typically increases the error rate in predictions, while increasing the predictive accuracy of individual trees decreases error rate. For this reason, it is necessary to test a number of mtry values. RF makes predictions by running new data down every tree in the forest. The new data is assigned a classification on the basis of majority vote. The proportion of votes that a class receives is used to attribute a probability of class membership (Bostrom, 2007). For example, if we are using a RF made up of 100 trees and we are trying to predict whether a new location is susceptible to landslides we take all the conditioning factors from that location and feed them into the RF. Based on the splitting rules of each 100 trees, the new data will be classified as either susceptible or nonsusceptible. If trees classify the new data as susceptible and 25 classify as non-175 susceptible the point will be classified as susceptible (majority vote) and be given the susceptible class probability of 0.75.
For LSM, the predictor variables are the conditioning factors and the target variable is the presence or absence of landslides. At each node, a number of conditioning factors are tested to determine which can most accurately differentiate between susceptible and non-susceptible areas. Using the example in Figure 2, the first split is above and below slopes of 5°. Following the right-hand branch of the tree, the next split is based on separating alluvial lithology from all over groups. This creates terminal nodes where all data used to train this decision tree has been split into homogenous "Landslide" and "No Landslide" groups, based on two splitting rules.
The process of identifying the landslide types is very similar to the binary split of "landslide" and "no landslide" as described above. The only difference is that now the target variable (landslide class) is multiclass rather than binary. The purpose of splitting decisions is to create homogenous groups of landslide types. Generally, as there are multiple classes, there will need to be more splits to reach terminal nodes, meaning trees will be bigger. New data are still predicted by a majority vote of all trees in the forest, and the proportion of the votes received represents the probability of class membership. One difference is that now the majority votes no longer needs to be the overall majority. For instance, classes A, B, and C receive 40%, 30%, and 30% of the votes, the data will be classified as a despite the fact that there is a 60% probability that it does not belong to that class.

Training data and sampling resolution
Developing a RF model for LSM needs data to train the model and validate its predictions. For a LSM, the primary requirement is data that represents both susceptible and non-susceptible areas. Typically, susceptible data is sampled from in and around landslides identified by the inventory (Nefeslioglu, Gokceoglu, & Sonmez, 2008). Non-susceptible areas are taken from areas beyond a buffer zone of previous landslides (Park, Choi, Kim, & Kim, 2013) or from areas where landslides are physically unlikely to occur (Gomez & Kavzoglu, 2005). Inventory data and data on conditioning factors are usually stored as GIS layers made up of grid cells of a given resolution. Once susceptible and non-susceptible areas have been identified, sampling conditioning factors from corresponding spatial locations is a straightforward task using GIS. As data mining models are typically "black box", meaning it is very difficult to define the relationship between variables, the accuracy of the models needs to be tested. Typically, the total training dataset is split with approximately 70% of samples used to train models and the remainder used for validation.
The physical factors known to control slope stability include slope angle, a range of physical soil properties such as shear resistance and cohesion, hydrological properties and the influence of biomass (Vorpahl, Elsenbeer, Märker, & Schröder, 2012). These data are generally not available at large scale; therefore, it is common to use geomorphological attributes that act as a proxy. There are a huge number of data which can be used as conditioning factors and there is no definitive method of selecting which should be used. While some commonly used conditioning factors, such as slope and lithology, are widely accepted as influential, there is debate about the merits of others (Catani, Lagomarsino, Segoni, & Tofani, 2013). In many cases, selection will depend on data availability, quality, and the findings of related studies.
Landslide inventories often contain other information, including a classification. This is generally based on the volume of material involved, speed, and type of movement, as well as the underlying geological conditions Varnes (1978). There is, however, no single exhaustive taxonomy; they can vary both regionally and nationally. The end users of the LSMs will typically be familiar with the classification system used in their region.
It is necessary to consider the both the format and resolution at which to represent LSMs, this decision should be focused on the end user. Typically, LSMs are used for planning purposes, hence the map needs to be of sufficient resolution to be fit-forpurpose, as well as being easy to interpret for non-specialists. This study will present LSMs at a 100 m resolution grid cell format. In comparison to terrain or slope units, grid cells are more easily aggregated to various administrative boundaries, which can aid decision making (Trigila et al., 2013). This approach is also more straightforward when sampling input parameters. Catani et al. (2013) found that a grid size of 50-100 provided the most accurate LSM classification, therefore, the input data in this study 240 was dis/ aggregated to a 100 m grid.

Case study area
To demonstrate the efficacy of RF for LSM and classification, the method will be applied to the case study region of Piedmont, a 25,402 km 2 region in northwest Italy ( Figure 2). This region is particularly appropriate as since 1950, in Italy alone the economic cost of landslides has been more than 53,249 billion Euros. They have also caused more than 3500 deaths in the country (Trezzini, Giannella, & Guida, 2013). Italy has the most infrastructure (in terms of km of roads) and land (in terms of km 2 ) exposed to landslides within Europe, moreover, Piedmont has been identified as being within a landslide "hotspot" (Jaedicke et al., 2014). For this reason we would like to determine both the susceptibility to landslides and the type of landslides to which an area is susceptible.

Landslide inventory
The landslide inventory used in this study is SiFRAP (Sistema Informativo Frane in Piemonte Landslide information system in Piedmont) is a dataset containing 30,439 landslides dating from the early twentieth century to 2006, mapped at a scale of 1:10,000 (Figure 3(b)) (Lanteri & Colombo, 2013). This is an update of the IFFI (Inventario dei fenomeni franosi in Italia-Inventory of Landslide in Italy) project (Amanti, Bertolini, & Ramasco, 2001). Of the 30,439 landslides, 236,715 have been classified based on the type of mass movement involved (Table 1). A comprehensive description of the classification taxonomy is available from SiFRAP (Piemonte, 2009). The location of the landslides in the inventory are shown in Figure 3(b).

Conditioning factors
A description of some commonly used LSM conditioning factors is shown in Table 2, along with a description of the ranges of data found in our case study region.
Landslide areas were sampled at a 100 m grid resolution, from within the boundaries of historic landslides within the inventory. The non-landslide data points were also sampled on a 100 m grid, selected randomly from the within the study area, excluding areas within 200 m of the existing landslide sites. The number of nonlandslide points was equal to the number of landslide points, give a total of 479,412, divided into a training dataset of 335,740 samples and a test dataset 143,692 samples. As not all the landslides have been categorized, a subset of the 236,715 samples, divided into a training dataset of 165,698 and a validation dataset of 71,016 was used to train and test the classification model. The accuracy of the maps produced will be assessed using a confusion matrix, as recommended by Kavzoglu et al. (2014). For the RF model, we determined 200 trees to be sufficient to produce stable models, echoing other experimental results Lanteri & Colombo, 2013). After testing, the optimal mtry was determined to be 2 variables at each split. The models were generated using the "RandomForest" package (Liaw & Wiener, 2002) in the R statistical computing language.

Results
The LSMs produced by RF are shown in Figure 4. Figure 4(a) shows a binary classification, where the region is classified as either susceptible to landslides (black) or not susceptible to landslides (grey). Figure 4(b) shows landslides susceptibility on a continuous scale from high to low. As stated, RF can predict both class and the probability of class membership. Figure 4(b) shows the probability of memberships to the susceptible/ landslide prone class. Across the region, susceptibility is highest in the mountainous areas in the north, west and south of the region and lowest in the alluvial plain in the centre and east.
Using the test dataset, the recall and precision is 88.41% and 88.66%, respectively and the overall classification accuracy of the model is just over 88% (Table 3). This means that over 88% of the test dataset was correctly identified as being either a landslide area Multiple Areas affected by one or more landslides and/or by morpho-structural elements associated with them. These landslides are generally not dated, and often involve entire slopes.
or a non-landslide area (Figure 4(a)). Results show that the largest source of error comes from landslide prone areas being classified as non-susceptible. This is to be expected as many on the mountainous regions will be highly susceptible to landslide occurrence despite the lack of previously recorded landslides in the area. Slope is the angle formed between any part of the surface of the earth and the horizontal. The angle is a prominent controlling factor on the shear stress experienced by earth and rock mass on a slope. This is on a 20 m grid derived from the DEM ranging between 0°and 87°. Aspect The aspect of each grid cell will have a bearing on the amount of rainfall and intensity of rainfall it experiences, as well as the amount and intensity of solar radiation. Aspect is defined as the compass direction of a slope. From 0°to 360°, flat areas are assigned −1.

Curvature
Curvature can be thought of as the slope of a slope. This will affect both stress on the material on the slope and the movement of water across the slope surface. Derived from the DEM on a 20m grid. The values range from −510 to 192. Profile curvature This is the rate at which the slope gradient changes parallel to the direction of maximum slope. A positive value indicates the cell is part of an upwardly concave slope. A negative value indicates that the slope is upwardly convex. In the study areas this ranges from 295.7 to −192.3.

Plan curvature
The slope perpendicular to the direction of maximum slope. A positive value indicates the cell is part of a sidewardly concave slope. A negative value indicates that the slope is sidewardly convex. In the study areas this ranges from 59.9 to 214.5.

Parent material Lithology
Lithology represents the geomechanical properties of bedrock and is a controlling factor in the structural and chemical propertied of soil. This study uses a 1 km raster grid showing the dominant parent material, divided in 12 classes in the region (Van Liedekerke, Jones, & Panagos, 2006). Land form Pennock landform classification, divided the study areas into seven classes of threedimensional landform elements. (Pennock, Zebarth, & De Jong, 1987). Landform has been shown to strongly influence LSM (Schulz, 2007).

Topographic wetness index (TWI)
Topographic wetness index represents a hypothetical measure of the accumulation of water flow at any point within a river basin. This can be considered to represent the distribution of soil moisture in the region. This was derived from the DEM on a 100m grid. Values range from 3.9 to 30.4. Soil classification Soils influence how water moves across the landscape. Some soils are more cohesive, others more prone to erosion. This will affect the conditions which trigger landslides and the type of landslide which occurs. World Reference Base (WRB) soil classification on a 1 km grid, divided in 15 classes in the region (Van Liedekerke et al., 2006).

Land cover
Represents vegetation and how the land is used, both of which can influence susceptibility. We use the CORINE land cover map 2006. A 1:100,000 scale land cover map divided into 16 land cover classes in the region, produced by interpretation of Landsat TM and SPOT HRV satellite imagery. Distance from road Building roads can destabilize slopes, leaving them predisposed to landslides. Furthermore, the vibrations caused by traffic can become a triggering mechanism. This was derived from the OpenStreetMap road network map of Italy using a GIS operation to calculate distance from a line. This produced a raster grid of 100 m resolution. Distance from river Proximity to the stream network has been shown to influence susceptibility. Streams have the power to erode soil and apparent material (Gomez & Kavzoglu, 2005). This was derived from a river network map of Italy (available from ISPRA) using a GIS operation to calculate distance from a line. This produced a raster grid of 100 m resolution.

Average annual rainfall
Rainfall is generally considered as a triggering mechanism for landslides, however, it is rarely included in LSM. In an study area this large, the rainfall will be spatially variable and should therefore be considered as a predisposing factor . Here the average annual rainfall on a 20 km 2 grid ranging from 684 to 2640 mm y -1 .

Hydrogeological complex
A classification based on hydrogeological formations, which contain similar geological, hydrogeological, productivity, and hydrogeochemical facies. This map is produced by ISPRA at 1:10,000 scale, divided into 11 classes within the region.
RF can also rank predictor variables in order of importance, based on each of their relative contribution to the classification accuracy of the model. Figure 5 shows that distance to rivers, TWI and rainfall are all important predictors, whereas, the various measures of curvature are relatively ineffective predictors of the presence or absence of landslides.
In the SiFRAP dataset, landslides have been divided into the multiple classes shown in Table 1. Figure 6 shows a RF prediction of the dominant landslide classes distributed across the study area. Using the test dataset, the random forest model is shown to have an overall classification accuracy, recall, and precision of 77.51%, 60.34%, and 77.26%, respectively. According to Table 4, among all categories, the "crash/roll over" and "fast dripping" landslide have the lowest recall value of 18.45% and 11.41%, respectively. This phenomenon may result from the class-imbalanced data samples, which should be further considered in the future research. Over 75% (Table 4). The probability of membership to each landslide class is calculated for each of the points in the testing    Figure 6. a) The dominant landslide class prediction. b) Areas of high landslides susceptibility (susceptibility >0.5) where the probability of landslide class membership is over 0.5, applying this threshold has been shown to improve overall classification accuracy to over 85% (Table 5).   data. Each point is assigned to the class of with the highest probability, giving the results in Table 4. This means that some points are assigned to a class that they probably do not belong to (e.g. if a point is has a 40% probability of belonging to the collapsing/ overturning class, a 30% probability that it belongs to DGPV and a 30% probability that it belongs to sliding Crash/rollover it will be classified as collapsing/overturning, despite the fact the model shows that there is a 60% probability that this is incorrect). As planners will be most interested in focusing resources in areas that are most likely to be impacted, it is possible to combine the susceptibility and classification maps. Figure 6(b) shows areas of high landslides susceptibility (susceptibility > 0.5) where the probability of landslide class membership is over 0.5, applying this threshold has been shown to improve overall classification accuracy to over 85% ( Table 5).
The relative importance of the conditioning variables for landslide classification is shown in Figure 7. Again, that distance to rivers and rainfall are important predictors, as is aspect, while in this instance TWI is the least powerful predictor.
The spatial distribution of the relative probability of each landslide class is shown in Figure 8. This shows the probability of class membership of each landslide type for every grid cell in the study area. This demonstrates that some landslides, such as collapsing/ overturning and multiple (Figure 8(g,i) respectively) are strongly associated with specific regions, whereas others such as sliding rotational/translational (Figure 8(b)) can be expected in a much greater range of locations. Complex landslides (Figure 8(e)), which are a combination of two or more other types of landslide have the potential to occur in virtually all locations.
To improve the accuracy of the classification, it is possible to impose thresholds so only points above a set probability are classified. Increasing the threshold increases the classification accuracy (Table 5), however, this approach reduces the number of samples that can be classified in the validation dataset (also the amount of the study area that can be mapped by landslide class).

Discussion
The RF LSM procedure described in this paper has been shown to be a highly effective way of predicting and susceptibility and landslide class. Visualising susceptibility and presenting it in a way that can be easily interpreted is a key element of LSM. The rationale is that landslide management strategies have limited resources and should therefore focus on the locations most susceptible to landslides, they also differ depending on the type of landslides which occur (Guzzetti et al., 1999). Identifying areas that are highly susceptible to landslides is the key issue for those planning mitigation strategies. Knowing the type of landslide that is likely to occur in susceptible regions provides extra information that can improve resource allocation. For ease of interpretation, we suggest a masking process when visualizing results. Figure 6(b) shows landslide classes only for the susceptible areas and only where the probability of class membership is above a 0.5% threshold, this has been shown to improve overall classification accuracy to over 85% (Table 5) and will allow decision makers to focus on highly susceptible and accurately classified areas.
Susceptibility mapping by landslide class (Figure 6(b)) and showing the distribution of landslides types in the study area is a novel approach to LSM. It combines the ease of interpretation provided by multiple maps for individual landslide classes with the   potential to be incorporated in a generalized hazard assessment (Guzzetti et al., 1999). This method addresses the problem of combining and visualizing the spatial likelihood of multiple types of landslides across a large, heterogeneous region without the need for distinct hazard evaluations. The RF algorithm allows us to predict an overall susceptibility across a large, heterogeneous region that experiences a wide range of landslide types without the need for multiple susceptibility analyses. The addition of a classification provides more important information which can inform the strategic management of landslide hazards. The overall classification accuracy for the RF LSM ( Figure 4) over 88%, which is favourable compared to other studies using data-mining approaches for LSM which are typically around 70-80% (Kavzoglu et al., 2014). The case study used is also substantially larger (at least an order of magnitude) than many others (e.g. (Arnone et al., 2014;Gomez & Kavzoglu, 2005;Pourghasemi et al., 2013;Vorpahl et al., 2012)) suggesting that RF LSM can be applied to heterogeneous landscapes at a regional or national-scale projects.
Of the predictor variables for susceptibility, distance from rivers and TWI are most important. These findings contradict those of Catani et al. (2013) who found that neither TWI nor proximity to rivers were important predictors of landslides. The difference can probably be attributed to the different landslide inventories and taxonomies, used in different studies. Both TWI and proximity to rivers have been shown to be important predictors of landslide susceptibility in mountainous regions (Devkota et al., 2013). This seems logical as the majority of landslides in the region occur in mountainous areas, which would coincide with low TWI and larger distances from rivers. This underlines the assertion that empirical LSM models are not readily extrapolated to neighbouring regions (Guzzetti et al., 1999). This also highlights a limitation of the data-mining approach, although RF ranks variables in order of importance, there is no way of knowing the process that the variables represent. Nor is there any way of determining a definitive ranking of variables as this will depend on the location of the case study, type of model, resolution of training data, and sampling scheme.
In terms of classification, overall, more than three quarters of the test data is classified accurately, however, there are substantial differences between the accuracy of each class. The accuracy, which is the percentage of the test dataset labelled a certain landslide class in the map is really this class, is all between 70 and 84% accurate. The producer's accuracy, which is the percentage of a landslide class in the test data being classified correctly, varies considerably. Only around 18% of the crass/roll over class and 11% of the fast dripping classes are correctly identified. This may be attributed to the relative lack of training samples for each class. Many of the crash/rollover landslides have been classified as collapsing and overturning or DGPV. This is unsurprising as they have a similar spatial distribution ( Figure 3) and are predicted as occurring in the same areas, associated with high elevation and steep slopes (Figure 8).
onversely, both DGPV and Multiple landslides have over 90% accuracy. Both classes are strongly associated with specific, fairly homogeneous spatial regions and have a relatively high number of samples in the training data, making classification more straightforward. The relative importance of predictors is shown in Figure 7. Distance from rivers and average annual rainfall are shown as important predictors, as is aspect. Aspect may be a strong predictor as it can be used to distinguish between classes associated with low relief, flat areas such as sliding rotational/translational, slow dripping, fast dripping, and other classes found on the slopes. It may seem unusual that TWI has changed from being a strong predictor of susceptibility to a weak predictor of class, however, this may be due to the clear spatial divide between susceptible and nonsusceptible areas in the study area (Figure 4), which can be linked to low and high TWI respectively. Within each area, there are a number of classes of landslide, here TWI may have virtually no discriminatory power and lead to misclassification.

Conclusions and future research
The combination of susceptibility mapping and landslide classification using RF is a novel method which directly addresses the challenges of large-scale LSM in a region that experiences multiple types of landslides. This study had specifically demonstrated that it is possible to use RF to produce highly accurate susceptibility maps for a large heterogeneous region without the need for multiple susceptibility assessments. Moreover, by joining classification and susceptibility predictions, it is possible to use this method to present planners and decision makers with an LSM that both contains a large amount of information and is easy to interpret. This method can be used to target resources on areas that are highly susceptible to specific landslides.
Results suggest that there is significant scope to develop a joint classification-susceptibility model. Rather than a two-stage modeling process, susceptibility is determined using a single RF model. The RF is trained to predict landslide classes, with a further "non-susceptible" class added to the target variables. Research challenges involved in this approach include how best to sample non-landslide areas and the number of non-susceptible samples to use in classification.

Data availability statement
The data referred to in this paper is not publicly available at the current time.