Comparison of four kernel functions used in support vector machines for landslide susceptibility mapping: a case study at Suichuan area (China)

ABSTRACT Suichuan is a mountainous area at the Jiangxi province in Central China, where rainfall-induced landslides occur frequently. The purpose of this study is to assess landslide susceptibility of this region using support vector machine (SVM) with four kernel functions: polynomial (PL), radial basis function (RBF), sigmoid (SIG), and linear (LN). A total of 178 landslides were used to accomplish this approach, of which, 125 (70%) landslides were randomly selected for training the landslide susceptibility models, whereas the remaining 53 (30%) were used for the model validation. Fifteen landslide conditioning factors were considered including slope-angle, altitude, slope-aspect, topographic wetness index (TWI), sediment transport index (STI), stream power index (SPI), plan curvature, profile curvature, distance to rivers, distance to faults, distance to roads, precipitation, landuse, normalized difference vegetation index (NDVI), and lithology. Using the training dataset, nine landslide susceptibility models for the Suichuan area were constructed with the four kernel functions. To evaluate the performance of these models, the receiver-operating characteristic curve (ROC) and area under the curve (AUC) were used. Using the training dataset, AUC values for the SVM-PL models with six degrees PL function (1–6) are 0.715, 0.801, 0.856, 0.891, 0.919, 0.953, respectively, and for the SVM-RBF model, the SVM-SIG model, and the SVM-LN model are 0.716, 0.741, and 0.740, respectively. Using the validation dataset, AUC values for the SVM-PL models with six degrees PL function (1–6) are 0.738, 0.730, 0.683, 0.648, 0.608, and 0.598, respectively, and for the SVM-RBF model, the SVM-SIG model, and the SVM-LN model are 0.716, 0.741, and 0.740, respectively. Our results suggested that the SVM-RBF model is the most suitable for landslide susceptibility assessment for the study area.


Introduction
In mountainous regions, landslides are considered as the most costly and damaging natural hazards that cause thousands of deaths every year and losses of billions of dollars (Michel et al. 2014). Landslides occur as results of complicated and perplexed processes (Feuillet et al. 2014;Perrone et al. 2014), and in general, earthquake and rainfall are considered as the two major triggering (Ding et al. 2014;West et al. 2014). To reduce the serious consequences induced by landslides, in recent years, many scientists have been engaged in landslide susceptibility mapping, and consequently, various methods and techniques have been exploited (Carey & Petley 2014;Hassaballa et al. 2014;Lissak et al. 2014). These methods and techniques were established in combining with Geographic Information System (GIS) and remote sensing (RS), and in general, process of producing landslide susceptibility maps is more easier and accurate (Klose et al. 2014;Paul ın et al. 2014).
Although many models have been proposed for landslide susceptibility mapping, until now most scholars have different opinions about the models selection, some of them try to contrast the models to acquire a result in a study area; this may be a good way to compare the advantages and disadvantages of each model (Yalcin et al. 2011;Pourghasemi et al. 2012a;Kavzoglu et al. 2014;Umar et al. 2014;Youssef et al. 2014Youssef et al. , 2015. Several models were conducted to produce landslide susceptibility map including logistic regression (Ercanoglu & Temiz 2011;Akgun 2012;Conoscenti et al. 2014;Kavzoglu et al. 2014), artificial neural network , support vector machine (SVM) (Chen et al. 2016a;Hong et al. 2015Hong et al. , 2016Li and Kong 2014;Peng et al. 2014), decision tree (Yeon et al. 2010;Pradhan, 2011;Alkhasawneh et al. 2014), evidential belief functions (Althuwaynee et al. 2012;, index of entropy (Constantin et al. 2011;Pourghasemi et al. 2012b), weights of evidence (Chen et al. 2016b;Neuhaeuser et al. 2012;Tehrany et al. 2014), analytical hierarchy process (Chen et al. 2016c;Shahabi et al. 2014), and frequency ratio (Pradhan and Lee, 2010;Demir et al. 2013). Among all these methods, the SVM model is new technique in landslide susceptibility mapping and it becomes more and more popular, due to its procedure is based on soft computing statistical theory (Yilmaz et al. 2010;Xu et al. 2012).
China is the most populous country worldwide, thousands of years of human activity, the history of the endless wars, especially in recent decades the rapid development of economy and population growth rate, but also the use of the nature resources has been increased leading to a strong interfere with the natural environment Miao et al., 2014). In the eastern and central regions of China, due to the large number of extraction of groundwater and massive exploitation of mineral resources (including oil and gas resources), resulting in the destruction of groundwater resources and geotechnical equilibrium state of tectonic stress changes induced and exacerbated land subsidence, ground subsidence, ground fissures, land salinization, swamping, development and hazards of geological disasters collapse, slip, flow, mine disasters (Dong et al. 2014;Xu & Xu 2014a;2014b;Xu et al., 2013aXu et al., , 2013b. In the western region of China, due to the development and other over-development of land, grasslands, forests and water resources, different problems were raised, including acceleration of soil erosion, desertification and contain collapse, landslides, and mudslides (Yin, 2014). Landslides have caused huge economic losses and casualties every year. Therefore, prevention and control of landslide disasters for China have a special significance Zhuang et al. 2014). In summary, landslides susceptibility mapping become more and more important in landuse planning and government management all over the world (Coe 2012;Moretti et al. 2012) The aim of this study is to produce landslide susceptibility maps using SVM model in the Suichuan area of China. The major achieve of this study is to contrast the results between four kernel functions named polynomial (PL), Lineal, Radial basis, and SIG. Besides, in PL, six kinds of degree from 1 to 6 were applied to verify the accuracy of the kernel functions. Finally, nine landslide susceptibility maps using four kernel functions in SVM model were produced.

Study area
The Suichuan area is located in the southern section of Luoxiao Mountain, the southwest border of Jiangxi Province of China. The study area lies between latitudes 25 28 0 32 00 N. and 26 42 0 55 00 N., and longitudes 113 56 0 51 00 E. and 114 45 0 45 00 E. It covers an area of 3,144 km 2 . Suichuan area is from the southwest to the northeast of Wanyang mountain; there are low mountains, hills and river valley plain. The county has two major rivers, Shu River, is a tributary of Ganjiang river (http://www.jxyh.gov.cn).
Suichuan area belongs to the subtropical monsoon climate, the annual precipitation ranges from 1,111.2 mm to 2241.3 mm with an average of »1,653 mm. The rainy season falls within March to September, accounting for 77.6% of the yearly rain, according to meteorological data  year) of Suichuan area (http://www.weather.org.cn). The area is characterized by an average annual temperature of »18.6 C and average annual sunshine 1720.3 hours. In the Suichuan area, there was no information about earthquake-induced landslides and about the high amount of precipitation that induced landslides. Figure 1 shows the landslide location and some recent photo about landslide disaster. The altitude of the area ranges from -44.6 to 1229.7 m above sea level. Around 33.6% of the study area has a slope gradient less than 15 whereas areas with a slope gradient larger than 30 account for 13.6% of the total study area. Areas fall into the slope category 15 -30 account for 52.8% of the total study area.
The geological structure of Suichuan area is complex. More than 48 geological groups and units are recognized (Table 1). The main lithological units in the study area are limestone, sandstone, silty slate, carbonaceous slate ( Figure 2).

Landslide inventory map
Landslide inventory map is the important step in landslide susceptibility assessment and the map can be constructed using various methods such as field survey, satellite image interpretation, aerial photograph, historical records (Pham et al. 2015). In this study, a landslide inventory map with 178 landslide events was established and these landslides were determined from interpretations of high resolution satellite images at the Google Earth Ò , historical records, and field surveys.
Our analysis of these landslide shows that the size of the smallest landslide is 12 m 2 , the largest is 45,000 m 2 , and the average is 2,508.5 m 2 . The landslide inventory map consists of 104 rotational slides and 74 translational slides, besides 84 slides are shallow and 94 are deep. Larger landslides (>800 m 2 ) account for around 5.8% of the total number of landslides. These landslides have been reported affecting 1,987 people. Around 27.7% of the total landslides are medium size (200-800 m 2 ) and affected 1,134 people. Small-sized landslides (<200 m 2 ) that affected 985 people are accounted for 66.5% of the total landslides.

Landslide predisposing factors
The landslide predisposing factors are very complex, until now there is no agreement with the total and deep cause of landslide. However, in most literatures, scientists study the relationship between landslide occurrence with conditioning factor such as topographical, geological, and climatic conditions. Based on literature review and analysis characteristics of the landslide inventory map of the Suichuan area, 15 factors were selected. They are slope-angle, altitude, slope-aspect, topographic wetness index (TWI), sediment transport index (STI), stream power index (SPI), plan curvature, profile curvature, distance to rivers, distance to faults, distance to roads, precipitation, landuse, normalized difference vegetation index (NDVI), and lithology were considered as major factors to produce landslide susceptibility map of the study area.

Digital elevation model and derivatives
A digital elevation model (DEM) for the study area with a spatial resolution of 25£25 m was generated from topographic maps. DEM of the study area was used to extract different conditioning factors such as slope-angle, altitude, slope-aspect, TWI, STI, SPI, plan curvature, and profile curvature. Slope-angle is a quantitative description of the extent of ground tilt, but also a basic landform index, through the influence of gravity, surface runoff and soil erosion affect the occurrence and intensity of erosion. For medium-sized basin and regional scale distributed hydrological and soil erosion model, the slope-angle of the surface is the most basic model parameters (Pedrazzini et al. 2013;Muceku and Korini 2014). The slope-angle map was prepared from the DEM, and reclassified into four categories: (1) 0-5 , (2) 5-15 , (3) 15-30 , (4) >30 (Figure 3a). Altitude was classified to five categories including <200m, 200-400m, 400-600m, 600-800m, and >800 m (Figure 3b).The slopeaspect ( Figure 3c) values are grouped into nine classes based on normal or common standard classification, including flat (-1 ), north (337.5 -360 and 0 -22.5 ), northeast (22.5 -67.5 ), east (67.5 -112.5 ), southeast (112.5 -157.5 ), south (157.5 -202.5 ), southwest (202.5 -247.5 ), west (247.5 -292.5 ), and northwest (292.5 -337.5 ). TWI is a kind of stream length through quantitative description of runoff area, but also the watershed soil moisture and runoff generation capacity. It is defined as where a is the cumulative upslope area draining through a point (per unit contour length), and tanb is the slope-angle at the point. It reflects the tendency of water to accumulate at any point in the catchment (in terms of a) and the tendency of gravitational forces to move that water down slope (expressed in terms of tan b as an approximate hydraulic gradient) (Moore & Grayson 1991;Poudyal et al. 2010). In the present study, TWI is divided into three classes <7, 7-11, and >11 ( Figure 3d). STI represents potential of soil loss from the combined slope properties ( Figure 3e). This index is derived from unit stream-power theory and is sometimes used in place of the lengthslope factor in the revised universal soil loss equation (RUSLE) for slope lengths less than 100 m and slope less than 14 . STI depends on two parameters As (is the upslope contributing area) and b (is the local slope gradient in degrees). In the current study, the STI factor was classified into three categories, including <10, 10-30, and >30 and was prepared according to the following equation: The SPI is a factor that measures the erosive power of flowing water based on the assumption that discharge is proportional to specific catchment area (Moore & Grayson 1991). The SPI depends on two parameters. The SPI (Figure 3f) can be defined as (Moore & Grayson 1991) as where As is the specific catchment area and b is the local slope gradient measured in degrees. In the current study, SPI was reclassified into five categories such as <20, 20-40, 40-60, 60-80, and >80. Plan curvature reflects the structure and morphology of the terrain, but also affects the distribution of soil organic matter content in the surface process simulation and hydrology, soil areas has important implications (Hapke and Green, 2006). Profile curvature is a measure of the slope gradient of the ground along the direction of the rate of change in ground elevation of maximum gradient (May et al. 2013). In the current study, plan curvature (Figure 3g) was divided into three categories including: concave, flat, and convex. Profile curvature (Figure 3h) was ranged from 0.001, -0.001 to 0.001, and >0.001.
2.2.4. Distance to rivers, distance to faults, and distance to roads Some authors found that faults could induce zones of weakness (reduced bulk-rock strength) that increase hillslope susceptibility to failure (Klose et al. 2014;Paul ın et al. 2014). In addition, an extensive landsliding in response to a large outburst flood indicates that lateral river erosion is a key driver of landslide erosion on threshold hillslopes, the fault and river become key factors causing landslide (Weng et al. 2011;Scheingross et al. 2013). The river network that undercut slopes was extracted from the topographic map (scale 1:50000) by buffering the river lines The rivers buffer map was classified into five categories including <100m, 100-300m, 300-500m, 500-700m, and >700 m ( Figure 4a). However, the distance to fault map was constructed by buffering the fault lines and classified into five categories <500m, 500-1000 m, 1000-2000 m, 2000-3000 m, >3000 m ( Figure 4b). The distance to roads is an important factor of landslides. Many landslides occur along the roads because of uncontrolled rock cuts. Highways and roads construction can cause slope disturbance causing increase of the strain behind the slope and leading to development of some tension cracks. In the current study, many landslides were recorded along the roads. The distance to roads map was prepared by buffering the road lines and classified into five categories including < 500m, 500-1,000m, 1,000-2,000m, 2,000-3,000m, and >3,000 m ( Figure 4c).

Precipitation
Precipitation is one of the most major triggered factors of landslides. It had been paid more attention by many scientists (Raia et al. 2013). The precipitation data were extracted from a database from the government of Jiangxi Province Meteorological Bureau. The mean annual precipitation for the period 1960-2014 at 23 weather stations was used to draw the rainfall map using Kriging method. The precipitation map was classified into five divisions including 697. 1-994.4 mm, 994.4-1140.7 mm, 1140.7-1306.5 mm, 1306.5-1545.3 mm, and 1545.3-1940.2 mm for the study area ( Figure 4d).

Landuse
Landuse has some relationship with the landslide, they are influenced each other, where unreasonable mining and building may induced landslide (Hadmoko et al. 2010). With ENVI software, the role of landuse distribution in landslide susceptibility was evaluated by applying Maximum likelihood classification method to Landsat 7 ETMC satellite image (acquired in 1999.12.10). Maximum likelihood generated high accuracy results (Kappa coefficient D 0.924) by taking a set of input data (Suichuan area). The landuse map in the study area was divided into six classes (Figure 4e), namely, water, residential area, forest, bare, farmland, and grass. The forest unit represents the maximum percentage (about 58.9%) of the landuse map, whereas the water unit represents the minimum percentage (about 0.02%) of the landuse map.

Normalized difference vegetation index NDVI is defined by
where NIR is the reflectance of the Earth's surface in the near infrared channel (0.725-1.1 mm) and VIS is the reflectance in the visible portion of the spectrum or the red channel (0.5-0.68 mm) (Tucker & Sellers 1986). The NDVI map of the current study was produced from Landsat 7 ETMC image (acquired in 1999.12.10). The NDVI was reclassified into five divisions including: < 0.1, 0.1-0.2, 0.2-0.3, 0.3-0.4, and >0.4 (Figure 4f).

Lithology
It is widely recognized that the erodibility degree of rocks is the main criterion of lithology type. Landslides are heavily influenced by rock properties and its change, and most scholars had taken lithology as an important factor in landslide susceptibility mapping (Chen et al. 2011). The lithology map of Suichuan area was obtained from China Geology Organization (http://gsd.cgs.cn) ( Figure 5 and Table 1). The lithological units of the study area were consisted of ten classes (A, B, C, D, E, F, G, H, I and J) ( Table 1). About 45.8% of the lithology covering the study area falls within the unit described as class J (Eight village group high group; Eight village group Stone Group) which includes: grey, greyish green sandstones, with grey green silty slate, slate and a small amount of carbonaceous slate: grey green striped strip slate with metaclastics, bottom common lenticular limestone (Table 1). Also, 20.3% of the study area is covered by class G (The waterwheel, Guidong, snow top super unit; The ZuoAnchao estuary, Nanping Hill unit unitunit, large clutch unit; Tang Huchao unit, Fu Fangchao unit car brain unit high delta unit; Fu Fangchao unit Gaoping unit, cat nasal Yin unit), which including monzoniticgranite; granodiorite; Tonalite diorite, porphyritic granodiorite, granite, porphyritic two porphyritic moyite; monzonitic granite. Other units constitute about 33.9% of the study area ( Figure 5 and Table 1).

Support vector machine
SVM is a training machine learning method. It applied for the linearly separable case analysis for linear non separable, nonlinear mapping algorithm by using low-dimensional input space. It can be linearly inseparable sample into high-dimensional feature space in which the linear separable, so that the high-dimensional feature space by nonlinear characteristics of the samples of the linear algorithm for linear analysis become possible (Micheletti et al. 2014). The two classes {1, ¡1} denote landslide pixels and no-landslide pixels. The aim of the SVM classification is to find an optimal separating hyper plane that can distinguish the two classes, i.e. landslides and no landslides {1, ¡1}, from the mentioned set of training data. For the case of linear separable data, a separating hyper plane can be defined as where w is a coefficient vector that determines the orientation of the hyper plane in the feature space, b is the offset of the hyper plane from the origin, and j i is the positive slack variables (Cortes and Vapnik 1995). The determination of an optimal hyper plane leads to the solving of the following optimization (Equations 6 and 7) problem using Lagrangian multipliers (Samui 2008): a i a j y i y j x i x j À Á ; Subject to where a i is Lagrange multipliers, C is the penalty, and the slack variables j i allows for penalized constraint violation. The decision function, which will be used for the classification of new data, can then be written as In cases when it is impossible to find the separating hyper plane using the linear kernel function, the original input data may be transferred into a high-dimension feature space through some nonlinear kernel functions. The classification decision function is then written as where K(x i , x j ) is the kernel function In the present study, to perform the landslide susceptibility mapping using SVM, SVM classifier provides four types of kernels including radial basis function (RBF), PL, SIG, and linear (LN). The mathematical representation of each kernel (RBF, PL, SIG, and LN) is listed as follows (Pourghasemi et al. 2013): Radial basis f unction : K x i ; y i ð ÞD ÀgjjX i À X j jj À Á ; g > 0; (10) Sigmoid : K x i ; y i ð ÞD tanh gX T i X j C r À Á ; Linear : K x i ; y i ð ÞD X T i X j; where K(x i , x j ) is the kernel function; g is the gamma term in the kernel function for all kernel types except linear; d is the PL degree term in the kernel function for the PL kernel; r is the bias term in the kernel function for the PL and SIG kernels; g, d, and r, are user-controlled parameters, as their correct definition significantly increases the accuracy of the SVM solution.

Preparation of training and validation datasets
In the present study, 178 landslide events were randomly split into two parts, out of which, 125 landslides (70%) were randomly selected for modeling construction and the remaining 53 landslides (30%) were used for the model validation. These landslides were assigned value of '1.' Since the landslide modeling using SVMs is considered as binary classification, in which the resulting models will classify pixels into two classes, 'landslide' and 'non-landslide', therefore it is necessary to collect non-landslide points (Tien . The non-landslide areas were identified with the usage of Google Earth Ò and the analysis of high-resolution DEMs. The areas that potentially are classified as non-landslide areas are characterized by gentle and without any changes morphometric characteristic. The height difference, the steepness, and the orientation of slopes and also the absence of concavities and convexities, are the main criteria for identifying the non-landslide areas. To avoid bias, the same number of non-landslide points was randomly generated from the landslide-free area using GIS tools and were assigned value of '-1' (Tien . Finally, values of the 15 landslide conditioning factors were extracted for all the landslide pixels and the non-landslide points to obtain the training and validation datasets.

Landslide susceptibility mapping
In this research, SVM with four types of kernel classifiers including RBF, PL, SIG, linear (LN), and PL (six select degrees were used degree 1, degree 2, degree3, degree 4, degree 5, and degree 6) were used in a GIS platform for landslide susceptibility mapping. A total of 178 landslides were mapped using field survey. Fifteen landslide conditioning factors were considered including slope-angle, altitude, slope-aspect, TWI, STI, SPI, plan curvature, profile curvature, distance to rivers, distance to faults, distance to roads, precipitation, landuse, NDVI, and lithology. The results of spatial relationship between landslide occurrences and conditioning factors using frequency ratio model is shown in Table 2. In Table 2, for the slope-angle class 0-5 , the frequency ratio was 0.70 which indicates a very low probability of landslide occurrence. Similarly, for the slope-angle class 5 -15 , the ratio was 1.31; where the probability of landslide occurrence is high. The frequency ratio between landslide occurrence and altitude showed that the altitude class between 200 and 400 m had the highest FR value 1.11 and for altitude class 600-800 m the FR had the lowest value (0.69). The frequency ratio for the slope-aspect was high for southeast-facing and south-facing slopes (FR value of 1.39 and 1.27, respectively) but the FR was low for flat class (0.00). The frequency ratio for the TWI, SPI, and STI were high for classes 7-11, 40-60, and 10-30, respectively, where the FR values were 1.07, 1.28, and 1.06, respectively. In the case of plan curvature, convex has high FR value of 1.04 than concave and flat. In the case of profile curvature, most of the landslides occurred in class (-0.001) to (0.001) with FR value of 1.54. In addition, it was found that landslides at a distance to river class <100 m had a FR value of 1.78; distance to fault class of 1000-2000 m had a higher FR value of 1.31; whereas a distance to road class of <500 had the higher FR value of 1.65. In the case of precipitation, 1545.3-1940.2 class had the highest FR value of 2.62. In the case of landuse, the FR value was high in farmland area (1.27); however, water had a lower FR value of (0.00). In the case of NDVI, the class (0.05)-(0.10) had a high FR (1.48). There were ten groups of lithological units within the study area, the FR between landslide occurrence and lithology suggests that the group I (i.e. Z (LechangXia Group)) which include Grey purple feldspar quartz sandstone intercalated with siltstone slate; light grey chert sandwiched phyllite: grey, greyish green sandstones had the highest value (5.89), whereas the group A with grey sandstone, siltstone, shale, carbonaceous shale and coal seam clamp: grey quartz conglomerate, pebbly sandstone, sandstone; purple red sandstone had the lowest value (0.00). Finally, the landslide susceptibility maps were produced according to SVM kernels models using RBF, PL, SIG, and linear (LN). The landslide susceptibility value (LSPV) ranges from 0 to 1, the value with higher susceptibility means the higher of the landslide occurrence. Figure 6 shows 6 different degree of PL of landslide susceptibility maps, degree 1 to degree 6 were from (a) to (f), the LSPV of 6 degree were 0. 0860-0.8652, 0.0905-0.8174, 0.0420-0.8674, 0.1127-0.8390, 0.1613-0.8244, and 0.1699-0.8009, respectively. Figure 7 shows the landslide susceptibility using the kernel of RBF, SIG, and linear (LN), the value of them were 0.0698-0.8864, 0.0768-0.7834, and 0.0843-0.8660, respectively.

Validation and comparison
In this study, the receiver-operating characteristic curve (ROC) and area under the curve (AUC) were used to evaluate and compare the performance and prediction capability of the landslide models . The ROC curve is a graph that is constructed based on sensitivity and 1¡specificity with different cut off values. The AUC varies from 0.5 to 1.0, the model with higher AUC is considered to be the best. Most studies in the process of validation, both the success rate and the prediction rate are used to validate and rank the models, so in current study, we use   both of them. It is noted that the success rate and the prediction rate here are derived from the ROC curve that are different with those mentioned in Chung and Fabbri (2003).
The success rate results were obtained by estimating AUC of these susceptibility models using the training dataset, whereas the prediction rate results were derived in the same way but using the validation dataset. Figure (8) shows the success rate curves for the six different degree of PL kernel, degree 6 has the highest AUC (0.953), degree 1 has the lowest AUC (0.715). Figure (9) shows the other kernel named SIG, RBF, and Linear, the value of AUC were 0.680, 0.833, 0.716, respectively. The prediction rate showed in Figures 10 and 11, the value of 6 degree of PL was 0.738, 0.730, 0.683, 0.648, 0.608, and 0.598, respectively; the value of SIG, RBF, and Linear were 0.741, 0.716 and 0.740, respectively.

Discussions and conclusions
Landslides susceptibility map is considered as a valuable tool for land use planning and management (Akgun 2012), therefore these maps should be produced by high performance models. However, it is still difficult to obtain landslide models with high accuracy because landslide is non-linear and complex process that relates to various conditioning factors (Tien   Figure 8. Success rate curves for the landslide potential maps by polynomial function (PL): degree1, degree 2, degree 3, degree 4, degree 5, and degree 6.
shows that although no method or technique is the best for all regions; however, SVMs are considered to be the most efficient methods and have proven outperforming conventional methods for susceptibility mapping Tien Bui et al. 2012;Yao et al. 2008). It is wellknown that performance of SVM models is strongly influenced by the kernel function used and its parameters. However, investigation of kernel functions in SVM models for landslide susceptibility modeling is still rare. We fill this gap in literature by investigating and comparing four kernel functions (RBF, PL, SG, and LN) used in SVMs with a case study at the Suichuan area, the Jiangxi province (China). To obtain this purpose, a landslide database with 178 landslide location and 15 conditioning factors has been established, and then, used to build and validate different SVM models. The results show that performance of landslide models is strongly depended on kernel function used. For the case of the PL function, a total of 6 degrees have been checked and the model with the first degree of the PL function has lowest degree of fit, but has the highest prediction capability. The finding in this study shows that the higher the degree of the PL function, the better performance of the model on the training data is (Figure 7). In contrast to results in the training dataset, the prediction capability of the model in the validation dataset decreases when the degree of the PL function increases. This indicates that the models with high degree of the PL function are suffered from overfitting problem. It is noted that SVM models aim to build hyperplanes that separates pixels into two classes, 'landslide' and 'non-landslide.' With higher degree of the PL function, more training samples (called support vectors) lies on the hyperplanes and therefore increasing loss of generality. Consequently, the prediction capability of the models is decreased. For the case of the SVM models with LN, RBF, and SIG function, although the model with RBF has the highest performance with AUC D 0.833 (followed by SVM-LN with AUC D 0.716 and SVM-SIG with AUC D 0.680); however, prediction capability checking show that the SVM-RBF model is slightly (»2%) lower than the SVM-LN model and the SVM-SIG model. Problem of overfitting of these models is alleviated since the difference of these AUCs in the training and validation datasets are low. Based on the above analysis, we conclude that the SVM-RBF model is the best for this study. This finding is in agreement with some landslide studies such as Tien Bui et al. (2012) and Hong et al. (2016) who stated that the SVM models with RBF function has the highest prediction capability.
In fact, performance of the SVM-RBF model is influenced by the selection of C and g parameter values (see Section 3.1) and in this study, these parameters were derived using the grid-search technique. Therefore, the performance of the SVM-RBF model could be enhanced if the process of picking up C and g is carried out using new optimization techniques   Figure 10. Prediction rate curves for the landslide potential maps by polynomial function (PL): degree1, degree 2, degree 3, degree 4, degree 5, and degree 6. studies on application of SVMs for landslide susceptibility mapping should focus on using soft computing optimization techniques to optimize kernel parameters values. Overall, this study contributes to the body knowledge of landslide susceptibility by investigating potential application of SVMs with four kernel functions with a case study at southwest China. According to this study, the SVM model with RBF function is the best suit for the data at hand, followed by the SVM model with second degree PL, the SVM model with LN, and the SVM model with SIG. At final conclusion, the result from this study is useful for land use planning and management in landslide-prone areas.

Disclosure statement
No potential conflict of interest was reported by the authors.