Landslide rainfall threshold for landslide warning in Northern Thailand

Abstract Northern Thailand is a hotspot for landslides. Rainfall-triggered landslides in this region have caused much suffering and many fatalities. In this work, a landslide-triggering rainfall threshold for Northern Thailand is proposed based on rainfall data relating to 48 triggering rainfall events that caused 59 landslides in the study area. To account for different mechanism of landslide formation, the threshold was portioned into two parts for different duration of rainfall events. A split point of 3 days was chosen as a separator for portioning the threshold to be 1) a threshold for rainfall events of duration no longer than 3 days and 2) a threshold for rainfall events of duration longer than 3 days. The threshold also required a suitable variable for antecedent rainfalls which was found to be cumulative rainfall over 25-day period (CR 25) of 140 mm. Therefore, the thresholds combining cumulative rainfall with rainfall event - duration (CED) were established by incorporating the CR25 of 140 mm into the traditional ED threshold. This is the first attempt to incorporate the difference mechanism of landslide formation by dividing the CED threshold to two portions for difference rainfall duration. The introduced threshold shows positive sign of the prediction, particularly in term of false alarm rate, false alarm ratio, and critical success index. The introduced threshold will be useful for landslide warning system in the study area.


Introduction
Every year, landslides result in economic and human losses. Understanding, managing, monitoring, and preventing these major natural hazards can mitigate the human and economic impacts. Studies of the many different aspects of landslide hazards have investigated triggering factors and hydrological responses (Chinkulkijniwat, Yubonchit et al.2016;Yang et al. 2021), biological stability (Indraratna et al. 2006), and landslide hazard assessment (Grozavu and Patriche 2021). Work carried out on landslide risk assessment has made some of the most vital contributions to landslide mitigation measures. Since rainfall is known to be an important factor in landslide events (Iida 2004;Fan et al. 2016), landslide rainfall thresholds are commonly utilized as an important component of landslide early warning systems (Guzzetti et al. 1994;Aleotti 2004;Wieczorek and Glade 2005;Ya'acob et al. 2019;Maturidi et al. 2020;Yang et al. 2020;Rosi et al. 2021). The most common parameters used to define landslide-triggering rainfall thresholds are based on event rainfall parameters, particularly the parameter that combines rainfall intensity and rainfall event duration, known as the ID threshold (Caine 1980;Crosta and Frattini 2001;Ahmad 2003;Aleotti 2004;Guzzetti et al. 2008;Yubonchit et al. 2017). Since the rainfall variables used to predict the ID threshold are not independent (Gariano et al. 2020), certain studies Vessia et al. 2014;Gariano et al. 2015;Peruccacci et al. 2017;Gariano et al. 2019;He et al. 2020;Germain et al. 2021;Lee et al. 2021) have preferred to use a threshold that takes into account event rainfall and rainfall duration, known as the ED threshold.
Thailand's Northern Region regularly experiences rainfall-triggered landslides that cause tragedy, injuries and loss of life (Yumuang 2006;Teerarungsigul et al. 2016;Komolvilas et al. 2021). In 2001, 176 people lost their lives in rainfall-triggered landslide events in the area. In 2006, 87 fatalities were recorded, andin 2018, eight people died but 260 casualties were reported. In 2003, Thailand's Environmental Geology Division reported that 6563 villages, in 1084 rural subdistricts, in 54 provinces, mostly in Northern Thailand, were located in landslide hazard zones. According to Segoni et al. (2018), who conducted a review of the recent literatures on rainfall thresholds for landslide occurrence published in journals indexed in Scopus or ISI Web of Knowledge database during 2008-2016, there was only one report for landslide rainfall threshold in Thailand (Kanjanakul et al. 2016) during the period of 2008-2016. The present work determines a landslide rainfall threshold at regional scale for Northern Thailand. The introduced threshold was modified from a landslide rainfall threshold for the Southern Thailand region that combined cumulative rainfall with rainfall event -duration, known as the CED threshold (Salee et al. 2022). The modification was achieved by portioning the CED threshold to two portions; one for short duration rainfall events and the other for long duration rainfall events. In general, the short duration, high intensity rainfall events involved shallow landslides, while the long duration, low to medium intensity rainfall events caused deep seat landslides (Caine 1980;Giannecchini et al. 2012Giannecchini et al. , 2015Zhang et al. 2019). Taking rainfall duration into account in an established landslide rainfall threshold, the difference mechanism of landslide formation might be incorporated to the threshold. Contingency tables and sets of skill scores were used to assess the performances of the thresholds. The threshold introduced in this study will be useful for rainfall-triggered landslide warning in Northern Thailand. Furthermore, this study shows the first attempt to incorporate the difference mechanism of landslide formation by dividing the CED threshold to two portions for difference durations of rainfall event.

Background of the study area
The Northern Thailand region (Figure 1) consists of nine administrative provinces, namely Chiang Rai, Mae Hong Son, Chiang Mai, Lamphun, Lampang, Phayao, Nan, Phrae, and Uttaradit. The region covers approximately 93,691 km 2 . The landscape of Northern Thailand is dominated by mountain ranges in the western and northeastern parts of the region. Theses ranges are part of the wider system that covers neighboring Burma and Laos. Broadly defined based on geological composition, there are two mountainous subsystems in the study area. In the western part of the region, mountains run southwards from the Daen Lao Range with the two parallel chains of the Thanon Thong Chai Range, which includes the highest mountain in Thailand, Doi Inthanon (2,565 m above mean sea level). In the northeastern part of the region, parallel ranges extending into northern Laos include the Khun Tan Range, the Phi Pan Nam Range, the Phlueng Range, and the western part of the Luang Prabang Range. There also exists a set of strike-slip faults in this region. However, landslides triggered by seismic events are rare in Thailand, and the most recent earthquakes in 2006 and 2014 did not lead to significant landslides (Schmidt-Thom e et al. 2018). The annual average minimum and maximum temperatures are 4 and 40 C, respectively. The average annual rainfall of 943.2 mm is spread over 122 days on average. Rainfall in this area is under the influence of the southwest monsoon, which starts in May and ends in October. Streams of warm moist air from the Indian Ocean bring abundant rain to the region, especially to the windward side of mountain ranges. However, the southwest monsoon is not the only source of precipitation during this period. The influence of the Inter Tropical Convergence Zone and tropical cyclones can also deposit large amounts of rain. Based on the available records, all major landslides in this area have been triggered by heavy rainfall caused by tropical cyclones. Landslides in Phare and Phetchabun provinces in 2001 were triggered by continuous heavy rain that fell during Typhoon Usagi. Several landslides in Uttaradit, Sukhothai, Phrae, Lampang and Nan provinces in 2006 were caused by continuous heavy rainfall in the wake of Typhoon Xangsane. More recently, in 2018, landslides at Huay Khab village in Nan province followed ten days of continuous rainfall caused by Typhoon Son-Tinh.

Data collection and rainfall characterization
This study considered 59 landslide events recorded in Northern Thailand during the years 2002 to 2018. Data were collected mainly from scientific papers published by the Department of Mineral Resources, Ministry of Natural Resources and Environment and partly from local newspapers. For an event to be taken into consideration, the available information had to convey at least the following details: (1) the date of the occurrence of the landslide, (2) the location of the landslide event, and (3) consequential damages. Triggered and non-triggered rainfall data from the years when these landslide events occurred were gathered from Thai Meteorological Department (TMD) rain gauge stations. These rain gauge stations located in the catchment area where the considered landslides were located. The locations of landslide events and TMD rain gauge stations in the study area are indicated in Figure 1. To estimate rainfall at landslide locations, rainfall data from TMD rain gauge stations was processed by use of inverse distance weighting (IDW). Based on inverse functions of distance, IDW assigned a larger weight to a station closer to a landslide location than it assigned to a station further away. Although, IDW is a deterministic model, it has been considered a reliable method of spatial interpolation in applications such as point spread function (Gentile et al. 2013), and baseflow measurement and baseflow index calculation (Ditthakit et al. 2021). IDW has also been successfully applied to the interpolation of rainfall data in various locations by Kong and Tong (2008), Kurtzman et al. (2009), Chen et al. (2010, and Yang et al. (2015) among others.
In order to characterize rainfall in this region, criteria must be identified that enable distinction between two consecutive rainfalls. The inter -event criterion (IEC) used in this study to separate two consecutive rainfalls is shown in Figure 2. In Figure 2, the inter-event criterion IEC A,B is a combination of the rainfall intensity threshold A and duration B. The condition that distinguished two consecutive rainfall events had to satisfy the combined criterion. If rainfall intensity was no greater than A mm/day for at least B consecutive days, two consecutive rainfall events were considered to have occurred. Conversely, if the rainfall intensity and duration of two rainfalls did not meet the IEC A,B , these two rainfalls were considered as one continuous rainfall. The determination of a suitable IEC was crucial to establishing a suitable landslide-triggering rainfall threshold. An IEC which is easy to meet might result in the rejection of a continuous rainfall, and an IEC which is hard to meet might produce too long a rainfall duration that includes independent rainfall events.
In this study, the suitable IEC was identified using all rainfall data (both triggered and non-triggered rainfall events) from the years in which landslide events occurred in the study areas. Since the suitable IEC can vary depending on seasonal and climatic conditions, and there is a clear distinction between two consecutive rainfalls in the pre-monsoon period, the determination of the suitable IEC in this study was based on a dataset that excluded inter-event rainfall in the pre-monsoon period. Twelve sets of variables A and B were examined to identify the suitable IEC A,B . For each IEC A,B , all inter-event times between every consecutive rainfalls were read and then employed to calculate the mean and the standard deviation of the inter-event times. Based on an assumption that inter-event times have an exponential distribution for which the mean equals the standard deviation (Bonta and Rao 1988), the suitable IEC was identified on the basis of a variation coefficient (CV) of inter-event times equal to 1.0. The variation coefficient (CV), defined as the ratio of the standard deviation to the mean, was calculated and presented in Table 1. As expressed in Table 1, the IEC that returned the CV closest to 1.0 was the IEC 5,1 , which stands for the condition that rainfall intensity was no greater than 5 mm/day for at least 1 days.
Based on rainfall events defined by IEC 5,1 , frequency distributions tables were produced of rainfall duration in days (Table 2) and rainfall event in mm (Table 3) for rainfall events from the years in which landslide events occurred in the study areas. Eighty-four percent of the collected rainfalls lasted no longer than 3 days. With regard to a depth of rainfall, it was found that eighty-three percent of the collected rainfalls fell to a depth no greater than 50 mm. Figure 3 presents average monthly rainfall in mm (blue line) calculated from rainfall data in this study compared with 30-year average monthly rainfall from years 1981-2010 (gray column) and monthly rainfall of the recent year 2021 (green column) sourced from Thai Meteorological Department   (2022). The rainfall data gathered in this study produced a similar distribution to the results from gauge readings throughout the Northern Thailand. Since the rainfall data in this study were collected from the years when the landslide events occurred and from selected rain gauge stations located in the same catchments with the considered landslides, the average monthly rainfall from rainfall data in this study was surely higher than the 30-year average and the recent year.

Measures of evaluation
In the evaluation of the performance of the thresholds to be established in this study, we considered various measures that are applied in the contingency table, comprising numbers of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), and were employed in diagnosing landslide rainfall threshold in the study area. A hit rate (HR) in Eq. 1 indicates the proportion of the correctly predicted landslide triggered rainfall events among all triggered rainfall events. The HR ranges from zero (0) at the poor end to one (1) at the good end. A false alarm rate (FAR) in Eq. 2 measures the number of false alarms per total number of non-triggering rainfalls. A false alarm ratio (FA) in Eq. 3 measures the fraction of forecasted events that did not occur. The FAR and FA range from zero (0) at the good end to one (1) at the poor end. A Hanssen-Kuiper skill score (KH) in Eq. 4 represents the hit rate with respect to the false alarm rate and remain positive while the hit rate is higher than the false alarm rate. The best possible KH score is 1, which is returned when the HR is 1 and the FAR is 0. The worst possible HK is 0, which is returned when HR ¼ FA. A critical success index (CSI) in Eq. 5 combines HR and FA into one score for low frequency events. This score measures the fraction of observed and/ or forecast events that were correctly predicted. It ranges from zero (0) at the poor end to one (1) at the good end.
Other than the aforementioned scores, the receiver operating characteristic (ROC) curves of HR against FAR were plotted at various probabilistic levels of landslide threshold and the corresponding areas under the ROC curves (AUC) were calculated to determine predictability. Furthermore, for each probabilistic level, the Euclidean distance, d, was calculated between the point corresponding to the threshold on the ROC curve and the ideal coordinate (0,1).

The event rainfallduration thresholds
Based on rainfall events defined by IEC 5,1 , rainfall event (E) and rainfall duration (D) data points of non-triggering-and triggering-rainfalls plotted on a double logarithmic scale were plotted on a double logarithmic scale in Figure 4a. The threshold was being established from rainfall event (E) and rainfall duration (D) of landslide-triggering rainfall events in Northern Thailand. Quantile regression (Koenker and Bassett 1978) was employed to generate sets of rainfall thresholds at various probabilistic levels using Eq. 6.
where a and b are regression coefficients. Using the above relationship, the ED threshold gave a straight line in double logarithmic scale.
To account for short-and long-duration rainfall events, the rainfall events were divided to two groups: short-and long-duration rainfall events. However, due to wide variety of hydrogeological conditions, a time at a split point between short-and long-duration rainfall thresholds lays over a range from many hours to few days. He et al. (2020) divided rainfalls to two groups; short-and long-duration rainfalls, using 48 hours as a split point to establish landslide rainfall threshold in China. Wicki et al. (2020) used rainfall duration of 6 hours to classified if the rainfall is short-or longduration rainfalls. Chen and Chen (2022) characterized rainfalls that triggered landslide in Taiwan to three types; including high rainfall intensity over a short duration (<12 h), high-intensity and prolonged rainfall, and high cumulative rainfall over a long duration (>36 h). Based on distribution of rainfall duration presented in Table  2, most of the rainfall events (almost 70%) last no longer than 2 days and there are few rainfall events (less than 10%) last longer than 4 days. Therefore, a time at a spilt point between short-and long-duration rainfall thresholds could be within 2-4 days. In order to define a suitable split point, three sets of the ED thresholds having their split point at 2-day (Figure 4b), 3-day ( Figure 4c) and 4-day ( Figure 4d) were established and assessed. Figure 5 presents the ROC curve and the corresponding AUC of three ED thresholds split at 2-day, 3-day, and-4 day. The performance was fair (AUC less than 0.76) for the ED threshold with the split point at 2-day. For the split point at 3-day and 4day, the ED thresholds yielded good predictability of their AUC magnitudes greater than 0.80. Among three ED thresholds, the ED threshold that used 3-day as a spilt point between short-and long-duration rainfalls exhibited the highest AUC. Hence, the split point at 3-day was chosen as a separator for establishing the ED threshold for short duration rainfall events (ED S threshold) and the ED threshold for long duration rainfall events (ED L threshold). Threshold parameters a and b for exceedance probabilities from 5 to 90% are reported in Table 4. Table 5 summarizes the four contingency scores and the six skill scores at ten probabilistic levels from 5 to 90% produced by results obtained from the ED Figure 4. (a) From the data of non-triggering and triggering rainfalls, double logarithmic scatter plots were built from data points of rainfall event versus rainfall duration. The rainfall event -duration (ED) threshold was determined at various probability levels using quantile regression. (b) The ED threshold was divided to two categories; short duration rainfall threshold and log duration rainfall threshold, using a split point at 2 days, (c) The ED threshold using a split point at 3 days, (d) The ED threshold using a split point at 4 days.  Table 4. Threshold parameters a and b (see Eq. 1) for exceedance probabilities from 5 to 90%. The threshold was portioned to two parts; ED S (for short duration rainfall events) and ED L (for long duration rainfall events) thresholds. The threshold was portioned using a split point at 3 day.  threshold portioned to short-and long-duration rainfall thresholds by 3-day duration. At low probabilistic levels, the threshold yielded very high FA value (i.e. FA ¼ 0.96 at probabilistic level of 5%). Threshold with high FA results in the operators losing trust in its reliability. Furthermore, the CSI value generated by the threshold was much lower than 0.50 at every probabilistic levels suggesting that the forecast had little or no skill. Hence, we concluded that the ED threshold is not practically useful in the study area.

The cumulative rainfall with rainfall eventduration threshold
For ease to account for antecedent rainfalls, cumulative rainfall over certain period prior to the failure day was integrated to the established threshold. Figure 6a presents the rainfall on a failure day (DR f ) against cumulative rainfall over 3-, 5-, 10-, 15-, 20-, 25-day periods prior to the failure day. The plots were divided into two portions with a 1:1 line to clarify bias in the scattering, whether towards the rainfall on a failure day or cumulative rainfall prior to the failure day. From 48 triggered rainfall events, 34 events were biased toward cumulative rainfall of 3-day period prior to the failure day. The number of biasness toward cumulative rainfall increased to 35, 40, 42, 42, and 46 events when period of cumulative rainfall increased to 5-, 10-15-, 20-, 25-day period, respectively. Therefore, cumulative rainfall over 25-day period prior to the failure day (CR 25 ) was considered as a suitable threshold variable for landslides in the studied area. Figure 6b presents a scatter plot of data points representing DR f and CR 25 for 48 triggered rainfall events. The scatter plot revealed that the highest value of CR 25 that returned few number of triggered rainfall events was CR 25 of 149 mm. Hence, the CR 25 value of 140 mm was an indicator that could potentially be used as a landslide-triggering threshold in the studied area.
The CR 25 of 140 mm was introduced into the ED threshold presented in Figure 4c to establish a CED threshold portioned by rainfall duration (Figure 7). This threshold was portioned into a threshold for rainfall events of their duration no longer than 3 days, and a threshold for rainfall events of their duration longer than 3 days. Table  6 presents the four contingencies and the six skill scores for ten probabilistic levels from 5 to 90% calculated for the CED threshold. Introducing the CR 25 of 140 mm Figure 6. Rainfall on a failure day in mm was plotted with respect to cumulative rainfall over 3-, 10-, 15-, 20-, and 25-day period prior to the failure day (a). Rainfall on a failure day in mm was plotted with respect to cumulative rainfall over 25-day period before the failure day (b).
into the threshold resulted in notably fewer FP cases, and hence the FAR for the CED threshold was considerably lower than the FAR for the ED threshold. The reduction of FAR significantly improved the overall reliability of the threshold, indicated by the ROC curve and the corresponding AUC of the CED threshold ( Figure 5). The reliability of prediction with the CED threshold was very good (AUC ¼ 0.96). The FA of the CED threshold (FA ¼ 0.44) yielded positive results since it was significantly lower than the FA yielded by the ED threshold (FA ¼ 0.96). The best compromise between the minimum number of incorrect landslide predictions (FP, FN) and the maximum number of correct predictions (TP, TN), was indicated by combination of the largest values for the HK and the smallest value of the d. It was found that the best compromising predictions was obtained at probabilistic levels of 5% for the CED threshold (HK ¼ 0.92 and d ¼ 0.07). The CED thresholds proposed in this study can be employed as shown in Figure 8. The CED threshold plotted in three -dimensional space. The threshold was portioned at 3-day becoming the CED threshold for rainfall events of their duration no longer than 3 days and the CED threshold for rainfall events of their duration longer than 3 days.

Conclusion
Rainfall data corresponding to 59 landslides recorded in Northern Thailand during the years 2002 to 2018 was used to establish landslide rainfall threshold in the study area. Based on the variation coefficient (CV) of inter-event times, a suitable inter-event criterion (IEC) to separate two consecutive rainfalls was IEC 5,1 standing for the condition that rainfall intensity was no greater than 5 mm/day for at least 1 days. The threshold introduced in this study was a threshold that explicitly included rainfall event and antecedent rainfall parameters in the threshold, namely cumulative rainfall with rainfall event-duration (CED) threshold. 140 mm of the cumulative rainfall over 25-day period prior to a failure day (CR 25 ) was found to be a suitable indicator to deal with antecedent rainfall events. Based on the threshold predictability and distribution of rainfall duration, a period of 3 days was chosen as an indicator to distinguish the rainfall event to short-and long-duration rainfall events. And the introduced threshold included the CED threshold for rainfall event of their duration no longer than 3 days and the CED threshold for rainfall events of their duration longer than 3 days. Introducing the CR 25 of 140 mm into the threshold resulted in notably fewer false positive (FP) cases, and hence the false alarm rate (FAR) for the CED threshold was considerably lower than the FAR for the rainfall event-duration (ED) threshold. The reduction of FAR significantly improved the overall reliability of the threshold, indicated by the magnitude of the area under the receiver operating characteristic. Furthermore, the false alarm ratio (FA) of the CED threshold yielded positive results since it was significantly lower than the FA yielded by the ED threshold. Since the CED threshold at probabilistic levels of 5% returned the largest Hanssen and Kuipers (HK) score and the smallest value of d, this threshold at probabilistic level of 5% can be recommended as the landslide rainfall threshold in Northern Thailand.

List of abbreviations and symbols
AUC area under receiver operating characteristic curve CED threshold cumulative rainfall with rainfall eventduration threshold. CR 25 cumulative rainfall over 25-day period before a failure day. CSI critical success index CV coefficient of variation E rainfall event in mm ED threshold rainfall eventduration threshold ED L threshold rainfall eventduration threshold for long duration rainfalls ED S threshold rainfall eventduration threshold for short duration rainfalls

Availability of data and material
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.