Rainfall thresholds of debris flows based on varying rainfall intensity types in the mountain areas of Beijing

Abstract Three strong storms that occurred on July 21, 2012, July 20, 2016, and July 16, 2018 caused severe flooding and debris flows in the mountain areas of Beijing. The detailed records of these storms and the actual occurrence time of the debris flows provide an opportunity for evaluating the thresholds derived from different types of rainfall intensities. Herein, a new rainfall threshold of debris flows is derived from the real-time rainfall intensities in Beijing. In addition, the thresholds are estimated based on the average rainfall intensities over the interval from the beginning of the rainfall to the debris flow occurrence and over the entire rainfall duration. The results show that the various types of rainfall thresholds display significantly different capabilities in terms of separating the storms with positive debris-flow response from those with negative debris-flow response, and the real-time rainfall threshold exhibits the best performance. Moreover, our data indicate that the debris flows in Beijing are triggered by the combined work of rainfall intensity and cumulative precipitation. A debris flow is initiated only when the cumulative precipitation and rainfall intensity simultaneously reach a threshold level.


Introduction
Debris flow is a major geohazard that occurs in the mountain areas of Beijing, and has caused immense loss of lives and properties in the past. The debris flows destroyed 39 villages and caused a significant loss of human lives and assets in 1888. Most importantly, the geohazard survey shows that more than 900 sites, including many villages and tourism sites, are located in regions with high debris-flow risk. Therefore, early and accurate forecasting of debris flows is of special significance for geohazard prevention and mitigation in Beijing. An intensive storm is well known to be the trigger of debris flows, though numerous factors, such as geological and geomorphological characteristics, compositions of sediments, and vegetation coverage rates, potentially impact debris flows (Iverson 1997b;David-Novak et al. 2004;Guzzetti et al. 2007;Kean et al. 2011;Guzzetti et al. 2020). Thus, capturing the rainfall pattern for debris flow initiation is of special significance in debris flow forecasting (Guzzetti et al. 2008;Baum and Godt 2010).
Empirical threshold is the most widespread method used to predict debris flow occurrences, though mathematical and numerical models, and early warning systems have also been employed in previous studies. Empirical thresholds are often derived from an assumed power relationship between rainfall intensities or total rainfall and durations of the past storms that have initiated debris flows (Caine 1980;Peruccacci et al. 2017), and some studies combine antecedent precipitation with short-term intensities (Smol ıkov a et al. 2016(Smol ıkov a et al. , 2021. This method is more suitable for the development of rainfall thresholds at a regional scale provided a sufficient amount of information is available (Berti et al. 2012). In contrast, the models and warning systems try to predict debris flows through establishing physical links between the rainfall patterns and debris flows in various ways, such as infinite-slope stability analysis (Papa et al. 2013), coupling of infinite-slope stability with hydrology (Berti and Simoni 2005;Berti et al. 2020), modeling initiating processes of both catchment outflows and debris flows via channel runoff , hydrological models (Bernard and Gregoretti 2021), integrating geological and geomorphological conditions of the studied basins with the radar storm tracking method (Tiranti et al. 2014); or Bayesian networks accounting for both technical failures and inherent system abilities (S€ attele et al. 2015). These methods need detailed information on the geological, hydrological, morphological, and soil characteristics that are associated with debris flow initiations. Whereas, as pointed out by Guzzetti et al. (2007), this information is often difficult to collect precisely over large areas, and is rarely available outside specifically equipped test fields, limiting the widespread application of these models and warning systems.
The efficiency of an empirical rainfall threshold depends on the accuracy of the past rainfall and debris flow data (Guzzetti et al. 2007;Peruccacci et al. 2017;Melillo et al. 2018;Marra 2019;Gariano et al. 2020). Even a small uncertainty (1%) can significantly decrease the performance of threshold-based predictive models (Gariano et al. 2015). There might be a large uncertainty in the data used to estimate empirical thresholds (Peres et al. 2018), especially the detailed processes of rainfall and the exact time of debris flow occurrences (Guzzetti et al. 2008). The data used to derive empirical thresholds, such as historical records of rainfall, short-term rain gauges, and radar-derived rainfall estimates (Guzzetti et al. 2007;Bernard and Gregoretti 2021), particularly the historical records, often lack of the information on the occurrence time of debris flows. As a result, the actual rainfall intensities initiating debris flows cannot be accurately constrained. Alternately, various types of rainfall intensities over different temporal resolutions, such as average and peak intensities, have been used to estimate rainfall thresholds in previous studies (Guzzetti et al. 2008). Most of previous studies used average intensities, and more than 40% of them employ daily or even coarser resolution data (Segoni et al. 2018). In addition, some studies assumed debris flow occurrences coinciding with periods of intense rainfall during a rainstorm (McCoy et al. 2010;Kean et al. 2011;Tang et al. 2011;Parise and Cannon 2012), and suggested peak or the maximum rainfall as the trigger of debris flows (Abraham et al. 2020). However, recent observations including the records from Beijing (Wu 2001;Ma et al. 2018) showed that debris flow occurrences did not always correspond to the maximum rainfall intensity. Similarly, some studies demonstrated that using peak rainstorms instead of the actual triggering intensity might cause an overestimation of the rainfall threshold and thus afford a high rate of missed alarm (Staley et al. 2013). Conversely, an underestimated threshold will result in a high rate of false alarming. Therefore, the research on the real-time rainfall intensities, which coincide with debris flow occurrences, is of special significance in accurately estimating rainfall thresholds.
The detailed records of three strong storms and their triggering debris flows on July 21, 2012, July 20, 2016, and July 16, 2018 in Beijing provide the first opportunity for evaluating the efficiencies of the thresholds derived from rainfall intensities of different typologies. This paper systematically analyzes the evolution of the three storms, particularly the rainfall intensity and precipitation changes, and their association with the debris flow initiation. The main aims of this study are as follows: 1) establishing the first objective rainfall threshold based on real-time intensity (I i ) in the mountain areas of Beijing; 2) evaluating the efficiencies of the thresholds derived from different types of rainfall intensities; 3) investigating the dynamics of debris flows in the mountain region of Beijing by analyzing rainfall patterns that initiated debris flows.

Study area
The debris flows in the mountain areas of Beijing have a close association with regional geology and geography. Beijing is located at the northern tip of the North China Plain, near the meeting point of the Xishan and Yanshan mountain ranges. Therefore, faults, folds, and bedrock fissures have intensively developed in the mountain areas of Beijing. These tectonic factors and rock weathering, cause considerable amounts of heterogeneous, coarse, and unsolid sediments accumulating on the surface of the mountain valleys (Cui et al. 2019), providing sufficient debris materials for debris flows. This, together with the high valley slope and low vegetation coverage rate, sets a favorable condition for debris flow initiations (Wu 2001;Zhong et al. 2004;Wang 2008). For the debris flows in this study, 20-100-mm-sized gravels account for more than 90% of the deposits for the debris flows on July 21, 2012 (Liu 2017), and the fine-grained material content is less than 1.4% in the deposits of the debris flow on July 16, 2018 (Li et al. 2019). The valleys, where the debris flows occurred, have a slope of 30 -60 with some as high as 70 , and a low vegetation coverage rate with an average value of <50% (Liu 2017).
The debris flows in the mountain areas of Beijing mainly occur in the rainfall season (summer), particularly during the period from June to August. Almost all of the debris flows during 1949-2018 were triggered by the strong rainstorms between June and August (Wang 2020). This is related to the climatic characteristic of Beijing. The climate in the studied areas is controlled by the Asian Monsoon, characterized by a hot and humid climate in the summer, and a dry and cold climate in the winter. The annual average rainfall is around 600 mm with the strong convective storms and a large amount of precipitation occurring in summer. The precipitation during June-August accounts for more than 80% of the annual total rainfall.
These characteristics are consistent with the geological, geomorphic, and climatic conditions of the runoff-generated debris flows in many mountain areas across the globe (Imaizumi et al. 2006;Gregoretti and Dalla Fontana 2007;Coe et al. 2008;Simoni et al. 2020). Therefore, the debris flows in the mountain areas of Beijing are dominated by runoff-generated (Ma et al. 2018;Li et al. 2019), the initiation mechanism of which is different from those of landslide-induced debris flows (Iverson 1997a(Iverson , 1997b.

Material and methods
The detailed temporal and spatial changes of the storms on July 21, 2012, July 20, 2016, and July 16, 2018 recorded by the intensive automatic weather stations, FY-2E infrared images, and Doppler weather radar (Zhong et al. 2015;Yang et al. 2018;Lei et al. 2020), provide a well control for the rainfall patterns initiating debris flows ( Figure 1a). Some meteorological stations, such as Hebeizhen and Nanjao, are located in the areas of the storm center, where the serious debris flows occurred ( Figure 1b). The distances between other debris flow sites and meteorological sites are within 5 km (Figure 1b). Most importantly, the records of infrared images, and Doppler weather radar suggest that all the selected stations are located exactly in the path of the rainstorm (Sun et al. 2013), and experienced similar rainfall processes with the debris sites ( Figure 1a) (Huanling et al. 2014).
The occurrence time of the debris flows was taken from news reports, geohazard survey reports, and interviews with local residents. The interval between the records/ witnesses and the occurrences of the events is less than 30 min due to the small size of the drainage area and the short distance between the hazard sites and witnesses.
The intensity-duration model (I-D) is the most common model employed for estimating empirical rainfall thresholds (Guzzetti et al. 2007(Guzzetti et al. , 2008Baum and Godt 2010;Saito et al. 2010), and it has the following general form: where I is rainfall intensity (in mm/h), D is rainfall duration (in h), and c, a and b are constants. In most cases, c is often taken as zero (Guzzetti et al. 2007), and the equation becomes a simple power law.
Various methods have been used to define the rainfall I-D thresholds, from simple regression to integrated statistical methods. The simple I-D threshold is usually obtained by subjectively drawing the minimum-level lines for the rainfall intensity (Y-axis) and the duration condition that causes debris flows and landslides (X-axis) in the Cartesian semi-logarithmic or double logarithmic coordinates (Guzzetti et al. 2007;Saito et al. 2010). In contrast, statistical methods try to estimate an objective threshold. Among the statistical methods, two models are often used to define the objective rainfall I-D thresholds: one is based on Bayesian inference, the other uses the Frequentist method. Frequentist method usually requires a large dataset that consistently covers the rainfall duration and mean intensity ranges to yield accurate results. In contrast, the Bayesian inference method is the best model suitable for examining small datasets due to its sensitivity to the position of a few data points (Leonarduzzi et al. 2017). Furthermore, the fitted I-D curves obtained by other methods, such as receiver operating characteristic curves (Fawcett 2006;Gorsevski et al. 2006) and True Skill Statistic methods (Staley et al. 2013;Gariano et al. 2015), are very similar to those derived using the Bayesian inference method (Leonarduzzi et al. 2017). Therefore, the Bayesian inference method is used in this study, given the small dataset. Bayesian inference method is a probability approach, which is used to obtain estimates for the scale (the intercept) and the shape (the slope) of the power law curve representing the threshold, based on a set of rainfall intensity (I) and duration (D) conditions that have initiated debris flows (Guzzetti et al. 2007). This is obtained by defining a Bernoulli probability (0 6 p 6 1, p R þ ) of a data point occurring at a given value of rainfall intensity and duration. The estimates of a and b, obtained through Bayesian inference of their posterior probability distributions given the model and the empirical data, are used to define the minimum I-D threshold curve. As did in previous studies (Brunetti et al. 2010), this study uses the WinBUGS program (Lunn et al. 2000, http://www.mrc-bsu.cam.ac.uk/bugs/) to perform the Bayesian inference to estimate the rainfall thresholds.
Three rainfall thresholds are established based on the rainstorm data. One is derived from real-time intensity (I i ), which is calculated by hourly accumulated rainfall, other two are based on average intensities (averaged over the interval from the beginning of the storm to the debris flow occurrence (I a ) and averaged over the entire rainstorm duration (I w ), which are widely employed in previous studies.

Results
There are distinct differences in rainfall, duration, and maximum intensity among the rainstorms of July 21, 2012, July 20, 2016, and July 16, 2018. The average rainfall of the whole recorded rainfalls of the three storms is around 170 mm for the storm on July 21, 2012, 203 mm for the storm on July 20, 2016, and 102 mm for the storm on July 16, 2018. The duration of the storm on July 21, 2012 was the shortest one of three storms with a duration of 19 hours, in contrast to the long durations of the storms on July 20, 2016 (55 hours) and on July 16, 2018 (60 hours). The storm on July 21, 2012 brought the biggest cumulative precipitations of the three storms with the maximum cumulative rainfall of 541 mm occurring in Hebeizhen of the Fangshan District. In addition, seven meteorological stations registered 24-h precipitations with a 100-year return period, while eight stations registered 24-h precipitations with a 50-year return period during this storm. In contrast, the cumulative precipitation in the storm center is 352 and 278 mm for the storms of July 20, 2016 and July 16, 2018, respectively (Table 1).
Similar to the rainfall patterns, the damage and loss of geohazards caused by the three storms differ significantly. The most severe geohazard corresponded to the higher recorded rainfall intensity of the storm on July 21, 2012 (Figure 1a). A total of 22 debris flows and 10 landslides were triggered by this storm in Fangshan District. The serious disasters occurred in the center of the storm, such as Hebeizhen (7 disasters), Xiayunling (13 disasters), and Nanjao (3 disasters) (Figure 1). In contrast, the rainstorms on July 20, 2016 and July 16, 2018 caused one debris flow in their respective storm center.
The debris flows in the mountain areas of Beijing display a close association with both rainfall intensities and cumulative precipitations (Figure 2). All the debris flows occurred during the periods with high rain intensities. The rainfall intensities ranged from 48.5 to 98.9 mm/h for the debris flows on July 21, 2012, and are 46 and 69.1 mm/h for the two events of 2016 and 2018, respectively. In addition, all the debris flows occurred after cumulative precipitation reached a certain level (Figure 2), which is around 128 mm, 217 mm, and 174 mm respectively for the debris flows on July 21, 2012, July 20, 2016, and July 16, 2018 (Table 1). No debris flow occurred before the cumulative precipitations reach a threshold level, despite a high rainfall intensity (Table 1). For example, the debris flows were not initiated by the maximum intensity in Mentougou and Longquan sites, but by the ones delaying the maximum value by more than three hours when the cumulative precipitations reached the threshold level ( Figure 2).
Based on the data of the three storms in Beijing, we got the objective I-D models by Bayesian inference: Here, D i, D a and D w respectively indicate the real-time rainfall duration, the interval from the beginning of the storm to the debris flow occurrence, and the whole rainfall duration. The results show that the thresholds derived from the rainfall intensities of different typologies differ significantly between each other (Figure 3). The I i -D i threshold is the largest one, and the I a -D a threshold is the smallest.

Efficiency of the rainfall thresholds derived from rainfall intensities of different typology
The efficiencies of the thresholds derived from rainfall intensities of different typologies are evaluated by their capabilities in discriminating between the rainfalls with positive and negative debris-flow response (Guzzetti et al. 2007). To test the efficiencies of the various thresholds, the rainfall intensities with values of P10 mm/h, which did not initiate debris flows in the mountain areas of Beijing, are plotted into the  Figure 3a. Herein, 10 mm/h is used to delimit the lower limit of rainfall intensities that initiate debris flows because an overland flow is generated on the surface of a slope after the cumulative rainfall reaches 10 mm in the studied region (Ma et al. 2016). Additionally, the peak and average rainfall intensities derived from historical records (Ma et al. 2016;Wang 2020) are plotted into the Figure 3a. The rainfall threshold derived from the real-time intensities exhibits high ability to separate the rainfalls with positive and negative responses in Beijing. Almost all the data without initiating a debris flow, particularly those with similar rainfall durations, are in the safety region delimited by the I i -D i threshold (Figure 3a). Furthermore, the rainfall data with the specific initiation time of the debris flows on July 21-22, 198921-22, (Wu 2001 are in the risk region delimited by our I i -D i threshold (Figure 3a). In contrast, the thresholds derived from the two average intensities could not well differentiate between the rainfalls with the positive and negative debris-flow responses. Most of the averaged intensities (35 of 92, 38%) with the negative debris-flow responses are in the risk region defined by the I a -D a threshold (Figure 3a). Similar to the I a -D a threshold, the threshold derived from the intensities averaged over the whole rainstorm durations exhibits a lower discriminating ability (Figure 3a). These data consistently suggest a high efficiency of the real-time threshold in discriminating the rainfalls with positive debris-flow response from those with negative response. The thresholds derived from average intensities (I a -D a and I w -D w ) underestimate the precipitation or intensity triggering debris flows and thus have a high rate of false alarm. Of course, the I i -D i model needs to be further refined due to the limited spatial and temporal coverage of the three storms considered herein.
To evaluate the reliability of the thresholds established in previous studies, we compare our thresholds with the previous thresholds in Beijing. All of the previous thresholds were derived from the historical records (Figure 3b), including the regional and local thresholds, the thresholds before and after 2000 AD based on data of 23 debris flows during 1963-2012 (Ma et al. 2016), the threshold based on the 49 events during 1949-2012 (Wang 2020), and one derived from peak intensities of 18 events of 1989-2012 (Tu et al. 2017). The I w -D w (averaged over the whole storm) threshold of this study is very similar to that derived from the intensities of the same typology during 1949-2012 (Wang 2020). This evidence suggests a negligible difference in the rainfall patterns initiating debris flows between the three storms in this study and those of 1949-2012. Therefore, the rainfall thresholds of this study should have a temporal representative of the rainfalls triggering debris flows in the mountain areas of Beijing. Whereas, as discussed above, the I w -D w threshold has a low efficiency in discriminating between the rainfalls with positive and negative debris-flow response. For the reliability of previous rainfall thresholds, Figure 3b shows that almost half of the real-time intensities fall into the safe region defined by the previous I-D models, suggesting that previous thresholds have not captured the actual rainfall patterns triggering debris flows in Beijing. In contrast, all of the storm data from 1949 to 2012 that induced debris flows are in the risk region delimited by our I a -D a threshold (Figure 3b). These data indicate that the I a -D a threshold in this study affords a high safe estimation of the rainfalls initiating debris flows in Beijing, despite it also affords a high rate of false alarms.

4.2.
Comparison with the global and regional thresholds in other regions and physical implications of the parameters employed in the empirical model Figure 3c shows a comparison of our rainfall thresholds with the global and the regional thresholds in other regions. The noteworthy characteristic in Figure 3c is that the I i -D i threshold of Beijing exhibits a similar pattern to the threshold of the debris flows triggered by short-intensive storms (Guadagno 1991;Larsen and Simon 1993;Wieczorek et al. 2000;Chen et al. 2005;Giannecchini 2005), particularly the threshold in Blue Ridge, Madison County, Virginia (Wieczorek et al. 2000), while it is higher than all global and most regional thresholds. This similarity might be associated with the small difference between the average and the real-time intensities for the short-intensive storms. Theoretically, the difference between the average and realtime intensities is small for short-intensive storms, but it is significant for the long and low-intensive rainfall. Therefore, the thresholds derived from the intensities averaged over short-intensive storms would be higher than those from the long-duration rainfalls. This assumption is consistent with the results of previous studies. Globally, the thresholds derived from the short-intensive storms are indeed larger than those derived from the intensities of other rainfall types (Guzzetti et al. 2007). These items of evidence, together with the high efficiency of the I i -D i model in Beijing, demonstrate that the thresholds derived from the intensities averaged over long-term rainfalls underestimate the actual rainfall intensities initiating debris flows.
The similarity between our I i -D i threshold and those thresholds of debris flows triggered by short-intensive storms presents some implications for the physical interpretations of the parameters employed in empirical I-D curves. Figure 3c clearly shows that the slope of the I i -D i curves (b) is similar to those of short-intensive storms initiating debris flows, but the intercepts (a) vary substantially in different regions. The similar value of b is consistent with the assumed physical interpretation of the slope. The b represents the slope of the catchment outflow threshold, which depends on the land cover upstream of the initiation area, and the value of which is fixed . Whereas, the intercept value of the curve represents the initial loss in headwater catchments , which is associated with regional catchment conditions. The initial loss includes rainfall loss due to interception, depression storage, percolation through the fractures and holes in rocks and soils, and wetting the surface soils and/or rocks of the catchments. Therefore, the initial loss (a value) differs significantly among various regions due to their geological and geomorphological differences.

Dynamics of the debris flows in the mountain areas of Beijing
As discussed above, the debris flows in the mountain areas of Beijing cannot be interpreted by the rainfall intensity alone, the debris flows occurred only when both cumulative rainfall and intensity simultaneously reached a certain threshold in the mountain areas of Beijing. On July 21, 2012, no debris flow was triggered when the cumulative precipitations were relatively low in the Mentougou (118.4 mm) and Longquan sites (84.3 mm), despite the rainfall intensity reaching the maximum (Figure 2). Similarly, the rainstorms with high intensity and low precipitations during the storm on July 16, 2018 caused few debris flows (Table 1). The data of this study indicate that the minimum values of accumulated rainfalls and rainfall durations are 174 mm and 5 hours for debris flow initiations (Table 1). These results indicate that the debris flows in Beijing are initiated by the combination of high rainfall intensity with cumulative rainfall. The same scenario has been detected in many areas across the globe (Guzzetti et al. 2008), including central and southern Europe (Guzzetti et al. 2007), Japan (Saito et al. 2010), America (Baum and Godt 2010), and the areas with high gradient slope in Himalaya mountains (Dahal and Hasegawa 2008) and Taiwan Island (Chen et al. 2005).
The contribution of antecedent precipitations to initiating debris flows is negligible in Beijing because of the low antecedent precipitations for the debris flows in this study (Ma et al. 2018;Li et al. 2019), though it displayed a significant role in some regions (Guzzetti et al. 2007;Smol ıkov a et al. 202). For example the 30-, 15-, and 5day antecedent precipitations were respectively $103.4, 102.2, and 30 mm in Xiabailianyu for the storms on July 16, 2018, which are considerably lower than the cumulative rainfall before the debris flow initiation during the storm. However, the high cumulative precipitations and long rainfall durations before the debris flow occurrence suggest that antecedent precipitation will play a role on debris flow initiations in the mountain areas of Beijing once it reaches a high level. The high antecedent rainfall would lead to a high value of antecedent moisture conditions , which is an important correction factor for the Curve Numbermethod of the Soil Conservation Service for effective rainfall calculation and a function of the initial conditions Dalla Fontana 2007, 2008).
Our data provide some implications for the dynamics of debris flow initiations. Based on above discussion, the combination of cumulative precipitations and high rainfall intensity triggered the debris flows in Beijing. This characteristic provides a strong test for the assumed mechanisms of debris flow initiations (Iverson 1997b). According to the postulation of Iverson (1997a), three factors control the development of debris flows: 1) failures of debris masses, 2) sufficient water for saturating the mass, and 3) sufficient conversion of the gravitational potential energy to internal kinetic energy for changing the motion to a flow. These three factors must be almost simultaneously satisfied for a debris flow initiation (Ellen and Fleming 1987;Anderson and Sitar 1995;Iverson 1997a). For the debris flows in Beijing, the longterm heavy rainfall and its resultant high cumulative rainfall before debris flow initiations provide sufficient time and water for water infiltration and saturation of debris sediments, subsequently mobilizing sediments by increasing the pore pressures of the sediments. Simultaneously, high water flows and surges caused by the intensive storms incorporate and retain the mobilized sediments, flowing down the slope and forming debris flows.

Conclusion
This study systematically analyzed the evolutions of three recent rainstorms and their association with the debris flow initiation in the mountain areas of Beijing. The debris flows occurred during the intervals with high rainfall intensities, but they did not always correspond to the maximum rainfall intensity, some of which exhibited 3-4 hours delay to the maximum intensity. All the debris flows occurred when both the intensities and cumulative precipitations reached a certain level simultaneously, indicating a combined influence of rainfall intensity and cumulative precipitation.
The rainfall thresholds derived from the rainfall intensities of different typologies differ significantly in the efficiency of discriminating the rainfalls initiating debris flows. The threshold derived from the real-time intensities exhibits a high ability to separate the storms causing debris flows from those not causing debris flows. In contrast, the I a -D a and I w -D w thresholds exhibit a low discriminating ability, though the I a -D a model provides a high safe threshold. The results of this study demonstrate a reliability of the real-time rainfall intensities in discriminating the rainfalls initiating debris flows, and the necessity of employing the real-time rainfall intensity for accurately estimating rainfall thresholds in future studies. Additionally, owing to the limited spatial and temporal coverage of the three storms, the I i -D i threshold needs to be tested and refined using more datasets.

Disclosure statement
No potential conflict of interest was reported by the author.

Data availability statement
The data that support the findings of this study are available from the corresponding author, Bing Xu, upon reasonable request.