Reliability analysis for a large and complex landslide in the three gorges reservoir area (China) based on incomplete information

Abstract The soil parameters for large, complex landslides are typically derived from incomplete information based on a small sample set due to budgetary constraints. This informational incompleteness results in large statistical uncertainty in landslide reliability analyses. In this article, the bootstrap technique is proposed to quantify the statistical uncertainties associated with a small sample set, and a practice-oriented reliability analysis is performed. The results suggest that the obtained reliability indices are characterized by a long tail, in which the worst-case scenario has a local extreme value and a small population. The statistical uncertainties are quantified and characterized by a confidence interval at a specified confidence level. The confidence interval of the reliability index and identification of the worst-case scenario enable engineers to make more informed decisions.


Introduction
Landslides are one of the most common hazards worldwide (Barbarella et al. 2015). The movement and failure of landslides, especially large-scale landslides, have potentially catastrophic societal and economic consequences (Ma et al. 2017a(Ma et al. , 2017b. For instance, the Jiweishan rockslide, a large-scale rockslide that occurred on June 5, 2009, in Wulong County, Chongqing, China, took the lives of 74 people (Tang et al. 2015c).
The volume of large landslide events can range from a few to hundreds of millions of cubic meters, and the velocity of such a landslide can easily surpass 100 m/s (H€ urlimann et al. 2001). Engineering measures are frequently inappropriate for addressing such large-scale landslides due to the extremely high kinetic energy involved. Therefore, suitable emergency planning is the only effective method of reducing the vulnerability of potentially affected areas (Crosta and Agliardi 2003). Identifying the triggering factors and evolution processes, developing a satisfactory understanding of the landslide geometry and kinematics, and performing reasonable stability assessments are required for managing such emergencies. Geological investigations (Hasegawa et al. 2009;Tang et al. 2015a;Deng et al. 2017), monitoring (Borgatti et al. 2006;Peyret et al. 2008), and mode testing ) have been performed to study the evolution mechanisms of large-scale landslides, and analyses of monitoring data have been performed to improve the knowledge of sliding kinematics (Peyret et al. 2008) and derive appropriate criteria (Corominas 1996;Crosta and Agliardi 2003). Numerical analytical techniques, e.g. the limit equilibrium method (Sun et al. 2016), reliability-based method (Wu et al. 2014(Wu et al. , 2017Tang et al. 2015b;Zhao et al. 2016), finite element method Hu et al. 2012), and discrete element method (Tang et al. 2009;Chang et al. 2012), have been applied in stability assessments of large-scale landslides.
Stability assessments of large-scale landslides are complex and remain a key challenge in natural hazard research because of the uncertainties involved (Giasi et al. 2003). In fact, three main sources of uncertainty are associated with soil parameter modeling in stability assessments: in situ variability, measurement errors, and statistical uncertainty (schematically illustrated in Figure 1) (Giasi et al. 2003;Most and Knabe 2010;Luo et al. 2013). In situ variability is associated with variations in the mineral composition, natural processes, or stress history of the soil mass ((a) in Figure 1). Measurement errors are caused by sampling disturbances, test imperfections and human error ((b) in Figure 1); this type of uncertainty can be addressed with multiple or repeated measurements or by providing measurements with error bounds. Moreover, assuming that measurement errors are minimal for soil data obtained in research programs is reasonable because good equipment and procedural controls are likely to be maintained (Orchant et al. 1988). Statistical uncertainty mainly results from a limited number of tests and samples ((c) Figure 1). In practical applications, only limited numbers of samples are tested due to budgetary constraints associated with in situ investigations and field and laboratory tests; this incompleteness can lead to tremendous statistical uncertainty in soil parameter modeling (Luo et al. 2013;Li et al. 2015).
The reliability-based method has the ability to account for estimation uncertainties in geological, geotechnical, and geomorphological parameters (Singh et al. 2013); thus, it has gained increasing popularity in the stability assessments of large-scale landslides. For example, Wu et al. (2014) used the random-fuzzy method to estimate the reliability index values for a large-scale reservoir landslide. Tang et al. (2015b) performed dynamic reliability analysis on a large earthquake-induced landslide by considering the energy-time distribution. Additionally, Wu et al. (2017) studied variations in the reliability index values of the large and complex Huangtupo landslide under periodic rainfall and water level fluctuations.
However, current research mostly focuses on the reliability analysis of large landslides under different driving factors, and few studies have examined the effect of statistical uncertainty on the landslide reliability index. The purpose of this study is to perform a practice-oriented reliability analysis for a large, complex landslide with incomplete information. A bootstrap technique is proposed to quantify the statistical uncertainty associated with a small sample set and achieve a more meaningful evaluation of the landslide reliability. For this purpose, the Outang landslide, a large and complex landslide in the Three Gorges Reservoir area, is chosen as a case study.

Bootstrap
The bootstrap method (Efron 1979) is a nonparametric technique that can be used to estimate variations in sample statistics derived from a small sample size (for example, fewer than 30 data points) (Mojtahedi et al. 2009;Luo et al. 2013;Li et al. 2015;Val a skov a et al. 2015). This technique is flexible, easy to implement, and applicable in nonparametric settings, and it requires a minimal set of assumptions (Mojtahedi et al. 2009;Ma et al. 2018). The notion behind bootstrapping is to create large sets of bootstrap samples by random sampling with replacements from the original observations. Bootstrap samples can have a smaller dimension than the original observations. The best results are achieved if the size of the single bootstrap sample is set to the original sample size (Luo et al. 2013; Val a skov a et al. 2015). Let X ¼ fx 1 ; x 2 ; :::; x n g denote the original observations. A bootstrap sample setB j ¼ fx j 1 ; x j 2 ; :::; x j n g can be constructed by random sampling with replacement from the original observations X (schematically illustrated in Table 1). An observation x i in the original observations may appear once, more than once or not at all. The sample mean and standard deviation of the constructed bootstrap sample set B j are given as follows: where B j i is the ith sample in the jth bootstrap sample set. The above sampling procedure is repeated many times, and B bootstrap sample sets are obtained.

Features of the Outang landslide
The Outang landslide (Figure 2 (a)), which is a large and complex landslide in the Three Gorges Reservoir, occurred in Anping Town of Fengjie County, Chongqing, on the south bank of the Yangtze River approximately 177 km from the Three Gorges Reservoir Dam (see Figure 2 (b) and (c) for its location). More than 4119 people reside at the landslide site. The main sliding direction of the landslide is 340-350 ( Figure 3). The Outang landslide has a maximum length and width of 1800 m and 1100 m, respectively. The average thickness of the landslide body is approximately 50.8 m. The entire planar area of the landslide is approximately 1.78 million m 2 , and its estimated volume is 90 million m 3 . The landslide body extends from an elevation of 95 m above sea level (a.s.l.) at the toe to 705 m a.s.l. at the crown (Figures 3 and  4). The landslide has an average slope gradient of 25 .
The landslide is composed of three blocks (Figures 3 and 4). Block 3 extends from an elevation of 95 m a.s.l. at the toe to 370 m a.s.l. at the crown. The planar area of Block 3 is approximately 0.92 million m 2 , and the volume of Block 3 is approximately 64.8 million m 3 . Block 2 lies at an elevation between 250 and 530 m. Block 2 has an estimated planar area and volume of 0.32 million m 2 and 10.2 million m 3 , respectively. Block 1 extends from an elevation of 400 m a.s.l. at the toe to 705 m a.s.l. at the crown. The planar area of Block 1 is 0.54 million m 2 , and the volume of Block 1 is approximately 14.5 million m 3 . The stratigraphic units present in the study area ( Figure 3) are loose Quaternary deposits (Q 4 ) in addition to the Zhenzhuchong formation (J 1 z) and Xujiahe formation (T 3 xj). A detailed description of the stratigraphic units is given in Table 2. A site-  specific investigation involving exploratory boreholes, open test pits, and exploratory tunnels indicated that the landslide masses are composed of silty clay with gravel and fractured rock mass (Figure 4). The sliding zone is composed of carbonaceous clay with a thickness of 5-8 cm. The sliding mass is underlain by sandstone of the Jurassic Zhenzhuchong formation with an average dip direction of 320-350 and a dip angle of 20-28 . Groundwater is abundant in the landslide (see Figure 3 for the groundwater level). A flow rate ranging from 3 to 7 tons per day was measured in February 2004 during the excavation of an exploratory tunnel (PD1).
The Outang landslide, a large ancient landslide, was reactivated by the initial impoundment of the Three Gorges Reservoir Dam in June 2003. Tension cracks at the crown between the elevations of 700 and 705 m a.s.l. (see Figure 3 for the crack locations) were first noticed by the locals during the 1950s. The crown crack was perpendicular to the sliding direction, and it has increased in width with time under the combined effects of water level fluctuations and rainfall. A site-specific investigation showed that the crown crack reached approximately 6 m in width ( Figure 5), and 160 noticeable cracks are distributed on the landslide surface that endanger residents living at the landslide site. The residents living at the landslide site began to relocate in October 2014 due to these precarious circumstances (Yin et al. 2016), and the residents' houses have since been demolished. For the purpose of landslide monitoring and early warning, a multiple monitoring system consisting of 37 GPS monuments and 17 inclinometer boreholes was installed on the landslide mass.

In situ shear test
In situ shear tests represent an effective method of obtaining shear parameters because of the following potential benefits. First, in situ shear tests can be performed in the natural environment without sampling disturbances. Second, in situ tests can be performed on soils, such as dry and single-grain sand, which are either impossible or difficult to sample without the use of expensive and specialized methods. Third, a larger volume of soil can be tested than is normally practicable for laboratory testing. This larger volume may be more representative of the soil mass. Fourth, the detection of the sheared surface is much more likely and practical. Because of these advantages, in situ shear tests have been increasingly used in recent decades (Wu and Watson 1998;Li et al. 2004).
In this study, in situ shear tests ( Figure 6) were performed in an exploratory tunnel (PD1) and open pit (TJ4) (see Figure 3 for the test locations) to study the shear strength of the sliding zone soil. The sample size of the soil block was 50 cm long, 50 cm wide, and 35 cm high. A total of eight sets of in situ shear tests were performed due to budgetary constraints. Each test set contained five soil blocks. In the exploratory tunnel and open pit, the sliding zone soil was saturated with groundwater. The saturated shear parameters of the sliding zone measured from the in situ shear test are shown in Table 3. From a statistical perspective, the dataset of shear strength parameters is considered incomplete information and is based on a small sample size, which will cause large statistical uncertainty in the soil parameters.  reliability index is particularly sensitive to uncertainties in the shear strength parameters . Therefore, the shear strength parameters were treated as uncertain variables, and the densities of the landslide materials were treated as constants. The deterministic quantities of the density for silty clay and fractured rock mass are as follows: q silty clay ¼2010 kg/m 3 and q fractured rock mass ¼2500 kg/m 3 .

Reliability analysis for the Outang landslide
The sample statistics derived for the Outang landslide from the small sample set measured from the in situ shear tests are given in Table 3. These sample statistics were used to generate random values for the reliability analysis. Previous studies have  Figure 3 for the test location). Table 3. Saturated shear strength parameters of the sliding zone soil measured during the in situ shear test (see Figure 3 for the test location). shown that most soil parameters can be adequately modeled with a truncated normal distribution (Most and Knabe 2010;Luo et al. 2013). Therefore, the shear parameters of the sliding zone soil for the Outang landslide were modeled with truncated normal distributions. For the sake of simplicity, reliability analyses were performed under the assumption of fully saturated conditions. In fact, saturated conditions are commonly assumed to correspond to the most critical conditions in practical applications (Gofar and Rahardjo 2017). The SLOPE/W module of the landslide stability analysis software program GeoStudio (Geo-Slope International Ltd. 2012) was used to calculate the reliability index and probability of failure of the Outing landslide, and the MCS was set to work with 10,000 cycles. The calculations of the safety factor were based on the Morgenstern-Price method. The reliability index based on the MCS of the Outing landslide was calculated as 0.78, and the corresponding probability of failure was found to be 21.9%.

Reliability analysis for the Outang landslide using bootstrapping
The overall framework of the reliability analysis for the Outang landslide using bootstrapping is shown in Figure 7. A total of B bootstrap sample sets were generated from the original observations shown in Table 3. The cohesive strength and friction angle were determined from the same soil sample; therefore, they are treated as a pair of data when they are resampled using bootstrapping. Bootstrap sampling represents a tradeoff between the maximum number of replications that can be performed in a reasonable amount of time and the minimum number of replications needed to accurately estimate the sample statistics. Generally, the chosen number of bootstrap sample sets B typically must be very large, e.g. 10,000, to obtain converged results in the statistical evaluation (Most and Knabe 2010;Luo et al. 2013). However, for a large and complex landslide such as the Outang landslide, for which a single reliability analysis is already numerically demanding, the computational cost of a reliability analysis using large-scale bootstrap sampling can become overly expensive. To reduce the computational cost, a sensitivity analysis was performed to determine the optimal bootstrap number B. In the sensitivity analysis, different bootstrap replicate numbers ranging from 10 to 20,000 were used to generate bootstrap sample sets of different sizes. The bootstrap means of the sample mean value were calculated as follows: (3) Figure 8 shows the fluctuations in the bootstrap means of the sample mean values with respect to the number of bootstrap replicate simulations. Figure 8 illustrates that the converged result of the bootstrap means was achieved at 2000 bootstrap replicate simulations. The bootstrap means of the sample mean cohesive strength and friction angle values are 8.81 kPa and 16.49 , respectively. These numbers correspond well with the original sample means shown in Table 3, suggesting that bootstrapping reflects the main characteristics of the original observations. Therefore, the bootstrap replicate number B was set to 2000 in the paired bootstrap method.
After 2000 bootstrap replications, a total of 2000 bootstrap sample sets were generated, and the sample statistics (mean value and standard deviation) for each bootstrap sample set were calculated. The sample statistics of the 2000 bootstrap sample sets generated from the original observations are shown in Figure 9, which shows that the  mean value and standard deviation of the bootstrap sample sets both follow a normal distribution.
The sample statistics shown in Figure 9 were set as inputs for the probability density function to generate random values for the reliability analysis in the MCS. The reliability analyses based on the MCSs in Section 4.1 were replicated 2000 times; consequently, a total of 2000 reliability indices were obtained. The corresponding reliability indices are shown in Table 4 and Figure 10. The variation in the computed reliability indices resulting from the variation in the sample statistics can be observed in Figure 10. Reliability indices ranging from -0.28 to 6.75 were obtained, and they presented a median value of 0.84 and a standard deviation of 0.44. A confidence interval, which is a range of values that likely reflect the true range of values, was used to express the variation in the computed reliability indices. A confidence interval at the 90% level, for example, can be derived to quantify the statistical uncertainties based on the most obvious technique: the percentile method (Chernick 2008). By ordering the bootstrap estimates from smallest to largest, we could define an interval by identifying the percentiles corresponding to the desired confidence interval as the  upper and lower bounds (e.g. 5% and 95% for a 90% confidence interval). In this study, the 90% confidence interval of the reliability index was computed. Instead of a single crisp reliability index based on the MCS, an interval ranging from 0.31 to 1.68 was obtained. The high standard deviation and wide confidence interval signify the presence of a high level of uncertainty associated with incomplete information. Figure 10 indicates that the obtained reliability indices were characterized by a long tail consisting of local extreme values with a small population located far away from the mean on both sides (Bruce 2004). Here, these extreme tails are important because they provide a notion of the worst possible scenarios that could be anticipated for the reliability of the landslide, and these tails must be particularly investigated. In the worst possible scenario, the reliability index of the Outang landslide decreased to -0.28 and presented a failure probability of 60.98. This probability is three times higher than that obtained based on the MCS.
The above results clearly indicate the need to consider the statistical uncertainty caused by incomplete information to assess the quality of the computed reliability indices and failure probability.

Comparison of MCS-based and bootstrapping-based reliability analyzes
Based on the hierarchy indices of the failure probability and landslide stability state (Wu et al. 2017), the MCS-based reliability analysis indicates that the Outang landslide is basically stable under the most critical state, whereas the bootstrapping-based reliability analysis indicates that the Outang landslide is basically stable to highly unstable under the most critical state.
A heavy rainfall event occurred in northeastern Chongqing from August 31 to September 2, 2014. The maximum accumulated rainfall was recorded in Yunyang County (31.26 N, 108.92 E), where the 12-hour rainfall amount reached 210 mm. A heavy rainfall event with an hour-long rainfall amount of 38 mm was recorded in Yunyang County at 07:00 on September 1, 2014. The landslide area also experienced heavy rainfall with a total rainfall amount of 217.6 mm. Under this heavy rainfall, the landslide mass deformed severely. A maximum displacement of 96 mm was recorded by the GPS monument. Additionally, significant deep deformation with a magnitude of 52 mm was recorded in borehole ZK 89 by the inclinometer at a depth of 34.5 m. Moreover, cracks appeared in the concrete sidewall of the exploratory tunnel, and the crack reached a length of 3 m and a width of 3 mm in late September 2014. These deformation characteristics indicate that the Outang landslide is at a high risk of landslide failure under the most critical conditions; consequently, the government decided to implement relocation and demolition. The finding of a high failure risk is consistent with the reliability analysis based on incomplete information using bootstrapping. Therefore, we conclude that the bootstrapping-based reliability analysis provides a reasonable estimate of the stability state.
In this study, a reliability index of 0.78 and a failure probability of 21.9% were obtained from the MCS, and they correspond well with the median value obtained from bootstrapping, which suggests that conventional MCS-based reliability analyses can reflect the average characteristics of the original observations. However, the difference in the computed failure probability from the median case to the worst possible scenario ranges from 19.7% to 60.98%, which is equivalent to a twofold variation in the magnitude of the failure probability.
For many engineers, bootstrapping-based reliability analysis and constructed confidence estimations are attractive but difficult to implement in landslide engineering. Therefore, considerable care should be exercised by decision makers when using such MCS-based estimates. An emergency backup plan should be implemented for the worst possible emergency.
To simplify matters, reliability analyses were performed with the assumption of fully saturated conditions. We hope that this shortcoming will not significantly affect the clarity of the present findings.

Conclusions
Reliability analyses of the large and complex Outang landslide were performed based on incomplete information using the bootstrap technique. The results indicate that for the Outang landslide, a high level of uncertainty is associated with incomplete information. The obtained reliability indices using bootstrapping are characterized by a long tail consisting of a worst-case scenario with a local extreme reliability index and a small population. The statistical uncertainties are quantified and characterized by a confidence interval at a specified confidence level. Generating a confidence interval of the reliability index and identifying the worst-case scenario enables engineers to make more informed decisions.
The successful implementation suggests that the bootstrap technique and proposed methodology could be useful for quantifying existing statistical uncertainties associated with small sample sets and achieving more meaningful evaluations of landslide reliability. However, the bootstrap technique also has certain limitations when applied to small sample sets. A main concern for small sample sizes (i.e. two or three samples) is that the bootstrap sample will underrepresent the true variability with only a few values from which to select since the observations are frequently repeated, and thus, the bootstrap samples can be repeated.

Disclosure statement
No potential conflict of interest was reported by the authors.