Implementation of reliability-based thresholds to excavation of shotcrete-supported rock tunnels

ABSTRACT Modern tunnelling in rock relies heavily on information from monitoring and observations during construction as a means to reduce the considerable uncertainty that originates from a lack of knowledge about the ground conditions. By formally integrating the monitoring information into the structural safety evaluation, a comprehensive risk-based design framework can be achieved. The establishment of relevant thresholds against unacceptable structural behaviour is a key aspect of this work. This paper presents an extensive application example based on real-case data, showing how a reliability-based threshold can be established to ensure the serviceability of a shotcrete lining in a rock tunnel in a challenging geological setting. The authors address and discuss practical approaches to consider the model uncertainty related to the confinement loss caused by the tunnel front advance, as well as the transformation uncertainty related to the use of indirect monitoring information. The established threshold is discussed in the context of the observational method.


Introduction
Monitoring and observation of the structural behaviour during construction are key components of modern tunnelling, because of the considerable prevailing uncertainties of the geological and geotechnical conditions in the ground; this has paved the way for observation-based tunnelling approaches (Schubert 2008;Stille and Holmberg 2010;Spross and Larsson 2014;Palmström and Stille 2015;Bjureland et al. 2017;Spross et al. 2018). These uncertainties originate either from natural variability in the ground conditions (aleatory uncertainty) or from a lack of knowledge of the conditions (epistemic uncertainty), where the latter is caused mainly by the limited extent of the pre-investigations performed. The knowledge gained from monitoring and observations during construction are therefore vital in satisfying the structural safety requirements, as additional information generally implies that the epistemic uncertainties are reduced and that the calculated structural reliability is improved .
If the tunnel has been designed with reliability-based methods (e.g. Langford and Diederichs 2013;Lü et al. 2017;Kroetz et al. 2018;Napa-García, Beck, and Celestino 2018;Bjureland et al. 2019), the knowledge gained can be accounted for straightforwardly through Bayesian updating, as demonstrated recently for tunnels by Feng and Jimenez (2015) and Feng et al. (2019), and for other geotechnical structures by Schweckendiek and Vrouwenvelder (2013), Li et al. (2016) and Contreras and Brown (2019). However, the primary purpose of performing monitoring and observations of the structural performance is normally to gain information about when specific measures are needed to ensure that the considered limit states are not attained. This requires that the monitoring or observations be linked to a threshold that triggers the implementation of safetyenhancing modifications to the structure. Consequently, a threshold needs to be established so that the triggered measure can be implemented before the corresponding limit state is attained. From a structural reliability point of view, the threshold shall be established such that the probability of failure of the structural component is satisfactory (i.e. the calculated probability of failure, p F , is less than the target, p F,T ), as long as the monitoring result does not violate the threshold.
Although such thresholds often have a key role in the execution of a geotechnical design, there is, however, a considerable lack of practical guidance on how to establish the threshold so that it provides a sufficient safety margin with respect to the problem at hand; for example, the application guidelines to Eurocode 7 (Frank et al. 2004) state only that "it is the designer's responsibility to prepare and communicate specifications for any such monitoring". Spross and Johansson (2017) showed in a simplified example how such reliability-based thresholds can be determined and applied to monitor the deformation of a rock pillar in the context of the observational method. The procedure for determining reliability-based thresholds was later theoretically extended by Spross and Gasch (2019) to a more general class of engineering problems, by making use of the Finite Element Method and Subset simulation. There is, however, a need for more comprehensive case examples that discuss the practical implementation of such thresholds to complex geotechnical structures, where the epistemic uncertainty typically is considerable. In this paper, we, therefore, investigate how reliability-based thresholds can be established to ensure the structural reliability of the shotcrete lining in a rock tunnel. In particular, we discuss the practical management of the threshold in the design of tunnels with the observational method (Peck 1969;CEN 2004) and the practical challenges in accounting for the involved model errors. The study uses the challenging geological conditions that were encountered in the Stockholm bypass rock tunnel project (European highway E4), where the excavation passed through a fault zone under lake Mälaren, west of Stockholm, Sweden.
2. Method to establish reliability-based thresholds 2.1. Threshold definition Describing a limit state as a function G(X) = 0, with X = [X 1 , . . . , X m ] being a vector of the relevant random variables, Spross and Johansson (2017) showed how a set of threshold values, x alarm , can be determined from an equality, so that the thresholds facilitate the structural safety requirements: In short, the procedure utilises the fact that a potential monitoring result can be described in terms of a probability of violating the threshold. For example, assuming monitoring of X 1 , the probability becomes P(h(X) = x 1,alarm − X 1 ≤ 0), where the function h(X) can be interpreted as a limit state function of its own. Thereby, the threshold value x 1,alarm can be determined using any structural reliability method, as Equation (1) may be reformulated into the following equality, after taking the new information Z from the monitoring into account (Straub 2014(Straub , 2015: where the numerator can be seen as parallel-system multiple failure modes and the denominator as a single failure mode. (In the simplest case with only one monitored parameter, Equation 2 is simply solved for the only unknown variable x 1,alarm ). Here, h(X) is defined so that critical behaviour corresponds to exceedance of the threshold, but thresholds against too low readings can also straightforwardly be determined. Note that if more than one threshold is to be established, Equation (1) becomes underdetermined, prompting additional assumptions or information (Spross and Gasch 2019). Such assumptions may, for example, concern the probability of exceeding the threshold. In this paper, however, we limit the study to one threshold.

Algorithm based on Subset simulation
The procedure to establish reliability-based thresholds from Equation (2) requires an iterative approach. This can be solved with crude Monte Carlo simulation (Spross and Johansson 2017), but when the limit state function requires input from a computationally demanding structural model (e.g. a Finite Element model), crude Monte Carlo becomes very inefficient. Spross and Gasch (2019), therefore, developed a general computational algorithm that efficiently computes such thresholds. The complete algorithm is provided in Appendix A and outlined in the following. The algorithm applies Subset simulation (Au 2001(Au , 2014, which is an adaptive form of Monte Carlo simulation that is particularly efficient for estimation of low failure probabilities when there are many random variables (Straub, Papaioannou, and Betz 2016). This efficiency facilitates the evaluation of the structural response with computationally demanding Finite Element models. A recent geotechnical application was presented by Gao et al. (2019).
In short, subset simulation calculates the p F as a product of larger conditional probabilities by defining p F from a number of nested intermediate failure events, such that F 0 . F 1 . . . . . F M , where F 0 is a certain event and F M is the structural failure described by the event {G(X) ≤ 0}. The probability of failure can then be formulated as in which F k = {G(X) ≤ c k }, where the limit c k for each intermediate event corresponds to a predefined probability p 0 for the event F k . The simulation entails a stepwise pushing of the sampling (by decreasing the value of c k ) toward the failure event of interest, where c M = 0, corresponding to the investigated limit state. This is facilitated using Markov chain Monte Carlo (MCMC) simulation by letting the samples that satisfied the previous intermediate event F k−1 be the seeds of the next sampling round. Considering that p 0 is a predefined probability, the p F can be estimated as where p M is an estimation of the last conditional probability P(F M |F M−1 ) and given by where I F,M is the indicator function of F M that is evaluated with the samples in the last round of sampling, x M−1 , which were generated conditionally on the event F M−1 . This general sampling procedure for subset simulation was adjusted in the algorithm for establishing reliability-based thresholds (Appendix A), to account for the information provided by the monitoring, by using the approach suggested by Straub, Papaioannou, and Betz (2016). To simulate the conditional probability of Equation (2), the intermediate events in , which gives the calculated probability of attaining the limit state G(X) ≤ 0, conditional on the information Z that the threshold described by h(X) has not been violated: Conceptually, the algorithm uses an iterative process to find the threshold x 1,alarm that satisfies Equation (2), based on a first guess of x 1,alarm (suggestions for reasonable starting values and revised guesses are provided in Spross and Gasch (2019)). In each iteration, the described Subset simulation procedure is performed using independent-component MCMC (Au and Wang 2014), aiming to minimise the error,1, between p F|Z and p F,T . The iterations end when1 is less than a predefined tolerance, t, which adjusts the accuracy of how well the structural safety corresponds to p F,T when considering the information that the monitoring result is not violating x 1,alarm . In the practical application, the reliability-based threshold should be interpreted such that as long as x 1,alarm has not been exceeded, the structural safety is acceptable with respect to the considered limit state.

Implementation details
In this study, all steps of the algorithm outlined in Section 2.2 and detailed in Appendix A are implemented in the general computational modelling software COM-SOL Multiphysics (Comsol 2018). A crude sampling approach is adopted in all realisations of the random variables in X. Important to emphasise is that our implementation puts no limitation on the structural model evaluated in steps 2b and 3dii of Table A.1 (in Appendix A), which can be 3-D or 2-D, and include, for example, material nonlinearity and time-dependent effects. Actually, the incorporation of our algorithm in a general Finite Element software gives us access to all functionality of the specific code when setting up a physical model. The algorithm is, in fact, not limited to structural cases, and could just as well be applied for monitoring the inflow of groundwater to a tunnel or the concentration of some chemical species in the groundwater close to the construction site. Any measurement error in the observation of X 1 can be considered straightforwardly by increasing its variability correspondingly in the algorithm when samples of X 1 are generated or evaluated from the numerical model. Details on models for measurement errors are provided by, for example, Baecher and Christian (2003). Generally, increasing measurement errors leads to stricter thresholds.

Description of the Stockholm bypass tunnel project
The Stockholm bypass project will provide a new route for the European highway E4 and will connect the Northern and Southern parts of Stockholm County with three lines in each direction, separated into two parallel 18-km-long tunnels. The tunnels are excavated in rock under lake Mälaren and the Lovön and Kungshatt islands west of Stockholm, Sweden (Figure 1). At the passage under lake Mälaren, south of Kungshatt, the excavation intersects a regional strike-slip fault zone, where the south side also has rotated in a reversed dip-slip movement. The fault zone is approximately 200 m wide at the passage. The interpreted results from the pre-investigations indicate a number of parallel weakness zones mainly oriented along the fault line, but with occasional subhorizontal weakness zones in the tunnel direction, where blocks have rotated upwards. The pre-investigations indicate that the weakness zones, which consist of gauge, cataclasites, mylonites or breccias, may be limited in extension and surrounded by areas of higher quality rock mass.
When passing the regional fault zone, the tunnel will be located 64 m below the lake surface. The weak rock mass is overlain by till and clay, which forms the lake bottom. The engineering geological forecast identified two typical rock qualities, denoted as rock classes IV and V, within the fault zone. Rock class IV describes mainly the quality of the rock mass in the transition zone of the fault zone, while rock class V describes mainly the quality of the rock mass in the core of the fault zone. This paper analyses only rock class IV, which is characterised by an adjusted Rock Mass Rating (RMR) value within the range 38-59 with 40 being the assigned typical value. (For reference, rock class V has an adjusted RMR in the range 28-33 with 29 being the assigned typical value.) The Swedish "adjusted RMR" is used in early design phases; it is based on Bieniawski (1989), but sets ground water conditions to "completely dry" and discontinuity orientation to "very favourable", as these conditions are not yet known.
The tunnels are excavated with drilling and blasting and temporarily supported with a systematic pattern of rock bolts and a layer of shotcrete. Where poor rock quality is expected, a pipe umbrella system is installed as pre-support. Because of the poor rock mass quality and the large width of the tunnel, the excavation is divided into a sequence of gallery and bench excavations, with an advance rate of 2 m per sequence. As the expected rock cover is limited, extensive rock grouting is performed to limit the risk of stability issues caused by flowing ground conditions. A permanent cast concrete lining will be installed at a later stage, to account for long-term issues related to, for example, potential swelling clay. Further details of the geological conditions and the applied technical solution have been presented by Stille et al. (2019).

Limit state definition
The excavation of the two tunnels is subject to several potential failure modes, including the collapse of the pipe umbrella system, failure of temporary support, face collapse and flowing ground. Instability caused by the attainment of the rock mass strength due to large in-situ stress is judged to be the main failure mechanism. This causes a loose core of rock that may lead to a progressive, large-scale collapse. A complicating factor is the stress conditions, which are affected by the different advance rates of the two tunnel faces. The complete design is, therefore, rather complex.
For this reason, the case analysed in this paper is a simplification of the real tunnel design, to better highlight the features of an underground rock excavation that can be related to the reliability-based thresholds. We consider, therefore, only the failure mode related to the exceedance of the shotcrete compressive strength, as this is a good indicator of the structural behaviour of the main support system. Moreover, we limit the analysis to the cross section of one tunnel with a simplified geometry, without considering any effects from the other parallel tunnel. As illustrated in Figure 2, the tunnel is assumed to be subjected to an overburden of 27 m of rock with rock class IV, 26 m of soil and 11 m of lake water. The support consists of 300 mm of shotcrete and 5-m rock bolts. As we investigated potential roof collapse, we did not consider the pipe umbrella support system in the finite element model, as the main purpose of the pipe umbrellas is to prevent face collapse.
Shotcrete can fail due to either attained compression or attained tensile strength. In the general case, multiple failure modes are analysed. Here, however, we analyse only compressive failure, which gives the following limit state function: where f c is the compressive strength of the shotcrete and s is the maximum compressive stress that any material point in the shotcrete support is subjected to; using the standard definition of stress in solid mechanics, we have that where s 3 is the minimum (compressive) principal stress, X is the coordinate vector and V s is the shotcrete domain. Thus, s depends on the complex structural interaction between the rock support, the rock mass properties and the in-situ stresses in the rock. This limit state corresponds to unsatisfactory serviceability due to the inward deformation of the shotcrete arch. Effects from bending deformation or local deformation around, for example, rock bolts, are not well described by this limit state, and rather imply a tensile failure of the shotcrete support. Extension of our procedure to consider also such failure modes is conceptually straightforward by considering multiple limit state functions.

Description of threshold violation as complementary limit state function
To verify structural safety, the vertical shotcrete displacement, d, is to be monitored at the crown of the tunnel. Displacement monitoring is common in tunnel construction, especially when tunnelling in complex geological conditions with large geotechnical uncertainty, such as the fault zone passage under Lake Mälaren in the selected case. The observed displacements are used as an indirect indication of the attainment of the limit state in Equation (7), which therefore can be rewritten into where C is a transformation factor using observations of d in the crown as indirect observations of s anywhere in the shotcrete support. The C is given by where C is a model error describing the uncertainty in this transformation. The C can here be interpreted as an uncertain stiffness-like parameter of the shotcrete-rock interaction at the measurement point, which accounts for the behaviour of the entire support system. The magnitude and determination of C is further discussed in the next chapter. By defining the event of identifying allowable shotcrete stress with the function where d alarm is a threshold for unacceptable vertical crown displacement, Equation (2) can be applied to find the d alarm that must not be violated, in order to ensure that the p F,T is satisfied: Geometries of the rock surface and the shotcrete thickness, as well as the shotcrete properties, are assumed to be without irregularities or spatial variability. While partly an effect of a need for mathematical  parameters Confinement loss at support installation c l front Triangular a tr = 0.49; b tr = 0.80; c tr = 0.53 a The s h and s H are based on in-situ stress measurements in the Stockholm region. The randomly generated values of s h and s H are increased with respect to the depth below the rock surface by a constant 13.75 and 37.5 kPa/m, respectively. b Determined as described in the section estimation of transformation model error. c Determined as described in the section estimation of confinement loss at support installation. convenience in this study, the assumption is judged to be a reasonable simplification for the analysed largescale serviceability limit state of a rather thick lining (300 mm) at the design stage. We base this on recent research studies on the effect of spatial variability of shotcrete properties and thickness on its failure behaviour (Bjureland et al. 2020;Sjölander, Ansell, and Malm 2021). They indicate that even for local blockfall failures, there is a substantial averaging effect. Furthermore, Malmgren and Nordlund (2008) studied numerically the effect of having an uneven rock surface and an evenly thick shotcrete lining. For this case, they found that the surface roughness is an important factor for the shotcrete behaviour, as it inflicted bending of the shotcrete at the edges of the rock surface in their model. In reality, however, rock surface irregularities tend to be evened out in the spraying of the shotcrete, especially for thicker linings, reducing the bending. (For ultimate limit states or thin linings, the effect of irregularities in the rock surface and shotcrete thickness may be more prominent and therefore require that spatial variation be accounted for in the model, to capture the effect of local weaknesses.)

Set-up of structural model
Dimensions of the simplified version of the tunnel used to set up our finite element model are shown in Figure 2. The rock bolts are placed with a spacing of 1 m, and relevant boundary conditions and loads are depicted in the figure. Only one-half of the tunnel-rock geometry was included in the model by assuming a vertical symmetry plane. The geometry was discretised with a mesh as shown in the left part of the figure (only showing the mesh close to the tunnel) and with an assumption of 2-D plane strain. In total, the mesh consists of 3123 elements, and, given a quadratic displacement field, the finite element model includes 19350 degrees of freedom (DOFs). Note that the rock bolts are placed along mesh element boundaries and, thus, are assumed to fully interact with the rock/shotcrete, and therefore add no additional DOFs to the model.
Both rock bolts and the shotcrete were assumed to be elastic, while the rock mass was considered elastoplastic. A Dücker-Prager yield criterion with the non-associated flow was used for the rock mass, with parameters determined from the Mohr-Coloumb criterion. The elastic constants of the rock mass and shotcrete, the cohesion, c, and the friction angle, w, of the rock mass were considered random variables, while the dilatancy angle, c, controlling the plastic flow in the rock mass was assumed fully correlated with the friction angle, such that c = w/8.
In 2-D modelling of rock tunnels, a key parameter is the inward displacement of the rock mass, u, here taken as the vertical displacement at the measurement point indicated in Figure 2. The principal behaviour of this displacement and its relation to d are shown in Figure 3: a supported rock mass in a considered tunnel section exhibits smaller displacement than an unsupported rock mass, as the excavation continues. A design challenge lies in the assessment of how much inward displacement, u 0 , and corresponding confinement loss, l front , has already occurred in front of the tunnel face. This structural behaviour can be analysed with longitudinal displacement profiles (LDP). However, generating LDPs for the general case requires full 3-D models to capture effects from, for example, non-hydrostatic stress conditions and complex tunnel geometries (Vlachopoulos and Diederichs 2009;Langford and Diederichs 2013). Because of the substantial computational effort that this would require in a reliability analysis, we have applied a simplified approach in 2-D to evaluate probabilistically the u 0 and the corresponding confinement loss, using findings of Vlachopoulos and Diederichs (2009) as described in the following.
In our 2-D model, the confinement loss is considered by using an incremental analysis, where the support given by the rock to be excavated is incrementally removed by reducing its stiffness and stress. This means that the tunnel was actually included in the discretised geometry, as is shown in the left part of Figure   Figure 3. Principal behaviour of inward tunnel displacement as a tunnel is excavated. The u denotes the total displacement of the rock mass, while δ denotes the displacement of the shotcrete (Bjureland et al. (2017), CC-BY-NC-ND 4.0, https:// creativecommons.org/licenses/by-nc-nd/4.0/, with minor adjustments).
2. This incremental procedure can be described by the confinement loss parameter, l, which ranges from l = 0 simulating full confinement to l = 1 simulating no confinement (González-Nicieza et al. 2008). Formally, the stress tensor s and the fourth-order elasticity tensor D in the domain to be excavated can be described by where s 0 is the in-situ stress, and D 0 describes the stiffness of the intact rock. By incremental increase of l toward no confinement in the model, the resulting u can be plotted as a ground reaction curve. Moreover, the support structure (shotcrete and bolts) is installed in a stress-free state at an intermediate step, where the support from the excavated rock has been partially removed. Assuming that the rock support is installed at the tunnel front, the l front describes the point in the incremental relaxation of the applied pressure in the 2-D model, when the support is to be implemented in the model. The l front was considered a random variable and its determination is further discussed in the next section. To summarise, the FE simulation can be described by the following steps: 1. Initialise in-situ stress in the non-excavated rock domain.
2. Incrementally remove the support from the tunnel by reducing its stiffness and stress (Equation (13)). 3. Insert the rock support system in a stress-free state given by l front . 4. Continue incremental removal procedure until l = 1.

Estimation of confinement loss at support installation
To estimate the displacement u 0 at the tunnel front, we assessed the ratio of the maximum radius of the plastic zone and the approximate equivalent tunnel radius (r P /r T ) to be approximately 2, based on initial simulations of a 2-D model of the analysed tunnel section (unsupported). According to Vlachopoulos and Diederichs (2009), r P /r T = 2 corresponds to the u 0 being 25% of the expected maximum deformation of the unsupported tunnel, u max . For 100 simulated ground reaction curves of the 2-D model without support, using crude Monte Carlo, we then investigated the uncertainty in the confinement loss that corresponded to u 0 = 0.25u max , that is, the confinement loss l front (Figure 4). Based on the 100 observations of l front , we assigned it a triangular distribution, as presented in the left chart of Figure 4 and Table 1.

Estimation of transformation model error
The magnitude of the transformation model error C in Equation (10) was evaluated from the crude Monte Carlo simulations of the structural model (step 2.b in Table A.1 in Appendix A), by calculating the variability in the ratio s/d, based on the output from the finite element analyses. The variability of C is shown in Table 1 as the COV of C. The C was then implemented in the limit state evaluations in the subsequent steps 2.c and 3.d.ii in the algorithm.

Simulation constants
To run the algorithm that determines d alarm for the analysed case, the following simulation constants need to be defined: target probability of failure, p F,T ; the probability of the intermediate events in the subset simulation, p 0 ; the tolerance, t, of the error between p F|Z and p F,T for the proposed threshold; the number of samples in the subset simulation, N; and the constant k, which ensures that there are enough samples in the initial crude Monte Carlo simulation (see details in Appendix A). As summarised in Table 2, the p F,T was set to 0.005, as the analysed limit state can be seen as a serviceability limit state for the temporary rock support. The p F,T used corresponds to suggested target probabilities for other serviceability limit states (Fenton, Naghibi, and Griffiths 2016), but in practice the required safety level is an issue to study further in development of reliability-based design codes. The p 0 was set to 0.1, following recommendations by Zuev et al. (2012), and t was set to 0.1. Based on preliminary runs on the algorithm, k was set to 3. Figure 5 shows the simulation results for the last, accepted threshold, d alarm = 4.4 mm. Figure 5(a) shows an output space in terms of deformation quantities, while Figure 5(b) shows the same simulations but in terms of strength-stress. The grey dots and the contour lines represent the initial crude Monte Carlo simulations (step 2 in the algorithm in Appendix A). The black dots represent the accepted data points after truncation at the proposed threshold value (step 3b). The red dots represent the last subset simulation level, that is, the simulations used to determine the probability of event F * M (cf. Equation (6) and step 3d). The probability of satisfying the threshold and therefore also satisfying the p F,T is 71%. Without adhering to an established threshold, the calculated p F would be 0.012 for the analysed design solution in this limit state in rock class IV. This illustrates the advantage of designing with planned monitoring in an observational approach: if the monitoring were not accounted for in the structural safety assessment, a more conservative design would be required to satisfy p F,T = 0.005. Figure 6 illustrates four evaluations of the final element model (marked in Figure 5), where (a) represents a typical structural behaviour, (b) represents Table 2. Simulation constants used in the example.

Constant
Value failure behaviour (i.e. violation of the limit state function), (c) represents a simulation with low horizontal in-situ stress that however did not violate the limit state and (d) represents a failure behaviour despite limited plasticised rock mass volume. The difference in structural behaviour between the figures indicates the complexity of the case: there are many possible combinations of input data, and the resulting behaviour in the evaluation of the structural model in each simulation is very difficult to predict in advance.

Effects of model errors and other simplifications
Model errors affect the calculated threshold considerably for this application. In the presented example, we introduced a transformation model error (C) to account for the uncertainty related to the fact that we used the displacement measurement in the tunnel crown to capture f c exceedance anywhere in the lining. A comparison of the two graphs in Figure 5 illustrates the significance of this error clearly: the data points representing displacements below the threshold in Figure 5(a) (black dots) exhibit a considerable scatter in the strength-stress output plot of Figure 5(b). The uncertainty represented by C depends partly on the extent of the monitoring programme. By measuring the shotcrete displacement at the expected location of the largest s, instead of at the tunnel crown, a smaller transformation error would be expected. Another model error is introduced through our simplified approach to assess the occurred confinement loss at the tunnel front at the time of support installation (l front in Figure 4). In this calculation example, we derived a triangular distribution for this model error. For simplicity, we did not, however, account for any uncertainty in the estimation of the ratios r P /r T and u 0 /u max that were needed to assess the uncertainty of l front . A straightforward approach to this issue would be to also assign a model uncertainty to u 0 /u max , for example, by assigning this ratio a uniform distribution between, say, 0.2 and 0.3, before evaluating l front , instead of using the deterministic value 0.25.
We also disregarded the potential correlations between l front , the rock mass properties, and the insitu stress conditions; ground conditions that are prone to cause substantial wall deformation (large u max ), because of significant plastic ground behaviour, may potentially be associated with more curved shapes of the normalised ground reaction curve (Figure 4). Such curved shapes may, in turn, be associated with less remaining confinement at the tunnel front (i.e. larger l front ).
The calculation example assumes that the shotcrete is installed directly at the tunnel front and achieves full strength instantly. In practice, the tunnel support would be installed normally some distance behind the tunnel front and gain strength as the shotcrete cures. The principles of how to account for these aspects in tunnel support design have been discussed for deterministic design by, for example, Chang (1994), Carranza-Torres and Fairhurst (2000) and Oke, Vlachopoulos, and Diederichs (2018), but their application to reliability-based design remains an issue for future studies.
In addition to the aforementioned model errors, there may also be intrinsic errors in the applied FEmodel. Such model errors have not been quantified and considered in this paper, which imposes a degree of arbitrariness on the calculated threshold. We note however that the effect of intrinsic model errors remains a general issue for future research in practical application of reliability-based analyses of rock engineering structures.

Accuracy of threshold and challenges with large uncertainties
The iterative nature of the applied algorithm (Appendix A) implies that a proposed threshold is accepted, if the conditional event of not violating the proposed threshold provides a calculated p F that is close enough to p F,T . This tolerance (t) between p F and p F,T is defined by the user. The larger the τ is, the faster the algorithm is expected to find an acceptable threshold value, but the larger becomes the potential deviation from the required p F,T when the threshold is applied. We believe a quite large τ can often be acceptable in practice, especially for serviceability limit states, meaning that the established threshold only needs to provide alarms related to the correct order of magnitude of the required p F,T .
In case there is only limited information regarding the site conditions, the uncertainty of the involved geotechnical parameters may be considerable. While this does not pose a problem for the algorithm itself, its accuracy might be impaired for cases where a non-linear FE-model is considered. With large uncertainty represented in the probability distributions, the likelihood increases of drawing sample combinations that characterise extremely low-quality rock mass with virtually no strength. For such samples, the chosen structural model may not converge for a non-negligible number of evaluations, that is, decreasing the number of valid data points. This would indicate a need for another approach for discretisation of the real conditions, for example, by using another material model or numerical method. Such considerations would, of course, also be needed in a deterministic sensitivity analysis for the same situation. For practical application, we recommend to monitor the number of non-converged model evaluations to ensure sufficient accuracy of the evaluated threshold.

Practical implementation with the observational method
The definition of the threshold (Equation (1)) implies that a violated threshold by definition means exceeding the p F,T . For the analysed limit state, the shotcrete in a considered cross section is loaded sequentially as the tunnel excavation progresses with each blasting round. Measuring the occurred inward displacement of the shotcrete (d) after each blasting round, a logarithmic pattern can be expected between d and the normalised distance to the tunnel front, l/r T (Figure 7). After a few blasting rounds with corresponding displacement measurements at distances [l 1 , l 2 , . . . , l i ] from the tunnel front, a prediction of the final displacement can be Figure 7. Expected logarithmic inward displacement behaviour with increased distance to the tunnel front. Using Bayesian regression analysis on the retrieved data points, a prediction can be made of the potential violation of the threshold. made, using a linear regression model with logarithmic transformation on the format where a and b are regression coefficients. Based on initial work by Stille (2009), Bjureland et al. (2017) showed how such measurement data can be analysed and implemented within a reliabilitybased framework for the observational method: using Equation (14) in a Bayesian framework, the final displacement can be predicted through extrapolation, which will be increasingly accurate as more data is collected. If this prediction indicates that the derived threshold d alarm is likely to be violated, the responsible decision maker regarding the tunnel support is alerted to take action, for example, by ensuring that additional support is installed. The use of reliability-based thresholds as a part of the observational method can thereby integrate the shotcrete design into a larger risk-based framework for tunnel design and construction, which was proposed by Bjureland et al. (2020). In our application example, we considered only one measurement point in the cross section and let the transformation factor C account for the behaviour of the complete shotcreted area. However, the roof and the walls are typically subjected to different structural behaviour, and moreover, the structural behaviour is affected considerably by the elastoplastic behaviour of the rock mass, as indicated in Figure 6. It is, therefore, likely to be necessary in a real-world application to implement more than one point of measurement and to consider multiple limit state functions, such as both tensile and compressive failure. This requires several thresholds to be derived. However, as previously mentioned, having more than one threshold makes Equation (1) underdetermined. How to find the optimal set of threshold levels in x alarm remains an issue for future research.

Conclusion
We have presented an extensive calculation example of how reliability-based thresholds can be established for the inward displacement of a shotcrete lining in a rock tunnel. The example considered a serviceability limit state function, for which a finite element model was needed to evaluate the ground-structure interaction associated with the loss of confinement from the advancing tunnel and the installed support. Because of the considerable computational effort required to evaluate the final element model probabilistically, an algorithm using subset simulation was employed. The effect of model uncertainty related to the confinement loss and transformation uncertainty related to information from indirect monitoring was investigated and discussed extensively. In particular, we find that there is a need for future research on how to consider the uncertainty in the modelling of confinement loss in 2-D models, as full longitudinal displacement profiles (LDPs) in 3-D still may be unreasonable to employ in probabilistic analyses.
The derived threshold may serve as a key component in designing a tunnel with the observational method. Our analysis in this paper showed how applying such thresholds could facilitate less conservative designs, as the thresholds ensure sufficient safety by being related to a target reliability and account for prevailing uncertainties. Thereby, the threshold can become an integrated part in the performed risk management work in the tunnel project. ii. Run the structural model for each set of samples inb k to find the complete set of proposed samples,x k iii. Set, for all N parameter sets of proposed samples, iv. Order the samples in x k in increasing order of magnitude of their limit state value G(x) and let c k+1 be the p 0 -percentile of the ordered samples. Let the N c first samples be denoted x (seed) k+1 and let the corresponding b (seed) k+1 contain the next seeds of the Markov chains. v. k = k + 1. e. Identify the number, N F , of sample sets for which x k−1 [ F * M and f. Calculate the error betweenp F and p F,T as1 = |(p F − p F,T )| p F,T .