Effects of flood wave shape on probabilistic slope stability of dikes under transient groundwater conditions

ABSTRACT The time-dependent response of pore water pressures during floods largely determines the safety against geotechnical failure of dikes, which is deemed to be highly dependent on the uncertain shape (duration, maximum height, etc.) of the flood discharge wave. This paper derives the uncertainty of flood wave shape from a database of precalculated hydrographs (GRADE) and evaluates the effect of shape variability on probabilistic safety estimates of slope stability, using a modelling chain consisting of a transient hydrological model (MODFLOW) and a probabilistic dike slope safety assessment (FORM). Accounting for flood wave uncertainty with transient groundwater flow generally leads to higher reliability estimates for slope stability, compared to the steady-state groundwater condition and other conservative assumptions, but to lower reliability estimates compared to a single design flood wave. Furthermore, the uncertainty of the flood wave shape can be as important as the uncertainty in geotechnical properties. For landside dike slope stability, the volume of the flood wave is the most important factor, while riverside slope stability depends mainly on the total water level drop after the peak. These two waveform characteristics are thus essential uncertainties to consider in probabilistic assessments of dike safety with transient groundwater conditions.


Introduction
Dikes (i.e.earthen flood defenses) form an extensive network along many major rivers around the world aimed at mitigating flood risk and prevent flooding.Climate change may increase the risk of a society to flooding (Middelkoop et al. 2001), for example through expedited snow melt or an increase in extreme precipitation events in the upstream drainage area (IPCC 2022).Under current and future climatic conditions, levee breaches can be caused by for example dike overtopping, slope instability due to seepage or under-seepage.Slope instability during river floods is one of the major failure mechanisms of river dikes and tends to occur rapidly, leaving little room for mitigation.Slope instability of these river dikes is associated with large uncertainties, mainly relating to soil properties (van der Krogt, Schweckendiek, and Kok 2019) and groundwater pore pressures (van Woerkom et al. 2021).The pore water pressure (pressure heads) in the dike and subsoil is also one of the main drivers leading to slope instability during floods.
When river water levels increase, water seeps into the dike and the subsoil, increasing the pressure heads, reducing the effective stress and strength of the soil.Pressure heads during floods are often assumed as steady state seepage conditions based on analytical solutions for typical conditions (U.S. Army Corps of Engineers 2003;TAW 2004;Lendering, Schweckendiek, and Kok 2018), which may lead to conservative estimates of dike safety.The development of pressure heads can be modelled more realistically using a timedependent seepage analysis.In such an analysis the effects of soil layering, variable permeability, and flood wave shape can be considered.The hydrological forcing can be derived from a copula-based model considering variability of the peak flow discharge and flow duration (Balistrocchi et al. 2019;Curran, De Bruijn, and Kok 2020) or from a single synthetic design flood wave (Butera and Tanda 2006).However, they do not account for the full variability of flood wave shapes from variable weather conditions (Hegnauer et al. 2014).Thus, the impact of the flood wave shape uncertainty on the pressure head evolution and the resulting slope stability is currently unknown.The flood wave shape is strongly influenced by the river basin characteristics, and especially capturing multi-peak floods in statistical properties is not feasible (Yue et al. 2002).As such, we define flood wave shape uncertainty as a scenario uncertainty (Baecher 2016), in which multiple pre-calculated flood wave shapes are assumed equally probable.
This paper thus explores the impact of the uncertainty of the flood wave shape on the probabilistic safety assessment of slope stability with transient groundwater conditions.A multiplicity of flood waves is obtained from a database of pre-calculated hydrographs (Hegnauer et al. 2014).First, a method is proposed to incorporate the variability of these flood wave shapes into the probabilistic reliability analysis for slope stability for a case study in the Netherlands in Section 2. This method includes the calculation of time-dependent pressure heads with the MODFLOW hydrological model.The results of this method are compared with dike slope reliability analyses based on pressure head assumptions used in standard engineering practice.Second, we assess the dynamic slope stability response to variability in flood wave shape for two simplified case studies in Section 3. The two cases are: a permeable sand dike and a less permeable clay dike on a shallow aquitard blanket layer.Hereafter further findings of our research are discussed.

Methods
To account for the scenario uncertainty of flood wave shapes from variable weather conditions in dike slope stability calculations, multiple flood wave shapes should be considered.Using a combination of rainfall-runoff modelling and hydrodynamic river models, the flood wave shape at a given point on a river can be derived.Such a derivation for river discharge on the Rhine river at Lobith, the Netherlands, is available in the GRADE dataset (Hegnauer et al. 2014), which will be exemplary applied in this manuscript.

Flood wave selection
The GRADE dataset (Hegnauer et al. 2014) consists of a database of 50,000 calculated flood wave shapes, each containing discharges at Lobith (the Netherlands) for 15 days before and 15 days after the simulated annual maximum discharge.The GRADE method is a tool for generating synthetic extreme rainfall and discharge events and includes three main components: (1) A stochastic weather generator that generates synthetic precipitation and temperature series based on nearest-neighbour resampling, preserving the statistical properties of the original data, run by the Royal Netherlands Meteorological Institute (KNMI).
(2) The HBV model, a rainfall-runoff model that calculates runoff and accounts for evapotranspiration and snow storage, which is widely used internationally (Lindström et al. 1997).
(3) Hydrologic and hydrodynamic routeing component that routes the runoff generated by HBV through the river stretches, using Sobek hydrodynamic model.Only the largest flood waves are selected from the results with the simple built-in routeing in the hydrological model, due to computational limitations.
Thus, GRADE uses simulated precipitation series from the weather generator that maintains the statistical characteristics of multi-day precipitation events to calculate discharges.So to say, each discharge event is the direct result of a statistically generated weather.
The discharges are translated to water levels at the Lobith gauge station using a qf-rating curve (Bom and van Leeuwen 2019), which considers steady flow, unsteady effects by water level hysteresis and the effect of weirs.Given a particular flood wave, we define the maximum water level of that flood wave as h max .Often, the expected maximum water level at a given location or return period is known (h max ), but the variation in flood wave shapes given that maximum water level is not.To derive the variation in flood wave shapes given a selected maximum (design) water level (h max ), first the 1000 flood waves with their maximum water level nearest to the selected water level are sampled from the dataset.Second, the sampled flood waves are scaled to the selected h max resulting in a dataset of flood waves with equal maximum water levels, but varying shapes.Ideally all 1000 selected flood waves would be used, but to constrain calculation times the selected flood waves are divided in 50 subsets.An analysis of the silhouette coefficient (to determine the minimum number of subsets required, Rousseeuw 1987) and a visual inspection of the subset variation both indicated that 50 subsets was sufficient and suitable.The subsets are based on five flood wave shape parameters: H 0 , A peak , A tot , DH ds , Ds max (Table 1, Figure 1(A)).The sub-setting is performed using k-means clustering on these parameters, which aims at minimising the combined within-subset variance.Each subset contains between 1 and 48 waves, with a median value of 19 and a mode of 7 waves per subset.Finally, a random flood wave is drawn from each of the subsets and an occurrence probability of that flood wave is calculated following the size (number of waves) of the corresponding subset relative to the entire sample of 1000 selected flood waves.
The 50 selected flood waves are assumed to represent all possible flood wave shapes given the corresponding maximum water level.The flood waves that represent a larger group of waves with similar characteristics have a larger occurrence probability (Figure 1(B)).The resulting sample set (Figure 1(C)) contains flood waves with multiple peaks or various shapes of the rising and falling limb.The extreme flood wave shapes, for example those with a very wide peak or very steep rising limb, generally have a smaller occurrence probability than those central in the selection (Figure 1(B)).A more extensive analysis of the water level uncertainty can be found in the GRADE report (Hegnauer et al. 2014).

Combined reliability for multiple flood waves
To calculate the reliability for slope stability accounting for the uncertainty in the flood wave shape scenarios, we follow a three-step approach.First, for each of the 50 selected flood waves (denoted by w i ), we calculate the probability for slope instability for each time step t.Then we combine the failure probability of all time steps over the entire flood wave, to obtain the probability of failure conditional to the flood wave (shape): P f | w i , shortened to P f ,i .Finally, we weigh these probabilities with the likelihood of a flood wave shape P(w i ).The probability of failure (P f ,i ) for each time step (t j ) of each flood wave is calculated from the reliability index β (Hasofer and Lind 1974) as where Φ stands for the cumulative distribution function of the standard normal distribution, and b(t j ) is the reliability index conditional to the groundwater state at time t j .Note that reliability and failure throughout this paper refer only to dike slope failure, and are defined by the moment that a dike slope collapses, independent of whether this collapse also leads to a dike breach.Furthermore, note that the failure probabilities throughout this paper represent a unitless conditional failure probability, for example given a certain groundwater condition or set of flood waves.For the combination over the time steps t 0 , . . ., t n , we assume that all time steps within one flood wave are fully dependent.First, because we only combine different time steps within one flood wave, and second because the geo-mechanical stochastic variables are time-invariant (at least on the time scale of a flood event) and thus highly correlated.Hence, the probability of failure (conditional to flood wave i), is the maximum probability of all time steps.P f ,i = max {P f ,i t 0 ( ), P f ,i t 1 ( ), . . ., P f ,i t j , . . ., P f ,i (t n )} (2)  with j the time step and n the number of time steps in the event.To combine the failure probabilities conditional to a given flood wave into a marginal failure probability including wave uncertainty, we assume the flood waves are mutually exclusive: if one wave shape happens, all other waves shapes cannot happen at the same time.Following the law of total probability, the combination is the sum of the probability conditional to the wave shape, weighted with the probability of each wave scenario to occur P(w i ), see Equation (3) and Figure 2(E).
Here m is the total number of flood waves analysed (here 50) and P(w i ) is the normalised occurrence probability of that given flood wave, which is estimated as the size (number of waves) of the corresponding subset relative to the entire sample of 1000 selected flood waves (see Section 2.1).
Thus, we perform the entire analysis above conditional to different values of the maximum water level of a flood wave max {h i (t j )}, further denoted as h max .Typically, existing methods (that neglect the variability in flood wave shapes) calculate annual failure probabilities.The presented conditional failure probabilities for a given event size h max (Equation (3)) can in theory be further converted to annual probabilities using the probability weighted sum of this conditional probability of failure for a given h max and the probability density (pdf) of the annual maximum of that water level (1/year) (Schweckendiek et al. 2017).

Methods to estimate groundwater pressure heads
The method proposed in the previous section provides, given a sample of individual flood waves and their effect on groundwater pressure heads, a maximum failure probability conditional to a single flood wave (sample set) and a weighted failure probability of these maxima (realistic).This method takes the full variation in flood wave shapes from variable weather conditions (Hegnauer et al. 2014) into account.However, such a large dataset of flood wave shapes is often not present, thus current solutions are based on steady-state or analytical solutions for typical conditions (U.S. Army Corps of Engineers 2003; TAW 2004) or a single synthetic design flood wave (Butera and Tanda 2006).We will compare the estimated failure probabilities of our analysis (including uncertain hydrograph shapes) with failure probabilities based on several frequently used methods to approximate pressure heads (Figures 1(C) and 2 (F)).These methods are described here, followed by the name they will be referred too, in italic.Each of the groundwater pressure head estimates relate to the maximum flood water level (h max ).Thus, the probabilistic dike reliability is calculated according to Section 2.5.4 and represents a failure probability conditional to the given the maximum flood water level (h max ) and corresponding estimated pressure heads.
A steady-state method (steady-state) is based on a steady-state hydrological simulation, assuming an infinite duration of the maximum river water level.A second method, as prescribed for the Dutch flood safety assessment (TAW 2004), approximates steady-state conditions.This method (analytical) provides an estimation of pressure heads at two depths; at the base of the dike and in the sandy aquifer, from which the other values within and underneath the dike are interpolated (TAW 2004).
Synthetic design flood waves have an advantage over steady-state methods as they incorporate the temporal component of groundwater pressure heads, calculated by the method proposed in Section 2.5.3.Such a synthetic design flood wave has been previously determined for any location along the Dutch river branches (Botterhuis, Waterman, and Geerse 2017) and has a default trapezoid wave shape (trapezoid) (Figure 1(C)).We add a second synthetic flood wave by taking the average for each individual time step of the initial selection of 1000 flood waves (average).As this choice is arbitrary, we cannot prove the accuracy of this representation, thus we only use it as an example of a synthetic flood wave.

Summary of the methodology: from flood wave shape to dike slope failure probability
Figure 2 provides the summary of the method devised to include flood wave shape variability and its transient groundwater response in probabilistic slope stability analysis, in four consecutive steps.First, a case location is selected, and a surface geometry and subsurface schematisation is created of both the dike and the natural subsurface (Figure 2(A)).Second, representative flood wave shapes for that location are retrieved from a large selection of flood waves (Figure 2(B)).Third, time-dependent groundwater flow is simulated with a hydrological model given the surface geometry and subsurface schematisation (Figure 2(C)).Fourth, coupling the hydrological output with probabilistic slope stability analyses results in a dike slope failure probability (slope reliability) per flood wave (Figure 2(D)).These are finally combined into a single dike slope failure probability, which thus takes the variation in flood wave shapes into account (Figure 2(C)).In the following section, the entire workflow is laid out into more detail.

Modelling chain
The general approach as described in the previous sections, can in principle be applied to any location, using any groundwater simulation model code and slope stability algorithm.Here we describe the cases and modelling algorithms selected in more detail.

Selected cases; surface geometry and subsurface schematizations
We selected three cases: a real dike case study of an existing Dutch river dike and two simplified hypothetical cross-sections of typical dike archetypes (Figure 3).
The real dike cross-section (near Lent, the Netherlands (51.867N, 5.884 E) has its floodplain elevation at approximately 10 m above mean sea level and the dike crest at 16.37 m above mean sea level.This case study consists of a loamy sand dike with a clay loam cover layer (Figure 3(A)).In the floodplain a layer of loamy sand is located on top of a silty clay loam aquitard with a thickness of 4 m.In the hinterland, the schematised aquitard thickness is approximately 1.5 m.A large sandy aquifer is located below the aquitard, which is in contact with the river, located 300 m from the dike under non-flooding conditions.
The two simplified hypothetical case studies are typical for the Netherlands and consist of a standard dike and two idealised subsurface scenarios (Figure 3(B-C)).The surface geometry includes a 6 m high dike with a 1:3 slope and a crest width of 5 m, on a flat floodplain with an elevation of 0 m above mean sea level.The dike is located 100 m inland from the river channel.The first subsurface scenario (sand dike) consists of a (loamy) sand dike on a thinner cover layer (Figure 3(B)) and the second subsurface scenario (clay dike) consists of a clay dike on a thick impermeable clay cover layer (Figure 3 (C)).In both scenarios the dike has a clayey cover layer at the surface; due to weathering and roots of vegetation this cover layer is likely to be more permeable and less cohesive, hence it is schematised as clay loam.

Maximum flood wave water levels for selected cases
To quantify the influence of variable flood shapes on dike slope failure probability we use the real dike cross-section (Figure 3(A)) and a h max of 15.96 m above mean sea level, which is 0.41 m below the dike crest height, corresponding to its safety standard.To further assess the influence of subsurface material and maximum river water level on failure probability we use the hypothetical typical cross-sections (Figure 3(B-C)).For these cross-sections, three hypothetical maximum water levels are selected: a h max of 2, 3.5 and 5 m above the floodplain, or respectively 4, 2.5 and 1 m below the dike crest height.These hypothetical maximum water levels cannot directly be related to a given safety standard but do provide an additional assessment on the influence of variable flood wave shapes under characteristic conditions and for various loads.

Hydrological model setup and parameters
The selected flood waves are used to simulate the pressure heads in the dike and subsurface (Figure 2 (C)) using the groundwater model software MOD-FLOW 6 (Langevin et al. 2019).Initial groundwater conditions are provided by a steady-state MODFLOW simulation.These initial conditions are based on the average winter river water level and average winter excess precipitation.The time-dependent groundwater response to changing river water levels, provided by the selected GRADE flood waves, is simulated using a transient MODFLOW simulation.The model is run with a 2-hour time step, which was found to not impact results compared to smaller time steps.The hydrological model has a cell size of 0.5 m both horizontally and vertically.
On the river side, the MODFLOW river package enables river-groundwater interaction.On the landward side, the drain package (Hughes, Langevin, and Banta 2017) enables outflow only if pressure heads become higher than the surface elevation.This package is also used to model a drainage ditch 20 m behind the landside dike toe with a depth of 1 m.The average winter excess precipitation (2.25 mm day −1 ) is simulated by the recharge package (Hughes, Langevin, and Banta 2017).A saturated conductivity (K sat ) is assigned to the subsurface material types based on typical values (Tables 2 and 3).The dike cover type K sat is from TAW (1996), the other material types K sat are based on García-Gutiérrez, Pachepsky, and Ángel Martín (2018).

Stability model setup and parameters
The hydraulic heads from the MODFLOW groundwater model are used as input for a detailed schematisation of the groundwater pressure heads in D-Stability (Figure 2(D)).D-Stability (Deltares 2019) is used to calculate the failure probability.Note again that this only represents dike slope failure, and does not take into account other possible failure modes (overtopping, under-seepage).The failure probability is calculated using the First Order Reliability Method (FORM, Figure 3. Three selected subsurface cases: One real dike cross-section near Lent, the Netherlands (A), and two hypothetical cross-sections that are typical for river dikes.The hypothetical cross-sections are a typical sand dike (B) and a typical clay dike (C).In the main text they are also referred to as real dike, sand dike and clay dike.Further information on the details of these cross-sections can be found in Section 2.5.1.
Hasofer and Lind 1974) based on the following limit state function: where F s (x(u)) is the factor of safety, with x(u) a realisation of the stochastic variables in the geo-mechanical parameter space as function of the realisation u in standard normal space.FORM has been widely applied to structural reliability problems and involves a linearisation of the limit state equation around a most probable failure point (design point) instead of the mean value.It aims to find the design point, and defines the reliability index (β, Equation ( 1)) as the shortest distance between the origin in standard normal space and the failure surface (Hasofer and Lind 1974).
The geo-mechanical parameters are assumed subject to uncertainty and treated as independent stochastic variables.The model uncertainty factor Y d is also a stochastic variable (with a corresponding coordinate u), accounting for the inaccuracy of the Limit Equilibrium Method.The SHANSEP (Stress History And Normalized Soil Engineering Properties) model is used in the real dike case study to calculate the undrained shear strength of the material.Two parameters correspond to this method: S, a friction parameter for characterising the undrained shear strength of soil under normally consolidated conditions, and m, a parameter that determines the extent to which the effect of the load history is reflected in the undrained shear strength.The properties of the geo-mechanical parameters g sat , f, c, S and m for the real dike case (Table 2) are based on local data, i.e. the results of a Direct Simple Shear (DSS) tests on either normally consolidated or over-consolidated soils, provided by the governing organisation.For the hypothetical cross-sections the values are based on EC7 (NEN 2017) (Table 2).
The reliability is calculated for a fixed slip plane, which is iteratively selected by searching for the most critical slip plane in the probabilistic design point, see the procedure described in Huber, van der Krogt, and Kanning (2017).The Uplift-Van model (Van 2001) was used for the slope stability calculation to account for long-shaped slip planes with strength loss due to uplifting of relatively thin clay blanket layers (with a particle swarm search algorithm to determine the most critical slip circle).The reliability is calculated for the landside of the dike and the riverside of the dike.

Estimating sensitivity of failure probability to wave form parameters
To evaluate to what degree the variability of flood wave shape contributes to the failure probability, and which flood wave shape parameters are most important, we use a first order variance-based sensitivity index (S i ).Given a generic model this sensitivity index determines the statistical dependence of Y and X i by quantifying how much the model uncertainty is reduced if the input parameter X i is set to a certain value (Borgonovo et al. 2017;Ratto et al. 2007).In our case, Y represents the failure probability and X i parameters that determine this probability (flood wave shape parameters and geo-mechanics parameters to a certain value.The expected value of V(Y|X i ) over all possible values for X i is where a large value of indicates a large contribution of X i to the variability of Y and is called the first-order effect of X i on Y (e.g.Saltelli et al. 2008).This conditional variance can be normalised by the total variance to result in the first-order sensitivity index S i by where high values of S i signals X i to be an important parameter (Saltelli et al. 2008).

Results and discussion
3.1.Failure probability of the case study lent using variable flood waves

Uncertainty and time-effects of wavedependent failure probabilities
We analysed the failure probabilities of a sample set of 50 flood waves (Figure 1).Each flood wave results in a different temporal development of failure probability; thus 50 different flood wave shapes lead to highly variable outcomes for the reliability.The variability of the reliability is larger for landside slope stability than riverside slope stability, see Figure 4. Furthermore, the sample set β's all have a similar development over time on the landside: the reliability indices are usually lowest several days after the peak water level (t 15 ).Riverside slope reliability reaches its maximum at or just before the peak water level but has a highly variable development before and after this maximum, as a result of the variability of the flood wave.The minimum β is usually reached after the peak water level, although the reliability is hardly lower than before the peak.
The resulting realistic reliability, being the weighted average of the highest failure probability of each sample set wave (Section 2.2), is shifted towards the waves with lower reliability indices (Figure 4).The realistic failure probability is thus dominated by only a few adverse flood wave shapes.The realistic β is lower on the riverside than on the landside, which is mostly caused by the lower sloping berm on the landside of the dike.

Comparison with other methods of groundwater pressure head estimations
The general development of the reliability index over time for the dynamic methods based on a synthetic flood wave shape (average, trapezoid) is comparable to the 50 individual selected flood waves.As such, the highest failure probability (lowest reliability index) of these dynamic estimation methods (P f ,i , Section 2.2) are also similar (Figure 4, crosses).The non-dynamic methods for pressure head calculation (analytical, steady-state) and the realistic method only have a single reliability index and thus no temporal component.
The realistic method has a β of 5.94 and 3.46 on the landside and riverside, the average method of 6.22 and 3.51 and the trapezoid method of 5.65 and -1.10 respectively.The analytical β is −0.34 and 2.46 on the landside and riverside and the steady-state β is 4.63 and -∞ respectively.Both steady-state and analytical result in higher failure probabilities than the realistic method proposed in this paper, only the analytical method on riverside slope stability provides a failure probability estimate that is close to that of the dynamic methods.Note: In case of a probabilistic parameter the distribution type and standard deviation are also provided.

Comparison between methods of the effects of subsurface properties and maximum river water level on failure probability
As the time step with the lowest reliability index β determines the failure probability for one flood wave, we only present the P f ,i of the entire flood wave in the next sections (determined by the lowest reliability over all time steps).We compared the reliability for two cross-sections that are the same in terms of surface geometry but differ in subsurface properties (see Figure 5).We compare the reliability results for the various static and dynamic methods, and for flood waves with three different maximum water levels (Sections 2.3 and 2.5.2).
The reliability indices of the 50 flood wave shapes (sample set) generally decrease with a higher maximum water level.However, higher water levels also lead to a larger uncertainty of the dike slope reliability.For riverside slope stability with h max = 5 the resulting realistic (weighted mean of the selected waves) reliability hardly decreases with respect to lower water levels.Sand dike landside slope stability is most dependent on the maximum water level.Furthermore, in these two typical dike cases (Figure 3(B-C)) with the same landside and riverside slope steepness, the riverside has a lower reliability index in most cases.
As noted in the previous section, the combined reliability index from the 50 selected flood waves (realistic, Section 2.3) is again shifted towards the sample set waves resulting in lower reliability indices.The realistic reliability is again more similar to the dynamic methods than to the static approximations.For landside slope stability, dike reliability is overestimated by the average and trapezoid methods with respect to the realistic reliability.On the other hand, dike safety is underestimated by the analytical and steady-state methods, independent of the maximum water level.As the analytical method is based on steady-state conditions, we would expect it to be closer to the steady-state method.However, both phreatic levels as aquifer pressure heads are higher for steadystate than for analytical, thus leading to lower reliability indices.For riverside slope stability the average, trapezoid and analytical methods all provide reasonable estimates of dike safety.On the other hand, a large underestimation of dike slope reliability is made using the steady-state method.The reliability is thus often consistently either overestimated or underestimated using the selected methods, which illustrates the necessity of considering the uncertainty of the flood wave shape in dike safety calculations.

Further exploration of the dynamic slope stability response to flood wave shape
To further explore the influence of variable flood wave shapes for dike stability, we provide an analysis of: .The variation in the selected flood waves and the influence of flood wave shape on groundwater pressure heads. .The timing of lowest failure probability.
. The fraction of the failure probability uncertainty explained by the flood wave shape. .The most critical flood waves and the most important flood wave shape parameters.
All these analyses are done separately for landside and riverside slope stability and for the typical clay dike and sand dike cross-sections (Figure 3(B-C)).

Variability of the flood waves and the impact on groundwater conditions
The selected flood waves are normalised to have the same maximum water level after 15 days (t 15 ) (Figure 6, top row), but have different water levels before and after the peak.With h max = 5, the mean and standard deviation of the river water levels at the start of the simulation (t 0 ) are −2.88 + 1.44 m, −2.24 + 1.74 m, and 1.24 + 0.97 m, given a maximum water level at t 15 of at 2, 3.5 and 5 m. 15 days after the maximum water level (t 30 ), the mean and standard deviation equal −2.29 + 1.23 m, −1.53 + 1.42 m, and 1.71 + 0.84 m respectively.Thus, 15 days after the peak the water levels are generally higher than 15 days before the peak, but the variation is smaller.
The two flood waves with either the largest (high wave) or smallest area (low wave) under the curve are selected to analyse their effect on groundwater pressure heads.At the landside toe of the dike the resulting pressure heads of the high wave are much higher than those of the low wave, while at the riverside toe this difference is much smaller.For both the high wave and low wave, pressure heads at the landside toe are on average highest at t 30 , while at the riverside toe pressure heads are highest during peak river water levels (t 15 ).This delay of pressure head response is more prominent in the clay dike case.
Furthermore, the pressure heads in the aquifer underneath the dike do not reach the same height as the river water level, but gradually increase towards t 15 .The increase of pressure heads in the aquifer continues towards t 30 for the clay dike case, even for the low wave in which river water levels decreased again substantially.For the maximum water levels of 2 and 3.5 m the observed pattern is similar, although the response is smaller.For the real dike case (not shown), the groundwater response is closest to the hypothetical sand dike case.
If the saturated conductivity of the geological units were to decrease, it could result in a lower hydraulic conductivity, which may delay and dampen the pressure head response of the dike or of the aquifer beneath the dike to the flood waves.Conversely, if the saturated conductivity were to increase, it may result in a more rapid and pronounced pressure head response of the aquifer to the flood waves.In more complex situations, such as a more permeable dike on a less permeable cover layer, the pressure head response of the dike may be enhanced due to the decreased capacity to drain to lower layers (van Woerkom et al. 2021).However, despite these potential differences, we expect that the observed patterns will remain relatively similar.

Timing of lowest stability
The timing of the lowest dike stability is an important step to gain better insight in the stability response to flood waves.The timing of the lowest β (T b min ) is >2.5 days after the occurrence of the peak water level for all cases.For landside slope stability the median T b min is 15 and 5 days after the peak water level for the clay dike and sand dike cases respectively for the most extreme water levels (Figure 7).The median T b min for riverside slope stability is 15 and 12.5 days after the peak water level for the clay dike and sand dike cases respectively.The real delay of T b min can be even larger as many selected flood waves reach the lowest stability on the last time step in the simulation.With lower maximum water levels, T b min is usually closer to the timing of the peak water level.On the sand dike landside this is not the case, as the pressure heads decrease faster with decreasing river water levels, thereby also decreasing dike slope failure probability.
The delay of the lowest reliability index with respect to the peak river water for riverside slope stability has been widely acknowledged (van der Meer 2020).On the other hand landside slope stability is often assumed to be an instantaneous process, although a delay of several days has already been reported (Moellmann, Vermeer, and Huber 2011;van Leeuwen 2019).Due to the delayed landside failure, as water levels may have also dropped substantially by that time (van der Meer 2020) extensive flooding after failure is less probable.This would reduce the economic damage and number of casualties after a slope instability and could lead to a reconsideration of dike safety in the flood risk framework, for example if it is combined with emergency repair to pose the risk of flooding by a quickly re- occurring high water level.As such, we recommend analysing return times of high river water levels shortly after a possible dike failure (van der Meer et al. 2021).

Fraction of uncertainty in dike slope reliability explained by flood wave shape
To quantify the importance of flood wave shape and associated groundwater levels, we compare the uncertainty in dike slope reliability arising from flood wave shape with the uncertainty arising from geo-mechanical parameters (Section 2.5.4).More specifically, we compare the variance of Y (dike slope reliability) if a certain flood wave shape is selected (X i ), being V(Y|X i ), to the total variance of Y, being V(Y), according to Equations ( 6) and ( 7) (Saltelli et al. 2008).After iteratively selecting each flood wave the analysis results in the first-order sensitivity index (S i , Section 2.6) of flood wave shape induced groundwater conditions, i.e. the relative influence of flood wave shape on dike slope reliability uncertainty.The analysis is done for each of the maximum water levels and for both landside and riverside slope stability.
The influence of the flood wave shape increases with increasing maximum water levels (h max ) (Figure 8).The S i of flood wave shape for the sand dike case increases from 0.15 and 0.18 to 0.56 and 0.52 for landside and riverside slope stability respectively.The influence of variable flood wave shape increases fast towards h max = 5 on the clay dike riverside, with a S i of 0.55.For clay dike landside slope stability on the other hand, the influence of variable flood wave shape hardly increases with increasing h max .For both landside and riverside slope  stability, the remaining uncertainty resulting from the geo-mechanical uncertainty in the slope stability analysis is mostly caused by the friction angle of the cover layer (roughly 50%), with cohesion of the cover layer (roughly 25%) and model uncertainty (roughly 20%) being the other main drivers (Table 3).
The influence of flood wave shape variability on dike slope reliability is strongly correlated with the variability in pressure heads.For example, high pressure heads hardly develop on the landside slope of a clay dike (Figure 6), leading to a low S i of flood wave shape on the landside.Contrary, the S i for sand dikes is larger, probably because large pressure head differences are present between waves (Figure 6).The stabilising effect of high river water levels is an additional explanation for the large influence of flood wave shape on riverside slope stability.To conclude, flood wave shape can contribute more than 50% to the uncertainty in a dike slope stability assessment with time-dependent groundwater analysis, pointing out the relevance of the entire flood wave shape, rather than only the maximum water level.
3.3.4.Critical flood waves and important flood wave shape parameters Lastly, we focus on the most critical flood waves and determine to which flood wave shape parameter the reliability index is most sensitive.
We selected the five flood waves that resulted in the highest dike failure probability, which are almost the same for the clay dike and sand dike cases (Figure 9).The most critical waves for riverside slope stability differ before the peak water level, but all show a steep decline in river water levels after the peak.The most critical waves for landside slope stability have relatively high water levels over the entire duration, but especially before the peak at t 15 .
To identify the flood wave shape parameter that has the greatest impact on dike slope reliability, we used the first-order sensitivity index (S i ) to analyse flood waves with a maximum water level of 5 m.Now, we iteratively fixed one of the flood wave shape parameters (X i ) to a particular value and calculated the conditional variance of dike slope reliability (Y ), denoted by V(Y|X i ), compared to the total variance of Y denoted by V(Y) (Saltelli et al. 2008).By evaluating each flood wave shape parameter, we determined their relative influence on dike slope reliability.It is important to note that the sum of all S i values here is greater than 1 (Figure 9), reflecting the non-independence of the wave shape parameters.Therefore, these results cannot be directly compared to those in Figure 8.Nevertheless, the S i value for each wave shape parameter represents a further breakdown of the flood wave shape S i at h max = 5 discussed in the previous section.In line with the visual inspection of most critical flood waves, the maximum river water level difference in the declining limb (DH ds ) is most influential for riverside slope stability.Remarkably, it has over twice the influence of the maximum gradient of this decline (Ds max ), which is often assumed as more important (Gao et al. 2019).As expected (van der Meer 2020), riverside slope failure is less sensitive to the flood wave shape before the water level peak.The total wave area (A tot ) and flood wave area before the peak (A peak ) are the most important parameters for landside slope stability.Contrary to the riverside, all flood wave shape parameters have some influence on the landside slope stability, but they mostly contribute to the total wave area (A tot ) and have an effect on the reliability as such (van Leeuwen 2019).
3.4.Implications and limitations of using flood wave shapes for calculating probabilistic slope stability 3.4.1.On the use of measures flood wave shapes versus GRADE derived flood wave shapes In our analysis, we rely on the GRADE dataset (Hegnauer et al. 2014) to sample flood wave shapes.Although the GRADE model was thoroughly tested against measurements, using historical measured discharge waves have the advantage of representing actual events and are therefore certain.Using measure discharge waves could be suitable for moderate flood waves, as there is usually sufficient historical data available.However, limitations arise when scaling up to extreme events due to the lack of available data.Vertical scaling may not always be feasible since the overall shape of the flood wave may not remain constant (Hegnauer et al. 2014).Additionally, past measurements may not be representative of the current channel and floodplain geometry.
In contrast, GRADE is better suited for extreme conditions since it can generate weather data and simulate discharge.As such, the database contains a larger number of flood wave shapes with extreme maximum discharges compared to historical records.However, there are uncertainties associated with GRADE, such as the considerable uncertainty of return values of multi-day rainfall extremes.This uncertainty arises due to the limited length of the baseline series used for resampling.Furthermore, the routeing model is used to simulate discharge well outside the historical range, for which it could not be quantified during calibration.
To create a unified approach, all discharge waves in our research are created by the GRADE model.In a levee flood risk assessment, this seems the most appropriate choice for the most extreme expected water levels.However, for flood wave shapes with a lower maximum water level, measured discharge waves may also be used to sample flood wave shape.

Reliability differences between groundwater estimation methods
We compared the realistic dike slope reliability based on multiple flood waves with four other groundwater estimations methods; two based on design discharge waves and dynamic groundwater flow (average, trapezoid) and two based on static groundwater conditions (steadystate, analytical).Overall, the steady-state method tends to produce excessively high pressure heads, making it unlikely to provide accurate results in terms of dike slope reliability.The analytical method is similar to the steady-state method, especially on the landside, but it also suffers from the same limitations.As these methods are most often used in levee assessment and design, current evaluations tend to err on the side of caution, probably to ensure safety.
Dynamic methods, on the other hand, are more accurate, which highlights the importance of using a transient model to calculate groundwater.The dike slope reliability given the realistic, average and trapezoid estimation methods (Figure 5) is often similar.However the slope reliability based on these methods is not always in the same order, due to the complex feedback between hydrological and geo-mechanical parameters, groundwater flow and slope stability.This stresses the importance of selecting an appropriate design wave, ff using multiple waves is not desired or possible, for example due to data unavailability.In order to err on the side of caution, our results suggest (Figure 9) one should select design wave with a steep gradient of decreasing water levels after peak (DH ds ) for riverside slope stability and a large total wave area (A tot ) for landside slope stability.

Conclusions
In this work, a combined transient analysis of flood wave shapes, groundwater pressure heads, and dike stability, indicates that dike stability is not only influenced by the maximum water level, but by the flood wave shape as well.
. Combining multiple flood wave shapes into a singleevent failure probability results in a failure probability that is more realistic than failure probabilities from a single design flood wave or a steady-state approximation.
. Considering only the average flood hydrograph in transient seepage analyses leads to overestimated reliability, urging to consider the variability in flood wave shapes. .Steady-state and analytical approximations of groundwater pressure heads failure probability generally overestimate the failure probability for slope stability up to several orders of magnitude compared to estimates with transient seepage analyses. .Higher maximum water levels result in higher failure probabilities due to higher groundwater pressure heads, but this relation is less strong for riverside slope stability, which is more dependent on the difference between peak water level and minimum water level after the peak. .In the considered case studies, the lowest stability always occurs > 2.5 days after the maximum water level.For landside slope stability and the sand dike case, the mode is 5 days after the maximum water level.Though this implies that water levels will be lower at the time slope instability occurs, there is still a high risk of extensive flooding for example due to erosion.We recommend to further analyse when the delay is such that the risk of flooding can be lower, and how to incorporate it into safety assessments.

Figure 1 .
Figure 1.Visualization of the flood wave shape parameters (A), selection of 50 representative flood waves with occurrence probabilities given their representativeness (B), and a comparison of the 50 selected flood waves with several frequently used flood wave shapes for groundwater pressure head estimation (C).

Figure 2 .
Figure 2. Workflow of the modelling chain and the comparison of its results with other methods.The various methods for dike slope reliability estimation are within quotation marks.More information on these methods and the food wave shapes they are based on can be found in Figure 1(C) and Section 2.3.

Figure 4 .
Figure 4. Temporal evolution (lines) and minimum value (crosses) for reliability indices of the real dike case over time for both landside and riverside slope stability and multiple methods for approximating groundwater conditions (Figure 1(C)).Note that the realistic, analytical, and steady-state methods have no temporal component and are therefore plotted on t 15 .

Figure 5 .
Figure 5. Reliability index (β) of the clay dike and sand dike cases conditional to various maximum river water levels (h max ) and according to multiple methods for groundwater pressure head approximation.The upper row contains the results for landside slope stability and the lower row for riverside slope stability.The box-whisker plots indicate the variation of the 50 sample set flood wave shapes.

Figure 6 .
Figure 6.Various flood wave shapes and the response of the groundwater pressure heads in the sand dike case and clay dike case on two extreme shaped flood waves with a h max of 5 m.

Figure 7 .
Figure7.Timing of lowest reliability index for the most extreme simulated water levels (h max = 5).For both landside and waterside slope stability, and for both the clay dike and sand dike cases, the lowest reliability index occurs several days after the maximum river water level.

Figure 8 .
Figure 8.First order sensitivity indices (S i ) of flood wave shape for various maximum water levels and for landside and riverside slope stability.The values obtained indicate the relative fractional impact of flood wave shape compared to that of geo-mechanical material properties.

Figure 9 .
Figure 9.Most dangerous flood wave shapes (top row) and sensitivity of dike failure probability at h max = 5 to several flood wave shape parameters (bottom row) related to either landside slope stability or riverside slope stability.The S i are interpreted as a further breakdown of the flood wave shape S i in Figure 8. Higher values indicate a higher influence of that parameter on the dike failure probability.

Table 1 .
Selected flood wave shape parameters, which are visualised in Figure1.

Table 2 .
Geo-mechanical and hydrological model parameters for Lent real dike cross-section: saturated unit weight (g sat ) angle of internal friction (w), cohesion (c), saturated conductivity (K sat ), SHANSEP Undrained NC shear strength ratio (S) and SHANSEP shear strength increase exponent (m).
Note: In case of a probabilistic parameter the distribution type and standard deviation are also provided.

Table 3 .
Geo-mechanical and hydrological model parameters for hypothetical cross-sections: saturated unit weight (g sat ) angle of internal friction (w), cohesion (c), saturated conductivity (K sat ).