Cramér–Lundberg asymptotics for spectrally positive Markov additive processes

This paper studies the Cramér–Lundberg asymptotics of the ruin probability for a model in which the reserve level process is described by a spectrally-positive light-tailed Markov additive process. By applying a change-of-measure technique in combination with elements from Wiener-Hopf theory, the exact asymptotics of the ruin probability are expressed in terms of the model primitives. In addition a simulation algorithm of generalized Siegmund type is presented, under which the returned estimate of the ruin probability has bounded relative error. Numerical experiments show that, compared to direct estimation, this algorithm greatly reduces the number of runs required to achieve an estimate with a given accuracy. The experiments also reveal that our asymptotic results provide a good approximation of the ruin probability even for relatively small initial surplus levels.


Introduction
Primarily motivated by applications in insurance risk, much attention has been directed to evaluating ruin probabilities (Asmussen & Albrecher 2010).In this context, one typically assumes that the driving net cumulative claim process X(•) is of Lévy type (or a subclass thereof, such as the compound Poisson process).With the insurance firm's initial surplus being u, one is mainly interested in computing the probability p(u) of the surplus level process u − X(•) dropping below 0. If this is beyond reach, which is often the case, one may settle for the (exact) tail asymptotics, the goal being to identify an explicit function p(u) such that p(u)/p(u) → 1 as u → ∞.
Research into such exact asymptotics goes back about a century.In the regime that the driving Lévy process is of compound Poisson type, and in case the claim size distribution is light-tailed, the classical Cramér-Lundberg asymptotics entail that p(u) decays, for u large, like p(u) = α e −θ u for positive constants α and θ (where the decay rate θ solves the so-called Lundberg equation).In the analysis one often relies on a duality with the stationary workload in the M/G/1 queue; see e.g.(Asmussen & Albrecher 2010, Ch.IV.2) and (Debicki & Mandjes 2015).Where in the literature the focus lies on finding the exact tail asymptotics for spectrally-positive (i.e.having only positive jumps) lighttailed Lévy processes, Bertoin & Doney (1994) derive such asymptotics for their spectrally two-sided counterpart (i.e.having jumps in both directions).
A natural extension of the above body of work concerns exact tail asymptotics of ruin probabilities in the context of Markov additive processes (MAPs) (Çinlar 1972(Çinlar , Neveu 1961)).These are essentially Markov modulated Lévy processes with in addition jumps at transition epochs of the modulating process (also often referred to as background process).More explicitly, a MAP Y(•) behaves as the Lévy process X i (•) when its Markovian background chain J(•) (evolving independently of the process Y(•) itself) is in state i, and may in addition have a jump at each transition of J(•).Because of the state changes, MAPs are particularly suitable for modeling real-world processes of which the behavior changes over time.
A main objective of this paper is to establish the exact tail asymptotics of the ruin probability p(u) when the driving process is a spectrally-positive light-tailed MAP.We find that these asymptotics are of Cramér-Lundberg type, i.e. they are of the form p(u) = α e −θ u for positive α and θ .The present paper complements our recent work (van Kreveld et al. 2022) and its predecessors, such as D' Auria et al. (2010), Dieker & Mandjes (2011), and Ivanovs (2011), which succeeded in identifying the Laplace transform of p(u) for spectrally one-sided MAPs.With the expressions found there, one could in principle try to perform Laplace inversion in order to obtain numerical values for p(u), but a complication is that these become inaccurate in the regime that ruin is rare.The findings of the present paper remedy this issue: we provide an asymptotic expansion that becomes increasingly accurate as u grows.
Cramér-Lundberg asymptotics have been studied for various kinds of specific MAPs.Without providing an exhaustive overview, we discuss a few relevant contributions.Siegl & Tichy (1999) consider a compound Poisson process with dividend barrier and two possible claim frequencies governed by a two-state Markov chain.In Jasiulewicz (2001) a model is analyzed with the premium rate depending continuously on the reserve level and the Markov-modulated claim frequency.More general MAPs of compound Poisson type are studied with constant premium rate (Miyazawa 2002) or constant claim frequency (Yin et al. 2006).The case where all components of the compound Poisson process are allowed to change with the background chain is considered by Zhu & Yang (2009).Typically, for the case that the driving process is of MAP type, the prefactor α is characterized relatively implicitly; see e.g.(Asmussen & Albrecher 2010, Theorem VII.3.7),where α is expressed in terms of random quantities at a stopping time under an alternative probability measure.An exception to this is the work of Miyazawa (2004) on the Cramér-Lundberg asymptotics of a Markov-modulated compound Poisson process including jumps at transition epochs.By considering a ladder-height approach, that work determines an explicit expression for the prefactor α for this model.
In detail, the contributions of this paper are the following: °In the first place, we succeed in establishing exact asymptotics p(u) = α e −θ u for the ruin probability p(u) in the model where the net cumulative claim process is represented by a spectrally-positive MAP Y(•) with light-tailed claim sizes and negative drift (such that p(u) → 0 as u → ∞).Through a delicate analysis of the overshoot of Y(•) over level u (under a specific alternative probability measure), we manage to express the constants α and θ in terms of the model primitives.By lower bounding the overshoot by 0 we obtain, as a by-product, a version of the Lundberg inequality for light-tailed spectrally-positive MAPs.°Secondly, we show that the ruin probability p(u) can be efficiently computed by simulation; this is particularly useful in the regime that ruin is rare, in which direct simulation would be extremely time-consuming.To this end, we construct a generalized version of Siegmund's algorithm (Asmussen & Glynn 2007, Ch. VI.2), which amounts to performing importance sampling using the alternative measure that we identified when establishing the exact asymptotics.The resulting estimator can be considered optimal, in that we succeed in proving that it has bounded relative error.Numerical experiments reveal that the number of runs required to obtain an estimate with a given precision (i.e. the ratio of the width of the confidence interval and the estimate) is essentially constant in u; this is a huge improvement over direct simulation, under which this number of runs roughly grows as e θ u .The experiments also show that p(u) = α e −θ u is an accurate approximation of the ruin probability, even for relatively small initial surplus levels u.
A crucial idea underlying our analysis is the observation that in order to identify whether the spectrally-positive MAP Y(•) has crossed level u, it is sufficient to monitor only the maxima of Y(•) between every two successive transition epochs of the background process; in other words, it is not needed to monitor Y(•) continuously in time.The resulting embedded process turns out to be conceptually substantially simpler than the original one.When deriving the exact asymptotics via the embedded process, two steps play a key role at the technical level.(i) The first concerns the introduction of a specific new probability measure Q under which the net cumulative claim process Y(•) has a positive drift, so that ruin occurs almost surely; see for instance (Asmussen & Albrecher 2010, Ch. VII.3).The main idea is then to write p(u) in terms of the likelihood ratio of a path to ruin under the original measure P, relative to the new measure Q.This reasoning leads to an expression for p(u) in terms of the overshoot of Y(•) over u under Q. (ii) In the second step, which combines application of the Wiener-Hopf decomposition with a result (Ivanovs et al. 2010) on the number of zeroes of the matrix-exponent corresponding to the MAP under consideration, this overshoot is analyzed in further detail (in the regime that u → ∞).
The outline of this paper is as follows.Section 2 defines our MAP Y(•), introduces the ruin probability p(u), and highlights a number of relevant preliminary results.The change-of-measure argument is carried out in Section 3, which leads, as mentioned above, to an expression for p(u) in terms of the overshoot of Y(•) over u under Q.As an intermediate step towards the identification of the exact asymptotics of p(u), the purpose of Section 4 is to find the transform of the overshoot distribution under Q.As it turns out, this overshoot transform can be expressed in terms of the solution to a set of linear equations.Combining the above findings, in Section 5 the exact asymptotics of the ruin probability are derived (i.e. the constants α and θ in p(u) = α e −θ u are identified).A simulation algorithm of generalized Siegmund type, enabling fast estimation of p(u), is given in Section 6.In addition, we give a proof for its bounded relative error, making the algorithm particularly useful in the rare-event regime.Finally, Section 7 presents the output of simulation experiments that provide an indication of the achievable speedup, as well as of the accuracy of the approximation p(u) ≈ p(u).

Model and preliminaries
In this section, we first describe in detail the model that we consider in this paper and introduce the corresponding notation.We then discuss the Wiener-Hopf decomposition for spectrally-positive Lévy processes and the result concerning the number of singularities of a spectrally one-sided MAP.Finally, we briefly outline the approach that we will follow in later sections to establish the exact asymptotics.The model, notation and preliminaries strongly resemble those featuring in van Kreveld et al. (2022), save for a few specific aspects (e.g. the restriction to spectrally-positive processes, the background process being irreducible, and a slightly different version of Proposition 2.2).

Model
We start by defining our model.Let the background process J(•) ≡ (J(t)) t 0 be an irreducible continuous-time Markov chain with state space {1, . . ., d} for d > 1.The corresponding transition rate matrix (or: generator matrix) is given by Q = (q ij ) d i,j=1 , with q i := −q ii > 0, having invariant probability distribution π = (π 1 , . . ., π d ).Associated with every state i, let X i (•) ≡ (X i (t)) t 0 be a spectrally-positive Lévy process, evolving independently of J(•).Informally, when J(t) = i, the process Y(•) ≡ (Y(t)) t 0 behaves as the Lévy process X i (•).Additionally, the process is allowed to have non-negative jumps at transition epochs of the background process.For that purpose, let (L n ij ) n∈N (with i, j ∈ {1, . . ., d} such that i = j) be a sequence of independent copies of the non-negative random variable L ij , independent of anything else, representing the jumps at the epochs that J(•) jumps from state i to state j.
The MAP Y(•) can be formally defined as follows.Let t n denote the time of the nth transition of the background chain.If J(0) = i, we have that Y(t) = X i (t) for t ∈ [0, t 1 ), and, if the transition at t n is from (say) i to j (where i = j), then where t ∈ [t n , t n+1 ) and n ∈ N. We say that the thus constructed process Y(•) is a spectrally-positive Markov additive process (MAP).

Notation and objective
Each of the spectrally-positive Lévy processes X i (•) is defined through its Laplace exponent ϕ i (•): for α 0, where the corresponding right inverse is denoted by ψ i (•).In the sequel, M(α) = (m ij (α)) i,j=1,...,d denotes the matrix with entries with λ ij (α) := E e −αL ij .The matrix exponent M(α) can be seen as the MAP counterpart of the Laplace exponent.
The aim of this paper is to identify the tail asymptotics of the maximum of the MAP conditional on the initial background state being i ∈ {1, . . ., d}.That is, with we wish to find an explicit function pi (u) such that p i (u)/p i (u) → 1 as u → ∞.We say that pi (u) are the exact asymptotics of p i (u).We throughout assume that Y(•) has a negative drift, i.e.
such that exceeding u is increasingly rare as u → ∞; in the actuarial literature, this condition is known as the net profit condition.The focus is on a light-tailed spectrally-positive MAP, under which p i (u) decays effectively exponentially.We provide a more precise description of what we mean by the MAP being light-tailed in Section 3.

Preliminaries
Two important results play a crucial role in the derivation of the exact asymptotics of p i (u): the Wiener-Hopf decomposition for spectrally-positive Lévy processes and a fundamental result on the zeroes of det M(α).
The Wiener-Hopf decomposition states that the value of any Lévy process at an exponentially distributed epoch can be written as the difference between two independent non-negative random quantities.In the context of this paper, we specifically consider the case of a spectrally-positive Lévy process (X(t)) t 0 with Laplace exponent ϕ(•) and corresponding right inverse ψ(•).Denote by (X(t)) t 0 the running maximum process associated with (X(t)) t 0 , and by (X(t)) t 0 the corresponding running minimum process.In the sequel, T ν is an exponentially distributed random variable with mean ν −1 , assumed independent of anything else.
The second result concerns a characterization of the zeroes of the determinant of the matrix exponent M(α) of a spectrally-positive MAP.In this result an important role is played by the Lévy processes X i (•) that are (almost surely) increasing (also referred to as subordinators).Let S ↑ denote the set of background states corresponding to these subordinators.The following result is (Ivanovs et al. 2010, Theorem 1).
Proposition 2.2: Let (Y(t)) t 0 be a spectrally-positive MAP, and let the underlying background process (J(t)) t 0 be irreducible.Then the equation det M(γ ) = 0 has d − |S ↑ | − 1 μ 0 solutions in C with positive real part.Additionally, it holds that det M(0) = 0.

Approach
We conclude this section with a short account of the approach followed in the derivation of the exact asymptotics, as covered by Sections 3-5.The MAP itself being a rather involved object, we work with a more manageable embedded version.This embedded process is constructed in such a way that the event of the MAP exceeding u is equivalent to the event of the embedded process exceeding u.More specifically, throughout this paper we restrict ourselves, similar to the approach followed in van Kreveld et al. (2022), to the value of Y(•) at only three types of time points.In the interval between two transitions of the background process we record the value of Y(•) (i) at the start of the interval (right after the jump at transition epoch, that is), (ii) at the epoch that the maximum value (within the interval) is achieved, and (iii) at the end of the interval (right before the jump at transition epoch, that is).
It can be seen that in order to verify whether Y(•) exceeds u we can restrict ourselves to the values of the MAP at epochs of type (ii).The increments of the MAP between the embedded time points are relatively straightforward to deal with, as a consequence of Proposition 2.1.
In Section 3, we work with the above embedding to find an alternative expression for the ruin probability under an alternative measure Q. Concretely, we succeed in expressing p i (u) in terms of the overshoot of Y(•) over level u under Q.This overshoot is then analyzed in Section 4, which eventually leads to the exact asymptotics of p i (u) in Section 5.

Change of measure
In this section, we derive an alternative expression for the ruin probability p i (u) by applying a change of measure.The derivation consists of the following four steps: (1) constructing a system of equations for finding the positive solution to the so-called Lundberg equation, providing the candidate for the exponential decay rate θ appearing in the exact asymptotics of p i (u) (where we remark that we show later that θ is independent of the initial state i); (2) defining an embedding of the MAP Y(•), based on the time points of type (i), (ii) and (iii) that were introduced in Section 2.1; (3) defining an alternative probability measure Q, also corresponding to a MAP, but with different driving Lévy processes, a different Markovian background process, and jumps at transition epochs having different distributions; (4) calculating the likelihood ratio of the actual probability measure P with respect to the new probability measure Q, through which we find a compact expression for our target probability p i (u).
The compact expression for p i (u) provides us with an explicit (exponentially decaying) bound on p i (u), uniformly in u 0.More importantly, however, it also eventually allows us to evaluate the exact asymptotics of p i (u), as shown in Sections 4 and 5.

Step 1: the Lundberg equation
The Lundberg equation, which yields the candidate for the exponential decay rate θ appearing in the exact asymptotics of p i (u), involves the increment of the MAP between two consecutive epochs that the background state changes to an (arbitrarily chosen) reference state i 0 ∈ {1, . . ., d}.The goal of Step 1 is to quantify the moment generating function of this increment, providing the solution to the Lundberg equation.
Denote, for i = 1, . . ., d, by (V n i ) n∈N a sequence of independent random variables being distributed as the generic random variable V i := X i (T q i ) (in words: the maximum of the spectrallypositive Lévy process X i (•) over an exponentially distributed time with mean q −1 i ).Also, denote by (W n i ) n∈N a sequence of independent random variables distributed as the generic random variable W i := −X i (T q i ) (in words: minus the minimum of the spectrally-positive Lévy process X i (•) over an exponentially distributed time with mean q −1 i ); recall from Proposition 2.1 that W i is exponentially distributed with parameter δ i := ψ i (q i ).Let the resulting 2d sequences be independent.Due to Proposition 2.1, the Laplace-Stieltjes transforms of these random variables are given by i (•) the density of W i (for ease assumed to exist), both under the original measure P. Below, we will use the notation h (P) ij (•) for the density of L ij under P. Now fix a reference state i 0 ∈ {1, . . ., d}, and define by U the increment of Y(•) between two subsequent visits of the background process to this state i 0 .That is, if t n and t m are two subsequent times that the background chain enters state i 0 , then U is distributed as Y(t m ) − Y(t n ).We assume that we are in the light-tailed regime, in the sense that the Lundberg equation ũ(θ ) := E e θ U = 1 has a positive solution, say θ ; in the literature, ũ(θ ) is referred to as the moment generating function (mgf) of the random variable U.
Four remarks are in place now.In the first place, note that our assumption is similar to that of (Debicki & Mandjes 2015, Section 8.1) for the Lévy case (i.e.covering the situation without any background process).In the second place, the fact that the Lundberg equation has a positive solution implies that the jumps L ij have a finite mgf in an open interval around the origin, and that the same holds for the mgfs of the possible upward jumps of the Lévy processes X i (•).It is for this reason that we call the driving MAP Y(•) light-tailed.In the third place, since ũ(0) = 1, ũ (0) < 0 and lim θ →∞ ũ(θ ) = ∞, the existence of a positive solution to ũ(θ ) = 1 excludes cases in which ũ 'jumps over 1' at some θ > 0. In the fourth place, we note that it may seem that θ depends on the reference state i 0 chosen.Below, however, we argue that the choice of i 0 does not affect the value of θ .
Next, we show that we can express the root θ in terms of a generalized eigensystem.Let y j ≡ y j (θ ), for j = i 0 , be the moment generating function of the increment of Y(•), with the background process starting in state j, before the background process reaches the reference state i 0 .To derive a system of equations, consider an increment of Y(•) in an interval of the type (t n , t n+1 ], that is, an interval between two successive transition epochs of the background process J(•).By Proposition 2.1, we know that such an increment can be written as the sum of three independent variables: the maximum of the corresponding Lévy process in this interval, the decrease after this maximum, and the jump at the transition epoch.In terms of moment generating functions this leads to the equations and, for j = i 0 , Then we equate ũ(θ ) to 1, in order to find θ .We thus end up with the following system of equations: with y i 0 = 1.Note that if y solves this system of equations, then so does c • y for any c, implying that the solution θ does not depend on the reference state i 0 .It also means that y is unique up to a multiplication by a factor, so that ratios of the type y i /y j are uniquely determined.Due to the fact that the components of the vector y = y(θ ) can be interpreted as moment generating functions, we conclude that y is necessarily componentwise positive.Below we will use the eigenvalue/eigenvector pair (θ , y) when defining the alternative measure Q.

Step 2: the embedding
The idea is to embed the process Y(•) into a simpler process that still contains all information based on which it can be determined whether it exceeds level u.To this end, observe that it suffices to consider only the values of Y(•) that correspond to maxima between transitions of the background process.This observation is visualized in Figure 1; the maxima between transition epochs (i.e. the dots) give all information that is needed to verify whether level u is ever crossed.Define the embedded process (S n ) n by with K 0 := J(0) and K n := J(t n ) denoting the state of the background process right after the nth transition of the background process.
As discussed above, and appealing to Proposition 2.1 to justify the independence between the sequences (V n i ) n∈N and (W n i ) n∈N , we can rewrite the ruin probability p i (u) in terms of the embedded process (S n ) n , as follows:

Step 3: construction of the alternative probability measure
To evaluate the ruin probability in the regime that u → ∞, we work with a change of measure.Particularly, we set up the alternative probability measure Q (to be defined below), under which the process is still a MAP, but now has positive drift.The goal is to evaluate p i (u) where the path of (S n ) n is sampled under Q.The measure Q is an exponentially twisted version of P, in that (where a more descriptive definition will be given below).This change of measure has been employed for the random walk (Asmussen 2003, Chapter XIII) and Lévy (Debicki & Mandjes 2015, Section VIII.1) cases.
From ũ(•) being convex, ũ (0) < 0 (due to (2)) and ũ(0) = ũ(θ ) = 1, it follows that ũ (θ ) > 0, which means that under the new measure Q the drift of (S n ) n has become positive, so that u is eventually exceeded almost surely under Q.As an aside, we also note that if state i corresponds to a subordinator process X i (•) under P, then the same applies under Q (and vice versa).
Let ≡ (S) be the appropriate likelihood ratio (or Radon-Nikodym derivative), which records the likelihood of the path of (S n ) n under P relative to Q (until (S n ) n exceeds u, that is).Then we have the following standard equality translating the likelihood of outcomes under Q into those under P: An important benefit of working with Q is that under this measure, the event in the indicator function has probability 1 due to the positive drift, whereas under P the same probability vanishes as u becomes large; as a consequence, It is noted that by many other definitions of Q we could have achieved a positive drift, and thus the validity of ( 6).The crucial feature of our specific alternative measure, as given in (5), however, is that for this Q the likelihood ratio takes a simple form, as we will show below.
The alternative measure Q, in an abstract sense defined via (5), is concretely described as follows.Under Q, the distributions of (V n i ) n and (W n i ) n are characterized through the densities respectively, for i = 1, . . ., n and x > 0. Note that this means that the Laplace-Stieltjes transform of the and that under Q the (W n i ) n are exponentially distributed with parameter In addition, the where, in the same way as above, the corresponding Laplace-Stieltjes transform can be verified to be λ Now that we have defined the driving Lévy processes and the jumps at transition epochs under Q, we conclude by considering the transition rates of the background process under Q.For i = j these become It is readily verified that this choice implies that the diagonal elements of the transition rate matrix under Q are where the last equality is by virtue of (4).

Step 4: the likelihood ratio
The next step is to evaluate the likelihood ratio (S) on a path such that (S n ) n reaches u, which is equal to p i (u) by ( 6).Let N ≡ N(u) denote the first n for which S n u (which is under the new measure Q finite almost surely).Then ≡ (S) can be written as the product of the following four factors corresponding to the elements of the MAP that change under Q: the maxima and minima of the underlying Lévy processes between two transition epochs, the jumps at transition epochs and the transition rates of the background chain.
°In the first place there is the contribution of the 'in between maxima' °Secondly, there is the contribution of the 'in between minima' °In the third place, there is the contribution of the jumps at transition epochs of the background process °And finally there is the contribution due to the jumps of the background process: It is readily verified (recognizing a 'telescopic product') that this contribution simplifies to .
It is also noted that, by the definition of the process Multiplying the above four components of the likelihood ratio, the resulting expression greatly simplifies.It means that, upon combining the above, and recalling the identity (6), we have arrived at the following result.
Theorem 3.1: For all u 0 and i ∈ {1, . . ., d}, Remark 3.1: To get insight into the expression found in Theorem 3.1, compare it to its counterpart for the maximum of a random walk with independent and identically distributed increments (X n ) n∈N (distributed as the generic random variable X).With p(u) the probability of the random walk exceeding level u, and S N denoting the value of this random walk at the moment N that u is crossed, a similar change of measure yields p(u) = E Q (e −θ S N ), with θ solving E(e θ X ) = 1 and the X under Q having Laplace-Stieltjes transform This principle underlies the celebrated Siegmund algorithm for efficiently estimating p(u); see for a detailed account e.g.(Asmussen & Glynn 2007, Equation (2.5)).The additional factors for the MAP case, as appearing in Theorem 3.1, have two reasons.First, the factor y i /y K N reflects the impact of the initial and eventual background states of the MAP.Secondly, regarding v K N (−θ ), one could say that (by definition) the number of 'steps' of the embedded process (S n ) n is odd: in step n there have been n + 1 contributions 'of the V-type', and just n contributions 'of the (−W + L)-type', the consequence being that at step N the contribution of the moment generating function of one of the V i (namely the last one) has not been neutralized.This results in the additional factor v K N (−θ ) in the expression for p i (u).
While the focus of the section lies on deriving an alternative expression for p i (u), as a by-product we obtain a uniform upper bound on p i (u).To this end, we define Realizing that by definition S N u, the following result, which can be seen as the MAP-counterpart of the conventional Lundberg inequality, is an immediate consequence of Theorem 3.1.
Corollary 3.1: For all u 0 and i ∈ {1, . . ., d}, Observe that S N can be decomposed into u + R(u), with R(u) denoting the 'overshoot' of the embedded process (S n ) n over level u (i.e. S N(u) − u 0); we also write N(u) rather than just N to stress the dependence on u.This means that if we manage to compute, for θ > 0 which is the MAP counterpart of the classical Cramér-Lundberg asymptotics that we were aiming at.Therefore, we would be done if we would be able to devise a way to compute the overshoot transform r ik (θ ).As it turns out, this can be done by taking a second transform (with respect to u): note that where Informally, the object α s ij (α, θ)) corresponds to the overshoot over an exponentially distributed threshold with mean α −1 , which grows to ∞ when sending α ↓ 0. In other words: what is left is (i) computing s ij (α, θ), and (ii) let α ↓ 0 in α s ij (α, θ).These are the topics of the next two sections.

Computing the overshoot transform
In this section, we are interested in evaluating the double transform s ij (α, θ) corresponding to the target level u and the overshoot R(u), as defined in (8).Recall that i and j respectively represent the initial background state and the state at the time level u is crossed.To find an expression for s ij (α, θ), we distinguish three scenarios for the MAP during the first background state i: (1) Level u has been crossed before the first transition of the background process.This only leads to a contribution if i = j.
(2) Level u has not been crossed before the first transition of the background process, but due to the jump at the transition epoch it crosses level u.This only leads to a contribution if i = j.(3) Level u is not crossed before or at the first transition epoch of the background process (to state k, say), but later it is.
We now split s ij (α, θ) into the components s (1) i (α, θ), s (2) ij (α, θ), and s (3) ijk (α, θ) corresponding to the above three scenarios.In the first place, for i = j we have where with s (2) ij (α, θ) and s (3) ijk (α, θ) evaluated below.The first term in the right-hand side of ( 9) is interpreted as the contribution due to the scenario in which V i remains below u, a transition from background state i to background state j is made, and the corresponding jump brings Y(•) above u; in this scenario the overshoot includes an extra V j (because we consider the embedded process, see Step 2 in Section 3).The second term in the right-hand side of (9) reflects the scenario in which V i remains below u, and a transition from background state i to background state k = i is made such that after the corresponding jump the process Y(•) is still below u.The transform s and the transform s In the case that i = j we obtain along similar lines with s (1) i (α, θ) evaluated below.The first term in the right-hand side of ( 10) is the contribution due to the scenario in which V i exceeds u.In the second term in the right-hand side of (9) we have that V i remains below u, and a transition from background state i to background state k = j is made such that after the corresponding jump Y(•) is still below u.The transform s We now further evaluate the expressions of s and s (3) ijk (α, θ).First, by conditioning on the value of V i ∈ [u, ∞), we directly find that Next, for each of the quantities, we swap the order of the integrals so that the most elementary integration (i.e. the one over u), becomes the innermost integral.Then this integral is computed, and after that the other integrals can be evaluated successively.
Along the lines that we sketched above, we obtain For s (2) ij (α, θ), a similar strategy can be followed.This yields We calculate s (3) ijk (α, θ) in a similar fashion.In addition to the usual rearrangement of the integrals, we perform the change-of-variable y = x−u + v, so as to obtain We have thus managed to express the entries of the vector in themselves.We proceed by writing the resulting linear equations in vector/matrix-form.To this end, note that under Q the matrix M(α) has entries For the remainder of this section, the notation M(α) implies that we are working under the measure Q.Also, with It is useful to observe that if background state i corresponds to a non-decreasing subordinator, we have δ i → ∞, so the expression simplifies to To obtain a system of equations for s j (α, θ), we combine ( 9) and ( 10) with the expressions for s (1) i (α, θ), s (2) ij (α, θ), and s (3) ijk (α, θ).When multiplying ( 9) and (10) by for each i = 1, . . ., d, it is seen that, for any given α and θ , we obtain the following system of linear equations.
Theorem 4.1: For any α > 0 and θ > 0, and for j = 1, . . ., d, We would be able to determine the vector s j (α, θ) from Theorem 4.1, were it not for the fact that for i / ∈ S ↑ , k ∈ {1, . . ., d}, the quantities s kj (δ i , θ) appearing in b ij (α, θ) are unknown.Defining we now turn our attention towards finding (ω ij ) i / ∈S ↑ .Note that, using the linear equations given in Theorem 4.1, one may express the vector s j (α, θ) by relying on Cramer's rule.More concretely, with the matrix M b j ,i (α, θ) denoting the matrix M(α) in which the ith column is replaced by the vector b j (α), we have that Since s ij (α, θ) is finite, any zero of the denominator should be a zero of the numerator.According to Proposition 2.2, det M(α) = 0 has d • := d − |S ↑ | zeroes in the right half of the complex plane (recalling that the asymptotic drift is positive under Q).For ease of exposition, we let these zeroes have multiplicity 1 (and we call them, say, α 1 , . . ., α d • ).When this multiplicity property does not hold, a reasoning similar to the one below still applies, but one needs to resort to the concept of Jordan chains; we do not discuss this procedure in detail, but instead refer to the in-depth treatment in D' Auria et al. (2010).
Having distinct zeroes guarantees that we have in other words, the zeroes of det M (in the right half of the complex plane, that is) are also zeroes of det M b j ,i , for each i = 1, . . ., d.
By precisely the same argument as the one given in van Kreveld et al. (2022), Equation ( 12) provides the same information for any i = 1, . . ., d, so it suffices to consider just the equation M b j ,1 (α j , θ) = 0.With Mik (α) representing the (d − 1) × (d − 1) matrix which results after deleting the ith column and the kth row from M(α), this equation can be rewritten as We thus obtain d • equations (one for each α j ) that are linear in the unknowns ω 1j , . . ., ω d • j , which can be dealt with in the standard manner, thus yielding a solution for the ω ij .This procedure can be repeated for each eventual state j.Now that the quantities ω ij (for i ∈ S ↑ ) are known, Equation (11) expresses s ij (α, θ) in terms of known quantities.

Exact tail asymptotics
As pointed out at the end of Section 3, in order to evaluate r ij (θ ), we are interested in lim α↓0 αs ij (α, θ ).The purpose of this section is to find an explicit expression for this limit, and consequently determining the exact asymptotics of p i (u) by Equation ( 7).We rely on the fact that from Section 4 we know how s ij (α, θ ) can be evaluated.
This explains why we first study the behavior of d(α) as α ↓ 0. To this end, we write, taking entry-wise Taylor expansions at α = 0, where Z = (z ij ) d i,j=1 is given by Hence, we can express d(α) as the determinant of a sum.In order to work with determinants of sums, we have the following lemma.Let C E k be the matrix consisting of all columns of C, but with its kth column replaced with the kth column of E. Proof: Recall that det(C + E) is the sum of 2 d determinants; one for each possible matrix in which each of the columns equals the corresponding column of C or E. Using this rule, one can write det(C + E) as a polynomial in ε of degree d where the coefficients of the ε are sums of d determinants of the above type.The result follows by isolating the terms that do not depend on and those that are linear in , and by in addition aggregating all terms that correspond to 2 , . . ., d .
The idea is now to set C = Q + αZ, ε = α 2 and E any finite matrix in the above lemma.It immediately follows that d(α) = det(Q + αZ) + O(α 2 ).We then use the lemma a second time, but now with C = Q, = α and E = Z, so as to obtain Here, in the second equality, we used that Q is the transition rate matrix of a continuous-time Markov chain (also under Q), which implies det Q = 0. Hence, we obtain .
We thus conclude that .
Combining this with (7), we obtain the main result of the paper: the exact asymptotics for p i (u).
As mentioned, this result can be considered the MAP-counterpart of the classical Cramér-Lundberg result.
Theorem 5.1: For any initial state i ∈ {1, . . ., d}, The above theorem provides a possible approximation for the ruin probability p i (u) in the regime that u is large.Concretely, we propose the approximation In Section 7 the accuracy of ( 13) is assessed.

Efficient simulation
In this section, we point out how to efficiently estimate p i (u) by simulation, with emphasis on the regime that u is large.The main idea is to rely on importance sampling, using a generalized version of the celebrated Siegmund algorithm.More specifically, we propose to run independent simulations of the MAP under Q, and subsequently average the values of that are sampled in each of the runs.By Theorem 3.1 this results in an unbiased estimator.The main objective of this section is to prove that this estimator has bounded relative error.Its efficiency gain (relative to direct simulation, that is) is demonstrated in Section 7 through a series of numerical experiments.
A pseudo-code corresponding to a single run of this generalized Siegmund algorithm is given by Algorithm 1.Here s records the current value of the embedded MAP and j the current background state.Also, the function 'sample(X)' generates and returns a sample of the random variable X, independent of everything what has been sampled before.Finally, the function 'sampleNextState(j, Q (Q) )' returns a new state of the background chain sampled under Q, when the current state is j.
The while loop in Algorithm 1 updates the value of the MAP at maxima between two successive background transition epochs.Lines 3 through 7 respectively correspond to sampling the decrease of the MAP before the next background transition, sampling the next background state, sampling the jump at the transition epoch and sampling the maximum between the current and next background transitions.
An important performance measure of algorithms estimating small probabilities is their relative error, defined by the standard deviation of the estimate divided by the estimated probability.Not only does γ ≡ γ (u) yield an unbiased estimator of p i (u), the next theorem entails that the relative error is bounded in u.We refer to this property as bounded relative error (Asmussen & Glynn 2007, Ch. VI.1).Theorem 6.1:A sample of p i (u) as returned by Algorithm 1 has bounded relative error.

Proof:
The proof is closely related to its counterpart for the random walk case (Asmussen & Glynn 2007, Ch. VI.2).Denote, in line with the notation that we have previously used, by R(u) and K N(u) the overshoot and the background state, respectively, at the time that level u is crossed.Now consider the process (R(u), K N(u) ) u 0 , conditional on K 0 = i.Above, we have computed the transform r ij (θ ), by which we uniquely characterized the limiting distribution of (R(u), K N(u) ) as u → ∞; we let (R, K) be distributed as the corresponding limiting random vector.
Data: Initial state i of background process and target level u; distributions of V j , W j , and L jk under Q; transition rate matrix Q = (q ij ) d i,j=1 under Q; eigenvalue θ and corresponding eigenvector y.Result: Unbiased sample of p i (u).
Algorithm 1: A single run in the generalized Siegmund algorithm As a result of the above, the ruin probability (which equals the mean of γ (u), as defined above) satisfies the following asymptotics, as u → ∞: We have bounded relative error as the second moment of γ (u) vanishes at essentially the same rate as the square of p i (u); to see this, observe that where it is noted that C 2 > C 2 1 due to Jensen's inequality.The variance of a single observation γ (u) in our generalized Siegmund algorithm thus satisfies It now directly follows that for u large the relative error tends to We observe that apparently the relative error loses its dependence on u as u grows, and that it is bounded by a constant.

Numerical experiments
In this section, we numerically study the asymptotic behavior of p(u), measuring in particular the efficiency of the generalized Siegmund algorithm compared to direct estimation.We in addition include an experiment that studies the impact of the background process on the ruin probability.
To obtain a direct estimation of the ruin probability p(u), one may simulate the model at hand say n times under the original measure P, and then determine the fraction of runs in which the process exceeds level u.This leads to an unbiased estimator, with the relative error equaling In order to achieve a relative error of at most , one thus requires roughly runs.The particularly worrisome element in this quantity is the p(u) in the denominator, being very small when u is large.Concretely, with p(u) decaying roughly as e −θ u and 1 − p(u) ≈ 1, we conclude that n blows up like e θ u as u grows.
The generalized Siegmund algorithm, in which the process is simulated under Q, on the other hand has bounded relative error.With c given in ( 14), and again aiming for a relative error , this algorithm thus requires roughly n ≈ c/ 2 runs for large u.As a result, the generalized Siegmund algorithm gives an accurate estimation for any u, in that for large u the number of runs required becomes independent of u.This in particular means that this number of runs does not blow up as p(u) → 0.
In the remainder of this section we discuss experiments in which we apply our generalized Siegmund algorithm.It should be noted that executing this algorithm requires being able to sample random variables (V k ) k∈{1,...,d} , while only their respective Laplace-Stieltjes transforms are known (see Proposition 2.1).To this end we first apply numerical inversion to obtain a discretization of the distribution function of each of the V k .In our numerics we have used the intensively tested and frequently cited algorithm that was developed in Abate & Whitt (2006); in the experiments reported in this section, we use 10 3 mass points.With this approximate distribution function at our disposal, we can use the inverse distribution function method to sample a random variable distributed according to V k .Three remarks are in place here.°First observe that, for any k, the Laplace inversion has to be performed just once (say, in the pre-processing phase); once the approximate distribution function of V k has been computed, the generalized Siegmund algorithm can be repeatedly executed until an estimate of sufficient precision has been produced.°Secondly, the simulation of the embedded process under the original measure P has the same inherent issue that the V k must be sampled.In other words, the need to perform Laplace inversion is not specific for the generalized Siegmund algorithm.°The fact that we propose to use numerical Laplace inversion to run our generalized Siegmund algorithm, triggers the question why we do not simply numerically invert the transform of p(u) (as was obtained in van Kreveld et al. (2022)).However, the latter method is typically inferior to the generalized Siegmund algorithm, in particular in the current context where the ruin probability p(u) is small, as a consequence of the fact that the Laplace inversion becomes increasingly inaccurate further along the tail.
We now describe the specific MAP we use in our experiments.It consists of two background states, the first and second corresponding respectively to a standard Brownian motion X 1 (•) with drift − 1 3 , and a compound Poisson process X 2 (•) with drift −1 and jumps of Exp( 1) size arriving at rate 2 3 .Note that we constructed this instance such that E X 1 (1) = E X 2 (1) = − 1 3 , allowing for better comparison between the impacts of both processes on the ruin probability.To make our model as elementary as possible, our setup does not contain jumps at transition epochs; we stress however that adding those does not lead to any conceptual complications.
Experiment 7.1: In the first experiment we vary the value of the exceedance level (or, in risk applications, the initial reserve) u, so as to provide empirical backing for the claims of Theorems 5.1 and 6.1.In addition, we obtain insight into the accuracy of the approximation (13).
We consider the instance q 1 = q 2 = 1, and we vary u (with steps of 10) from 10 to 80, see Table 1.Two approximations of the ruin probability are presented: the second column shows estimates of p 1 (u) that are generated by 10 4 runs of Algorithm 1, and the third column presents the approximation p1 (u) = α 1 e −θ u of (13) (with α 1 ≈ 0.6390 and θ ≈ 0.4066).The approximations of both methods differ around 1%, even for small values of u.This indicates, for our specific MAP, fast convergence of the expression in Theorem 5.1.
In addition, Table 1 shows the average relative error of a single run under Q (Algorithm 1), based on the sample (fourth column).This is compared to the same error when one would estimate the ruin probability directly under P (fifth column).Where the relative error of Algorithm 1 is fairly constant, the same error under direct estimation shows exponential increase in u, as anticipated.If, say, one is interested in an estimate for p 1 (u) with relative error at most 5%, the instances u = 10, 40, 80 respectively require approximately 4 • 10 4 , 7 • 10 9 , and 8 • 10 16 runs.The number of runs required under Q, on the other hand, is around 250 for any value of u.Experiment 7.2: In the second experiment the background chain transition rates q 1 and q 2 are varied, and with them the proportion of time spent in each of the two background states.For each combination of these parameter values we run Algorithm 1 with u = 40 a total of 10 4 times.The output consists of the estimated ruin probability and the relative error per run based on the sample.The results are shown in Table 2.
As we can see, the ruin probability heavily depends on the transition rates of the background chain.The larger the proportion of time spent in the compound Poisson state (state 2), the larger the ruin probability, and the larger the proportion of time spent in the Brownian state (state 1), the smaller the ruin probability.It is also reassuring to see that, for this MAP, the (bounded) relative error per run is rather low.With 10 4 runs this results in a relative error in the order of 5 • 10 −3 .
When q 1 > > q 2 , one expects that the net cumulative claim process effectively coincides with a compound Poisson process, which has a ruin probability that asymptotically decays as 2 3 e − u 3 .Substituting u = 40 gives 1.0797 • 10 −6 , close to the values of p 1 (40) in the top rows.Conversely, when q 1 < < q 2 , the net cumulative claim process should be close to a Brownian motion, which has a ruin probability e − 2 3 u .Substituting u = 40 gives 2.6231 • 10 −12 , in line with the values of p 1 (40) in the bottom rows.

Figure 1 .
Figure 1.Example of a MAP with two background states, with dots representing the embedded process (S n ) n .That is, the dots represent the maxima over the intervals [t n , t n+1 ) for n ∈ N.

Table 1 .
Ruin probabilities as a function of u, and the relative error per run under Q and P. u p 1 (u) estimated by Algorithm 1 α 1 e −θ u Relative error per run under Q Relative error per run under P