Local approximation of Markov chains in time and space

ABSTRACT In epidemic modelling, the emergence of a disease is characterized by the low numbers of infectious individuals. Environmental randomness impacts outcomes such as the outbreak or extinction of the disease in this case. This randomness can be accounted for by modelling the system as a continuous time Markov chain, . The probability of extinction given some initial state is the probability of hitting a subset of the state space associated with extinction for the initial state. This hitting probability can be studied by passing to the discrete time Markov chain (DTMC), . An approach is presented to approximate a DTMC on a countably infinite state space by a DTMC on a finite state space for the purpose of solving general hitting problems. This approach is applied to approximate the probability of disease extinction in an epidemic model. It is also applied to evaluate a heterogeneous disease control strategy in a metapopulation.


Introduction
Early mathematical models of disease considered infection in a single population compartmentalized into susceptible and infectious classes [34,44]. Such models continue to be of interest to researchers to this day [2,10,13,24,32,36,37]. In recent years, increases in mobility, or the processes by which individuals change their location, has motivated researchers to adapt models to consider the spatio-temporal spread of disease [7]. One technique to capture spatial heterogeneity is called metapopulation modelling [6,8,31]. In these models, the population is broken down into subpopulations called patches. Individuals experience patch-specific local interactions as well as migration between patches. Patches are often considered to be separate spatial locations [8]. Deterministic metapopulation models of the spread of infectious disease are often concerned with the asymptotic behaviour of solutions in a neighbourhood of the disease-free equilibrium and those in a neighbourhood of an endemic equilibrium [4,5,8,9,27]. Another important topic for analysis is the question of persistence or extinction of the disease. In deterministic models, the basic reproduction number [19,45] is often used to formulate a sharp threshold for the persistence of the disease [18,23,28,39,47].
When considering the case of the emergence or reemergence of the disease, it becomes necessary to account for the effects of random fluctuations on the dynamics of the system. One technique to account for this stochasticity is to formulate a continuous time Markov chain (CTMC) model. In this setting, random fluctuations will always ultimately lead to the extinction of the disease. Keeling describes a stochastic population as persistent if extinction events are rare [31]. This can be measured by calculating the mean extinction time and comparing it to ecological time scales [30][31][32] or by calculating the probability of extinction. The probability of extinction is often impossible to calculate directly from the Markov chain, but it can sometimes be approximated by the extinction of a branching process [3,12,37,38]. In the case of a single infectious type, a branching process model takes the form of the classical Galton-Watson branching process (GWbp) [3,48,51]. In the case of multiple infectious types, the branching process takes the form of a multitype branching process [3,26,37,38]. Whether there is a single infectious type or multiple, we will generally refer to approximation via branching process techniques as the branching process approximation. However, the branching process approximation is not appropriate in all cases [38]. In any case, the probability of extinction is the probability of hitting the subset of the state space associated with the extinction of the disease.
CTMC models of biological processes typically assume that the time until the next transition is exponentially distributed. In this case, the hitting problem can be studied by first passing to the embedded discrete time Markov chain (DTMC). In this article, we present an approach to approximate the general hitting problem in a discrete time Markov chain called local approximation in time and space (LATS). We prove that the hitting probability calculated using LATS converges asymptotically to the true hitting probability in Section 3.3. We illustrate this approach via applications in Sections 4 and 5. There are ulterior motives for the choices of applications in Sections 4 and 5. In addition to providing an example of two ways in which the LATS technique provides an approximation of the probability of extinction, the application in Section 4 illustrates that LATS approximation is accurate at small population sizes (when branching process approximations fail) and that LATS provides insight into the most likely path to extinction or outbreak. In Section 5, the LATS technique is used to calculate an atypical hitting problem, illustrating its robustness as a method for approximating general, rather than specific hitting problems.
Stochastic Susceptible-Infected-Susceptible (SIS) models have been a topic of study dating back to the work of Bailey [11]. While some authors have modelled stochasticity using stochastic differential equations [24], we will focus on SIS models in the form of Markov processes. West and Thompson [50] analysed continuous and discrete time SI models and Allen and Burgin [2] compared deterministic and stochastic SIS models. Epidemiological models in the form of stochastic SIS models in metapopulations have been studied in various ways [5,10]. In particular, several authors have considered optimal control [13,35,36]. To illustrate the utility and robustness of the LATS technique, we will formulate the optimal control problem as a hitting problem and analyse it using LATS.
In contrast with branching process approximation, the LATS approach can be used to answer any question about a stochastic metapopulation that can be posed as a hitting problem. For example, the probability of extinction in a single patch may provide important information about a metapopulation. Since patches in a metapopulation are typically coupled together by the migration of their residents, extinction in one patch is a transient event.
However, the probability of extinction in a single patch can be used as a patch-specific indicator of disease risk. To illustrate this, we consider a two-patch SIS model in which disease control that reduces the rate of infection by 20% can only be deployed in a single patch. The probability of extinction in a single patch is used to determine the optimal strategy for deployment. We also calculate the optimal strategy using various standard techniques, for comparison.
The article is organized as follows: we begin by recalling the necessary mathematical preliminaries in Section 2. The LATS approach is presented in Section 3. Section 4 illustrates an application of LATS to calculate the probability of the extinction of the disease in a single population where branching process techniques are not suitable. In Section 5, we formulate an optimal control problem in a metapopulation SIS model as a hitting problem to show that analysis via the LATS technique is perhaps more sensitive to the preferences of decision-makers than standard techniques such as the probability of total extinction and R 0 .

Mathematical background and preliminaries
The LATS technique, which is presented in Section 3, combines basic results from the theory of Markov chains, including collapsed Markov chains, and graph theory to formulate an approximation of a given CTMC. For applications to epidemic models, we will typically be concerned with CTMCs which have exponentially distributed waiting times. However, LATS can be applied more generally. In this section, we first recall existing definitions and results pertinent to the development of LATS. A more thorough introduction to the theory of Markov chains can be found in [1,20,22,29,33,40,41].

Embedded Markov chain
Let (X t ) t≥0 be a homogeneous CTMC on a countable state space S. We ignore the special case of an explosive process [1,40]. Let (W i ) i∈N be the collection of random jump times such that X t jumps to a new state at each time W i and remains in the state X W i for W i ≤ t < W i+1 . The random variables T i = W i+1 − W i represent the random amount of time X t spends in state X W i for each i and are called interevent times. The CTMC X t can be characterized by its transition probabilities for transition from state i to state j in S. The collection of all transition probabilities can be written as a (possibly infinite) matrix P(t) = (p ij (t)).
Let q ii = − j =i∈S q ij and define the generator matrix Q = (q ij ). Let q i = −q ii = j =i∈S q ij be the off diagonal ith row sum of Q. The CTMC X t admits a DTMC Y n , known as the embedded Markov chain. Y n has the property that Y n = X W n , ,= 0, 1 , 2, . . . . Define and for j = i Then the matrix T = (t ij ) is the probability transition matrix of the embedded DTMC Y n . It is equivalent to say that Y n is formed by conditioning X t on the fact that a transition has taken place and replacing the index of time with the index of the number of transitions.
Since we restrict our discussion to homogeneous CTMCs on a countable state space, any CTMC we consider admits an embedded DTMC. The notation Y n and T to denote the embedded CTMC and its probability transition matrix can be found in [1] and is used to differentiate between the CTMC and its embedded chain, but is not the standard notation for a DTMC. For the discussions that follow we adopt the following more standard notation for DTMCs. Let X n be a homogeneous discrete time Markov chain on a countable state space S. Let P be the probability transition matrix associated with X n . For i, j ∈ S the one-step transition probability from i to j is given by The m-step transition probability from i to j is given by Since X n is a homogeneous DTMC, p ij and p (m) ij are independent of n. Then (P) ij = p ij and ij . If there is an m > 0 such that p (m) ij > 0, then we say that j is reachable from i. If j is reachable from i and i is reachable from j, we say that i and j communicate. Communication is an equivalence relation which partitions S into equivalence classes called communication classes. We say a communication class C is closed if whenever i ∈ C and j is reachable from i, then j ∈ C. A state which is in a communication class which is not closed is called a transient state. The probability of first reaching state j from state i in with respect to X n is called a hitting probability. The following definitions of the m-step hitting probability and hitting probability come from Pinsky and Karlin [41], while the notation is more similar to that of Norris [40], but modified to suit our purposes. The probability of first hitting state j from state i in exactly m transitions with respect to X n is an m-step hitting probability and given by The probability of first hitting j from i with respect to X n is

Graph theory
We now recall a few notions from graph theory [14,15,49]. A graph G is a set of vertices V(G) together with a set of edges E(G). The graph G is called a directed graph or a digraph if E(G) is a set of ordered pairs such that (u, v) ∈ E(G) whenever there is a directed edge from vertices u to v. We say that u is the tail and v is the head of edge (u, v). The out-degree of a vertex v is the number of edges with tail v. The out-degree of the graph G is the supremum of the out-degrees taken over all the vertices. For our purposes, we will consider the case that V(G) is a countable set. In this case, V(G) can be enumerated as a sequence The graph distance between vertices u and v, denoted d(u, v), is defined as the length of the shortest u,v-path. We take the convention that d( Since G is a directed graph, the graph distance is typically a quasi-metric (i.e. it meets all the conditions of a metric except symmetry). The ball of radius r in the graph G centred at the vertex v, denoted B(v, r), is the subset of vertices u such that d(v, u) ≤ r. Let us return our attention to the DTMC X n on the countable state space S with probability transition matrix P = (p ij ). Let A be the adjacency matrix induced by P such that In this way, X n induces a di-graph, G, with states in the state space S as vertices. By the definition of A given in (5), if there is a positive probability of transitioning from i to j in X n , then there is an edge (i, j) in G. Therefore, if p (m) ij > 0, then there is a, i,j-path in G of length m. It follows that the distance in the induced graph is given by d(i, j) = inf{m ≥ 0|p

Finite absorbing Markov chains
Now let X n be a homogeneous DTMC on a finite state space S with the properties that there is at least one closed communication class, there is at least one communication class which is not closed and that all closed communication classes are singletons. This makes X n an absorbing Markov chain. The closed communication classes of an absorbing Markov chain are called absorbing states. A more complete discussion of finite absorbing Markov chains can be found in Chapter 3 of [33]. Let P be the probability transition matrix of X n . Suppose that |S| = m and S contains j < m absorbing states. By simultaneously exchanging rows and columns of P, we can put it in its canonical form where

Theorem 2.1 ([33]): Let X n be a homogeneous DTMC on the state space S with |S| = m.
Let the probability transition matrix of X n be given by (6). Then . . , m and k = 1, . . . , j, the entry, H ik , associated with i and k is the probability of hitting the absorbing state s k from the transient

Collapsed chains
We now return to our discussion of the homogeneous DTMC X n on the countable state space S with probability transition matrix P. Following Hachigian [25], consider the collapsed process Y n = f (X n ) where f is a many-to-one function on S onto the state space of Y n . Suppose that the states of Y n are denoted by the countable sequence (S α ) α∈I . For each α in the index set I, we say that the subset of states i of S given by f −1 (S α ) are collapsed into the single state S α of the process Y n . It is not generally the case that the process Y n possesses the Markov property. It is natural to ask under what conditions the process resulting from collapsing a Markov process is again Markov. Several researchers have made contributions to this field [16,17,25,42,43]. The following result gives a sufficient condition in the case X n is a finite state space Markov chain. The result is given for non-homogeneous chains.

Theorem 2.2 ([17])
: Let X n , n ≥ 0, be a non-homogeneous Markov chain with any given initial distribution vector p and the state space S = {1, 2, . . . , m}. Let 1 ≤ r ≤ m, and S 1 , S 2 , . . . , S r be r pairwise disjoint subsets of S = r i=1 S i . Let Y n , n ≥ 0, be the collapsed chain defined by Y n = i if and only if X n ∈ S i . Then a sufficient condition for Y n to be Markov is that

Local approximation in time and space
In this section, we present a technique called LATS for studying certain features of homogeneous DTMCs or homogeneous CTMCs by first passing to the embedded DTMC. As such, the details will be presented in terms of a DTMC X n on a countable state space S with probability transition matrix P. In broad strokes, LATS consists of first restricting to a subset of the state space S, modifying this subset to form a state space S so that X n induces a DTMC Z n on S. We will show that Z n faithfully approximates X n locally in time and space. We then collapse Z n on S to an absorbing DTMC Y n on the collapsed state space S . Analysis of the collapsed chain Y n is more straightforward and less computationally expensive.

Neighbourhood of a set of states
The first step in the LATS technique is to form a subset of a set the state space S.
Definition 3.1: Let X n be a homogeneous DTMC on a countable state space S. Let T ⊂ S and N ∈ N. The N-neighbourhood of the set T with respect to X n is defined as where B(s, N) is the ball of radius N centred at s in the graph G induced by X n as described in Section 2.2.
Note that N 0 (T) = T, since p (0) ii = 1 by convention. Let R N (T) be the submatrix of the probability transition matrix P associated with the states in N N (T). If T is a proper subset of S, then R is typically a substochastic matrix. In order to extend R N to a stochastic matrix, we must extend the N-neighbourhood by adding an additional absorbing state. Let δ i (j) be the Dirac delta function with δ i (j) = 1 when i = j and zero otherwise.

Q N (T) is a stochastic matrix which can be written
Since Q N (T) is stochastic, it induces a DTMC, Z n , on the state space S N (T). If the outdegree of the graph and the set of initial states T are both finite, then the state space S N (T) is finite. The next result shows that for any initial distribution supported on states in T, Z n approximates X n locally in time.
Theorem 3.1: Let P = (p ij ) be the probability transition matrix of the DTMC X n on the countable state space S, let T ⊂ S and N ≥ 1. Suppose that N N (T) and Q N (T) = (q ij ) are as defined in (7) and (8) Let π be any initial distribution on S. Let supp (π ) be the support of π . If supp (π ) ⊂ T, then there is a distribution on S, say ∼ π, which is supported on supp (π ) and equal to π there. Theorem 3.1 shows that the evolution of the distribution π in the chain X n is completely captured by the evolution of the distribution ∼ π in the chain Z n for the first N transitions.

Collapsing Z n
The definition of S N (T) is related to the graph G induced by the underlying Markov chain X n by (7). If the set of initial states T and the out-degree of G are both finite, then S N (T) is finite for each N. Nevertheless, |S N (T)| may be quite large, making a direct analysis of Z n difficult. In order to reduce the complexity of the mathematical analysis of Z n , as well as the computational expense for calculations, the next step is to collapse Z n to an absorbing DTMC Y n . As noted in Section 2.4, Y n = f (Z n ) is a collapsed chain for any many-to-one function f from the state space of Z n to the state space of Y n . However, not all collapsed chains are themselves Markov. We recalled the definition of a closed communication class above. Now we make a more general definition of a closed subset of the state space. Definition 3.3: Let X n be a homogeneous DTMC on the finite state space S and let C ⊂ S. We say that C is closed with respect to communication if, whenever i ∈ C and j ∈ S \ C, then p ij = 0.
As noted above, we seek a function f so that Y n = f (Z n ) is an absorbing Markov chain. Absorbing states in a collapsed chain are either the result of the identity map on an absorbing state in the original chain or they are the result of collapsing a subset of the state space which is closed with respect to communication. Theorem 3.2 shows that collapsing a closed set results in a Markov chain. Theorem 3.2: Let X n be a homogeneous DTMC on a countable state space S. Let C ⊂ S be a proper subset of the state space which is closed with respect to communication. Let f be a many-to-one function given by Since To form the state space S N (T) of the chain Z n , we eliminated the complement of the Nneighbourhood N N (T) in the state space S and replaced it with an absorbing state g. In light of Theorem 3.2, we see that the escape state g can be formed, for example, by making all states in the complement of N N (T) absorbing and collapsing the resulting closed (possibly infinite) set.

Corollary 3.3: Let X n be a homogeneous DTMC on a countable state space S. Suppose that
where h (x;X) (S i ) and h (x;Y) (i) are hitting probabilities as in (4).
To approximate the behaviour of a homogeneous DTMC near a set of initial states T ⊂ S, we first form the DTMC Z n on a local state space S N (T) either by adjoining an absorbing state to the N-neighbourhood, N N (T), described by (7) or by making the contrivance that S \ N N (T) is closed with respect to communication and collapsing it. Theorem 3.1 shows that Z n approximates the behaviour of X n for N transitions on the condition that X 0 = Z 0 ∈ T. If we are not concerned with the behaviour of X n once it enters a closed communication class, we may further simplify the approximation by forming the collapsed chain Y n = f (Z n ), where f identifies closed communication classes with absorbing states and f is the identity map on transient states. In the next section, we show how LATS can be used to answer the question: what is the probability of first hitting a particular subset of the state space from a particular state in the chain X n ?

LATS approximation of hitting probabilities
Let X n be a homogeneous DTMC on a countable state space S. In this section, we describe how LATS can be used to approximate the general hitting problem with respect to X n . By the general hitting problem, we mean to determine the probability h For example, if S is finite but very large or S is countably infinite, then the general hitting problem can be difficult or even impossible to determine from direct analysis of X n . For x, y ∈ S, the probability of first hitting y from x, given by h x (y), can be reformulated in terms of x,y-paths in the induced graph G. By a path of length m which first hits y from x in the graph G we mean an x,y-path of length m, as in Section 2.2, with v i 0 = x, v i m = y and v i k = y for k = 1, 2, . . . , m − 1. Let ( i (x, y)) i∈I m be the family of such paths with index set I m . For i ∈ I m , let p i be the probability of path i . Then Equation (4)  In order to describe how to use LATS to approximate the general hitting problem, we make the following assumption.

Proposition 3.4: Assume (A1) and suppose that x ∈ T such that h
, then there is an x,y-path in the graph induced by Y n . Since there is a path from x to y, y is reachable from x. Since y is an absorbing state, the singleton {y} is a closed communication class. Therefore, x is not reachable from y. Hence,

y) is non-increasing as a function of N. If N N (T) increases to S with N, then lim
In the case i ⊂ N N (T), there is a vertex of i which is an element of S \ N N (T). Then i ⊂ N N (T) corresponds to an x,g-path in the collapsed chain Y n . Therefore, ∞ is non-decreasing. The final claim follows from the fact that if lim N→∞ N N (T) = S.

Theorem 3.5 proves that h (x;Y) (y) approximates h (x;X) (V) and h (x;Y) (g)
is an upper bound on the error of the approximation. To show that error goes to zero requires the additional assumption that N N (T) increase to S as N increases to infinity. This seems like a very strong assumption. However, biological phenomena are typically modelled by generalized birth-death processes, which satisfy this assumption. In these and other cases for which the assumption holds, the proof only shows that the error goes to zero asymptotically. However, we will see that in applications the approximation can be very accurate for N < 30.

Corollary 3.6: Assume (A1). For x ∈ T and m ≤
Proof: Let ( i ) i∈I k be the collection of all paths of length k which first hit V from x in the graph induced by X n . Let p i be the probability of path i . If m ≤ N, then This is a powerful result for practical applications. Since I−D is a nonsingular Mmatrix, to find a hitting probability using the LATS method we can use either of Q n N (T) for sufficiently large n or (I − D) −1 C. We will exhibit the practicality of this technique and the nature of the convergence for a particular DTMC by way of example in the next section.

LATS vs. branching process techniques
Infectious Salmon Anemia virus (ISAv) causes Infectious Salmon Anemia (ISA) in a variety of finfish including Atlantic salmon (Salmo salar). ISA causes high mortality [21] and affects all major salmon producing countries [46]. It is an important disease to the salmon culture industry which has caused significant economic losses according to the Global Aquaculture Alliance. Deterministic models of an ISAv outbreak in one and two patches are proposed and analysed in [39]. The one-patch model is given by where S denotes susceptible fish, I denotes infected fish and V denotes free-virus in the environment. Susceptible fish experience logistic population growth with birth and death rates β and μ, respectively. The rates of infection due to contact with infected individuals and contact with free-virus are given by η and ρ, respectively. The rate of mortality of infected fish is given by α and the rate infected fish shed virus particles into the environment is δ. Free-virus denatures and is cleared from the environment at rate ω. By scaling the system, we may eliminate η and ρ. This yields the reduced system The state variable V in the reduced system represents the number of infectious doses of free-virus in the environment.
The basic reproduction number is shown to be a critical parameter with the sharp threshold R 0 = 1 separating persistence of the disease (R 0 > 1) and its extinction (R 0 ≤ 0). The emergence of the disease is characterized by low prevalence of the disease in the environment and few infected individuals. This setting demands the use of stochastic modelling techniques to account for inherent random fluctuations. Continuous time Markov chain companion models are developed and analysed in [38]. The CTMC model of a single patch is given by X t = (S t , I t , V t ) and associated infinitesimal transition probabilities with transition rates σ (i, j) given in Table 1.
For deterministic epidemic models, it is common for R 0 > 1 to be the invasion criterion that implies persistence. In that case, the disease persists with certainty for all time. In the case R 0 < 1, the disease goes extinct with certainty. For stochastic epidemic models, there is always a chance that the disease will go extinct. We say that the disease persists if the event that the disease goes extinct is rare. That is, the probability of extinction is close to zero. The probability of extinction is the probability of hitting the subset of the state space associated with extinction starting from an initial state in its complement. If the initial state is near the disease-free quasistationary distribution and certain other assumptions hold, then the probability of extinction can be approximated by a branching process [3,26,37]. In this case, R 0 is shown once again to be a critical parameter, where R 0 ≤ 1 implies almost sure extinction and R 0 > 1 implies that the probability of extinction is less than 1 [5].
The additional assumptions are that all transitions are independent and that the initial susceptible population is sufficiently large. Biologically speaking, assuming that all transitions are independent is a strong assumption, but one that is commonly made in the formulation of mathematical models. Since I and V are distinct types of infectious individuals, X t is approximated using a multitype branching process. The offspring probability generating function (pgf) for the multitype branching process is whereS = β/μ is the population size at the disease-free distribution of X t . It is easy to verify that F(1, 1) = (1, 1). If R 0 > 1, then there exists a unique (q 1 , q 2 ) in [0, 1) × [0, 1) such that F(q 1 , q 2 ) = (q 1 , q 2 ). In this case, the probability of the extinction of all infectious types, given that there are initially i individuals of type I and j individuals of type V, is given by q i 1 q j 2 [3,26]. The branching process approximation of X t is a linearization and therefore doesn't capture nonlinear behaviour of X t . This is the reason that the branching process approximation diverges from the Monte Carlo approximation in Figure 1. For the purpose of this illustration, we assume that μ = 1, α = 3.3, δ = 1.3, ω = 4 are fixed and allow β to vary. We assume that there is initially one individual of type I and no individuals of type V. Therefore, the branching process probability of extinction is given by q 1 , which is a continuous function of the parameters. Monte Carlo simulations are performed at 10 unit increments of β from β = 10 to β = 50. At data point β = 10, the branching process approximation of the probability of extinction is 0.2953, while Monte Carlo simulation gives 0.3893.
Using LATS to approximate the probability of extinction faithfully captures the behaviour of X t near the disease-free distribution, including any nonlinear behaviour. As a result, the LATS approximation in Figure 1 remains accurate as β is varied. The fact that LATS detects nonlinear behaviour is one of the benefits of this technique. Since X t is a non-explosive process, for both branching process approximation and LATS, we first pass to the embedded DTMC, which we denote X n . We say an outbreak has occurred in X n when I n + V n ≥Ī +V, whereĪ,V are the numbers of infected individuals and infectious doses of free-virus at the outbreak quasistationary distribution (equivalent to the endemic equilibrium of the deterministic model (15)). We say extinction has occurred when I t + V t = 0. In Assumption (A1), above, we consider the set of initial states T and seek to approximate the probability of hitting a single set V. In this case, we let T be the singleton    x 0 , g-path of length N in the graph induced by Y n . Therefore, (Q N N (x 0 )) x 0 g = 0. Table 2 illustrates the convergence of ( Theorem 3.5 shows that the LATS collapsed chain Y n approximates the probabilities of extinction and outbreak another way, as well. That is, Since the LATS collapsed chain Y n is a finite absorbing chain, by Theorem 2.1, Q N (x 0 ) has canonical form given by (6) and h (x 0 ;Y) (j) = (H) x 0 j for j = e,o or g and where H = (I − D) −1 C. In applications, approximating hitting probabilities with respect to X n using the LATS collapsed chain Y n via entries of the matrix H may converge faster and with less computational expense than approximating via the N th power of the matrix Q N (x 0 ). This is illustrated in Table 3. The results in Table 3 illustrate that for N = 25, the probability that realizations with the initial state x 0 in the LATS collapsed chain Y n leave the N-neighbourhood before hitting the absorbing states associated with outbreak or escape is so small as to be negligible. Therefore, the LATS approximation Y n is an accurate representation of the embedded DTMC X n near the initial state x 0 for N ≥ 25. Furthermore, in Table 3, the probability of extinction converges faster (for smaller N) than the probability of outbreak. This means that realizations with the initial state x 0 that lead to extinction are more likely to remain close to x 0 than those that lead to outbreak. This is reasonable, since the initial state x 0 is one transition in X n from the subspace E ⊂ S associated with extinction, as evident from the first row of Table 2. Nevertheless, the fact that LATS approximation gives insight into the behaviour of X n at such fine resolution is another benefit of the technique.  S N (T), the state space of the LATS collapsed chain Y n , associated with extinction, outbreak, and escape from the N-neighbourhood, respectively. The probability of escape does not actually reach 0, but does become so small as to be negligible for the purpose of numerical calculations. Parameter values are the same as those used in Table 2.

Locally deployed disease control
In this section, we consider 2 small communities coupled together by migration. This metapopulation experiences a disease outbreak in which infected individuals become fully susceptible immediately upon recovery. First, a deterministic SIS model in the form of a system of nonlinear ODEs is proposed and analysed, then a related CTMC is developed. Consider the following hypothetical: suppose that both patches are at risk of outbreak in the sense that their patch type reproduction numbers are greater than 1, with patch 2 at higher risk than patch 1. Suppose that patch 1 has the resources to deploy disease control to reduce the rate of infection by 20%, but can only be deployed in one patch. We consider the strategies of deployment in patch 1 only, or patch 2 only, and show which strategy minimizes the basic reproduction number for a particular choice of parameters. LATS is used to determine the optimal strategy for control in the CTMC, taking into account outcome preferences of decision makers. First we consider the following deterministic two-patch SIS model: where, in patch i, S i is the number of susceptible individuals, I i is the number of infected individuals, N i = S i + I i is the total number of individuals, β i is the rate of infection and γ i is the rate of recovery for i = 1,2. The rate of migration between patches is d and N = N 1 + N 2 is the total number of individuals in the system. This model corresponds to the model studied in [37] with n = 2, α 1 = α 2 = 0 and d 12 = d 21 = d. When I 1 = I 2 = 0, the system admits the disease-free steady state S 1 = S 2 = N/2. Following [37], the basic reproduction number is The patch type reproduction number is given by R 0i = β i /γ i for i = 1,2 and corresponds to the basic reproduction number in each patch in the absence of migration. Suppose that β 1 = 0.4, β 2 = 0.5, γ 1 = 0.2, γ 2 = 0.1, and d = 0.1. Then R 01 = 2, R 02 = 5 and R 0 ≈ 3.4358. Now we modify system (19) to include parameters θ 1 and θ 2 related to the control.
The new system is given bẏ which corresponds to system (19) when θ 1 = θ 2 = 1. The strategy of deploying disease control in patch 1 corresponds to (θ 1 , θ 2 ) = (0.8, 1) and the complementary strategy corresponds to (θ 1 , θ 2 ) = (1, 0.8). The effect these strategies have on R 0 and R 0i for i = 1,2 is found in Table 4. Clearly, the strategy of deploying control in patch 2 minimizes R 0 as well as minimizing the sum of the patch-specific reproduction numbers, R 01 + R 02 . Regardless of which strategy is chosen, R 0 > 1 and the disease persists [27]. If d = 0, then the system decouples and the basic reproduction number in a single patch is given by the patch type reproduction number R 0i = β i /γ i , for i = 1,2. For N i sufficiently large, the probability of extinction following the introduction of a single infected individual is given by the GWbp approximation q = 1/R 0i = γ i /β i , a decreasing function of R 0i [3,51].
In order to investigate what happens at small population size we consider the CTMC X t = (S 1 (t), I 1 (t), S 2 (t), I 2 (t)) with the infinitesimal transition probabilities p ij ( t) = P(X t+ t = j|X t = i) = σ (i, j) t + o( t) and associated rates σ (i, j) found in Table 5.
Consider the two-patch SIS CTMC, X t , with the parameters β 1 = 0.4, β 2 = 0.5, γ 1 = 0.2, γ 2 = 0.1, d = 0 and control strategy (θ 1 , θ 2 ). Since d = 0 the two patches are decoupled andṄ i = 0 for i = 1,2. Suppose there are initially 10 individuals in patch 1 and 5 individuals in patch 2. Using LATS, we determine the probability of extinction P 0i in patch i from the initial state (9, 1) in patch 1 and (4, 1) in patch 2 with and without control. As noted above, Whittle showed that in this case the GWbp approximation of the probability of extinction in patch i is the reciprocal of the patch reproduction number [51].
The results of LATS approximation of the probability of extinction are reported alongside the patch reproduction numbers and the reciprocals of the patch reproduction numbers in Table 6. The results of the experiments reported in Table 6 coincide with the hypothesis that the probability of extinction in a single patch is monotone in R 0 for small populations as well. Now let d = 0.1 so that the two patches form a metapopulation. Let us return to the hypothetical that residents of patch 1 have resources available to deploy a control strategy that reduces the rates of infection in a single patch. The residents of patch 1 wish to deploy the control in the patch that yields the greatest benefit to themselves. To account for this preference, we want to calculate the probability of extinction in patch 1 only from the initial state x 0 = (10, 1, 1, 3). This is the state of the system if patch 1 is at its single patch quasistationary disease-free state (10, 0), patch 2 is at its single patch quasistationary endemic state (1, 4) and a single infected individual migrates from endemic patch 2 to patch 1 as the rate of movement between patches, d, becomes positive. Recall that the state space S of X t is the subset S = {(S 1 , I 1 , S 2 , I 2 )} ⊂ N 4 with the property that S 1 + I 1 + S 2 + I 2 = S 1 (0) + I 1 (0) + S 2 (0) + I 2 (0), i.e. that the total population is preserved. Let A = {(S 1 , I 1 , S 2 , I 2 ) ∈ S : I 1 = 0} be the subset of S associated with the extinction of the disease in patch 1 only and let E = {(S 1 , I 1 , S 2 , I 2 ) ∈ S : I 1 = I 2 = 0} be the subset of S associated with total extinction of the disease. Then E ⊂ A. Define the probability of partial extinction in patch 1 to be the probability of hitting the set A. Since d > 0, the event X t = a for a ∈ A \ E is transient. However, we take it as an indicator of the disease risk in patch 1. Let X n be the embedded DTMC. We use LATS as described in (A1) with V = A to approximate h (x 0 ;X) (A) and again with V = E to approximate h (x 0 ;X) (E). Results in Table 7 show that strategy (θ 1 , θ 2 ) = (0.8, 1), corresponding to deployment in patch 1, optimizes patch 1 partial extinction, h (A;X) (x 0 ). In contrast, the probability of total extinction, h (E;X) (x 0 ), indicates deployment in patch 2 optimizes total extinction of the disease. Table 7. Probability of partial extinction in patch 1 only and total extinction from initial state x 0 = (10, 1, 1, 3).  Table 8. The probability of extinction from initial state x 1 = (7, 1, 7, 0) and x 2 = (7, 0, 7, 1). Obviously, these are conflicting results. In order to make sense of them, we consider the classical setting in which a single infected individual appears in one patch or the other and calculate the probability of total extinction. The disease-free quasistationary distribution of the two-patch model is (7.5, 0, 7.5, 0), which is not a state of the system. Therefore, we expect X t to fluctuate between the states (8, 0, 7, 0) and (7, 0, 8, 0) when it is near the quasistationary distribution. We therefore consider the probability of extinction from the initial states x 1 = (7, 1, 7, 0) and x 2 = (7, 0, 7, 1). These probabilities are reported in Table 8.
The results in Table 8 provide insight into the effect of the different control strategies. First, disease control measures deployed in a single patch promotes disease extinction in both patches. The probability of partial extinction in patch 1 given by h (A;X) (x 0 ) reported in Table 7 and h (E;X) ((7, 1, 7, 0)) reported in Table 8 reflects a bias towards patch 1, while h (E;X) ((7, 0, 7, 1)) reflects a bias towards patch 2. From the perspective of a given patch, the optimal outcome is a result of deploying control measures in that patch, which may or may not coincide with minimizing R 0 . The product of column h (x 1 ;X) (E) and column h (x 2 ;X) (E) can be viewed as giving a global perspective. From this perspective, it is still optimal to deploy control measures in a way that also minimizes R 0 . Partial extinction probabilities, like h (x 0 ;X) (A), may be useful tools for decision-making in the context of heterogeneous control in metapopulations. In addition to being a potentially useful indicator for heterogeneous control problems, h (x 0 ;X) (A) cannot be calculated using branching process approximation. The very fact that we have been able to calculate this probability using LATS illustrates another benefit of the technique, namely, that it is suited to approximating hitting problems in general.

Discussion
The LATS technique combines localization and collapsing of Markov chains to reduce a DTMC on a countably infinite state space to a finite absorbing Markov chain. It is a robust tool for approximating general hitting probabilities for typical chains when the local neighbourhood is sufficiently large. However, increasing the size of a neighbourhood comes at the cost of increasing computational expense, particularly if the degree of the induced graph is large. LATS is shown to be effective in some cases where branching process techniques are not suitable. This is due, in part, to the fact that LATS captures nonlinear behaviour of the Markov chain it seeks to approximate. In addition to solving the general hitting problem, LATS also gives information about the number of transitions in a path as well as the likelihood of long sojourns. Therefore, it is a valuable addition to the set of tools used to analyse stochastic models of real world phenomena.
In Section 5, partial extinction probabilities are shown to have the potential to help inform deployment of spatially heterogeneous disease control strategies. There are likely other interesting biological questions that can be formulated as general hitting problems. LATS is well suited to provide approximations for these problems.
In the context of heterogeneous control strategies, outcome preferences are important drivers of decision-making. Standard techniques to calculate probability of extinction may not be the ideal way to account for this. Formulating hitting problems that account for preference, such as partial extinction probabilities, may be useful to optimize decisionmaking subject to constrains.