The large graph limit of a stochastic epidemic model on a dynamic multilayer network

We consider an SIR-type (Susceptible $\to$ Infected $\to$ Recovered) stochastic epidemic process with multiple modes of transmission on a contact network. The network is given by a random graph following a multilayer configuration model where edges in different layers correspond to potentially infectious contacts of different types. We assume that the graph structure evolves in response to the epidemic via activation or deactivation of edges. We derive a large graph limit theorem that gives a system of ordinary differential equations (ODEs) describing the evolution of quantities of interest, such as the proportions of infected and susceptible vertices, as the number of nodes tends to infinity. Analysis of the limiting system elucidates how the coupling of edge activation and deactivation to infection status affects disease dynamics, as illustrated by a two-layer network example with edge types corresponding to community and healthcare contacts. Our theorem extends some earlier results deriving the deterministic limit of stochastic SIR processes on static, single-layer configuration model graphs. We also describe precisely the conditions for equivalence between our limiting ODEs and the systems obtained via pair approximation, which are widely used in the epidemiological and ecological literature to approximate disease dynamics on networks. Potential applications include modeling Ebola dynamics in West Africa, which was the motivation for this study.


Introduction
A fundamental issue in disease dynamics is that contact patterns change in response to infection.This is particularly salient in the study of disease dynamics on contact networks: infected individuals curtail contacts with their regular community due to illness (e.g.being too sick to go to school or work) but increase their contacts with other segments of the population, such as healthcare workers or caretakers in the home.The recent Ebola outbreak in West Africa provides a stark example.The array and severity of symptoms, including high fever, diarrhea, vomiting, and hemorrhaging, make symptomatic individuals too ill to engage their regular community contacts and, instead, cause individuals to seek care in the home, hospital, or other facility.This coupling of evolution of network structure to disease status is basic, but a theoretical understanding of how this affects disease dynamics is lacking.
Disease dynamics on networks has been an extremely active area of research in the past 20 years, typically within the SIR-type (Susceptible → Infected → Recovered) modeling framework [14,21,32,49,56,58,74].This has been stimulated in part by the explosion of data on networks of various sorts [5,12,13,17,24,55,65,66,76] and the recognition that network structure can have a dramatic impact on disease dynamics.Theoretical findings include applications of percolation theory to static networks [30,56].Less theory has been developed for networks that change over time, with much work in this area focusing on concurrent partnerships forming and breaking independent of disease status [1,2,8,23,40,41,72].The study of adaptive networks, where the contact structure changes in response to disease progression, is an emerging area, as reviewed by Funk et al. [28].One popular approach is to assume that susceptible individuals break connections to avoid infection [25,31,64,79].Related works examine behavioral changes due to awareness of infection [28,29].As these studies indicate, evolving network structure may lead to rich dynamics that are of practical importance for disease forecasting and evaluating public health interventions [31,63,64].
A challenge for understanding disease dynamics on networks is their high dimensionality -for example, a modest-sized network or graph (here, and elsewhere in the paper, we use these terms interchangeably) may have tens of thousands of nodes and over a hundred thousand edges.Various approaches have been developed for deriving simpler models to approximate the full network dynamics including grouping vertices by degree [9,10,57] or "effective degree" [6,43,45] and considering stationary degree distributions for dynamic graphs [2,40,41].Two approaches particularly relevant to this work are the pairwise approach (e.g.early work includes [35,36,59]) and the edge-based approach of Volz and Miller that is applicable to graphs with a specified degree distribution [53,54,71].Both approaches naturally lead to consideration of the disease dynamics in the large graph limit, i.e. when the number of nodes tends to infinity.Whereas Volz and Miller derived their results heuristically, recent mathematical work has rigorously shown that a deterministic edge-based system of equations is the large graph limit of an SIR continuous-time Markov process on a static random graph [20,33].
Multilayer networks, which allow for more complex disease dynamics, have also received much attention recently, as reviewed by Kivelä et al. [38].In particular, multilayer networks where the interconnected layers can represent different populations have been considered [15,60,73,78,80].The effect of degree correlation on twolayer networks has also been studied [62] with each layer an Erdős-Rényi or Barabási-Albert random graph.Other models involving two-layer graphs were also considered where one layer corresponded to information-spreading and the second to disease transmission [26,27,29,34] or where two competing pathogens spread on the two layers [61,75].Multilayer networks have also recently been employed to model temporal networks as sequences of static networks [67,68].The particular class of multilayer networks studied here are those in which the set of nodes is identical in each layer (i.e.node-aligned [38]) with distinct edge types corresponding to each layer.These are often referred to as multiplex (or multi-relational) networks [11,16,19,77] and can be represented as a graph with edges colored according to type.
In this work, we consider the problem of modeling an epidemic with different modes of disease transmission on a dynamic contact network.Specifically, we formulate an SIR continuous-time stochastic process on a multilayer graph, with specified degree distribution, where nodes represent individuals and edges represent potentially infectious contacts.Each layer contains the same set of nodes but corresponds to a different transmission mechanism (i.e. a multiplex network).In addition, we allow edges to be active or dormant with transmission occurring only along an active edge.The dynamic network structure allows us to incorporate behavioral changes due to infection.A simple example is a two-layer network with one layer corresponding to community contacts and the other to healthcare contacts where we assume that infected individuals deactivate their community edges, for example due to decreased mobility or isolation, while their healthcare edges are being activated as they seek care (Figure 1).
The main result of this work, Theorem 1 in Section 3, derives the large graph limit for the stochastic SIR process on the dynamic multilayer network.According to the theorem, the scaled counts of different edge and node types converge uniformly in probability to the solution of a deterministic system of equations.Thus, we obtain a relatively simple limiting model in the setting where network connectivity changes with the evolution of the disease process.In particular, it follows that for a certain class of random graphs the large graph limit coincides with the model obtained using either the pairwise or edge-based approach.As we demonstrate with the two-layer network example, the limiting systems are amenable for mathematical analysis, allowing us to gain biological insight into how changing network structure influences disease dynamics.Moreover, Theorem 1 extends previous results on edge-based models [20,33].
This paper is organized as follows.Section 2 introduces the stochastic model that is considered along with the necessary notation.Section 3 presents our main result, a law of large numbers for the stochastic process on the dynamic multilayer network, and considers several important special cases, which relate our result to edge-based and pairwise models.The two-layer community-healthcare network model and its analysis is given in Section 4. The proof of our main theorem is given in Section 5. We conclude with a discussion in Section 6.The Appendices provide further mathematical details.

Stochastic model
We start by introducing some notation and relevant definitions for dynamic multilayer networks.This includes defining the class of random graphs to be considered and extending the notions of degree and excess degree to the multilayer setting.Then, we introduce the stochastic process considered in this work, which is the appropriately modified version of the SIR process.

Layered configuration model
Let r denote the number of layers and, for any vectors The probability generating function (pgf) of the multivariate degree distribution is given by where p k k k = P(k 1 , . . ., k r ) is the probability of a node being of (multi-)degree k k k, i.e. having k i neighbors in layer i.Given a realization of the degree distribution on n nodes, we construct a multilayer graph as follows.Each node is assigned a collection of half-edges in each layer corresponding to its degree, and then half-edges within each layer are paired uniformly at random.Thus, restricted to the j-th layer, the resulting graph is a realization of a configuration model [55] with the degree distribution given by the j-th marginal of ψ.We refer to the collection of such realized graphs as the layered configuration model (LCM) and denote it by G r (ψ, n).

Excess degree distributions.
In a single-layer graph, the excess degree of a node u is calculated by following an edge to u from a neighbor v and counting the number of other neighbors (not including v) of u [55].It will be convenient to extend the notion of an excess degree distribution to the multilayer setting.Let P j|i (l) denote the probability that a randomly selected i-neighbor (i.e.neighbor in layer i) of a node u has j-degree (i.e.degree in layer j) equal to l.Then, by LCM construction, P j|i (l) is given as is the average i-degree, ∂ i denotes the partial derivative with respect to x i , and 1 is the vector of ones in R r .Correspondingly, let ψ ex j|i denote the pgf of the excess j-degree distribution of a node randomly selected as a i-neighbor.Then, where x j is the vector of ones with the jth coordinate replaced by x j .The average excess j-degree of an i-neighbor is then given by Finally, we define the normalized average excess j-degree of an i-neighbor as Note that, for the univariate (i.e.single layer) case when r = 1, P i|i (k) = kp k /µ, which is the well-known distribution of the degree of a neighbor (also referred to as the size-biased distribution [70] and corresponding to the excess degree distribution q k = (k + 1)p k+1 /µ [55]), and κ is the ratio of the average excess degree to average degree.

SIdaR process
Assume that we have a realization of an LCM G r (ψ, n) specifying the contact network for a population of size n.
The disease modeling framework adopted is the standard SIR compartmental model where individuals are classified based on their infection status [37].S, I and R correspond, respectively, to susceptible, infected, and recovered (or removed) individuals.We assume that edges within layers represent potentially infectious contacts of a certain type and we allow the network to be dynamic in response to infection.That is, we assume that nodes will either activate or deactivate their edges, depending on their infection status.An infectious node drops (resp.activates) edges in layer j at rate δ j (resp.η j ).We assume that a layer cannot be both activating and deactivating, i.e. at most one of δ j and η j are nonzero, and we also assume that all deactivating layer edges are initially activated and all activating layer edges are initially deactivated.Let j = 1, . . ., k denote the deactivating layers (with η j = 0) and let j = k + 1, . . ., r denote the activating layers (with δ j = 0).Then, 2r + 1 event types may occur: infection (I) along an edge of any of the r types, drop (d) of a deactivating edge or activation (a) of an activating edge, and recovery (R).The timings of all events are assumed to follow independent exponential clocks with the following rates: β j rate of infection along jedges (S j −→ I), j = 1, . . ., r δ j rate of deactivation (drop) of jedges, j = 1, . . ., k η j rate of activation of jedges, For a susceptible node u, let X j SI,u and X j SS,u denote, respectively, the number of infectious and susceptible active j-neighbors of u.Similarly, let X j SI,u and X j SS,u denote the number of deactivated j-neighbors of u.Also, for an infected node u, let X j IS,u and X j IS,u denote the number of susceptible active and deactivated, respectively, j-neighbors of u.We consider aggregate variables that are the total number of nodes or pairs of neighboring nodes (i.e.dyads) with a given disease status.For example, the total number of j-edges between susceptible and infectious individuals is denoted X j SI and is given by X j SI = ∑ u∈S X j SI,u .We denote the aggregate dyad counts as vectors in R r , e.g.X SI = (X 1 SI , . . ., X r SI ) and likewise for X SI , X SS , and X SS .Note that X SS and X SS count the edges twice.We let X(t) = (X S , X I , X SI , X SI , X SS , X SS )(t) denote the state of the aggregate stochastic process at time t > 0 where X S and X I denote the number of susceptible and infectious nodes, respectively.Note that the number of recovered individuals is given by X R = n − X S − X I and so, for the sake of simplicity, we ignore the equation for X R throughout.The transitions for the aggregate process are listed in Table 1.We refer to such a process as SIdaR(r, k) in order to emphasize the activation and deactivation events.The analysis of this process is complicated, partially due to the aggregation of the nodes that destroys the Markov property (see, e.g.[2]).
Note that, as above, the notation of a dyad subscript (e.g.X SI ) is understood throughout to denote a (row) vector in R r .Also, with a slight abuse of notation we take multiplication, division, integration and ordering of vectors to be coordinatewise.The state variables depend on n but we do not explicitly acknowledge this in our notation.
Consider the SIdaR(r, k) process X(t) on the LCM G r (ψ, n) with transitions as outlined in Table 1.The Doob-Meyer decomposition theorem [48] guarantees the existence of a zero-mean martingale where the integrable function We now define two more variables that will help us describe the evolution of the process in the large graph limit.Let X S• (t) = (X 1 S• (t), . . ., X r S• (t)) where X j S• (t) is the number of j-edges belonging to susceptible nodes at time t.We partition the collection of susceptible nodes S by their degree k k k ≥ 0 so that S = ∪S k k k which corresponds also to We also define θ = (θ 1 , . . ., θ r ) by where β = (β 1 , . . ., β r ).We may interpret θ j (t) as the probability of no infection along a j-edge by time t in a susceptible node of j-degree one, given that the node was susceptible at t = 0.That is, θ 1 is the probability that a susceptible node of (multi-)degree 1 has not been infected through any layer by time t, given that the node was susceptible at t = 0. Note also that we may equivalently write where θ (0) = 1 and As shown in Theorem 1 below, θ plays a key role in describing the evolution of X(t) in the large graph limit.The use of such a variable was pioneered by Volz [71] and Miller [50] in their edge-based approach.In fact, as shown in Section 3.2, in the single-layer, static network case, the large graph limit of θ corresponds to the variable in the standard SIR edge-based model [53].

Large graph limit theorems
The stochastic process defined in Section 2.2 is complex and difficult to analyze.In this section we present a limit theorem (Theorem 1) that shows, under mild technical assumptions, that the stochastic process converges to a relatively simple system of ODEs as the number of nodes tends to infinity.The limiting ODEs retain key features of the epidemic process while being amenable to analysis.In the case of a finite but large population, analysis of this deterministic system provides a good approximation to disease dynamics.In the general case, the evolution of the quantities of interest, X(t), will involve a function of the θ variable defined in the preceding section.In Section 3.2, we state corollaries that relate our result to edge-based models in the special case of static graphs [54].Finally, in Section 3.3 we show that, for a particular class of degree distributions, the evolution of X(t) decouples from θ , revealing a perhaps surprising connection between our limiting system and low-dimensional models obtained via pair approximation.

General case
All limits considered below, unless otherwise noted, are for n → ∞.We use P → to denote convergence in probability in the space of right-continuous with finite left limits (càdlàg) stochastic processes on random configurations drawn according to an LCM G r (ψ, n).We say that a sequence of random variables Y n → ∞ with high probability (w.h.p.) if P(Y n > k) → 1 for any k > 0. Let 0 < T < ∞.We make the following assumptions: (A2) The fractions of initially susceptible, infected, and recovered nodes converge, respectively, to some α S , α Furthermore, α S > 0, α I > 0, and the initially infected and recovered nodes are chosen randomly.
The assumption (A1) implies that, for large graphs, the infection does not overtake the graph.That is, some proportion of the individuals remain susceptible on [0, T ] and, hence, θ is well-defined in (6).Furthermore, it implies that the average j-degree of a randomly chosen node, i.e. ∂ j ψ(1), is positive since 0 < lim inf n −1 X S• ≤ ∂ ψ (1).
Assumption (A2) implies that the initial conditions for the dyads, scaled by n, also converge in probability, i.e.
where, for the deactivating layers j = 1, . . ., k, and, for the activating layers j = k + 1, . . ., r, For example, we obtain the initial condition α j SI for j = 1, . . ., k as follows.By assumption, all deactivating layer edges are initially activated and all activating layer edges are initially deactivated.Then, according to (A2), the limiting probability of selecting a random node that is susceptible is α S .The average number of j-edges a node has is µ j , and the limiting probability that a given edge connects to an infected node is α I .Therefore, α j SI = α S α I µ j .The other dyad initial conditions are obtained similarly.We denote α = (α S , α The assumption (A3) implies that ∑ k k k k 2 j p k k k < ∞ for j = 1, . . ., r (i.e. the second moments of the marginal degree distributions are finite) and also that the multigraph considered when matching half-edges uniformly at random is a simple graph with positive probability [33].
Before stating the main result, we define a quantity that plays a key role in describing how the network structure affects the large graph limit.For 1 ≤ j, l ≤ r, let κ jl be defined by As we discuss in Section 3.3.3,κ jl (θ ) can be interpreted as the ratio of the average excess j-degree of a susceptible node chosen randomly as an l-neighbor of an infectious node to the average j-degree of a susceptible node.We also define the function that, by Theorem 1, describes the evolution of (X(t), θ (t)) in the large graph limit.
Thus, Theorem 1 specifies the large graph limit of the aggregated SIdaR(r, k) process on G r (ψ, n) under conditions (A1)−(A3).That is, (X(t)/n, θ (t)) converges uniformly in probability on any finite interval [0, T ] to the solution D(t) of the deterministic set of equations given by (12).

Edge-based limiting systems
We consider two special cases of the large graph limit theorem for multilayer networks.First, we consider a static network, i.e. the case where δ j = η j = 0 for j = 1, . . ., n. Corollary 1 shows that in this case our system ( 12) is equivalent to an edge-based model with multiple modes of transmission.The model is one proposed by Miller and Volz [54] but with a modification to allow for a large number of initially infected nodes (following [51] where Miller modifies the standard SIR edge-based model for such a scenario).In the case that the initially infected nodes are randomly chosen (which we assume in (A2)), the model is given by Details of the equivalency, i.e. proof of Corollary 1, are given in Appendix A.
Corollary 1. Assume δ j = η j = 0 for j = 1, . . ., n and the conditions of Theorem 1 hold.Then, the conclusions of Theorem 1 hold where D(t) is equivalent to the solution of the edge-based model with multiple modes of transmission (13).
We further consider the special case of a static, single layer graph (i.e.r = 1 and δ 1 = η 1 = 0).In the case r = 1, (13) reduces to the well-known edge-based SIR model of Volz and Miller et al. [53,71], which has been proven to be the large graph limit of the SIR stochastic process on a static configuration model graph [20,33].By taking r = 1 in Corollary 1 we have provided an alternative proof of this fact.
Corollary 2. Assume r = 1, δ 1 = η 1 = 0 and the conditions of Theorem 1 hold.Then, the conclusions of Theorem 1 hold where D(t) is equivalent to the solution of the edge-based SIR model (13) with r = 1.

Pairwise limiting systems
In this section we consider a certain class of LCMs where κi j as defined by equation ( 10) is constant.This affords a substantial simplification to the limiting system (11) and, in fact, the system of differential equations defining the large graph limit will coincide with the model derived via the pairwise approach.

Poisson-type distributions
We define a multivariate Poisson-type (PT) distribution to be a distribution with a pgf that satisfies where we recall the definition of the normalized average excess degree, κ i j , in equation (2).At first glance, ( 14) may seem opaque; however, in fact it is satisfied by a broad class of distributions.For example, in the single layer case this condition is equivalent to which is satisfied by the univariate Poisson (κ = 1), binomial (κ < 1), and negative binomial (κ > 1) distributions.Note that a k-regular graph (where all nodes have degree k) is a special case of the binomial distribution.Additionally, the geometric distribution is a special case of the negative binomial distribution.Bansal et al. have shown that the geometric distribution (i.e. the discrete analog of the exponential distribution) gives the best fit for several empirical contact networks [7].In the multilayer case with r > 1, if the layers are independent, i.e. the pgf ψ(x) can be written as ψ(x) = ∏ r j=1 ψ r (x r ), then the PT condition ( 14) reduces to each layer being a (univariate) Poisson-type distribution.

Pairwise model
If the degree distribution is PT as defined by condition ( 14), the limiting system ( 11) defining H = (H X , H Θ ) has a particularly simple form.Indeed, substituting the constant κ jl for κ jl (x Θ ) decouples H X from H Θ .We consider the resulting model in this section and introduce some new notation to do so.Let [XY ] j and [ XY ] j denote, respectively, the number of activated and deactivated edges of type j between a node of status X and a node of status Y .Let [XY Z] i j denote the number of triples with an active i-edge between nodes of status X and Y and an active j-edge between node the node of status Y and one of status Z.Similarly, [ XY Z] i j will denote such triples where the i edge is dormant.
Under the correlation equations approach of Rand [59], triples are needed to describe the evolution of pairs, quadruples (e.g.[XY ZW ] i jk ) are needed to describe the evolution of triples, and so forth.A pair approximation for triples is used in order to close the system at the level of pairs [59].For consideration of triples in the multilayer setting, we must take into account the edge types and the appropriate excess degrees as defined in Section 2.1.Let p(u = X|A) denote the probability that a node u has disease status X ∈ {S, I, R} given an edge (or triple) arrangement A for u.The pair approximation of [XY Z] i j is then calculated as follows: with κ i j as defined in equation (2).We can alternatively calculate Applying the correlation equations approach using the pair approximation (15) to the SIdaR(r, k) dynamics de-scribed in Section 2.2 results exactly in the same equations as the limiting system (12) in the case of a PT distribution: Here, we have derived the system (16) using absolute pair and triple counts.Notice that, if we scale all variables by the graph size n, the nondimensional quantities satisfy the same system of equations (this holds for any n and, hence by continuity of the solution, also in the limit).From here on we will consider only the scaled variables, which will be convenient when we state the law of large numbers in Corollary 3. Accordingly, we set the initial conditions to be (S, I, so that they agree with the initial conditions in Theorem 1 as n → ∞. Observe also that, in fact, we can reduce the dimension of the system ( 16) since we only need to keep track of the deactivated edges for the activating layers, i.e. j = k + 1, . . ., r.Also, [SS] j ≡ 0 for an activating layer since its initial condition is zero and, hence, we only must track [SS] j for j = 1, . . ., k.We refer to system (16) with its initial condition (17) as the pairwise model.
The discussion above is summarized in the following corollary.
Corollary 3. Assume the conditions of Theorem 1 hold for LCM G r (ψ, n).Then, the conclusions of Theorem 1 hold where D(t) is the solution of the pairwise model (16) if and only if G r (ψ, n) has a multivariate Poisson-type degree distribution.
Furthermore, we can consider the implications of Corollary 3 in the static, single layer case.If r = 1 and δ 1 = η 1 = 0, then the pairwise model ( 16) reduces to the well-known correlation equations model of Rand [59]: Corollary 4. Assume the conditions of Theorem 1 hold for G 1 (ψ, n), a static graph (i.e.δ 1 = η 1 = 0).Then, the conclusions of Theorem 1 hold where D(t) is the pairwise SIR model of Rand (18) if and only if G 1 (ψ, n) has a univariate Poisson-type degree distribution.
Note that together Corollaries 1 and 3 imply that, in the case of a multivariate PT degree distribution on a static graph, the pairwise model ( 16) is equivalent to the edge-based model with multiple modes of transmission (13).Likewise, Corollaries 2 and 4 indicate that the pairwise SIR model ( 18) is equivalent to the edge-based SIR model, (13) with r = 1, when the distribution is PT.We note that the edge-based SIR model has previously been shown to be equivalent [32,52] to a higher dimensional pairwise model of Eames and Keeling [22].The latter model stratifies the susceptible nodes by degree and, hence, has dimension K + 3 where K represents the number of distinct degrees [52].The model of dimension K + 3 was derived as an approximation to an earlier well-known model of Eames and Keeling, which is of dimension 3K 2 /2 + 3K/2 + 1 [22].We observe that, in fact, (18) can be reduced to two differential equations.Separation of variables on d[SS]/dS (see, e.g., [3] Subsequent inspection of d[SI]/dS yields a linear differential equation that can be solved to express [SI] explicitly as a function of S: Thus, in Corollary 4, we have identified a condition on the degree distribution under which the dimension of a pairwise model that is equivalent to the edge-based SIR model has been reduced from K + 3 to two.

Large-graph-consistent pair approximation
Corollary 3 and the derivation of the pair approximation (15) motivate us to consider a more careful approximation of the triples in the general case when the distribution is not necessarily PT.Let µ ex|SI i| j denote the average excess i-degree of a susceptible node chosen randomly as a j-neighbor of an infectious node and let µ S i denote the average excess i-degree of a susceptible node.We now make more precise our comment in Section 3.1 that κi j can be interpreted as the limiting ratio of these quantities.
For the LCM G r (ψ, n) with θ as defined in equation ( 6), we derive µ S i and µ ex|SI i| j (see Appendix B) to be Thus, we see from the definition of κ in (10) that We note that µ S i and µ ex|SI i| j are dependent on time and differ, respectively, from µ i and µ ex i| j defined in Section 2.1.Indeed, susceptible nodes of high degree are preferentially infected [47].On the other hand, the naive approximation of a triple with κ i j = µ ex i| j /µ i in (15) uses the average degree and excess degree over all nodes, which remain constant for all t ≥ 0. Note that it is only necessary to approximate triples of the form [XSI] i j where X ∈ {S, I}.Therefore, we can more carefully derive the pair approximation using µ S i and µ ex|SI i| j : Note that the approximation p(u = X|[uSI] i j ) ≈ p(u = X|[uS] i ) becomes exact in the large graph limit as the configuration model becomes locally treelike [52].
Moreover, it follows from Theorem 1 that κi j (θ ) P → κi j (D Θ ) uniformly on any finite interval [0, T ].We then see from (21) that, if we took the correlation equations approach for SIdaR dynamics on a general LCM but instead approximated the triples with the system of equations obtained would be exactly that given by the limiting system (12).In this sense, the pair approximation ( 22) is "correct", i.e. it is consistent with the large graph limit.
4 Community-healthcare network example In this section, we discuss the two-layer dynamic network model that was mentioned in the Introduction as a concrete example of the stochastic SIdaR process and use it to illustrate how our limit theorems can be applied to gain insight into disease dynamics in this complex setting.Consider the SIdaR(2, 1) process, as described in Section 2.2, on a two-layer LCM G 2 (ψ, n) with multivariate PT degree distribution where the two edge types correspond to community and healthcare type contacts.We assume that infected individuals deactivate their community contacts, for example due to decreased mobility or isolation, while they activate their healthcare contacts as they seek care (Figure 1).Note that the healthcare network may include both care provided by healthcare professionals at hospitals or other facilities as well as care provided in the home.This model is motivated by the recent 2014−2015 outbreak of Ebola virus in West Africa.The multitype contact features are particularly relevant for Ebola, given the disproportionate Ebola risk experienced by healthcare workers [18,46] and women (primary caregivers in the home in West Africa) in the 2014 West Africa outbreak as well as Ebola outbreaks in the Democratic Republic of the Congo [39].We use this example to demonstrate that, even in the quite complicated scenario where network dynamics are tied to infection status, the pairwise limiting system we obtain from Theorem 1 is tractable for analysis.In particular, analysis of this model illustrates how the activation and deactivation of edges in response to infection couples the different layers of the network, and how this coupling affects basic features of disease dynamics such as invasibility and outbreak size.Let C denote community edges and H denote healthcare edges.As in Section 2.2, the stochastic events are assumed to follow independent exponential clocks where the rates of infection along Cand H-edges are, respectively, β C and β H , the rate of deactivation of a C-edge is δ , the rate of activation of an H-edge is η and the rate of recovery is γ.
The SIdaR(2, 1) stochastic process above satisfies the conditions of Theorem 1 and Corollary 3 and, thus, converges to the following system of ODEs in the large graph limit: As in Section 3.3.2, the variables in (23) are scaled by n and the initial conditions are given by [SI] Recall that µ C and µ H are the average Cand H-degrees, respectively, of a randomly chosen node and that (25) corresponds to all community edges being active and all healthcare edges deactivated at t = 0. System (23) can now be analyzed to gain insight into how the structure of the different layers of the network and their coupling through activation / deactivation of edges in response to infection affects disease dynamics.In particular, we can compute the basic reproduction number, R 0 , which determines whether disease invasion is possible [69].Consider the disease free equilibrium x 0 = (1, 0, 0, 0, 0, µ C , µ H ). Let .
Note that κ CC µ C = µ ex C , the average excess C-degree, so we can interpret R C as the average number of secondary cases transmitted through the community network in a susceptible population and, likewise, R H represents the secondary cases caused by healthcare transmission.The next-generation matrix method [69] gives (see Appendix C.1) the basic reproduction number as where The R CH term can be interpreted as the number of secondary infections created in the community contact network from an initial infection along an H edge, and R HC as the secondary infections created in the healthcare contact network from an initial infection along a C edge.Examination of (26) yields two key insights.First, the presence of both R C and R H in R 0 reflects the coupling of the different layers through edge activation / deactivation.In the limit of fast deactivation of community contacts (i.e.δ → ∞), R 0 → R H and disease invasibility depends solely upon the healthcare layer.Similarly, in the limit of slow activation (i.e.η → 0), R 0 → R C and the basic reproduction number is driven by the community layer.For intermediate activation and deactivation rates, R 0 depends upon both layers and the multilayer aspect of the model plays an important role in affecting disease dynamics.Both transmission "within" (e.g.R C , R H terms) and "between" layers (e.g.R CH , R HC terms) contribute to R 0 , as seen in (26b).
Second, (26) shows how R 0 depends upon the structure of the different layers.Consider the cross term in (26a).In the case where κ CH κ HC /κ CC κ HH = 1 (for example, independent layers with Poisson degree distributions), the cross term vanishes and R 0 = R C + R H .In general the sign of the cross term may be either positive or negative, and thus R 0 may be either larger or smaller than R C + R H .In fact for fixed R C and R H it is possible for R 0 to be either greater than or less than one, depending upon the structure of the layers.This is illustrated in Figure 2 where we plot prevalence (i.e.infected proportion of the population) over the course of an epidemic for two different scenarios for the structure of the healthcare layer.We assume the layers are independent (κ CH = κ HC = 1) and that the degree distribution of the community layer is Poisson (κ CC = 1) with µ C = 10.We fix R C = 0.75 and R H = 0.5.Let the degree distribution of the healthcare layer be negative binomial with µ H = 8; in the first scenario we set κ HH = 1.1 (blue) while in the second we set κ HH = 4 (orange).Figure 2 shows, for each scenario, the deterministic solution to the limiting system (23) (solid line) as well as the approximate 95% empirical confidence interval calculated from 500 stochastic simulations of the corresponding SIdaR(2, 1) process with n = 10 4 (shaded region).The basic reproduction numbers are calculated from (26a) to be, respectively, R 0 = 1.17 and R 0 = 0.91.Correspondingly, in the first case a large epidemic occurs while in the second case the infection dies out.The increase in κ HH from the first case to the second corresponds to an increase in the variance of the H degree distribution.In fact, analogous to previous results (see, e.g., [42]), inspection of (26b) reveals that, if µ C and µ H are kept constant, R 0 is an increasing function of the variances of the C and H degree distributions as well as their covariance.
In many situations R 0 not only determines the ability of disease to invade, but also the size of an outbreak if one occurs [44].The system of ODEs (23) obtained via Theorem 1 and Corollary 3 can similarly be analyzed to determine a relation for the final size of the epidemic, as in [3,4].To illustrate, consider the special case where the community and healthcare layers are independent with Poisson degree distributions (i.e.κ CC = κ HH = κ CH = 1).Let S ∞ denote the fraction of the population that escapes infection.Analysis of a transformed model (as in [3]) or, alternatively, a result of Arino et al. [4] can then be used (see Appendix C.2) to derive the final size relation: which agrees with the classic result for mean-field, homogeneously mixed populations as α I → 0 and α S → 1 [37].We conclude our analysis of the community-healthcare network model by noting that system (23) is simple enough to be amenable for practical application to outbreaks of interest, for example for parameter estimation and intervention assessment from available data (for example, medical records and contact tracing data).Further details on statistical inference and application to specific outbreaks will be presented elsewhere.Here we briefly show how system (23) can be further reduced by finding invariants that allow for dimension reduction.Details for the following methods are provided in Appendix C.3.
Consider the case where κ CH = κ CC .We can then eliminate [ SS] H by expressing it as a function of S and [SS] C : where In the case of independent layers (κ CH = 1) neither of which has a Poisson degree distribution (i.e.κ CC = 1 and κ HH = 1), we are able to find two additional invariants, equations ( 32) and ( 33) below.The reduced system, of dimension four, is given by where [ SS] H is given by ( 29) and In the case of independent layers with Poisson degree distributions (κ CC = κ HH = κ CH = 1, see Appendix C.2), we have [SS] C = µ C S 2 and [ SS] H = µ H S 2 .We can further reduce the dimension of the system by one with the following invariant:

Proof of the limit theorem
In this section, we provide proof of our main result, Theorem 1, preceded by two lemmas.The derivation of our results relies on the key observation, summarized in Remark 5.1, that in a finite graph, the neighborhood of a susceptible node may be described by a certain multivariate hypergeometric distribution.Lemma 1 shows that X S k k k and X S• can be expressed in the limit as functions of θ given by ( 6).Lemma 2 shows that the dynamics of the scaled process on the finite graph converges, in the appropriate sense, to the dynamics described by the ODE system (11) involving θ .Theorem 1 then follows from Lemma 2 using Doob's martingale and Gronwall's inequalities.
Recall that we take operations on vectors such as multiplication, division, integration and ordering to be coordinatewise.We first provide an important remark about the neighborhood of a susceptible node of degree k in layer j, i.e. (X j SI,i (t), X j SI,i (t), X j SS,i (t), X j SS,i (t)) for i ∈ S k .The distribution of the neighborhood is critical when we consider the expectations of ), which are mixed moments with respect to the neighborhood distribution.Recall that we consider the evolution of the SIdaR(r, k) process on a realization of the LCM random graph that has been generated by time t = 0. We could alternatively consider an equivalent process whereby the graph is revealed dynamically as infections occur, as in [20,33].In the latter process, a susceptible node i ∈ S k remains unpaired until it becomes infected.We could, however, pair off all unpaired edges at time t > 0 (uniformly at random according to the LCM construction) in order to define the neighborhood of i.The following remark is perhaps most easily understood by keeping this equivalent construction in mind and recalling the relevant probability space mentioned in Section 3.1.
Remark 5.1.For k j ≥ 1 and i ∈ S k , and conditionally on the process history up to time t, the vector (X j SI,i (t), X j SI,i (t), X j SS,i (t), X j SS,i (t)) follows the multivariate hypergeometric distribution with probability mass function where E h denotes expectation with respect to the hypergeometric distribution.We also note that, based on the LCM construction, the neighborhoods of i in distinct layers are independent, i.e.

P(X SI
It follows from above that the mixed moments are given by for l = j and Likewise, for any 1 ≤ j, l ≤ r, . Keeping in mind this remark, we proceed to prove the first lemma, which shows that X S k k k and X S• can be expressed in the limit as functions of θ as given by ( 6).

Lemma 1. Assume (A1) and
where Z i (t) ∈ {0, 1} indicates whether node i is of degree k k k and susceptible at time t > 0. Recall from Remark 5.1 that EX j SI,i = k j X j SI /X j S• for i ∈ S k k k .We claim that 0) by (A2) and is a càdlàg martingale process with mean zero and finite variation.By the triangle inequality, The second term tends to zero by assumption (A2) and for the first term we have, by Doob's martingale inequality, Since there are at most n jumps for X S k k k and each is of size one, it follows that the quadratic variation of X S k k k = O(n), which gives the needed result.
(b).By equation ( 5), we can write Consider arbitrarily large N and ε > 0. By Markov's inequality, we have for n sufficiently large, α S ≤ 1, and we may apply the Monotone Convergence Theorem.So the tail of the sum is negligible since N is arbitrary and, by assumption, The result then follows since in (a) we showed convergence for fixed k k k.
Before proceeding with the next lemma, we give a brief remark on boundedness of our variables and define some useful functions.
. We will use the facts that, in the limit, the hypergeometric mixed moments are approximately multinomial and we can replace X S• with a function of θ by Lemma 1.Therefore, it will be convenient to define the following compensators. Let and so that the hypergeometric mixed moment in equation ( 35) is given by We also define a function related to the multinomial distribution, C jl,(k k k) m and (Note that, if we considered n −1 X j SI /z j (t) and n −1 X j SS /z j (t) to be actual probabilities, then C jl,(k k k) m (t, z(t)) would be a mixed moment with respect to the multinomial distribution.)Observe that there exists L > 0 such that for any j, l (and uniformly in n) since the domain of z is bounded away from 0 and n −1 X SI , n −1 X SS are uniformly bounded above by Remark 5.2.It also follows that Proof.(a).Define J : . By Remark 5.2 and (A1), (t, n −1 X S• ) and (t, α S θ ∂ ψ(θ )) are in the domain of J for t ∈ [0, T ].By definition of ∆ Θ in (39), Since J is Lipschitz continuous in z, (b) We note that, by definition in (38), ∆ S = ∆ I = 0. We will show that sup 0<t≤T ||∆ SI (t)|| P → 0 as it follows similarly for ∆ SI , ∆ SS and ∆ SS and together these imply sup 0<t≤T ||∆ X (t)|| P → 0. In fact, we observe that ∆ SI = (∆ 1 SI , . . ., ∆ r SI ) and so it suffices to show sup 0<t≤T for j = 1, . . ., r.Let 1 ≤ j ≤ r.We can rewrite ∆ j SI as Thus, we define . Hence, it suffices to show This is done in what follows in two separate steps.Consider an arbitrary pair ( j, l) 1 ≤ j, l ≤ r.We first show that as To this end, observe that for some C > 0 since X S k /n ≤ 2p k for n sufficiently large.From the bound on C jl,(k k k) m in (45), we also have Then (48) follows from ( 49) and ( 50) together with (A3).
Next we show that sup 0<t≤T ∆ jl,(k k k) SI (t) P → 0 for any k k k.We write and we will show that each of these terms tends to zero uniformly in probability.By Remark 5.1 and equation (42), the process (t) is a zero-mean, piecewiseconstant càdlàg martingale that jumps only if infection/recovery of a node of degree k k k (or a neighbor of a node of degree k k k) occurs or activation/drop of a j-edge or l-edge belonging to a node of degree k k k occurs.Recall that, for each layer, either activation or drops are possible, not both.Consider events impacting a node u ∈ S k k k .For infection or recovery events, of which there are at most 2(1 + k j + k l ) corresponding to infection and recovery of u itself or one of its jor l-neighbors), the jump size is at most k j k l .For deactivation and activation events of an l or j-edge, of which there are at most k l + k j affecting u, the jump size is also at most k j k l .Recall that the number of nodes of degree k k k is approximately np k k k for large n.The quadratic variation of M jl,(k k k) h (t) is the sum of its squared jumps [see, e.g., 3, Chapter 9] and, thus, satisfies i.e. the term in (51) tends to zero uniformly in probability.
In consideration of the term in (52), we note that n −1 X S k ≤ 1 and (t, n −1 X S• ) for j = l.For the case l = j, we have for some L > 0 and since X j S• (t) is non-increasing on [0, T ].Thus, the term in (52) tends to zero uniformly in probability by (A1).
For the term in (53), we observe in (45) and Lemma 1(a).Finally, since C jl,(k k k) m (t, z(t)) is Lipschitz continuous in z, we have for some L > 0 and so the term in (54) tends to zero uniformly in probability by Lemma 1(b).Therefore, recalling also (48) we conclude that (47) holds and hence (46) follows.
Proof of Theorem 1. Recall the definition of ∆ = (∆ X , ∆ Θ ) from equations (38) and (39).Note that, by equations ( 3) and ( 7), where Note that each of the coordinate processes of the vector process M(t) = (M S , M I , M SI , M SI , M SS , M SS )(t) is a pure jump, càdlàg, zero mean, martingale process.Consider the process M j SI which, by equation ( 3), jumps only if infection of a node, recovery of a node or a j-edge drop/activation occurs at time s.Recall that, for each j, either activations or drops are possible, not both.Consider events corresponding to a node of degree k k k.For infection and recovery events the jump size, δ M j SI (s), is not greater than that node's j-degree, and for activation and deactivation events, of which there are at most 2k j affecting that node, the jump size is one.Since the number of nodes of degree k k k is approximately np k k k for large n, the corresponding quadratic variation process satisfies for some L > 0, since the first term in the parenthesis tends to zero by (A2) and (9).The assertion follows when we take t = T < ∞.

Discussion
The complexity of dynamic multilayer networks makes understanding disease dynamics on them a challenge.Working with simple models that retain key features of network evolution and disease transmission is essential for gaining biological insight into mechanisms underlying basic disease features such as invasion, persistence, and outbreak size.
In this work, we have developed a framework for modeling infectious diseases with multiple modes of transmission in the setting where the network changes in response to infection.Even in this seemingly complicated scenario, it is relatively straightforward to formulate a continuous-time stochastic process by considering transitions in the states of nodes and connected pairs of nodes.However, the state space of the Markov process becomes unmanageable as the size of the network increases and analysis of the non-Markovian aggregate process is complicated.Our main result, Theorem 1, rigorously derives the large graph limit of the stochastic process on a layered configuration model graph and, thus, gives a simple model retaining key features of the epidemic process while being amenable to analysis.Moreover, our results extend previous ones for the SIR process on a static, single-layer network [20,33] to the dynamic, multiplex network case.In contrast to previous results, we have formulated the limiting system in terms of dyads via the function κ that provides the large-graph-consistent approximation of triples.Expressing the system of equations in this way allows for a simple characterization of the class of graphs (i.e.those with κ ≡ κ) for which the model obtained via ordinary pair approximation [59] coincides with our large graph limiting system.Numerical comparison of κ and κ could then potentially be used to quantify how well a pairwise model accurately reflects the relevant disease dynamics.
A model that is useful to inform public health interventions must be simple enough to perform analysis and statistical inference.As demonstrated in the two-layer community-healthcare example in Section 4, the large graph limit we have derived is indeed tractable.The basic reproduction number can be calculated and its analysis allows us to gain insight into how the structure of the different layers and their coupling through the dynamic network structure affects disease invasion and the final size of the outbreak.The current framework is flexible and could be adapted to diseases with multiple modes of transmission (e.g.those with sexually and non-sexually transmitted infections).Application of this framework to specific diseases, such as Ebola, will require investigation into parameter identifiability and statistical estimation methods.The large graph limit derived here will aid such analysis and, in fact, suggests a hybrid approach in which the node state transitions remain stochastic but the dyads are approximated using the limiting differential equations (or, if possible, invariants such as those found in Section 4).Even for large networks, the resulting Markov process approximation would allow for computationally inexpensive maximum likelihood estimation.Finally, we mention that interventions (e.g.vaccination) can be incorporated into this framework, which would be critical for providing actionable recommendations to public health policymakers with the aim of curbing current epidemics or preventing future outbreaks.

A Equivalence with edge-based multiple modes of transmission model (Proof of Corollary 1)
We provide in this section the proof of Corollary 1, i.e. equivalence of the system (12) with the edge-based model with multiple modes of transmission given by system (13).First, we observe that D S (0) = α S ψ(D Θ (0)) = α S and i.e. α S ψ(D Θ ) satisfies the same differential equation as D S .Therefore, by the uniqueness of the ODE solution, Secondly, we claim that Indeed, D j SS (0) = α 2 S ∂ j ψ(1) and which is the same differential equation as that satisfied by D j SS , which proves (56).We define We now show that D j SI α S ∂ j ψ(D Θ ) satisfies the same differential equation: where we have used (55) and (56).Since Hence, dD j Θ = −β j D I, j φ , i.e.
which is the same differential equation as that for θ j in (13).Furthermore, D j Θ (1) = θ j (1) which implies that D j Θ = θ j .It subsequently follows from (13) and equation ( 55) that S, I and R are also equivalent for the two models.

B Derivations of µ S
i and µ ex|SI i| j We provide here the derivations for µ S i and µ ex|SI i| j as given in equation (20).Recall that µ S i is the average i-degree of a susceptible node.By equation (37), the probability that a node u is susceptible and of degree k k k is given by P We can then calculate µ S i as follows: Recall that µ ex|SI i| j is the average excess i-degree of a susceptible node chosen randomly as a j-neighbor of an infectious node.That is, we randomly select a j-edge between a susceptible node and an infectious node, and we calculate the excess i degree of the susceptible node.Let E j SI denote the set of j-edges between susceptible nodes and infectious nodes and let E j S k k k I denote the set of j-edges between susceptible nodes of degree k k k and infectious nodes, so that E j SI = ∪ k k k E j S k k k I .Recall that X j SI,u is the number of infectious j-neighbors of a susceptible node u.Also recall from Remark 5.1 that, given u ∈ S k k k , the neighborhood of u has a hypergeometric distribution and E h [X j SI,u ] = k j X j SI /X j S• .We first assume i = j and calculate .
If j = i, we likewise get .
Therefore, equation (20) follows.Alternatively, we recall the equivalent model with dynamic graph construction mentioned in Section 5 and consider a j-half edge of an infectious node that is forced to pair with a j-half edge of a susceptible node, which we denote by e, at time t.Let E j S denote the set of j-half edges belonging to susceptible nodes and let E j S k k k denote those belonging to susceptible nodes of degree k k k.We first assume i = j and calculate µ ex|SI i| j as follows: .
The case i = j follows likewise as above.
C Calculations for community-healthcare model C.1 Basic reproduction number, R 0 We use the next-generation matrix method (and corresponding notation) from [69] to calculate R 0 .We consider I, [SI] C , [SI] H , and [ SI] H to be the infective compartments (m = 4).We consider the disease-free equilibrium for the system (23) given by x 0 = (1, 0, 0, 0, 0, µ C , µ H ). The matrix corresponding to terms for new infections, F, is then given by and the matrix corresponding to terms from all other transitions, V , is Therefore, and the next-generation matrix is given by Then, R 0 is the spectral radius of the next-generation matrix, i.e. the largest absolute value of an eigenvalue.We see that FV −1 has two zero eigenvalues, and the other two eigenvalues are determined by the characteristic polynomial: We solve p(λ ) = 0 to determine
Then, from integrating equation ( 59 Transforming back to our original variables, this gives the invariant (34): Let S ∞ denote the fraction of the population that escapes infection.Taking the limit t → ∞ gives the final size relation ( 28

C.3 Model reduction
Let σ and λ be as defined by equation (30).We will derive relation (29) to demonstrate our method of finding invariants.Suppose we have a function f which satisfies Thus, d f /dt = 0 if the following system of equations is satisfied: For a given C, the solution to this system is: Hence, the invariant is i.e. relation (29).The derivations of relations (32) and (33), the additional invariants in the independent layers case, are similar.For (32), we look for an invariant of the form f (S, [SS] C , [SI] C ) = 0. Likewise, for (33) we consider an invariant of the form f (S, [ SS] H , [SI] H , [ SS] H ) = 0.The analysis follow analogously to above.

Figure 1 :
Figure 1: (Left) Neighborhood of a susceptible vertex (labeled 1) with an infected (red) neighbor.Community (top/blue) and healthcare (bottom/green) contacts are shown as active (solid) or deactivated (dashed).(Right) After infection, two healthcare contacts are activated and one community contact is dropped.

Figure 2 :
Figure 2: Prevalence curves for two different scenarios for the structure of a community-healthcare network corresponding to large (top/blue) and small (bottom/orange) outbreaks.The degree distribution of the C-layer is Pois(10).The degree distribution of the H-layer is NB(10, 4/9) with κ HH = 1.1 in the first case (top/blue) and NB(1/3, 0.96) with κ HH = 4 in the second case (bottom/orange).The solid lines show the deterministic solutions to (23) while the shaded regions indicate the approximate 95% confidence intervals based on 500 numerical simulations of the stochastic SIdaR(2, 1) processes with n = 10 4 .Initially infected nodes (α I = 0.002, α S = 0.998) are randomly chosen and γ = 0.1, δ = 0.1, and η = 0.3.We fix R C = 0.75 and R H = 0.5 which corresponds to β C = 0.0162 and, respectively for the two scenarios, β H = 0.0082 and β H = 0.0021.