Influence of non-homogeneous mixing on final epidemic size in a meta-population model

ABSTRACT In meta-population models for infectious diseases, the basic reproduction number can be as much as 70% larger in the case of preferential mixing than that in homogeneous mixing [J.W. Glasser, Z. Feng, S.B. Omer, P.J. Smith, and L.E. Rodewald, The effect of heterogeneity in uptake of the measles, mumps, and rubella vaccine on the potential for outbreaks of measles: A modelling study, Lancet ID 16 (2016), pp. 599–605. doi: 10.1016/S1473-3099(16)00004-9]. This suggests that realistic mixing can be an important factor to consider in order for the models to provide a reliable assessment of intervention strategies. The influence of mixing is more significant when the population is highly heterogeneous. In this paper, another quantity, the final epidemic size () of an outbreak, is considered to examine the influence of mixing and population heterogeneity. Final size relation is derived for a meta-population model accounting for a general mixing. The results show that can be influenced by the pattern of mixing in a significant way. Another interesting finding is that, heterogeneity in various sub-population characteristics may have the opposite effect on and .


Introduction
Epidemiological models can be used to identify key factors affecting disease outbreaks and to evaluate intervention strategies. However, because the model predictions and evaluations can be influenced by assumptions made to simplify the analysis, it is critical to understand how certain simplifying assumptions may affect the model outcomes, particularly if the models are used to inform public health policy-making.
One of the most commonly used assumptions in epidemiological models is the homogeneous, or random, mixing in the whole population, which allows the model to ignore heterogeneity in various sub-population factors including activity (pertinent to disease transmission), susceptibility, preference in contact, group size, etc. Although these simplifying assumptions are reasonable in many cases depending on the questions to be studied, there are cases where these heterogeneous factors cannot be ignored (see e.g. [11][12][13]).  (18) for the case of a homogeneous population with random mixing. It shows that F is an increasing function of R 0 .
When epidemic models are used to evaluate strategies for disease control and prevention, the quantities used often include the basic or effective reproduction number (R 0 or R e ), the final epidemic size (F ), and the peak size of an outbreak (P). Andreasen [2] studied the role of heterogeneity in susceptibility and infectiousness on final epidemic size for the case of proportionate mixing and discussed the relationship between F and R 0 . Tildesley and Keeling [19] considered a model with a spatial structure to discuss the issue of using R 0 as a predictor of final epidemic size for the case of foot-and-mouth disease in the UK. Results concerning final epidemic sizes can also be found in other studies including [1,3,4,10,15,16]. None of these studies considered the influence of heterogeneity on the final size or the relationship between F and R 0 in a meta-population with preferential mixing.
In this paper, we consider an epidemic meta-population model that includes a general mixing function (which includes proportionate, preferential, and homogeneous mixing as special cases). We examine how heterogeneity in various sub-population factors may influence R 0 , F, and P, and how preferential mixing may affect the relationship between F and R 0 . Our results show that models that do not account for heterogeneous mixing may produce biased or even incorrect information.
From previous epidemic models with homogeneous mixing, a general understanding is that F and R 0 are positively related (see Figure 1). We show in this paper that, when preferential mixing is considered in the model with heterogeneity in activity and/or preference, it is possible that F and R 0 are negatively related (see Figures 4 and 5). Moreover, when vaccination is considered, the homogeneous coverage may not be the most effective strategy (see Figures 7 and 8).
The paper is organized as follows. In Section 2, we present the model with n subpopulations and non-homogeneous mixing. Derivations for the reproduction number and a final epidemic size relation are also included in this section. Section 3 focuses on the investigation of the effects of heterogeneity in various sub-population factors and mixing on the final epidemic size and epidemic peak during an outbreak. Two examples of using the meta-population model to evaluate intervention strategies are also presented in this section. Section 4 includes discussions of the results.

The model and final epidemic size
Consider a meta-population with n sub-populations or groups with sizes N i (1 ≤ i ≤ n). The population in each group i is divided into three epidemiological classes denoted by S i (susceptible), I i (infectious), and R i (recovered and immune due to infection). Thus, Because we are concerned with the final epidemic size of a single disease outbreak, births and deaths are ignored. The epidemic model with n sub-groups consists of the following ordinary differential equations: with initial conditions: In model (1), a i is the per capita contact rate (referred to as activity), β i is the probability that a susceptible person in group i becomes infected upon contacting an infectious person, γ denotes the per-capita recovery rate. The contacts between sub-groups are described by the mixing matrix C = (c ij ), where c ij is the proportion of the ith sub-group's contacts that is with members of the jth group, and I j /N j is the probability that a randomly encountered member of group j is infectious. As pointed out in Busenberg and Castillo-Chavez [6], a mixing function needs to satisfy the following basic conditions: n j=1 c ij = 1, One of the commonly used mixing functions, as originally considered by Nold [17] and later extended by Jacquez et al. [14], is the preferential mixing with the form: In the preferential mixing (4), i denotes the fraction of contacts reserved for one's own group (referred to as preferences), and δ ij is the Kronecker delta (1 when i = j and 0 otherwise). The function f j describes mixing that is random, i.e. proportional to unreserved contacts, (1 − j )a j N j . All parameters are non-negative.
The isolated basic reproduction number for group i is The next generation matrix is where C = (c ij ) is the mixing matrix. The basic reproduction number for the metapopulation is when ρ(K) denotes the dominant eigenvalue of K. In general, we can only compute ρ(K) numerically when n > 2. Two special cases are when the mixing is proportionate (i.e. when i = 0) or isolated (i.e. when i = 1) for all i. In these cases, ρ(K) is given by either the trace of K or the maximum of R 0i . That is, the analytic expressions are: If vaccines are available before the epidemic starts, a certain level of population immunity can be achieved via vaccination. Let p i denote the immunity of sub-population i at time t = 0 (0 ≤ p i ≤ 1). Then the disease dynamics can be modelled by the equations in (1) with modified initial conditions: In this case, the isolated effective reproduction number for group i is R ei = (1 − p i )R 0i , and the next generation matrix is K e = diag(R e1 , R e2 , . . . , R en )C. The effective reproduction number for the meta-population is R e = ρ(K e ).
If the effect of another type of control or intervention programme is to reduce activities a i and/or probabilities of infection β i , let R ei denote the isolated effective reproduction number for group i (same formula as for R 0i in (5) but with the new values of a i and β i ), then the effective reproduction number for the meta-population R e is given by the dominant eigenvalue of the next generation matrix K e , which is the same as K in (6) but with R 0i being replaced by R ei .
We now use (1) with initial condition (2) to derive a relation for the final epidemic size. Note from the S i and I i equations in (1) that Because S i (t) ≥ 0 and I i (t) ≥ 0 for all t > 0, S i (t) + I i (t) is a non-negative decreasing function and hence has a limit as t → ∞ and ∞ t=0 I i (t)dt = 0. It follows that I i (t) → 0 as t → ∞ (see [5], Section 2.4). Thus, S i (∞) . = lim t→∞ S(t) exists. Using I i (∞) = 0 and S i (0) + I i (0) = N i , and from (10) we get Dividing the S i equation in (1) by S i (t) on both sides, we get Integration of both sides of (12) from 0 to ∞ yields From (11), (13), and R 0i = a i β i /γ , we obtain Let Thus, from (14) and (15), we know that Z i can be determined by a solution of the following set of equations: Let N = n i=1 N i denote the total population size, and let F i = Z i /N denote the fraction of infected in group i (over the total N). We define the final epidemic size F by In the special case when the population is considered homogeneous with random mixing, i.e. the parameter values for all sub-groups are identical (e.g. where Z = n i=1 Z i and R 0 = R 0i = aβ/γ . If we vary R 0 by varying β while keeping a and γ fixed, the final size relation (18) is illustrated in Figure 1, which shows that F is an increasing function of R 0 .
As shown in Feng et al. [11] and Glasser et al. [13], in meta-population models with preferential mixing, even when β and the mean activity n i=1 a i /n are constant (with all other parameter values fixed), R 0 may vary with the level of heterogeneity in a i and preference i . Therefore, the dependence of F on R 0 may not be increasing as exhibited in Figure 1. In the following sections, we will investigate how population heterogeneity in models with non-random mixing may influence the final size F, the relationship between F and R 0 , and model evaluations of intervention programmes.

Effect of heterogeneity on final size and the relationship between F and R 0
A general understanding based on epidemic models for a homogeneous population with random mixing is that F is an increasing function of R 0 , as shown in Figure 1. Recall that the increase in R 0 is caused by increasing β. When heterogeneity such as activity (a i ), preference of contacts ( i ), and sub-population size (N i ) are considered in a metapopulation model with preferential mixing, R 0 may vary by a significantly magnitude even when β i are fixed (see e.g. [13]). In addition, it has been shown in Poghotanyan et al. [18] that, among a large class of mixing matrices (c ij ) including the preferential mixing in (4), R 0 = ρ(K) is bounded below and above by the basic reproduction numbers corresponding to proportionate mixing ( i = 0 for all i) and isolated mixing where It is clear from (19) that we can keep β i fixed and change R 0 by varying a i and N i while keeping the same mean activity and mean population size. Particularly, different i values may also affect how R 0 and F may be influenced by heterogeneity in a i or N i . Thus, the relationship between F and R 0 may be more complex in meta-population models than the one illustrated in Figure 1. This will be explored next. Results in this section are for the case of n = 2 groups, but they can be extended to the case of n > 2. Although the equations for Z i in (16) cannot be solved analytically, we can explore the dependence of Z i on other parameters numerically. Rewrite (16) as where  For demonstration purposes, we used the following parameter values in Figure 2 (the time unit is days): β 1 = β 2 = 0.035, 1/γ = 5 (infectious period is 5 days), a 1 + a 2 = 20 with a 1 = 14 and a 2 = 6 (heterogeneous activity with the mean activity being 10), 1 = 2 = 0 (proportionate mixing), N 1 + N 2 = 10 4 , σ = N 1 /N = 0.5 (homogeneous sub-population size with N 1 = N 2 = 5000), and I 1 (0) = I 2 (0) = 1. For these parameter values, the isolated reproduction numbers for sub-populations 1 and 2 are R 01 = 2.45 and R 02 = 1.05, respectively. The overall effective reproduction number for the metapopulation is R 0 = 2.03. From the equations in (16), we have Z 1 = 4210 and Z 2 = 2730. These values are also consistent with the cumulative infections in groups 1 and 2 presented in Figure 2(b) from numerical simulations of system (1) with the initial condition (2).

Influence of heterogeneity and non-homogeneous mixing
To explore the effect of mixing on model predictions, we consider preferential mixing (4) and examine heterogeneity in various characteristics affecting the mixing function including activity a i , group size N i , and preference i . For demonstration purposes, we fix the same parameter values β 1 = β 2 = 0.035, N 1 = N 2 = 5000, and the mean activity (a 1 + a 2 )/2 = 10. Values for other parameters may vary depending on applications.

Heterogeneity in activity a i
Consider first the case of proportionate mixing ( 1 = 2 = 0). The effect of heterogeneity and variability in activity a i on F is illustrated in Figure 3, which shows the final sizes for three sets of activities (a 1 , a 2 ). The a i values used in the three rows from top to bottom are (with increasing variability) Top : (a 1 , a 2 ) = (10, 10), Middle :(a 1 , a 2 ) = (14, 6), Bottom :(a 1 , a 2 ) = (18, 2).
The left column shows the values of Z i determined by equations in (16), as the intersection of F = 1 and G = 1 (marked by the dot). The right column shows the epidemic curves based on the simulations of the system (1) with initial condition (2). Notice that the number of individuals in the R class represents the cumulative infections at anytime.
In this case, the basic reproduction numbers are R 0 = 1.75, 2.03, and 2.87, respectively, while the final sizes are F = 0.713, 0.695, and 0.599, respectively. We observe the following changes as the variability in activity increases • F decreases; • R 0 increases; • P increases; • the time to epidemic peak decreases.
To make it more transparent how variability in a i affects R 0 , F, and their relationship, Figure 4 plots F and R 0 for several sets of activity (a 1 , a 2 ) that have the same mean. It shows that the homogeneous activity (a 1 = a 2 = 10) corresponds to the smallest R 0 = 1.75 and largest F = 0.713, while the heterogeneous activity with the highest variability (a 1 = 18, a 2 = 2) corresponds to the largest R 0 = 2.87 and smallest F = 0.599. This provides an example that the final size does not vary with R 0 positively. That is, the usual understanding based on models with homogeneous assumption that F increases with R 0 (see Figure 1) may not be true when more realistic mixing functions are incorporated in meta-population models. Such results may have important implications for public health policy-making. For example, when a control programme is aimed at reducing R 0 , it should also check whether or not it may lead to an increased final epidemic size.

Preference level in mixing
The examples in Section 3.1.1 are for proportionate mixing (i.e. i = 0). A more realistic mixing, in general, is when i > 0, representing a preference of contacts with others in the same sub-group. Heterogeneity in factors such as a i and i may have an important influence in R 0 and F, as suggested in (19). We now explore the effect of variability in a i and the joint effect of a i and i on F and R 0 . Consider first the simpler case when 1 = 2 . Figure 5 provides a similar information as in Figure 4 but includes cases of i > 0.
We observe in Figure 5 that (i) for each fixed i value, R 0 increases but F decreases with the variability in a i , and the changes are more dramatic for higher i ; and (ii) for each fixed heterogeneous pair (a 1 , a 2 ), R 0 increases and F decreases with i . Because the bar charts in Figure 5(a,b) are for the same sets of a i and i , it shows again that F and R 0 are related negatively when heterogeneity in a i and/or i is varied. This illustrates again an opposite relationship between F and R 0 to that shown in Figure 1.

Sub-population size N i
The influence of heterogeneity in N i can be examined by fixing all other parameter values. For example, Figure 6 shows the case when a 1 = 15, a 2 = 5, i = 0, and all other parameter values except N i are the same as before. The total N = N 1 + N 2 = 10 4 is fixed but   When (a 1 , a 2 ) changes from (10,10) to (18,2), R 0 increases from 1.75 to 2.87 whereas F decreases from 0.713 to 0.599. Figure 5. Plots of F and R 0 for various preference level i and activity variability in a i (with equal mean (a 1 + a 2 )/2 = 10), showing that as i or variability in a i increases, F decreases while R 0 increases. σ = N 1 /N varies. It shows that both F and R 0 increase with σ , and thus, in this case, F and R 0 are positively related. The values of F and R 0 for the homogeneous case (σ = 0.5 and a 1 = a 2 = 10) are also shown by the dot-dashed and dashed lines, respectively. Thus, heterogeneity in characteristics including a i and N i can either increase or decrease F and R 0 .

Examples
Because population heterogeneity in factors such as a i or i may affect the final size and the basic reproduction number in significant ways, we investigate in this section how the model evaluations of intervention strategies may be affected. Two examples are discussed below.

Heterogeneity in vaccine coverage p i
Consider the model equations in (1) with the initial condition in (9). Assume that the vaccine efficacy is 100% so that the vaccination coverage in group i is the same as the  immunity p i , and the total number of vaccine doses is p 1 N 1 + p 2 N 2 . When N 1 = N 2 with fixed total size N = N 1 + N 2 , fixing the total number of vaccine doses is equivalent to fixing the value of p = p 1 + p 2 . Thus, we can examine the effect of heterogeneity in vaccine allocation (p 1 , p 2 ) on F and R e for fixed p. Choose parameter values to be the same as before except that 1 = 2 = 0.6, β = 0.05, a 1 = 8, a 2 = 12, and 1/γ = 7. For these parameter values, the basic reproduction numbers for the two groups are R 01 = 2.8 and R 02 = 4.2 with R 0 = 3.8. For demonstration purpose, let p 1 + p 2 = 1.32. For the homogeneous case (a 1 = a 2 = 10 and p 1 = p 2 = 0.66), the final size is F (hom) = 0.1 and the effective reproduction number is R (hom) e = 1.2, which are labelled in Figure 7 by the dot-dashed and dashed lines, respectively.
In Figure 7, F and R e for several sets of heterogeneous coverage with p 1 + p 2 = 1.32 are plotted. It is interesting to notice that for p 1 = 0.55, both F and R e are smaller than This shows that, for a fixed amount of vaccine doses, although the homogeneous coverage p 1 = p 2 can provide a relatively more effective allocation than many heterogeneous vaccine allocations in reducing R e , it may not be the best choice. This result is consistent with those found in Feng et al. [11], in which it is shown that the optimal vaccine allocation can be determined using a gradient approach based on a Lagrange optimization problem. The results in Feng et al. [11] is formulated in terms of minimizing the effective reproduction number.
In addition to the final size, the peak size (P) and time to peak (T ) are also sensitive to the heterogeneity in p i . This is illustrated in Figure 8 (the p 1 values for different curves are labelled in the legend). All parameter values are the same as in Figure 7.
We observe in Figure 8(a) that, as p 1 increases from 0.35 to 0.55, the number of cumulative infections decreases (see the solid curves). However, as p 1 continues to increase to 0.65 until 0.85, the number of cumulative infections becomes increasing (the dot-dashed to dashed curves). Similarly in Figure 8(b), P and T decrease from p 1 = 0.35 to 0.55 (solid curves) but increase from p 1 = 0.65 to 0.85 (dashed curves).

The 2011 acute haemorrhagic conjunctivitis outbreak in China
In this section, we examine how population heterogeneity in various characteristics may affect model evaluations of control and intervention programmes. We demonstrate this by using an example based on the 2011 outbreak of AHC in Changsha, China. Studies on the efficacy of quarantine during this outbreak is presented in [7,8]. By fitting a homogeneous SIR model to the data from a case report for a school population, they obtained an estimate for the transmission coefficient β. Based on this, we adopt β = 0.05 as a baseline value for the probability of infection on contact with a baseline value for activity to be a = 15. The infectious period for this AHC outbreak in China is about 7-10 days, so we choose 1/γ to be 8 days. Under these parameter values, the baseline value for the reproduction number is R 0 = βa/γ = 6. This is much higher than the value of the reproduction number R * 0 presented in [9] for the 2003 AHC outbreak in Mexico, where R * 0 = 2.64. The main reason for this difference is in the value of infectious period, which are about 8 and 3 days for the outbreaks in China and Mexico, respectively.
Consider the case when the population is divided into two groups, and individuals in group 1 have a lower risk of being infected due to a reduced contact rate, better health habit (e.g. washing hands to increase protection against infection) or other behavioural changes as suggested by public health officials. Thus, let β 2 = 0.05 (baseline value) and N = N 1 + N 2 = 10 3 (e.g. a school population) be fixed. Assume that individuals in the low risk group have a reduced activity with a 1 = 5 (< a 2 = 15), and let The mixing is assumed to be proportionate. We will examine the changes in F due to interventions aimed at decreasing δ and increasing σ . The simulation results are presented in Figure 9, which shows the model evaluations for the effect of reducing δ and increasing σ on the final epidemic size F. We observe in Figure 9 that increasing σ can be effective in reducing the final size only if δ is sufficiently small. For example, if 60% of individuals are in group 1 (σ = 0.6) and the reduction factor for the risk of infection is δ = 0.6, the final size is 0.79. The reason for the high percentage of cumulative infections is because of the high value of the basic reproduction number R 20 = 6 for group 2. In fact, in the absence of control, almost the entire population will be infected during the outbreak. For the same value of δ = 0.6, even when σ = N 1 /N = 1, the final size will still be as high as 32%. The only possibility to have the final size below 10% is when δ ≤ 0.2 and σ ≥ 0.9. However, as long as σ < 1 (or N 2 /N > 0), there will be a significant number of infections. Figure 10. The surface is a plot of the final size F versus δ and σ , and the plane corresponds to F = 0.2. The intersection curve identifies the region for (δ, σ ) (where the surface is below the plane) in which the final size is below 20%. Figure 10 plots the surface of the final size as a function of δ and σ , which is obtained by connecting the values of the bar chart. We can identify the region in the (δ, σ ) plane in which the final size is below a certain level. For example, the intersection curve of the F surface with the plane with F = 0.2 determines the proportion σ of individuals in group 1 (for a given δ value), above which the final size is below 20%.

Discussion
The results presented in this paper provide another example that incorporation of population heterogeneity and non-homogeneous mixing in epidemic models can generate outcomes that are qualitatively different from that based on models for populations with homogeneous mixing. Particularly, for two of the most important quantities, reproduction number (R 0 or R e ) and final epidemic size (F ), we demonstrated how they may be influenced by population heterogeneity in various characteristics and mixing by comparing the results from a meta-population model with and without heterogeneity and non-homogeneous mixing. The final epidemic size relation and the reproduction number derived from the meta-population model (1) with a general mixing matrix C = (c ij ) allows us to conduct various comparisons (see Sections 3.1.1-3.1.3). We also presented two examples to illustrate the effect of heterogeneity and mixing in the model evaluation of intervention strategies (Section 3.2). An interesting result is that, although homogeneous vaccine coverage may produce higher reduction in F than many heterogeneous combinations of coverage, it may not be the most effective option (see Figures 7 and 8) if the objective is to maximally reduce F for a given number of vaccine doses.
Another interesting finding of this study is the correlation between F and R 0 . In models with homogeneous mixing, F and R 0 are positively related, as shown in Figure 1. However, this relation may be reversed when heterogeneity is considered in a meta-population model with non-homogeneous mixing, as illustrated in Figures 4 and 5. Particularly, an increase of variability in activity (a 1 − a 2 ) or preference level ( i ) can increase R 0 but decrease F. Thus, in this case, F and R 0 are negatively related. The finding that R 0 can be increased by population heterogeneity in various factors is also supported by the fact pointed out by Tilman and Kareiva [20] that R 0 is higher in situations with significant spatial or heterogeneity features.
The final size relation derived in this paper from a meta-population model, which considers n sub-populations, multiple types of heterogeneity, and a general mixing function including preferential mixing, provides a useful evaluation tool for intervention strategies. It also raised the warning issue that, although in many models control measures can be evaluated using the effective reproduction number R e , it should not be the only quantity to consider, especially if a programme reduces R e but increases F.