Early estimates of epidemic final sizes

ABSTRACT Early in a disease outbreak, it is important to be able to estimate the final size of the epidemic in order to assess needs for treatment and to be able to compare the effects of different treatment approaches. However, it is common for epidemics, especially of diseases considered dangerous, to grow much more slowly than expected. We suggest that by assuming behavioural changes in the face of an epidemic and heterogeneity of mixing in the population it is possible to obtain reasonable early estimates.


Introduction
It has been standard practice in analysing disease outbreaks to formulate a dynamical system as a deterministic compartmental model, then to use observed early outbreak data to fit parameters to the model, and finally to analyse the dynamical system to predict the course of the disease outbreak and to compare the effects of different management strategies. In general, such models predict an initial stochastic stage (while the number of infectious individuals is small), followed by a period of exponential growth. Measurement of this early exponential growth rate is an essential step in estimating contact rate parameters for the model. However, instances have been noted where the growth rate of an epidemic is clearly slower than what this approach predicts. One of the earliest examples [4] concerns the growth of HIV/AIDS in the United States, and a possible explanation might be the mixture of short-term and long-term contacts. This could be a factor in other diseases where there are repeated contacts in family groups and less frequent contacts outside the home. Some suggested explanations for slower than expected growth include metapopulation models with spatial structure including cross-coupling and mobility, clustering in spatial structure, dynamic contacts, agent-based models with differences in infectivity and susceptibility of individuals, and reactive behavioural changes early in a disease outbreak.
In Liberia, a country with a population of close to 4,000,000, the 2014 epidemic of the Ebola virus produced fewer than 12,000 cases although estimates of the reproduction number from early data would have predicted more than 1,5000,000 cases. Such large discrepancies led to growth models that are not mechanistic but are very useful for fitting data and making short-term predictions. One example is a discrete model [5] that has been applied to Ebola data in [6]. Another approach has been to formulate a so-called general growth model of the form where C(t) is the number of disease cases occurring up to time t, and p, 0 ≤ p ≤ 1, is a 'deceleration of growth' parameter [3,12]. These models have been very successful for fitting epidemic growth, but they are not useful for predicting the final size of an epidemic. The parameter p represents the effect of behavioural responses during the initial growth phase of the epidemic but since without some additional term to model the decreasing phase of the epidemic it cannot be incorporated into a final size relation.
An approach that may be more helpful in estimating the final size of an epidemic from early data is the suggestion of [2] that behavioural change may decrease the basic reproduction number in a severe epidemic.
It has been observed that in many disease outbreaks, the phenomenon of superspreading events, situations in which a small fraction of the population causes more than its share of disease cases, has been significant [9]. For example, superspreading events were important in the SARS outbreak of 2002-2003 [10] in which there was a cluster of at least 125 cases apparently infected by a single index patient as well as another cluster of perhaps 300 cases. For a given basic reproduction number, the number of disease cases is generally fewer if there is superspreading than if the population mixing is homogeneous.
In the 2014-2015 West Africa Ebola epidemic, there is also evidence of superspreading [8]. It is suggested in [7] that commonly roughly 20% of the individuals in a population are responsible for 80% of the disease cases.
In this note, we suggest that these two effects may give useful early estimates of the final size of an epidemic.

Behavioural changes during an epidemic
It is natural to expect that when a disease outbreak begins there may be behavioural response to attempt to protect against infection. This is particularly evident if the disease is considered to be very dangerous, as has been the case in Ebola virus outbreaks, and much less evident with diseases not considered to be dangerous, such as seasonal influenza outbreaks. A model described in [2] for the Ebola virus outbreak of 2014 in West Africa used data from the first 40 weeks of the outbreak to estimate disease model parameters and then assumed a response leading to a decrease by a factor of 0.6 in the reproduction number. This assumption led to a very good fit of the model to continuing disease data and the ultimate final size for the epidemic.
We suggest that assuming a simple model of SIR or SEIR type for an emerging disease outbreak, even though it may fail to capture many of the properties of the disease, together with an assumption of a decrease in the reproduction number after an initial stage, may give a good start on obtaining a reasonable estimate of the epidemic final size. For simplicity, we will assume that the effect of the measures causing this reduction is immediate, even though it would be more realistic to assume a reduction carried out over a given time interval. Clearly, we would expect the reduction factor in the reproduction number and the time of the reduction to be sensitive parameters.
Example 2.1: The 2014 Ebola virus in Liberia has been modelled by a system that includes some specific aspects of the Ebola virus [1,2]. At the beginning of an outbreak of an emerging or re-emerging disease, the specific aspects of the disease may not yet be well understood. In order to make an early estimate of the final size of the outbreak of a disease serious enough that there may be substantial behavioural change in response, we suggest using a generic SEIR model incorporating such response to obtain an initial estimate. We use an SEIR model for the Ebola outbreak in Liberia

Heterogeneity in epidemic models
In order to describe a simple superspreading epidemic model, we consider an SEIR epidemic model in a population divided into two groups. Group 1 members (superspreaders), forming a fraction ρ of the total population, have a contact rate σ a and Group 2 members have a contact rate a, with σ > 1. Mixing between groups is proportionate. We take σ = 16, ρ = 0.2 in order to conform to the suggestion in [7] that commonly roughly 20% of the individuals in a population are responsible for 80% of the disease cases.
We assume • Total population size is N. Group 1 population size is N 1 = ρN. Group 2 population size is N 2 = (1 − ρ)N. • Group i is divided into susceptibles (S i ), exposed members (E i ), infectives (I i ) and recovered members (R i ). • Mixing between groups is proportionate. Fraction of contacts of susceptibles with groups 1 and 2 respectively are .
• Infections in each group recover at rate α.
• There are no demographic effects (births, deaths, migration) on the population. Then Similarly, Thus the full model is The parameter values corresponding to the SEIR model (1)  (3) To calculate the reproduction number using the next generation method [11], we write Thus the next generation matrix is This matrix has the same non-zero eigenvalues as the 2 × 2 matrix (the next generation matrix with least domain). The matrix K L has determinant zero. Thus the non-zero eigenvalue is the trace of this matrix, and From the model (2), we see that from which we deduce that E 1 → 0, I 1 → 0, and S 1 (t) → S 1 (∞) ≥ 0. Integration gives Similarly, we see that from which we deduce that E 2 → 0, I 2 → 0, and S 2 (t) → S 2 (∞) ≥ 0.
Integration gives Integration of the equation for S 1 in the model gives We combine the expressions Since S 1 /S 1 = σ S 2 /S 2 , we may carry out analogous calculations to give with the matrix K given by We note that K is the transpose of the next generation matrix with least domain K L . We now have the final size relation We note that We could substitute (5) into the final size relation (4) to obtain a single final size equation.
We simulate the superspreader model (2)  The number of new disease cases after t = 280 is 17,718+4531 = 22,249 , and the total number of cases is 13,215+22,249 = 35,464. This is still larger than the actual number of observed cases, but considerably closer to the actual number of observed cases than the estimate provided by (1).
The factor σ in the model (2) represents the degree of superspreading. To indicate the extent to which the epidemic final size would depend on the value of the parameter σ , we used the final size relation (4) for the model (without incorporating behavioural response).
This suggests that superspreading is significant in decreasing the number of disease cases corresponding to a given value of the basic reproduction number.

Conclusion
It is useful to have some idea early in a disease outbreak how serious the outbreak will turn out to be. If the disease is viewed as a serious threat, there will surely be substantial behavioural changes as the disease develops. The standard approaches to estimating a disease outbreak using initial exponential growth rates may give predictions which are so large as to be meaningless. We suggest that it may be possible to use relatively simple SIR or SEIR models without detailed assumptions about the particular disease being studied but assuming a decrease in the contact rate and heterogeneity of mixing to obtain more plausible estimates. While a decrease in the contact rate would probably be continuous in reality, we approximate it by a discontinuous decrease at a fixed time. Of course, the results will depend strongly on the size of the assumed decrease and the time at which it is assumed to take place.