Characteristics of an epidemic outbreak with a large initial infection size

ABSTRACT A deterministic model proposed in previous literatures to approximate the well-known Richards model is investigated. However, the model assumption of small initial value for infection size is released in the current manuscript. Taking the advantage of the closed form of solutions, we establish the epidemic characteristics of disease transmission: the outbreak size, the peak size and the turning point for the cumulative infected cases. It is shown that the usual disease outbreak threshold condition (the basic reproduction number is greater than unity) fails to fully guarantee the existence of peaking time and turning point when the initial infection size is not relatively small. The epidemic characteristics not only depend on but also on another index, the net reproduction number .


Introduction
The Richards model [14] dC(t) dt = rC(t) 1 − C(t) K 0 a , also known as the theta-logistic model [15], was initially formulated as an extension to the logistic model in order to investigate the ecological population growth. Since last decade, it has been widely employed to fit epidemic data. Unlike models with several compartments commonly used to predict the disease spread, the Richards model considers only the cumulative infective population size with saturation in growth as the outbreak progresses, caused by decreases in recruitment because of attempts to avoid contacts (e.g. wearing facemask) and implementation of control measures. In this single equation, C(t) represents the cumulative infected cases at moment t, K 0 is the carrying capacity or total case number (maximum case number), r is the per-capita growth rate of the infected population and a is the exponent of deviation from the standard logistic curve which makes the model much more flexible: a > 1 (or a < 1) signifies that the cumulative number of infection grows faster (or slower) than that predicted by the logistic growth model. The solution to the Richards model can be explicitly expressed by with t T being the turning point defined as the time when the second derivative of C(t) equals zero, or equivalently, when C(t) takes the value K 0 (1 + a) −1/a [17]. For more details about the Richards model, we refer the reader to the study [4]. Some epidemic pattern suggests a single S-shaped curve for the cumulative cases, which is consistent with the solution of the Richards model. Therefore, the Richards model is used to fit the single-phase severe acute respiratory syndrome (SARS) outbreaks in Hong Kong and Taiwan well [10,19]. Meanwhile, it is also employed to fit the 2005 dengue outbreak in Singapore to study the impact of intervention measures relating to the turning point [11], the weekly reported dengue case data in Havana City to assess the contribution of hurricane to dengue transmission [8] as well as H1N1 [5,9]. The Richards model gives a single S-shaped curve while some epidemic datasets share a multiphase outbreak pattern, such as dengue outbreaks in Taiwan [6]. To fit these multiple-wave patterns, Hsieh and his collaborators proposed a multiphase Richards model [7] and their method was successfully used to fit the multi-wave dengue outbreak in Taiwan [6] and 2003 SARS outbreak in Toronto [7]. This Richards model has made so many successful applications in real-time data fitting and predictions of infection dynamics, although several parameters (e.g. the exponential term) share no clear epidemiological explanation, which poses a puzzle to the community of theoretical epidemiology to find the intrinsic link between the Richards model and wellestablished deterministic epidemic models, such as the Kermack-McKendrick model. To approximate the Richard model, recently the authors in [17] proposed the following model a modified version of the classical SIR model. The novelty of the model lies on the consideration of the 'actually at risk' total population N(t), which is defined as the eventually infected population, that is, N(t) = S(t) + I(t) is the total number of 'actually' vulnerable individuals for the disease transmission at time t [12], S(t) is the actually at risk susceptibles that will be exposed to the pathogen during the entire epidemic under consideration (at 'actual' risk for infection) and I(t) is the number of infected individuals. In their model, the frequency-dependent disease transmission term βSI/N is used, while δ is the removal rate, which refers to all removal forms from the infected individuals due to disease induced death or recovery. In [17], this model is used to illustrate the epidemiological interpretations for parameters in the Richards model, especially the exponent a.  (1) is also adopted to describe the transmission of avian influenza (H7N9) virus among birds [12] and then extended to infer the dynamics of the cumulative number of infected humans due to infection transmitted by infected birds. Fitting the epidemic data for humans, the authors estimated the key parameters in the model system quantifying the bird and human components of an avian influenza epidemic. More recently, this model is used to fit the Ebola outbreak in west Africa [18]. All these previous investigations provide important information on epidemic characteristics and suggestions on epidemic control, but most of them propose an implicit assumption that the disease is initialized by an infinitesimally small proportion of the population, that is, the initial value of infected individuals to the deterministic model is assumed to be very small, which is almost negligible compared to the initial number of susceptibles. However, this underlying assumption for the deterministic model may be unappropriate in some scenarios as argued in [13]. For example, a discrete formulation for disease spread is vital at the initial contamination stage of an epidemic outbreak when the number of infectives is small [16]. In this case, the randomness should be carefully accounted, which can be described by a stochastic process, such as the model in [16] and references therein. In other words, the evolution of an epidemic outbreak in an isolated population can be split into two stages: a stochastic Markov process describing the initial contamination and a linked deterministic dynamical system with random initial conditions for the continued development of the outbreak [16]. Motivated by this, we assume the initial distribution of infectives I(0) to be not negligible compared with the size of susceptibles. Another reason is related to a late surveillance programme which provides large initial infection value.
In this work, we are going to release the assumption of small initial number of infectious individuals and assume its size can be random. Under this released assumption, we will investigate the final size relationship of the epidemic and predict the real-time number and the peak time of infected individuals. In particular, we are going to figure out the time when the inflection of the cumulative case curve occurs (the turning point), i.e. the moment when a rapid increase in case numbers is replaced by a slower increase and this inflection point indicates the moment when the rate of increases in numbers of cumulative cases reaches its maximum. It is shown that the usual disease outbreak threshold condition (the basic reproduction number is greater than unity) fails to guarantee the existence of peaking time and turning point.
The remaining part of this paper starts from solving the model system to obtain a closed form of solutions, based on which the final size, peak time and turning point are discussed. The conclusion and discussion are given in the last section.
To investigate the characteristics of the epidemic pattern, we start from solving the system directly to obtain a closed form of solutions in terms of the initial values as well as model parameters.

Explicit solutions
The first and last equations of (2) give dS/dN = β/δ · S/N which indicates Now we solve the system for two different cases: β = δ and β = δ.
Therefore I(t) = I 0 N(t)/N 0 and the last equation of (2) reads dN/dt = −δI 0 N(t)/N 0 , from which we obtain Using the relationship (4), we can express the solution into , which is a constant determined by the initial value and intrinsic coefficients. Note that K > N 0 when β > δ while K < N 0 when β < δ. In this case, Equation (3) can be written as Substituting Equation (6) into and correspondingly, the N-equation in Equation (2) becomes Since β = δ, Equation (8) takes the form of a Bernoulli equation (also in the form of Richards equation whose solution can be solved) and its solution can be explicitly obtained as Hence, the solution for component N(t) can be expressed as Taking differentiation of the above expression with respect to t, we get However, on the other side N (t) = −δI(t), therefore and the susceptible population size as follows: This solution for S(t) can also be obtained directly from Equation (3). It would also be interesting to apply the closed form for the number of infected individuals (10) into various scenarios to predict the starting time and the duration of disease outbreak. For example, in order to predict the starting time, we just need to solve the equation for time t to get a negative root, which is the starting time when only one individual got infected. Similarly, to predict the duration of the disease outbreak, which can be defined as the duration in which the number of daily infected cases βSI/N is greater than some threshold value , one may evaluate two roots of the equation for time t and take the difference for two roots to infer the time duration of disease outbreak. Note that the same closed form of solutions is obtained in [12,18] for the case when β = δ. In the following sections, we are going to use these explicit solutions to investigate various indices quantifying the epidemic characteristics.

Reproduction numbers
Before investigating the epidemic characteristics, we first introduce two indices, the basic reproduction number R 0 and the running reproduction number R * t at time t [1,12]. The basic reproduction number of model (2), R 0 = β/δ represents the number of secondary infections generated by an introduction of a primary infection into the total population previously unexposed to the disease. However, since at the starting point, the model is seeded with initial size of infection I 0 , therefore, the running reproduction number R * t at time t [1,12] is given by which measures the number of secondary infections caused by a single infected individual in the population at time t. Naturally, the current magnitude of the running reproduction number (i.e. whether or not it exceeds one) determines the increases or decreases in infection. It is easy to see that R 0 is a static constant, solely dependent on the model parameters, while R * t is time-dependent, 'running' with disease spread. Moreover, the basic reproduction number R 0 is always greater than the running reproduction number R * t . It would also be interesting to write down the net reproduction number [2] as the running reproduction number at initial time when t = 0: which will play an important role in the whole paper. Note that the net reproduction number is not equal to the basic reproduction number R 0 as I 0 is not negligible. The net reproduction number gives the average number of secondary infectious cases resulting from each case in a given population (with a proportion of infectious individuals). From Equation (12) we have N 0 /S 0 = R 0 /R * 0 , then the parameter K defined in Section 2.1 can be rewritten as On the other hand, from Equation (12), we can also get Equalities (13) and (14) will be used in subsequent sections.

Outbreak size
During an epidemic outbreak, majority of people are keen to know 'How big is an outbreak likely to be?'. One can infer the information related to this question from the outbreak size, which illustrates the cumulative number of infected population during the disease transmission, and mathematically, it is determined by the quantity C ∞ = N 0 − lim t→∞ S(t).
For the case when β = δ (the basic reproduction number R 0 = 1), we have lim t→∞ S(t) = 0 and lim t→∞ I(t) = 0 by (5) and therefore, the outbreak size is C ∞ = N 0 .
Summarizing the above argument, we have the following result about the outbreak size:

Proposition 2.1:
The outbreak size is either N 0 (when the basic reproduction number R 0 ≥ 1, which implies that every individual involved will get infected)

Epidemic peak
Once outbreaks have begun, knowing their potential severity helps public health authorities to respond immediately and effectively. The epidemic peak [3] indicates the largest number of diseased individuals in the population, that is, the maximum value of I(t). It is particularly important to find out the time (peaking time t P ) such that the number of infected cases reaches maximum. In this subsection, we are going to find the peaking time t P and the peak size at this moment.
Applying Equations (13) and (14), one obtains From the above argument, we can claim that

Proposition 2.2:
The infected population size always decreases when R 0 ≤ 1; however, when R 0 > 1, the pattern of I(t) evolution is dependent on the net reproduction number R * 0 : If the net reproduction number R * 0 ≤ 1, then I ≤ 0 for all t > 0 while when R * 0 > 1, there exists a peak time for the infected cases t P such that I (t) > 0 if 0 < t < t P and I (t) < 0 if t > t P , where the peak time and the size of each compartment at the time are given by Equation (15).

Turning point for the cumulative number of the infected cases
During an epidemic outbreak, the surveillance programme reports the daily/weekly newly infected cases, while the trend of the newly reported cases indicates whether the epidemic becomes worsening or improving. This trend can be traced by observing the rate of change of the cumulative cases. In particular, the time (the turning point t T ) is of interest when the inflection of the cumulative case curve occurs, i.e. the moment when a rapid increase in case numbers is replaced by a slower increase and this moment marks the key turning point when the spread of the disease starts to decline. Theoretically, this turning point t T , defined as times at which the rate of accumulation changes from increasing to decreasing or vice versa, can be easily located by finding the inflection point of the epidemic curve.
Let C(t) be the cumulative number of reported cases at time t, then the growth rate of cumulative cases (i.e. the number of newly infected individuals at time t) is given by dC/dt = βSI/N and the rate of change of the newly infected cases is d 2 C/dt 2 = d/dt(βSI/N).
If β = δ (R 0 = 1), then from Equation (4), we have which shows the number of newly infected cases always decreases and there is no turning point. If β = δ (R 0 = 1), from Equations (6) and (7), we have dC/dt = βSI/N = βK(N/K) β/δ [1 − (N/K) β/δ−1 ] and thus Based on Equation (9), we can observe Therefore, if β < δ, or if β > δ and 1 − β/δ + β/δI 0 /S 0 ≥ 0, C (t) < 0, which implies that there is no turning point. However, when β > δ and 1 − β/δ + β/δI 0 /S 0 < 0, then there exists a unique positive t T such that C (t T ) = 0. Actually t T can be solved from which turns out to be We can obtain the corresponding size for each compartment at the turning point from Equations (6) and (7): . Furthermore, the newly infected cases on the turning point is . According to Equation (14), when β > δ (R 0 > 1), the condition 1 − β/δ + β/δI 0 /S 0 ≥ 0 (< 0) is equivalent to R * 0 ≤ R 2 0 /2R 0 − 1 (R * 0 > R 2 0 /2R 0 − 1, respectively). Further, applying Equation (13), we have , Summarizing the above argument, we have the following results about the turning point: Proposition 2.3: When R 0 ≤ 1 or when R 0 > 1 and R * 0 ≤ R 2 0 /2R 0 − 1, there is no turning point, which implies that the growth rate of the cumulative cases decreases for all t ≥ 0; there is a unique turning point at t = t T . In this case, the growth rate of cumulative cases is increasing as t < t T and decreasing as t > t T . The turning point, the size of each compartment at the time, and the maximum rate of increase of cumulative cases are given by Equation (16).

Conclusion and discussion
In the study of epidemic outbreaks by means of mathematical models, most previous work has implicitly assumed that the disease is initialized by an infinitesimally small proportion of the population. In the current paper, we modify this assumption in order to account for an arbitrarily large initial proportion of infected individuals. By assuming the nonnegligible amount of infection in the population that fed into the deterministic model, we revisit the model system proposed in [17] (see also in [12]). The model admits a closed form of solutions with explicit expressions, based on which we get the whole picture of the epidemic characteristics with respect to the model parameters and initial values for the system. In particular, we investigate the final and outbreak sizes of the epidemic, the peak time and turning point for an epidemic outbreak.
Many results of the current paper are investigated in terms of two indices: the basic reproduction number R 0 = β/δ and the net reproduction number R * 0 = β/δS 0 /(S 0 + I 0 ) (the running reproduction number R * t at initial time t = 0). The net reproduction number R * 0 is smaller than the basic reproduction number R 0 , and they are almost equal when the initial infection size I 0 is very small compared to S 0 . The R 0 − R * 0 coordinate plane can be divided into five different regions (see Figure 1) with one region (R 0 ≤ R * 0 ) being not biologically feasible. Then the whole picture of the epidemic characteristics can be drawn for each region, as summarized in Table 1, which shows that the existence of the turning point implies the existence of the peak time, but the converse claim does not hold. Moreover, the usual condition that the basic reproduction number R 0 > 1 can not guarantee the existence of peak time, neither the turning point when the initial infection size I 0 is not negligible.
Peak time represents the moment when the infected incidence reaches its maximal value while the turning point quantifies the moment when the growth rate of cumulative cases attains its maximum. Therefore, it is reasonable to expect a time lag from turning point to peak time. Actually, when R * 0 > 1, from the expressions of t p and t T it follows that Hence, the peak time always happens later than the turning point, and the time interval between turning point and peak time is independent of the initial susceptible and infected population sizes S 0 and I 0 (as well as the net reproduction number R * 0 ). Due to the time lag between turning point and peak time, in some surveillance programmes for epidemic outbreaks, one cannot observe the turning point, or both the peak time and turning point.
On the other hand, direct calculation yields the ratio of the infected population sizes at these special moments   (2) with respect to the reproduction numbers R 0 and R * 0 . Note that R * 0 is always less than R 0 based on (12), also shown in Figure 1.
Basic reproduction Net reproduction Outbreak size Epidemic peak Turning point Not exist Not exist Exists implying that the ratio of the peak value of infected cases and the size at the turning point is independent of the net reproduction number (as well as the initial condition).
The following lemma shows that I P /I T > 1 for R 0 > 1.
Finally, it is interesting to highlight the novel conditions for the existence of peak time and turning point in the current paper. If the initial infection size is infinitesimally small relative to the size of susceptibles, the existence of peak time implies the existence of turning point for the number of cumulative cases, and vice versa. Moreover, they exist if and only if the basic reproduction number R 0 is greater than unity [12,17]. However, when the initial infection size is not negligible as assumed in the current paper, the existence of peak time (as well as the turning point) is not only dependent on the basic reproduction number R 0 , but also on the net reproduction number R * 0 . Furthermore, the existence of turning point implies that of the peak time, but not vice versa.