Inference of Long-Term Screening Outcomes for Individuals with Screening Histories

ABSTRACT We develop a probability model for evaluating long-term outcomes due to regular screening that incorporates the effects of prior screening examinations. Previous models assume that individuals have no prior screening examinations at their current ages. Due to current widespread medical emphasis on screening, the consideration of screening histories is essential, particularly in assessing the benefit of future screening examinations given a certain number of previous negative screens. Screening participants are categorized into four mutually exclusive groups: symptom-free-life, no-early-detection, true-early-detection, and overdiagnosis. For each case, we develop models that incorporate a person’s current age, screening history, expected future screening frequency, screening test sensitivity, and other factors, and derive the probabilities of occurrence for the four groups. The probability of overdiagnosis among screen-detected cases is derived and estimated. The model applies to screening for any disease or condition; for concreteness, we focus on female breast cancer and use data from the study conducted by the Health Insurance Plan of Greater New York (HIP) to estimate these probabilities and corresponding credible intervals. The model can provide policy makers with important information regarding ranges of expected lives saved and percentages of true-early-detection and overdiagnosis among the screen-detected cases.


Introduction
Breast cancer screening for women has been recommended since the 1960s. The U.S. Preventive Services Task Force (USPSTF) provides recommendations regarding mammogram screening; current guidelines state that women between the ages 50 and 74 should undergo biennial screenings, and women under 50 should consult their physicians. However, no guidelines have been offered for women over 75. Too few screenings may lead to undetected breast cancer, while frequent screenings may lead to higher possibility of overdiagnosis, that is, cases that would never have caused clinical symptoms during a person's lifetime, but nonetheless were treated, and the individual died of other causes (see Welch et al. 2012). 1 The estimated percentage of overdiagnosis from observational studies of breast cancer ranges from 7% to 52% (Welch et al. 2012).
The problem of overdiagnosis can be severe for any type of cancer, especially for prostate cancer: estimates of prevalence in autopsy studies, while dependent on age of death, ethnicity, and other factors, nonetheless can range from 25% to 100% (Haas et al. 2008;Zlotta et al. 2013). Moreover, the problem of overdiagnosis can be exacerbated by multiple screening examinations, each of which could lead to unnecessary treatment or a false positive (false positive is not counted as "overdiagnosed case, " as presumably a subsequent biopsy/test confirms negative status; hence the false positive rate can be estimated directly cases when asymptomatic individuals had no screening history. However, more realistically, persons aged 60 or over may already have had at least one prior screening examination. Given, say, K 1 previous negative screening exam results, how much benefit is derived by undergoing additional screens? If there is a benefit, how many additional screens should be taken, and how often? Previous models do not address this issue, which has become critical for both public health and policy decision makers (Badgwell et al. 2008). In this article, we develop a probability model that incorporates various characteristics of a potential screening participant, including age, previous screening history (number of and ages at previous negative screens), test sensitivity, and distributions of durations of disease-free status and preclinical disease.
Screening examinations are used for many conditions, most notably breast, prostate, lung, colorectal, ovarian cancers (via mammography, prostate-specific antigen, computed tomography, sigmoidoscopy, CA-125), heart disease (cholesterol, blood pressure), and osteoporosis (bone density scan). For concreteness, we will focus on women's breast cancer, due to the availability of data from the historical HIP study (Health Insurance Plan of Greater New York; Shapiro et al. 1988). 2 We assume that a woman has had previous breast cancer screening examinations and all were negative, so she currently appears healthy. If she continues to undergo screening exams in the future, then eventually she will fall into one of the four groups defined above (Wu, Kafadar, and Rosner 2014;Wu and Perez 2011).
Many individuals (especially those in older age groups) have had previous screening examinations, so we develop here a probability model for the case where an individual may have had an arbitrary number (K 1 ) of previous screens and expects further screening examinations. In doing so, we can estimate the probabilities of the four groups defined above. Individuals in Group 3 derive the greatest benefit from screening, so it is important to have estimates of the proportion of screened individuals who are likely to benefit from screening (relative to the proportion of screened individuals in Groups 1, 2, and 4, who would not benefit from screening).
In Section 2, we propose a probability model and derive the probabilities for each of the four groups, treating the duration of human lifetime as a random variable, with the cause of death subject to other competing risks. Using data from the breast cancer screening study conducted by the Health Insurance Plan of Greater New York (HIP), we develop simulations to estimate these probabilities for different screening schedules in Section 3. More extensive simulations under different assumptions are given in Section 4; the simulation results can be used to guide decisions regarding the expected benefits when using different screening frequencies, different current ages, etc. We conclude with a discussion in Section 5.

Probability Formulation
Consider a currently asymptomatic individual with a screening history who has had no cancer diagnosis previously. We assume the common disease progression model in which disease develops through three states, S 0 → S p → S c (Zelen and  Other randomized screening trials have been conducted, but the data from them are not as publicly available as the HIP data are. Feinleib 1969), where S 0 refers to the disease-free state or the state in which the disease cannot be detected; S p refers to the preclinical disease state, in which an asymptomatic individual unknowingly has the disease that a screening exam can detect; and S c refers to the clinical state at which the disease manifests itself in clinical symptoms. Let β(t ) be the sensitivity of the screening test at age t, that is, the probability that the screening exam is positive given that the individual is in the preclinical state S p at age t, and let β i = β(t i ). Finally, let X and Y be random variables that denote the duration in S 0 (the disease-free state) and S p (the preclinical state), respectively; let w(x) and q(y) be their respective probability density functions (pdf), and Q(y) = ∞ y q(t )dt be the survivor function of the sojourn time Y (duration of preclinical phase). We assume that the sojourn time distribution does not depend on the age of entry into S p , that is, sojourn time Y and the duration of the disease-free state X are independent. Throughout, the time variable t represents an individual's age at time of screening, and T represents a person's lifetime, a continuous random variable with probability density function f T (t ).
Assume that an individual currently aged t K 1 has had screening examinations at ages t 0 < t 1 < · · · < t K 1 −1 , all thus far negative, with plans for K more screens at ages t K 1 < t K 1 +1 < t K 1 +2 < · · · < t K 1 +K−1 before the end of his/her life at age T > t K 1 +K−1 (see Figure 1). To derive the probability of each of the four cases, we proceed as follows: 1. First, we derive the probability for the simplest case when K 1 = K = 1 (single previous screen and single future screen) with lifetime T fixed. Then we allow the lifetime T to be a random variable. 2. Second, we generalize the results to any fixed positive integers K 1 and K, when the lifetime T is fixed. Finally, we allow the lifetime T (and hence the number of future screening exams K as well) to be a random variable.

Probability of Each Case When K
Suppose an asymptomatic woman, at current age t 1 , previously had only one screening exam at age t 0 (< t 1 ), and it was negative. We define this event: A woman had a screening exam at age t 0 , no breast cancer was found, and she is asymptomatic at current age t 1 ⎫ ⎬ ⎭ . Because We need to derive the probability P(H 1 |T ≥ t 1 ) and the probability of each of the four outcomes P(Case i, H 1 |T ≥ t 1 ), see Figure 2. To calculate the probability of H 1 , that is, the conditional probability that no breast cancer was found before/at age t 1 when one's lifetime T exceeds t 1 , we note that this outcome can arise as one of three mutually exclusive events: (i) she remains in the disease-free state through age t 1 , the probability of which is 1 − t 1 0 w(x)dx; or (ii) she enters state S p before t 0 and her cancer is missed at the exam at t 0 , but she remains in S p so long as no symptoms presented before with no symptoms before t 1 , with probability Thus, the probability of H 1 is the sum of the three probabilities (i)-(iii): For the four possible outcomes, we first derive each probability given that the lifetime T = t (> t 1 ) is a fixed value. A woman in Group 1 who never has detectable breast cancer during her lifetime (0, t ) can follow one of four mutually exclusive trajectories: (i) she remained in the disease-free state S 0 throughout her lifetime (0, t ); (ii) she entered the preclinical state S p before t 0 , her cancer was not detected at t 0 nor at t 1 , and her sojourn time was so long that no clinical symptom appeared before her death; (iii) she entered the preclinical state S p in (t 0 , t 1 ), her cancer was not detected at t 1 (false negative, with probability 1 − β 1 ), and her sojourn time was so long that no clinical symptoms appeared before her death; (iv) she entered the preclinical state S p in (t 1 , t ) (after her screening exam at time t 1 ), and had a long sojourn time, so no clinical symptoms appeared before her death. Then the conditional probability of a Group 1 case with (2) For a woman in Group 2 whose cancer became symptomatic and was found in (t 1 , T ), one of three outcomes is possible: (i) she entered S p before t 0 and was not detected by either exam (at t 0 or t 1 ); or (ii) she entered the preclinical state in (t 0 , t 1 ) and was missed by the screening exam at t 1 ; or (iii) she entered the preclinical state after t 1 . In all situations, her sojourn time was less than (t − x), where x is her age upon entering S p . Here the conditional probability is (3) For a Group 3 case, a woman is truly detected early at t 1 by taking the scheduled exam, and her symptoms would have appeared before death. That is, she must have entered S p at some age x before t 1 , that is, x ∈ (0, t 0 ) or x ∈ (t 0 , t 1 ), and in either case, her sojourn time lies between (t 1 − x) and (t − x). Then: For a Group 4 case (overdiagnosis), she would have been diagnosed at some time, say t 1 , but her symptoms would not have appeared before death. That is, she must have entered S p at some age x (x < t 1 < T ), but her sojourn time would have extended to beyond time (T − x). For this overdiagnosed case, The probability of each case when the lifetime T is a random variable and T ≥ t 1 can be obtained by By summing Equations (2) through (5), one can verify that, for any t > t 1 , The right-hand side of (8) does not depend on t, so which implies

Probability of Outcomes for any K 1 and K: Multiple Exams with Histories
We generalize this approach to an individual with a history of any number of screenings, and we focus on modeling the impact of future screenings based on the four possible outcomes: symptom-free-life, no-early-detection, true-early-detection, and overdiagnosis. We assume that an initially asymptomatic individual has gone through K 1 screening exams so far, which occurred at her ages t 0 < t 1 < · · · < t K 1 −1 ; see Figure 1. We let t −1 = 0, and her current age is t K 1 (> t K 1 −1 ). Define an event A woman had screening exams at her ages t 0 < t 1 < · · · < t K 1 −1 , no breast cancer was found, and she is asymptomatic at her current age t K 1 We first calculate P(H K 1 |T ≥ t K 1 ), the conditional probability that no breast cancer was found before/at age t K 1 given that her lifetime T exceeds t K 1 . H K 1 can arise if one of (K 1 + 2) mutually exclusive events occurs: (i) She never progressed out of the disease-free state S 0 throughout her lifetime, the probability of which is 1 − t K 1 0 w(x)dx; or (ii) she entered state S p in age interval (t j−1 , t j ), j = 0, . . . , K 1 − 1, but remains in S p long enough that no symptoms present before t K 1 , and she was missed by the following (K 1 − j) exams (K 1 disjoint events); or (iii) she entered S p after t K 1 −1 , and without any symptoms before t K 1 . Thus, the probability of H K 1 is the sum of these probabilities: Now if she plans to undergo K screening exams in the future, occurring at ages t K 1 < t K 1 +1 < · · · < t K 1 +K−1 , we first derive the conditional probability for each outcome when her lifetime T is fixed, and then allow T to be a random variable.
Given that her lifetime T = t K 1 +K (> t K 1 +K−1 ), a Group 1 case (clinical breast cancer never occurs in her lifetime) can arise as any one of (K 1 + K + 2) mutually exclusive events: (i) she remained in the disease-free state S 0 throughout her lifetime, the probability of which is 1 − the preclinical state S p when she was between ages t j−1 and t j , j = 0, . . . , K 1 + K − 1, she had a long sojourn time, and she was not detected by the following (K 1 + K − j) exams, so no clinical symptoms appeared before her death (K 1 + K disjoint events); or (iii) she entered S p after t K 1 +K−1 and without clinical symptoms before her death. Summing these probabilities: For a Group 2 case, we calculate the probability of no early detection by defining I j as the probability of being an interval case (Wu, Rosner, and Broemeling 2007), that is clinically diagnosed between screens (i.e., not screen-detected) in an interval A Group 3 case, true early detection, can arise as one of K disjoint events depending on her age at diagnosis by screening, namely, at t j , j = K 1 , K 1 + 1, . . . , K 1 + K − 1. If she is diagnosed at t j , then she must have entered the preclinical state S p before t j , was not detected by the previous j exams, and her sojourn time must have been in the interval (t j − x, t K 1 +K − x), where x represents the onset time of the preclinical state. Therefore, A Group 4 case, overdiagnosis, also can arise as one of K disjoint events. She might have been diagnosed at the jth exam, but her sojourn time would have exceeded (t K 1 +K − x), and thus her symptoms would not have appeared before her death: We confirm (see supplementary material) that, for any screening numbers K 1 ≥ 1 and K ≥ 1, For an individual currently aged t K 1 with a random (versus fixed) lifetime T , the number of future exams K(T ) likewise is ran- Thus, if she has K future screens at t K 1 < t K 1 +1 < · · · < t K 1 +K−1 , then the probability of each case (i = 1, 2, 3, 4) when her lifetime T exceeds t K 1 is the weighted average: where (12)-(15). We can verify from (16) that for any future screening schedule when the lifetime T is random,

Projection of Long-Term Outcomes using HIP Data
We applied our model to the Health Insurance Plan of the Greater New York (HIP) breast cancer screening data (Shapiro et al. 1988), using for a lifetime distribution that derived from the actuarial lifetable on the Social Security Administration (SSA) website 3 ; see Wu et al. (2012) for a detailed procedure on using this lifetime distribution.

Bayesian Probability Estimates
Using the results in Section 2, the probability of each case is a function of the three key parameters (age-dependent screening sensitivity β(t ), age-dependent transition density w(t ), survival function of the sojourn time Q(x)), as well as on a person's current age t K 1 , her screening history, her future screening schedule, and the lifetime distribution. For the three key parameters, Wu, Rosner, and Broemeling (2005) estimated them for the HIP data using parametric models as follows: where m is the average age of women at the time of their entry into the HIP study. The unknown parameters in this model are θ = (b 0 , b 1 , μ, σ 2 , κ, ρ). The 0.2 in the model for w(t ) is the upper limit when making a transition from the diseasefree state to the preclinical state. 4 Using Markov chain Monte Carlo (MCMC), two paralleled chains with dispersed initial values were simulated 30,000 steps, with a burn-in of 10,000 steps, and then sampled every 20 steps (thinning), that is how 2000 Bayesian posterior samples (θ * j ) were generated, for details, see Wu, Rosner, and Broemeling (2005). Using the HIP data, the posterior predictive probability of each case can be estimated as where θ * j is the random sample drawn from the posterior distribution f (θ|HIP) and n = 2000 is the posterior sample size. Using the 2000 posterior samples, the sensitivity appears to increase with age: from ages 40 to 65 years (the study entry ages in the HIP study), the posterior mean sensitivity increased from 0.603 to 0.875. The posterior mean sojourn time was 1.88 years, and the posterior median of the sojourn time was 1.51 years (Wu, Rosner, and Broemeling 2005).

Results
We applied (19) to the 2000 MCMC posterior samples and conducted Bayesian inference on three hypothetical cohorts of asymptomatic women currently aged 60, 70, and 80, assuming that they have started their first screening at age 50, 5 with annual, biennial screening ( 1 = 1 or 2 years), or without any screening until their present age (i.e., 10, 20, or 30 years later). For each group, we assumed that they are asymptomatic at their current age with no cancer previously found, and we considered both annual and biennial screening intervals in the future ( 2 = 1, 2 years). The number of screens K = K(T ) = (T − t K 1 )/ 2 is a function of the (random) lifetime T , and hence also is a random variable. In the simulation, for each posterior sample θ * j , we need to calculate P(Case i|T ≥ t K 1 , H K 1 , θ * j ) using Equations (11) and (17) since .
(20) The calculation of the denominator is done by simply inserting the values of θ * j into Equation (11). The numerator is an integral on (t K 1 , ∞) in Equation (17) based on SSA's actuarial life table, we use an upper age bound of 120 years (not ∞, i.e., f T (t|T ≥ t K 1 ) = 0 for t > 120). Since we can estimate f T (t|T ≥ t K 1 ) numerically at any value of t, we used numerical integration (trapezoidal rule with very narrow intervals) to calculate the integral in Equation (17), by letting age t vary evenly from t K 1 to 120, with interval lengths equal to 0.125, 0.15625, and 0.1875 for t K 1 = 80, 70, 60, respectively. The conditional probability of overdiagnosis among the screen-detected is (21) For the lifetime distribution, we used the conditional lifetime density for females, currently aged 60, 70, and 80, using the actuarial life table from the Social Security Administration (SSA) website (see the beginning of this section). The derivation of the conditional probability density function (pdf) is given in Wu et al. (2012). The marginal probabilities of each of the four cases P(Case i|H K 1 , T ≥ t K 1 , HIP) with standard errors are reported in Table 1.
The percentage of "Overdiagnosis" for all age groups is very small, between 0.20% and 0.34% from Table 1, but it does represent about 5%-12% of those cases that are not symptom-free for life (i.e., the last column is 5%-12% of the total of the last three columns combined). This probability decreases when a woman's current age increases, and as the future screening interval 2 increases; and it increases to a lesser extent when the past screening interval 1 increases. The percentage of "True-early-detection" is higher for the annual screening group in the future than that for the biennial screening within each age group. That is, this probability decreases as the future screening time interval increases. The probability is also lower when the Table . A projection of breast cancer screening outcomes using the HIP data.
Initial screen age t 0 = 50, current age t K NOTES: a ( 1 , 2 ) are screening interval in years in the history and future correspondingly; when 1 is "−, " it refers to those who never take any screening exam until the current age, that is, those without a screening history. b The probability of each outcome, that is, P(Case i|H K 1 , Entries are in percentages, that is,  times the mean probability (with standard error). current age increases from 60 to 80. Under this model, the percentage of "No-early-detection" is between 0.45% and 2.86%; it increases as the future screening interval length increases and decreases when the current age increases. The probability of "Symptom-free-life" is very high: it increases from ∼ 93% to 98% when the current age increases from 60 to 80, and changes only slightly with the past or future screening intervals. This implies that the majority of women will live a life free of breast cancer. Figure 3 shows, via boxplots, how the trends change with screening history for women currently aged t K 1 = 70. (Other age groups follow the same pattern, so they are not shown.) When other parameters are the same, the probabilities of "True-earlydetection, " "Over-diagnosis, " and "No-early-detection" slightly increase as the historic screening interval increases. However, the probability of "Symptom-free-life" shows a reversed pattern: when the past screening interval increases, this probability decreases. A histogram and density estimate in Figure 4 shows that the probability of overdiagnosis is skewed to the right, while that of the other three probabilities is more symmetric with a single mode.
The main concern about screening is the probability of overdiagnosis among the screen-detected cases, that is, if the case is screen-detected, what is the probability that it is an overdiagnosed case? Table 2 lists the estimated mean probabilities (and their 95% highest probability density (HPD) intervals) of "Overdiagnosis" and "True-early-detection" given that it is a screendetected case. The percentage of "Overdiagnosis" among the screen-detected cases is roughly 7%, 10%, and 15% for those currently aged 60, 70, and 80, respectively; it increases significantly with a person's current age, as well as with the mean sojourn time (the percentages are 20%, 40%, and 55% when the mean sojourn time is 5, 10, and 15 years, respectively in our simulation). It is much less affected by the lengths of past or future screening intervals ( 1 , 2 , respectively). The lengths of the 95% HPD intervals for these two probabilities (as percentages) increase with a person's current age, showing large variation with advanced ages.
The top graphs in Figure 5 show boxplots of the two conditional probabilities when the current age is 80. The conditional probability of "True-early-detection" and "Overdiagnosis" given that one was screen-detected changes little with one's screening history. The histograms and estimated pdfs of these probabilities when 1 = 2 = 1 in Figure 5 show that the densities for both probabilities are heavily skewed.

Simulations
We conducted extensive simulations using the model developed in Section 2. Since the probability of each case is a function of current age, past screening history, future screening interval, human lifetime T , and three model parameters (sensitivity, sojourn time, transition density from S 0 to S p ), we explore the effects of these factors on the probability of each outcome, and also the changes in the proportions of true-early-detection and overdiagnosis among the screen-detected cases.
We selected the following scenarios for simulation: age at initial screening t 0 = 50, current age t K 1 = 70, past and future screening intervals ( 1 , 2 ) are (1, 1), (2, 1), (1, 2), (2, 2) in years, and screening sensitivity β = 0.7, 0.9. The transition probability density w(t ) was chosen to be a lognormal pdf, with a single mode around 60 years old and an upper limit of 20%, which may be relevant for various cancers including breast cancer; hence the parameters for w(t ) are chosen as (μ, σ 2 ) = (4.2, 0.1). (Wu, Kafadar, and Rosner (2014) showed that changing the parameters of w(t ), or even the form of the pdf, will not change these four by much, as long as the mode of the pdf is ≈ 60 years.) The sojourn time distribution was chosen to be a loglogistic pdf, with parameters chosen so that the mean sojourn time μ S is 2, 5, 10, or 15 years (corresponding roughly to fast, moderate, slow, and very slow-growing disease). The number of screens K = K(T ) = (T − t K 1 )/ 2 is a function of the random lifetime T , so it also is a random variable in the simulation. We base our simulation on the female lifetime distribution and the scenario of breast cancer, and report the results in Table 3.
In Table 3, the first column ( 1 , 2 ) are the historic and future screening intervals in the simulation. The next four columns present the conditional probabilities of each of the four cases, that is, P(Case i|H K 1 , T ≥ t K 1 ), i = 1, 2, 3, 4, corresponding to the probability of symptom-free-life, no-earlydetection, true-early-detection, and overdiagnosis. The last two columns are the conditional probability of true-earlydetection and of overdiagnosis among the screen-diagnosed Figure . Histogram and estimated density of the percentage for each of the four categories, when the start-screen age is t 0 = 50 years, the current age is t K 1 = 70 years, and the past and future screening intervals are 1 = 1, 2 = 2 years, respectively. NOTE: a The conditional probability of true-early-detection and of overdiagnosis given that it is a screen-detected case, that is P( cases, calculated as P(Case i|H K 1 , 4. The probabilities are reported as percentages in Table 3.
The results in columns 2 to 5 in Table 3 show that the probability of symptom-free-life is very stable, roughly 95% for all cases. The probability of no-early-detection is influenced more by the future screening interval length 2 , sensitivity β, and the mean sojourn time μ S ; it is larger when the future screening interval is longer and it is smaller as the sensitivity, and the mean sojourn time increases. Conversely, the probability of true-earlydetection is smaller when the future screening interval is longer.
The probability of overdiagnosis is slightly smaller when the future screening interval is longer. The historic screening interval seems to have little influence on the result: the proportions in the four groups change only slightly with changes in 1 .
The sojourn time has the greatest effect on the proportion of overdiagnosis and true-early-detection among those detected by screening, as shown in the last two columns of Table 3. The proportion of overdiagnosis could be as high as 58% among the screen-detected cases if the mean sojourn time is 15 years long; even with shorter mean sojourn times (10, 5, or 2 years), the probability of overdiagnosis remains uncomfortably high (approximately 40%, 21%, and 9%, respectively).
Screening sensitivity slightly affects the ratio of the noearly-detection and the true-early-detection: when sensitivity is higher, the probability of true-early-detection is higher, and the probability of no-early-detection is lower. Compared with sojourn time, screening sensitivity does not greatly affect the ratio of overdiagnosis and true-early-detection among screendetected cases. The transition probability density w(·) also is important, but in this simulation, we considered only the situation where the density has a single mode at about age 60; we plan to investigate the effects of different forms for w(·) in future work.

Discussion
This study provides an approach to evaluating the long-term outcomes of screening programs under different historic and future screening schedules for those with a screening history. We separated asymptomatic participants in a screening program into four mutually exclusive groups: symptom-free-life, noearly-detection, true-early-detection, and overdiagnosis. The derived probability of each group shows that the proportion depends on current age, the three key parameters, the past and future screening frequency, and the lifetime distribution. Our simulations considered different scenarios when the mean sojourn time ranges from 2 years to 15 years, and the screening the 30,000 women randomized to the study arm who partook of no screening at all. More recent trials of other cancers (e.g., Prostate and Ovarian cancers in the PLCO trial) showed much higher proportions of overdiagnosed cases (e.g., estimates of overdiagnosis ranged approximately from 23% to 42% for prostate cancer in the population studied according to the SEER program; see Sandhu and Andriole 2012). Regardless of the extent of the overdiagnosis, our estimation using the HIP data demonstrates that the probability of overdiagnosis given a positive outcome increases with age (Table 2, Sec. 3.2), which would suggest the wisdom of reducing the number of screening examinations for older age groups. Because frequent mammography can incur high costs and side effects, such as unnecessary radiation and biopsies, our model can help policy makers to evaluate outcomes and to balance risks and costs. Our model does depend on sufficient data to estimate the three key parameters (screening sensitivity, sojourn time distribution, and transition probability), which are then used to estimate the probabilities of true-early-detection, no-earlydetection, overdiagnosis, and symptom-free-life, for different age groups with different screening histories and future screening frequencies. The sojourn time distribution is assumed to be independent of age in this model, but, realistically, it may depend on age, that is, an older person may be more likely to transition from S 0 , the disease-free state, to S p , the preclinical state, than a younger person, and the sojourn time in S p may also be shorter. The SEER data suggest that some cancers may have bimodal sojourn time distributions, that is, a fraction of cancers are very fast growing while others grow much more slowly; and the proportions of these two fractions may well depend on age. An analysis of current incidence data from the SEER may shed light on these proportions, but, because sojourn time distributions cannot be observed in practice, only research on animal models would provide quantitative information regarding the dependence of sojourn time distribution on age. Alternatively, we can investigate the effect of this dependence by simulating different levels of dependence between the mean sojourn time and age (as was done, e.g., in Heltshe, Kafadar, andProrok 2015, Kafadar, Prorok, andSmith 1998). We leave our investigation of these approaches to future work.