Age variations and population over-coverage: Is low mortality among migrants merely a data artefact?

The migrant mortality advantage has been observed extensively, but its authenticity is debated. In particular, concerns persist that the advantage is an artefact of the data, generated by the problems of recording mobility among foreign-born populations. Here, we build on the intersection of two recent developments: the first showing substantial age variation in the advantage—a deep U-shaped advantage at peak migration ages—and the second showing high levels of population over-coverage, the principal source of data artefact, at the same ages. We use event history analysis of Sweden’s population registers (2010–15) to test whether this over-coverage can explain age variation in the migrant mortality advantage. We document its U-shape in Sweden and, crucially, demonstrate that large mortality differentials persist after adjusting for estimated over-coverage. Our findings contribute to ongoing debate by demonstrating that the migrant mortality advantage is real and by ruling out one of its primary mechanisms.


Introduction
Over the past few decades, many studies have documented differential mortality patterns among foreign-born populations in high-income, migrantreceiving countries. Most commonly, but not always, international immigrants experience lower mortality as compared with native-born populations (Aldridge et al. 2018). This 'migrant mortality advantage' has garnered increased attention in recent years due to the growing share, diversification, and ageing of foreign-born populations in rich countries. The health and mortality patterns of immigrants have profound implications for overall population health and present substantial challenges for national healthcare and welfare systems in understanding, managing, and maximizing migrant health (Rechel et al. 2011;Guillot et al. 2018).
The migrant mortality advantage exhibits some common features in the literature. It is often found, paradoxically, in immigrant populations with a lower average socio-economic position than the native-born population in the country that they move to (Deboosere and Gadeyne 2005;Ruiz et al. 2013) and is less common among immigrants who move from neighbouring countries that are similar to the host country (Wallace and Kulu 2014;Juárez et al. 2018). Studies also suggest that the advantage is strongest on arrival and wears off with length of stay (Hammar et al. 2002;Harding 2003Harding , 2004Hajat et al. 2010;Vandenheede et al. 2015;Syse et al. 2016Syse et al. , 2018, and that it is weaker, or absent, among immigrants arriving as children (Guillot et al. 2018;Juárez et al. 2018;Mehta et al. 2019). These similarities have often been used to try to advance understanding of what generates the advantage, but its causes remain debated (Guillot et al. 2018).
Until recently, little was known about how immigrant mortality patterns varied by age-one of the most fundamental demographic characteristics. Such a gap has limited our understanding of how and why mortality among immigrants differs from that of the native born. This is because estimates produced net of age, rather than by age, assume that the size of mortality differentials, and the causes working to generate such differentials, remain proportional over the entire age range (Guillot et al. 2018). Further still, this means that policy recommendations regarding immigrant health and mortality are based on age-adjusted mortality estimates that can mask variation in the presence, scale, and even direction of migrant mortality patterns by age.
Having recognized this issue, Guillot et al. (2018) recently developed and tested a theoretical framework that predicted how migrant mortality would vary by age. Systematic age variation in the advantage was uncovered in the United States (US), United Kingdom (UK), and France: specifically, a relative excess mortality among immigrants at child ages and a U-shape of advantage among immigrants at young adult ages, the tail of which then tapered into later-life adult ages (Guillot et al. 2018). The authors surmised that the patterns were most consistent with 'in-selection effects' (the 'healthy migrant effect'): the idea that people who move are not typical members of their birth country populations, but rather that they are selected based on characteristics that result in lower mortality (Riosmena et al. 2013). The patterns were also consistent with the predicted effect of 'cultural factors': some migrants are from countries where normative behaviours are health promoting, affording them protection in host countries where normative behaviours are health eroding (Abraido-Lanza et al. 2016). By contrast, the findings were inconsistent with 'out-selection effects' (or the 'salmon bias effect'), which predict that migrants are more likely to return to their birth country when they are sick, thereby depressing the average mortality level of those migrants who remain (Wallace and Kulu 2018).
One explanation that Guillot et al. (2018) could not dismiss was data artefacts: the idea that mortality differences between the foreign born and native born are generated by issues stemming from the inability to record the greater mobility of immigrant populations in the data (Wallace 2019). Although Guillot et al. (2018) offered a comprehensive overview of all the primary issues associated with this explanation (principally, the under-coverage of deaths and over-coverage of the resident population), they could not theorize with certainty how data artefacts might cause the advantage to vary by age (Guillot et al. 2018). Thus, while the paper from Guillot et al. (2018) was crucial in extending the theoretical framework surrounding the advantage and providing new age-related evidence, it could not dispel concerns around whether the advantage is real or merely a data artefact (Wallace 2019). This is troubling because, if it is an artefact, then most of what we think we know about immigrant mortality is wrong.
These concerns about data artefacts become even more pertinent in light of recent research on the principal source of data artefacts in mortality differentials between foreign-and native-born populations: over-coverage (Monti et al. 2019). On testing three approaches for correcting for this problem in Sweden, Monti et al. (2019) found the largest levels of over-coverage (i.e. the greatest number of individuals considered to have left and no longer be living in Sweden)-and therefore the largest bias in core demographic estimates among immigrants-at peak migration ages. These are the same ages at which the migrant mortality advantage is found to be largest (Guillot et al. 2018). This then calls into question how much of this U-shape of advantage is attributable to biases arising from data artefacts, as opposed to substantive causes such as selection and cultural factors.
Here, we address this concern directly, with the aim of understanding whether or not this age variation in the migrant mortality advantage is genuine. In doing so, we effectively isolate and test one of the most prominent explanations for the advantage. To achieve our aim, we pose three research questions. First, we ask whether the same U-shaped pattern of mortality advantage found in the US, UK, and France can be found among the foreign-born population in Sweden. Second, we ask whether adjusting for the principal source of data artefact, over-coverage, can account for this Ushaped mortality advantage. Finally, we ask how the intersection of age variation in the migrant mortality advantage and in over-coverage varies according to country of birth. To answer these questions, we use event history models to estimate mortality differentials for foreign-vs native-born populations by age and sex (unadjusted and adjusted for over-coverage), both for the total foreign-born population and according to specific immigrant countries of birth. We do this for Sweden in the period 2010-15 which allows us to compare our findings with previous research on the three high-income contexts just referenced.
Sweden represents an ideal context in which to conduct this research. Our analysis is based on registers of the entire population (over 9 million people in any of the years that we analyse). This permits a thorough and detailed examination of how migrant mortality varies by age, including by country of birth. Moreover, we are able to build on existing 82 Matthew Wallace and Ben Wilson research on over-coverage in Sweden, as several approaches for adjusting for over-coverage are already available (Weitoft et al. 1999;Aradhya et al. 2017;Monti et al. 2019). The value of our work is found in the ability to contribute to ongoing debate on what causes the migrant mortality advantage and whether empirically observed mortality differences are real or merely an artefact.

Background
Over-coverage refers to a situation in which people continue to be recorded as resident in a population, even though they have left the country (Monti et al. 2019). This almost always occurs because we lack any record or proof of their departure in national data sources. If such cases are not identified in studies of mortality, these individuals become 'statistically immortal', as they continue to age in data sources, despite not being at risk of dying in the host country, and their eventual death will be registered elsewhere (Kibele et al. 2008). This generates a downward bias in mortality rates because we overestimate risk time and potentially underestimate deaths. Hypothetically, if the level of over-coverage is large enough, then the migrant mortality advantage could simply be a consequence of the presence of some unregistered emigrants in the analysis. While native-born people are also susceptible to over-coverage, immigrants are disproportionately affected due to their higher mobility, including recent diversification in forms of migration (such as repeat, onwards, and circular) (Aradhya et al. 2017). Here, we provide an overview of previous studies that have explicitly studied over-coverage in immigrant mortality rates, starting with studies on Sweden. We focus on data, methods, and the potential impact of over-coverage.
Recent Swedish research has tested three different approaches that adjust for over-coverage. The first is the zero-income approach, which uses the register relating to economic activity and is based on the logic that those without economic activity in a given year, or years, in a welfare state such as Sweden can be assumed to have left the country (Aradhya et al. 2017). The other two are called the cross-sectional and longitudinal register-trace approaches, respectively. Both assume that individuals have left Sweden if they are not correctly registered and also fail to show any traces of activity (e.g. internal migration or enrolment in education) across several of the registers (rather than just the register relating to economic activity). The former looks for evidence of residence at a single point in time, whereas the latter approach looks for evidence over time. Monti et al. (2019) estimated age-specific mortality rates for the foreign born, adjusted and unadjusted for over-coverage, and produced a ratio between the two. For each of the three approaches, Monti et al. (2019) found the largest impact of over-coverage at young adult ages (15-40). The zero-income approach provided by far the most conservative adjustment, with migrant mortality rates between 1.4 and 2.5 times higher in this age range. At ages 40-75, the three approaches all generated similar ratios. That paper showed the extent to which migrant mortality can be downwardly biased by over-coverage. However, it said little about the extent to which over-coverage can explain differentials between foreign-and native-born populations or the migrant mortality advantage. As detailed later on, we draw on these approaches in our analysis.
To the best of our knowledge, the only study to examine the role of over-coverage in explaining mortality differences between immigrants and the native born in Sweden was conducted around two decades ago (Weitoft et al. 1999). Using an approach based on income and the receipt of social benefits, the authors adjusted mortality rates among immigrants aged 20-64 for the period 1987-94. Initially, a migrant mortality advantage was found among immigrants from Southern Europe, former Yugoslavia and Turkey, Latin America, Africa, and Asia, and a group containing the rest of Europe, Canada, US, and Oceania. After removing people assumed to have emigrated if they did not receive any earnings or social benefits, the advantage decreased but persisted in these immigrant groups. However, removing more possible cases of over-coverage by further restrictions based on earnings only, the migrant mortality advantage was lost among most groups. The authors concluded that there was some underestimation of mortality, but its extent was difficult to assess (Weitoft et al. 1999). In a direct response to this study, the authors of a paper from Germany calculated mortality among the foreign-born population aged 15+ and 15-64 in Germany using the German Socio-Economic Panel, based on the logic that that such cohort studies are less vulnerable to population over-coverage than the registers. They found immigrant mortality advantages of a similar size to those found in the German register studies (Razum et al. 1998).
In other contexts, one study of England and Wales used life event indicators from civil registers (the birth of children, deaths of respondents, migration, and other life events) and presence at decennial censuses from a linked longitudinal data source to identify probable cases of over-coverage (Wallace and Kulu 2014). Its authors adjusted for over-coverage through several different scenarios, which examined the impact on migrant mortality of these unregistered emigrants exiting two, four, and seven years after their final census. They found over-coverage to explain some, but not all, of the migrant mortality advantage among migrants aged 18+ (Wallace and Kulu 2014). In France, one paper (Khlat and Courbage 1996) used an indirect approach initially developed in an earlier paper (Courbage and Fargues 1979) to show 23 per cent over-coverage of Moroccan men from 1980 to 1990. Nevertheless, adjusted period life tables still gave Moroccan men a 2.4year mortality advantage compared with the native born. In a study of mortality among migrants aged 25-55 in Belgium, Anson (2004) performed a simple over-coverage check by reducing the risk time for the foreign born by 194 days (the number of days, on average, that it took for those who did not record their emigration to be administratively removed from the risk set). An advantage persisted and the author argued that over-coverage could not explain their results (Anson 2004).
Rather than attempting to adjust over-coverage explicitly, several other studies, such as Razum et al.'s (2000), have used data sources that they reason are more suited to capturing the immigrations and emigrations of foreign-born individuals. One study used the federal German statutory pension scheme (DRV) to estimate the mortality of pensioners, arguing that such data are more accurate because the survival of pensioners must be tracked carefully to give correct pension payments (Kibele et al. 2008). Pensioners living abroad must provide annual confirmation of being alive to receive a pension, reducing the likelihood that pensioners continue to be included in this database after death. On comparing their findings from the DRV with official population data, Kibele et al. (2008) found a sizeable overestimation of the foreign-born population and underestimation of deaths among male immigrants aged 65+ in the official population data. Additionally, the mortality in this group did not increase exponentially from age 65, as would be expected. Consequently, the substantial life expectancy advantages found in the official population data were not replicated in the DRV data. A subsequent study from Germany also compared official population data with another data source, the Central Register of Foreigners (AZR), which is argued to be more accurate due to its focus on immigrant populations (Kohls 2010). The author found mortality among migrants to be much lower in the official population data, notably at older ages, in comparison with the AZR. However, lower mortality rates compared with native born were still found in the AZR among some birth country groups, notably Asian and African migrants (Kohls 2010).
Along similar lines, researchers in the US have combined pension data with data from annual questionnaires about beneficiaries living in the US and abroad (Turra and Elo 2008). The authors calculated age-adjusted and age-specific death ratios (65-90+) comparing foreign-and native-born subpopulations, both including and excluding emigration, as recorded by the questionnaires. In contrast with the German studies, however, the authors continued to find large mortality advantages among Hispanic immigrants that could not be explained by over-coverage, even after accounting for negative selection of poor emigrant health (Turra and Elo 2008).
In summary, we can identify two different approaches to addressing over-coverage in migrant mortality patterns. We have reason to be cautious about the findings from both approaches. The first approach explicitly adjusts for over-coverage by using traces of presence in available data to try to identify unregistered emigrants. This may lead to the incorrect inclusion of leavers and exclusion of stayers in risk populations. The second approach uses alternative data sources that are assumed to capture mobile migrant populations better. However, most of these data sources relate to pensionable ages, at which we might no longer even expect to find a migrant mortality advantage. Small residual advantages at older ages are also more easily explained by over-coverage. Ideally, research should focus on examining whether over-coverage can explain low migrant mortality at young adult ages, where the mortality advantage is theorized to be at its strongest. Most importantly, none of the cited studies disaggregates by age. Rather, they estimate the average impact of over-coverage across broad, open-ended, or specific age intervals (e.g. ages 65+). Thus, we set out to use high-quality registers to establish variation in the effect of over-coverage by age, including the peak migration ages at which the advantage is strongest most pronounced.

Data and method
The Swedish registers Our study uses the collections of Swedish register data called 'Migrant Trajectories', which are organized at Stockholm University and accessible for research under ethical approval from the regional ethics board in Stockholm. We use longitudinal individual-level data from several administrative data sets. We derive our information from four different register sources: (1) the total population register (Mikrodata för Registret över totalbefolkningen; RTB), which acts as the base register for the production of statistics on the size and composition of Sweden's population; (2) the register for labour market studies and health insurance (Longitudinell integrationsdatabas för sjukförsäkringsoch arbetsmarknadsstudier; LISA), which contains annual information on education, employment, health, and social benefits; (3) the migration register, which contains detailed information on registered immigrations and emigrations of the resident population; and (4) the death register. Swedish population data are of high quality because residents are obligated to register their address in order to work in the country and access all the benefits and social services that are available (e.g. for healthcare and social welfare benefits, or for their children to attend school).
Available data from Migrant Trajectories cover the entire resident population of Sweden annually from 1961 up to 2017. In this paper we focus on the period 2010-15, in order to compare any age variation with previous work (Guillot et al. 2018) and to permit analyses of age variation in migrant mortality according to specific birth country groups. We end the observation period in December 2015 rather than 2017, because 2015 is the latest year for which we have all of the information required to construct our indicator of over-coverage.

Measuring population over-coverage
We build on studies that have calculated over-coverage based on the absence of labour market activity and social welfare receipt (Weitoft et al. 1999;Aradhya et al. 2017). These studies considered people to be resident in Sweden if they had received income in the year(s) before or during the study period. The assumption made is that an individual must still be resident to receive an income from work or social benefits in a country with a welfare state that is as comprehensive as Sweden's. This has been referred to as the zero-income approach and has been shown to provide the most conservative estimate of over-coverage available (Monti et al. 2019).
Given that the debate about which measure of over-coverage is the most accurate remains unresolved (Monti et al. 2019), it seems logical to implement the most conservative (or strictest) approach if our aim is to examine whether over-coverage can explain the existence of the migrant mortality advantage. This is because, if the mortality advantage persists after adjusting for the most conservative estimate of population over-coverage, it is unlikely that age variation in immigrant mortality patterns can be explained by this data artefact. Moreover, the zero-income approach is attractive because the receipt of income, including any social benefits, is most likely at the ages of interest (i.e. young adulthood, which is where we might observe the U-shape of advantage).
Specifically, we assume that people are overcovered (i.e. have actually left Sweden but remain 'resident' in the data) if, in a given year, they are registered in the total population register but do not receive an income from employment, social benefits, sick pay, or pensions, and have no recorded emigration or death. Conversely, people are assumed to be resident in a given year if they are younger than 16 (as people in Sweden only enter into the LISA database at age 16) or aged 16 or older and receiving income from the sources listed. In the cases where death or emigration takes place in a given year, these life events are given precedence over the receipt or not of any income or benefits. For example, if a death is recorded in September of a given year, but there is no evidence of income or benefits, this person is still assumed to have been present from the start of the year. With this in mind, our approach falls somewhere between the zero-income approach and the longitudinal register-trace approach that is described in Monti et al. (2019), in which traces from multiple registers are used to determine presence. A variable corresponding to each year identifies individuals as being over-covered ('1') or resident ('0'). This logic is illustrated in Figures 1 and 2, which also provide descriptive information on how the absolute and relative risk time of over-coverage varies across subgroups.

Statistical methods and study parameters
To estimate the mortality of foreign-born people (also referred to here as immigrants, i.e. those who arrived at some point earlier in their life) relative to the native born, we estimate age-specific mortality hazard ratios (HRs) for all-cause mortality using Immigrant mortality, age variation, and over-coverage 85 Cox proportional hazards models. Entry into the study begins in 2010 (1 January) and we follow the resident population for six years until the end of 2015 (31 December). Immigrants can also enter the study if they arrive in Sweden within the date parameters of the study period. Exit from the risk population takes place when people die, emigrate (where emigrations have been registered), or reach the end of the six-year risk window alive and resident in Sweden. Age is the clock in the models, specified using age at entry (into the study) and age at exit (from the study) of subjects when setting the data for event history analysis. We split the data into multiple episodes per person, so that each record corresponds to a single year. To each episode we assign time-invariant covariates (e.g. sex, nativity status, and birth country group), time-varying covariates (year), and our year-specific over-coverage indicator.
Our analyses are split into two sections. The first section focuses on our first two research questions. We examine the foreign-born population as a whole to determine whether: (1) the relative shape of migrant mortality by age in Sweden is consistent with recent work on the US, UK, and France from Guillot et al. (2018); and (2) to see if the characteristic U-shape of advantage can be explained by over-coverage, one of the main explanations of the migrant mortality advantage that Guillot et al. (2018) could not test as part of their framework. The second section focuses on the third research question, by examining specific birth country groups to see, first, whether the relative shape of  In both analyses, we estimate unadjusted agespecific HRs of mortality. Then we drop episodes for individuals with a value of '1' in our over-coverage variable (indicating absence from Sweden) and refit the models without these episodes in order to obtain age-specific HRs of mortality that are (conservatively) adjusted for an estimate of over-coverage. For example, we may imagine a person who is living in Sweden in 2010 and still alive and resident in Sweden at the end of 2015, but who we believe was living outside Sweden in 2012, 2013, and 2014. In the unadjusted analysis, we would include all six episodes for this individual. In the adjusted analysis, we would include episodes for only 2010, 2011, and 2015. Such an approach reflects the dynamic nature of our event history set-up, allowing for permanent exits, temporary one-off exits and returns, and noncontinuous presence (i.e. those who split their time between several countries). As a matter of interpretation, and to reflect the uncertainty in identifying people who have left the country, we consider the unadjusted HRs as a 'lower bound' estimate of relative mortality and the adjusted HRs as an 'upper bound' estimate of relative mortality, proposing that the true mortality differential for the foreignvs native-born population lies in between. We estimate all-cause mortality for the total resident population of Sweden, but nevertheless include 95 per cent confidence intervals as some measure of population variability.  In all models, we adjust for nativity status and year of study: the native born act as the reference group for nativity status and the year 2010 acts as the reference for year of study (from 2010 to 2015). We stratify the models by sex and five-year age groups ranging from 5-9 to 85+. We begin with 5-9, rather than 0-4, to maintain consistency with previous work on age variation in migrant mortality (Guillot et al. 2018), and because there are very few foreign-born children aged 0-4.

Total foreign born population
Figures 1 and 2 provide flow charts for the total native-and foreign-born populations, describing the absolute and relative shares of exposure time (in person-years; PYs) for episodes with evidence of residence in Sweden and episodes with evidence of possible over-coverage. We highlight several patterns. Exposure time relating to over-coverage is similar among the foreign-and native-born groups. Accordingly, given the differences in the sizes of the two populations, the relative share of exposure time relating to estimated over-coverage is higher among immigrants (9.1 per cent among foreignborn men vs 1.4 per cent among native-born men; 6.8 per cent among foreign-born women vs 1.0 per cent among native-born women). Irrespective of nativity, the absolute and relative proportions of this over-coverage are always lower among women than men. Figure 3 presents the population age structure of the native-and foreign-born populations of Sweden in 2010-15. As expected, the foreign-born population's age structure is younger than that of the native born, with a larger proportion at young adult ages. Regarding variation in age-specific levels of possible over-coverage, the general patterns for both the native born and foreign born suggest that possible over-coverage is highest at peak migration ages (20-39; especially among the foreign-born population), and then declines as age increases. For example, as a percentage of the specific age group, possible over-coverage is highest for the foreign-born population at 25-29, with 15.0 per cent of total episodes for males and 13.3 per cent for females. These patterns of over-coverage are consistent with previous research testing this approach (Monti et al. 2019) and suggest that the greatest bias is likely for mortality differentials at peak migration ages. Figure 4 relates directly to our first two research questions: to determine whether the same U-shape of the migrant mortality advantage documented in the US, UK, and France by Guillot et al. (2018) can be found in Sweden, and to see whether adjusting (conservatively) for possible over-coverage can account for this U-shaped mortality advantage. Specifically, Figure 4 shows estimates from the event history models relating to the total foreignborn population. It displays adjusted and unadjusted mortality levels by age among Sweden's foreignborn population relative to the native-born population for the time period 2010-15, plotted (with permission) alongside similar estimates for the US, UK, and France (Guillot et al. 2018).
Addressing our first research question, the overcoverage unadjusted patterns for men and women show a similar age variation in relative mortality among immigrants in Sweden as in the three other high-income countries (Guillot et al. 2018). To elaborate, we document excess all-cause mortality in the age groups 5-9 and 10-14, followed by a deep Ushape of mortality advantage starting at ages 15-19 and ending at 35-39, and at its lowest in the age group 25-29, both for men (HR = 0.60; 0.52-0.70) and women (HR = 0.58; 0.45-0.73). From age 40, the size of the advantage tapers among foreignborn women towards and above the mortality of native-born women, revealing a small disadvantage that peaks in the age group 75-79 (HR = 1.11; 1.08-1.15). The advantage also reverses among foreignborn men, revealing a more substantial disadvantage that peaks in the age group 60-64 (HR = 1.20; 1.15-1.25). Nevertheless, there are some differences in the relative age patterns of mortality in Sweden as compared with the US, UK, and France. These include the depth of the U-shape (which is deepest in the US), the ages at which the U-shape begins (youngest in the US and oldest in France), and the ages at which it ends (youngest in Sweden and oldest in the US). A further difference is whether at old ages the mortality advantage is maintained (as among immigrant populations in the US and the UK), attenuated (as for men in France), or reversed (as for women in France, as well as women and men in Sweden).
Addressing our second research question, and the adjusted estimates, Figure 4 shows that adjusting for possible over-coverage does affect the size of the relative mortality differentials between the foreign born and the native born, particularly at peak migration ages and among men. However, the characteristic age variation-and most importantly the U-shape of advantage-in the mortality of immigrants persists after adjusting for a conservative estimate of over-coverage. To quantify the differences, in Table 1 we provide adjusted and unadjusted HRs for each age group, alongside the absolute difference between these two estimates, and the proportion of the migrant mortality advantage (where it is observed) that is attributable to possible over-coverage. Concentrating on the peak migration age groups, from 15-19 to 35-39, in which we observe large mortality advantages among immigrants, we find that adjusting for possible over-coverage explains a similar amount of the advantage in absolute terms for both women and men. In relative terms, in the 15-39 age range, this translates to 4-25 per cent of the migrant mortality advantage among women and 20-32 per cent of the migrant mortality advantage among men. This means that the majority of the mortality advantage at these ages remains unexplained after making a conservative adjustment for the data artefact that is often touted as a primary explanation. Table 1 also shows the age-adjusted HRs for the entire adult age range (15+). These estimates, when compared with the profound age variation that we have observed, demonstrate the importance of examining age variation explicitly when studying mortality differentials between the foreign born and native born. The adjusted HRs indicate a small excess mortality among foreign-born women (HR 15+ = 1.02; 1.01-1.03) and a larger excess among foreign-born men (HR 15+ = 1.09; 1.08-1.10). This adjusted estimate masks the considerable advantages that foreign-born men and women experience at peak migration ages (Figure 4), with averages of HR 15-39 = 0.77 (0.71-0.82) among men and HR 15-39 = 0.75 (0.67-0.83) among women. Perhaps just as importantly, averaging mortality across the entire age range underestimates the magnitude of the excesses observed in the age groups 55-59 to 75-79. Previous research has devoted attention to the notion that differentials may disappear at these ages after adjusting for data artefacts. We therefore note that the magnitudes of the excess mortality ratios for age groups 55-59 and above appear to be underestimated without an adjustment for possible population over-coverage.

Specific birth country groups
Figure 5 bridges our analysis of foreign-born women and men overall (already shown), with our countryspecific analysis (shown next). It displays the composition of the entire foreign-born population in Sweden by age in 2010-15 according to specific immigrant birth country groups. Individual birth countries are categorized into eight groups: four of European origin (Finland, Other Nordic, Other Western countries, Central and Eastern Europe) and four of non-European origin (Central and Southern America, Middle East, Asia, and sub-Saharan Africa). Table S1 (supplementary material) shows the birth country composition of these groupings according to the lowest-level country groups available in the register collection that we use. We note here that the group 'Other Western countries' largely comprises countries in Western and Southern Europe, but also includes a small percentage of migrants from the US, Canada, Australia, and New Zealand (16 per cent; see Table S1). Generally, we see that non-European countries (notably Asia and the Middle East) represent a greater share of the population at younger ages (around 60 per cent up to the end of peak migration ages, 15-39) and a smaller share at older ages. The opposite is true for European countries, particularly Finland and the Other Nordic countries. Thus, in the analysis of mortality for the entire foreign-born population ( Figure  4), we might expect migrants born in non-European countries to have had a greater influence on the mortality patterns that we observed at younger ages, and migrants born in European and Nordic countries to have been more influential at older ages. Figure S1 (supplementary material) shows population age structures by countries of birth alongside the share of possible over-coverage in each age group. In line with the composition shown in Figure  5, we find an older age structure among Finnish immigrants compared with the age structure of the entire foreign-born population (in Figure 3). Conversely, we find younger age structures-that are more reflective of the entire foreign-born population-for immigrants from non-European countries and from Central and Eastern Europe. All birth country groups follow the average pattern, whereby possible over-coverage is highest at peak migration ages and then declines with age. Nevertheless, we do observe variation by birth country group in the extent of possible over-coverage. Estimates of over-coverage are lowest for migrants from Finland; below average for migrants born in Central and Eastern Europe, Central and Southern America, and sub-Saharan Africa; above average for Asia and Other Western countries; and highest for migrants born in Other Nordic countries. Thus, it is for the latter birth country groups that we would expect to observe the largest bias in mortality estimates. (1) HR is the hazard ratio; MMA is the migrant mortality advantage;indicates no migrant mortality advantage (hazard ratio > 1).

Central and Eastern Europe
Age-specific HR of mortality (log scale)

Age (five-year bands)
Native-born population (ref) Foreign-born (unadjusted) Foreign-born (adjusted) Prop. attrib. to over-coverage 95% CIs Figure 6 Unadjusted and possible over-coverage adjusted age-specific hazard ratios for mortality (Cox proportional hazards models), foreign-born subgroups with European origins vs native-born men and women in Sweden, 2010-15 Note: HR is the hazard ratio. Source: As for Figure 1. relative to the native-born Swedish population for the period 2010-15. These figures therefore allow us to compare age variation in the mortality advantage on average (Figure 4) with the age variation in the advantage by country of birth for the groups defined earlier.
On average, foreign-born children exhibit excess mortality (Figure 4). By country of birth, we find that excess mortality is typical before age 15 for child migrants born in all non-European countries (Figure 7), but not among child migrants born in European countries, except those born in Central and Eastern Europe (Figure 6). The average Ushape of migrant mortality advantage at peak migration ages (15-39) found in Figure 4 can also be seen in around half of the groups in Figures 6  and 7. Notably, it can be found among men and women born in Other Western countries and Central and Eastern Europe (Figure 6), the Middle East, and men born in Asia (Figure 7), but not for women or men born in Finland, the Other Nordic countries (Figure 6), Central and Southern America, or sub-Saharan Africa, nor for women born in Asia (Figure 7). Regarding the emergence of excess mortality at older ages that we see for all foreign-born men and women (Figure 4), we observe the same pattern of relative disadvantage at older ages among men and women born in Finland and Other Nordic countries, and men born in Central and Eastern Europe. In contrast, among older men and women born in non-European countries or Other Western countries, mortality tapers towards-but almost never exceeds-the level of the native-born population.
Figures 6 and 7 also show the impact of possible over-coverage on the mortality of migrants by country of birth group. Overall, as with the entire foreign-born population estimates, we find that adjusting conservatively for over-coverage does not alter the age profiles of relative mortality substantially for any of the groups (in Figures 6 and 7), even among the groups for which estimates of over-coverage are substantial (such as Other Nordic). Additionally, for those groups with a characteristic U-shape of mortality advantage at peak migration ages, over-coverage does not explain the large advantages observed. Tables S2 (men) and S3 (women) in the supplementary material quantify the differences between the adjusted and unadjusted estimates (in the same way as Table 1). For example, a conservative adjustment for over-coverage at peak migration ages (15-39) explains only 10 per cent of the advantage among men and 6 per cent among women born in the Middle East, 16 per cent for men and 14 per cent for women born in Other Western countries, and 24 per cent among men and 16 per cent among women born in Central and Eastern Europe. Considering a wider age range (to eliminate some of the instability in the age estimates for specific origins), the same tables show that large mortality advantages can still be observed (and are not explained by adjusting for over-coverage) even in the absence of any U-shape. The most obvious example is among men (adjusted HR 15+ = 0.78; 0.72-0.84) and women (adjusted HR 15+ = 0.74; 0.68-0.80) aged 15+ born in Central and Southern America.

Discussion
Despite the topic being documented extensively over the past few decades, scholars continue to debate what causes the migrant mortality advantage and, perhaps more importantly, whether or not it is genuine. Debates about its veracity have persisted due to concerns about the accuracy of data on international immigration and a lack of large-scale data sets permitting the detailed study of mortality differentials that compare foreign-and native-born populations. Indeed, we have only recently begun to understand how such differentials vary by age, one of the most fundamental of all demographic characteristics. Here, we have built on recent work documenting age variation in migrant mortality across three high-income countries (Guillot et al. 2018) and in levels of population over-coverage among migrants in Sweden (Monti et al. 2019). Our study represents the first examination of the intersection between these areas of research. Our aim-to understand whether or not age variation in the migrant mortality advantage is real-centred around three research questions: (1) whether the same U-shaped pattern of mortality advantage found in the US, UK, and France is found in Sweden; (2) whether over-coverage can explain the U-shape, and (3) how evidence of age variation and over-coverage varies by country of birth.
Regarding the first question, we found a similar age pattern in relative mortality differentials between foreign-and native-born populations in Sweden as has been found in the US, UK, and France (Guillot et al. 2018). For both women and men, we found relative excess mortality at child ages (<15) and a sizeable U-shaped mortality advantage at age groups 15-19 to 35-39. The advantage for the foreign born diminished with age, becoming a disadvantage for immigrants over 60-64 notably among men. This variation by age was hidden by the age-adjusted differential, which indicated a small disadvantage among foreign-born men and a similar level of mortality to the native-born population among foreign-born women. Our findings for Sweden further exemplify the importance of adopting an explicit age perspective in research on immigrant mortality.
With reference to our second research question, we found that while adjusting for possible over-coverage did change the magnitude of mortality differentials between the foreign born and native born to some extent, most prominently at peak migration ages, it did not alter the age profile of mortality differentials. Consequently, we found that over-coverage could explain some, but not all, of the mortality advantage documented in the peak migration age groups. Indeed, a substantial proportion of the mortality advantage between age groups 15-19 and 35-39-around four fifths among foreign-born women and three quarters among foreign-born menremained unaccounted for even after making this conservative adjustment for possible population over-coverage.
Considering our final question, we documented considerable heterogeneity in the age profiles of mortality advantage and in the impact of adjusting over-coverage on mortality estimates according to country of birth groups. Concerning the age patterns, it was rare for a specific group to mirror the exact age profile of the foreign-born average, and the influence of specific birth country groups on the average was clear. For example, the U-shaped mortality advantage in the foreign-born average was driven by men and women born in Other Western countries, Central and Eastern Europe, the Middle East, and men born in Asia. Similarly, the emergence of excess mortality at older ages was driven by immigrants born in Finland, Other Nordic countries, and Central and Eastern Europe. The atypical age profiles for men and women born in Finland and Other Nordic countries show the importance of not generalizing a migrant mortality advantage to all groups. These two birth country groups experienced near universal disadvantages across the age range. We can also look to men born in Central and Eastern Europe, a group with ages of both advantage and disadvantage to reiterate the importance of adopting an explicit age perspective in work on immigrant mortality. Adjusting for over-coverage appeared to matter more for some birth country groups (such as men and women born in Other Western countries, Asia, and Other Nordic countries) than for others. Despite this, our conservative adjustment did not profoundly alter the age variation of mortality for these groups. For groups in which a U-shape of advantage was documented, conservatively adjusting for estimated over-coverage could not explain the mortality differential, and large advantages remained. We note that immigrants born in the Middle East, a group that combines the characteristic U-shape of advantage with low overcoverage, provide the most reliable evidence to date that the migrant mortality advantage is genuine.
Our findings advance understanding of immigrant mortality in two significant aspects. First, we add to the small body of evidence that has examined the role of over-coverage in the migrant mortality advantage. In line with previous work we determine that this particular data artefact can explain some, but not all, of the advantage (when observed). Importantly, we go beyond this body of research by finding that over-coverage induces differential amounts of bias across ages and birth country. We recommend that future studies, where feasible, implement an age-specific perspective when investigating the impact of over-coverage on the migrant mortality advantage. Our results also call into question whether studies that use alternative data (such as pension data) to overcome over-coverage problems are suited to investigating the migrant mortality advantage. While such data may better capture the resident population, we in fact documented mortality excess among foreign-born women and men at ages 65+ and there was no advantage to explain. Conclusions based on these data may give the false impression that the advantage is a data artefact, whereas we demonstrate that this is not the case, or at least certainly not for Sweden.
Second, our findings contribute to ongoing debate about what causes the migrant mortality advantage. By documenting similar age variation to that found in other high-income countries (Trovato 2003;Guillot et al. 2018), but also determining that this age variation is not caused by over-coverage, we develop new understanding by ruling out over-coverage as a cause of the advantage. Since over-coverage is the data artefact most likely to bias mortality downward, this implies that the advantage must be generated by real mechanisms. Our own conclusions are in line with those of other studies that have documented similar age variation in migrant mortality (Trovato 2003;Guillot et al. 2018;Wallace and Wilson 2019): in short, that the U-shaped advantage at young adult ages is consistent with a strong and recent selection or healthy immigrant effect among recent arrivals. At older ages, the absence, reversal, or tapering of the advantage is consistent with a weaker selection effect among older arrivals, a waning of selection effects among immigrants who have lived in the host country for a long time, and higher average levels of adaptation to the health habits and lifestyles of the host society (Trovato 2003).
Of course, we cannot definitively state that age variation in immigrant mortality in the US, UK, and France is not caused by over-coverage. It is, for example, unclear whether the increased mobility of immigrant populations induces a similar amount of bias in census data (which were used to calculate the mortality estimates for these countries) as it does in register data. Additionally, the mechanisms that generate the bias are somewhat different (i.e. not being present to complete a census form at a specific time of year vs not showing any evidence of residence for the entire year). However, given that the sizes of the U-shaped advantages were similar (or larger in the US), it is clear that a large amount of over-coverage would be required to explain the differentials in these countries. We recommend that future work looks to adjust migrant mortality patterns in these countries for over-coverage to determine the extent of any bias.
Given the growing share of immigrants in Sweden (Agafiţei and Ivan 2017) and the fact that the age variation in migrant mortality patterns observed here persists after adjusting for over-coverage, it seems plausible that migrants could affect national mortality. We refer to a paper showing that Sweden has been losing ground in life expectancy at birth compared with other countries (Drefahl et al. 2014). On decomposing life expectancy by age, its authors found that while Sweden had retained its advantage at young adult ages, it had lost its overall advantage because old-age mortality had improved more slowly than it had in other countries. We position these findings in the context of the age variation in migrant mortality found here and wonder whether the mortality advantages of young adult migrants have helped Sweden maintain its life expectancy advantage over other countries at young adult ages. Equally, we wonder whether the mortality disadvantages found here among older migrants have contributed to Sweden losing ground in older-age life expectancy advantage relative to other nations, especially given that previous work has found that migrants can retain these advantages into old age in other countries (Trovato 2003;Guillot Immigrant mortality, age variation, and over-coverage 95 et al. 2018). Finally, we wonder to what extent in future the changing birth country composition of migrants at older ages, from migrant groups with mortality disadvantages at old ages (Finland and Other Nordic) to migrant groups with residual mortality advantages at older ages (Other Western countries, the Middle East, Asia, and Central and Southern America), might help Sweden recoup gains in life expectancy at older ages compared with other leading countries.
A potential limitation of our study relates to the approach used to adjust for over-coverage. As yet, no consensus has been reached as to which method can be used to adjust for over-coverage most accurately. Here we based our approach around the zero-income method because previous work had shown it to be the most conservative (i.e. it provides the highest estimate of over-coverage), particularly for the ages in which we were interested. In all likelihood, we expect that over-coverage will explain less of the differential than we report here, and that the true mortality differential will fall somewhere between our unadjusted and adjusted mortality estimates. Additionally, the use of income and social benefits as indicators of residence is clearly more relevant for people of working age. As such, we expect this approach to be less effective at older ages. Finally, over-coverage is only an example of one data artefact. It is possible-albeit unlikely-that other data artefacts may be sources of bias in mortality differentials. Other artefacts include ethnic misclassification (irrelevant here because we used country of birth), under-registration of deaths (for this artefact the number of unrecorded deaths among foreign-born residents abroad would have to be large to fully explain the U-shaped advantage at ages where death is rare), and age misreporting (an issue specific to those from less developed countries, of whom there are many in Sweden; however, such an issue should generate an increasing bias with age and not the pattern that we observe here). Despite the fact that we consider none of these additional data artefacts as a likely source of material bias, future research could consider these other types of data artefact, including in Sweden.

Conclusion
Overall, we find that the unique mortality patterns of international immigrants, and in particular their characteristic mortality age profile, cannot be explained by population over-coverage, the data artefact that is most likely to be a source of bias.
Thus, in the age groups and origin regions for which a migrant mortality advantage is observed, we can conclude that this low mortality is genuine. Research on migrant mortality should refocus attention towards understanding what explains this advantage and which combinations of mechanisms result in the absence, reversal, or tapering of relative advantages with age. Decision makers can be reassured that the mortality patterns of immigrants are real, but should remain wary that estimates that do not adjust for over-coverage may over-or underestimate the advantage to some extent. Explicit consideration of age is needed in all analysis and policymaking focused on the health of immigrants in rich countries.