Has it Always Paid to be Rich? Income and Cause-Specific Mortality in Southern Sweden 1905-2014

Socioeconomic differences in mortality are among the most pervasive facts of contemporary societies. While the mortality gradient by income is well-established for the period after 1970, knowledge about the origins of the gradient is still rudimentary. We analyze the association between income and cause-specific adult mortality during the period 1905-2014 in an area of southern Sweden using competing-risk hazard models with individual-level longitudinal data for over 2.2 million person-years and over 35,000 deaths. The present-day income gradient in adult mortality emerged only in the post-WWII period and did so for the leading causes of death largely simultaneously. Acknowledgements This work was supported by the LONGPOP project “Methodologies and data mining techniques for the analysis of big data based on longitudinal population and epidemiological registers”, funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie [grant agreement No 676060]. Disclaimer: This document reflects only the authors’ view and that the Agency is not responsible for any use that may be made of the information it contains. The study is also part of the research program “Landskrona Population Study”, funded by the Swedish Foundation for the Humanities and Social Sciences (RJ). We thank seminar participants at the Swedish Institute for Social Research, the University of Minnesota Demography and Aging Seminar Series, the Departments of Economic History Political Science, Lund University and especially Tommy Bengtsson, Ingrid van Dijk and Luciana Quaranta for helpful comments and suggestions.


Introduction
There is a clear health and mortality gradient by socioeconomic status all over the Western world. This is true both for liberal market societies with limited public transfers and mostly private health care, such as the United States, and Scandinavian welfare states with a high degree of redistribution of income and mostly public and equal access to health care, such as Sweden. No matter which indicator of socioeconomic status that is used, high status is related to better health and longer life (e.g., Cutler et al., 2012). Most research has analyzed the association between education or social class on the one hand, and mortality or self-rated health on the other hand (see, e.g., Cutler & Lleras-Muney, 2008;Mackenbach et al., 2016). The literature on the relationship between income and health/mortality is more limited but generally also shows a clear gradient in terms of mortality or life expectancy (Chetty et al. 2016;Hederos et al. 2018;Kondo, Rostila, and Yngwe 2014;Torssander and Erikson 2010;Wamala, Blakely, and Atkinson 2006). Most of these studies also show a widening of the mortality differences over time, although few studies analyze the period before 1990.
Most studies consider all-cause mortality, but some have also analyzed the cause-specific pattern in relation to income. From these studies, it seems as if the widening of the mortality differentials by income in recent times has been especially related to circulatory diseases, alcohol-related diseases and some forms of cancer (Hederos et al. 2018;Tarkiainen et al. 2012).
Knowledge of the mortality differentials by income and other dimensions of socioeconomic status farther back in time is more limited. While many researchers seem to assume that they were as large, or even larger, in the past (e.g., Deaton, 2016;Elo, 2009;Marmot, 2005), recent findings have indicated a rather late emergence of the social class gradient in adult mortality ( Bengtsson, Dribe, and Helgertz 2020;Bengtsson and Dribe 2011;Debiasi and Dribe 2020;Edvinsson and Broström 2012;Edvinsson and Lindkvist 2011).
Research on the long-term trends (going back to the period before 1970) of socioeconomic differentials in mortality has almost exclusively focused on occupational class and there are no previous studies considering the long-term association between income and cause-specific mortality. However, different dimensions of socioeconomic status, such as income, education, and class, are not completely overlapping, but partly reflect different aspects of social stratification (Erikson and Goldthorpe 2010; Breen, Mood, and Jonsson 2016;Blanden 2013;Mood 2017). Hence, it cannot be expected without further qualification that mortality differentials developed identically across the different dimensions, and research on contemporary Sweden demonstrate similar mortality gradients also when including multiple measures of socioeconomic status in the same model (Torssander and Erikson 2010).
2 We analyze the relationship between relative income and adult cause-specific mortality during the period 1905-2014 in an area of southern Sweden. The area includes an industrial town (population of 14,500 in 1900 and 27,500 in 2000) and five nearby rural and small-town parishes (population of 5,500 in 1900 and 10,000 in 2000). For this area, we have individuallevel longitudinal data on cause-specific mortality, occupation and individual income from 1905-2014. We study the relationship between income quintiles and total and cause-specific adult and old age mortality for different sub-periods to establish if the present-day pattern has always existed, or if not, when it emerged.
Our findings confirm the existence of an income gradient in adult mortality in contemporary Sweden, for both men and women and for all causes-of-death groups. We also confirm a widening of the income differences in mortality since 1970. More importantly, however, we demonstrate that this modern mortality gradient is of a quite recent origin, and did not emerge until after 1950. When it emerged it did so more or less simultaneously for most causes of death and for both men and women. The gradient first became visible for infectious diseases and then appeared in circulatory, respiratory, cancers, and external causes.
The rest of the paper is organized as follows. Section 2 reviews previous research on the income gradient and the main theoretical pathways linking SES (and income) to health and mortality. Section 3 discusses data and variables, and section 4 the empirical design. Section 5 accounts for the estimation results and section 6 provides a concluding discussion.

Previous research
While the inequality in health in contemporary societies is well documented and reasonably well understood, the same is not true for historical contexts, which means that we largely lack knowledge about when and why these differentials emerged. It is often assumed that socioeconomic differences in health and mortality were greater in the past, but the empirical evidence to support this assumption is rather scant.
Most studies on socioeconomic mortality differentials are based on occupation (status or social class) or education, while there has not been as much focus on income ( Bengtsson, Dribe, and Helgertz 2020;Debiasi and Dribe 2020). The studies that have evaluated several dimensions of socioeconomic status at the same time have often found that the different dimensions only partly overlap in their associations with mortality (e.g., Geyer et al., 2006).
Income, for example, has been shown to be associated with health and mortality also when 3 accounting for other socioeconomic dimensions, such as class or education (e.g., Smith, 2004;Torssander & Erikson, 2010).
Several previous studies of the period after 1970 have demonstrated a clear mortality gradient by income, wealth or sometimes poverty (Elo, 2009;Hederos et al., 2018;Kondo et al., 2014;Smith, 1999;Tarkiainen et al., 2012;Torssander & Erikson, 2010;Wamala et al., 2006). Higher income and/or more wealth is consistently related to lower mortality, even though the strength of the relationship depends on age and is also often muted when controlling for education.
In a recent study using Swedish register data between 1970and 2007, Hederos et al. (2018 show a clear and consistent association between income and life expectancy at age 35. Higher income is related to longer adult life expectancy, and the income differentials have increased over time. The increasing income differential is explained by faster improvement in life expectancy in high-income groups than in low-income groups. Income differences in life expectancy also increased more for men than for women. Using survey data linked to population registers, Wamala et al. (2006) found shrinking mortality differentials by income for men but increasing differentials for women in Sweden during the 1980s and 1990s. A later register-based study, covering the period 1990-2007, found increasing mortality differentials by income for both men and women in Sweden (Kondo, Rostila, and Yngwe 2014). A similar mortality gradient by income was found in the United States between 1999 and 2014, without any noticeable threshold effects (Chetty et al. 2016). As in Sweden and Finland, the mortality inequality increased over time as improvements in life expectancy were faster for the higher-income groups than for the lower income groups. In fact, at the bottom of the income distribution there was almost no progress at all in life expectancy. Analyzing more detailed geographical patterns, they also found that life expectancy among the poorest was correlated with health behavior but not with access to health care, physical environment, labor market factors or income inequality. Kinge et al. (2019) found similar results for Norway in an analysis of full-count register data for the period 2005 to 2015. While the absolute differentials between the top 1% and bottom 1% were somewhat lower than in the United States, they were still substantial (8.4 years for women and 13.8 years for men, compared with 10.1 years for women and 14.6 years for men in the U.S. as reported by Chetty et al., (2016)). As in both Sweden and the United States, these differentials also increased in Norway over time. Mortality from cardiovascular disease, cancer and respiratory disease were important contributors to the increasing differentials.
These studies exclusively deal with the last 50 years, and in some cases, only the last 10 years, periods for which nationwide register data are available. Our main contribution is to study how these mortality differentials by income developed over a much longer time period, in order to gain a better understanding of the origins of the mortality gradient and its underlying mechanisms. Previous studies, have analyzed the long-term development of social class differences in adult mortality, both all-cause  and cause-specific (Debiasi and Dribe 2020). These studies point to a late emergence of the mortality gradient. It was not until after 1970 that the full social-class gradient in mortality was fully developed for both men and women. It emerged somewhat earlier for women, but in all broader cause-ofdeath groups at the same time. Previous research has stressed that different dimensions of socioeconomic status are related to health and mortality through different pathways and that social class, income, and education to a considerable extent are independently associated with various health outcomes (Cutler, Lleras-Muney, and Vogl 2012;Geyer et al. 2006;Torssander and Erikson 2010). Hence, income might be associated with all-cause and cause-specific mortality in different ways than social class.

Pathways
There are a number of different pathways linking income with health and mortality. In the literature on socioeconomic status and health, more generally, five broader sets of factors are often mentioned: (1) lifestyle (including nutrition), (2) access to health care, (3) physical environment, (4) psychosocial stress, and (5) reverse causality. We do not assess the relative importance of these different pathways for income differentials in mortality but focus on charting the development of the mortality differentials and briefly discuss the findings in light of previous research and theorizing on the likely pathways linking socioeconomic status, more generally, and adult mortality.
Lifestyle factors include smoking, diet, exercise, and alcohol consumption, which have well-documented effects on a number of health outcomes and diseases. Behavior and lifestyle 5 are also connected to socioeconomic status, including income. More specifically, in modern developed societies low status is usually associated with higher smoking prevalence, higher alcohol consumption and drug use, greater inactivity, and higher obesity rates (Adler & Stewart, 2010;Case & Deaton, 2020;Cavelaars et al., 2000;Elo, 2009;Marmot, 2005). It is more uncertain if low income and economic resources cause adverse health behavior directly, as many of these behaviors are actually quite costly. Cutler et al. (2012) maintain that income does not affect health and mortality to any greater extent in adulthood, but that education and social class exert an influence on health, mainly through behavior and lifestyle factors.
In preindustrial and early industrial societies poor nutrition has sometimes been argued to have been an important cause of premature death, and the improved nutritional standards following the agricultural and industrial transformations has been seen as a major explanation for reduced mortality and improved physical stature (e.g. McKeown 1976;Fogel 2004). Even in early-twentieth century Sweden, nutritional status appears to have been sufficiently high given the work effort also among the lower-status workers in both rural and urban areas (Lundh 2013), following a considerable improvement in the diet and nutrition of the lower-status groups between the late nineteenth century and the first decades of the twentieth century (Hirdman 1983, ch. 1-2). Hence, we do not expect variation in nutrition by income to have been large enough to produce income differentials in adult mortality even in the early twentieth century.
Access to health care could be an explanation behind income differences in health and mortality, especially in contexts lacking universal provision of health care at low costs (Adler and Stewart 2010) but also in contexts where health care is universal or close to universal (van Doorslaer et al. 2000), which could be related to underutilization of health services by lower socioeconomic status groups (Steingrímsdóttir et al. 2012). This might also explain why there is no strong evidence that increased provision of health care actually reduces the health gradient (Smith, 1999). In any case, simply providing more universal health care is likely not enough to eliminate the health gradient in contemporary societies (Adler and Stewart 2010).
Environmental factors could be part of the explanation for income differences because income is an important determinant of residential location and exposure to work hazards. Jobs more exposed to accidents and pollution are rarely found among the best paid. Nonetheless, it still remains unclear how much this factor actually contributes to the health gradients observed in the contemporary Western world (Adler and Stewart 2010), but may have been more important in the past. Marmot (e.g., 2005) has stressed the role of psychosocial stress for health and mortality differentials. The degree of control of the individual life situation is important for psychological 6 well-being and could also have an impact on health. Lack of such a control leads to stress, which negatively affects health through different physiological mechanisms (blood pressure, susceptibility to infection, clogging of blood vessels, etc.). It is not only exposure to stress in itself that is important but also the ability to cope with it. In particular, the combination of high stress exposure and low levels of coping lead to negative health effects (Adler and Stewart 2010). Long-term exposure to stress will excise a high toll on bodily functions through a frequent use of various physiological coping mechanisms, creating what is often referred to as "allostatic load" (Smith 1999;Adler and Stewart 2010). Having lower income is often associated with a lower degree of control over work and hence higher work-related stress.
A major issue in the vast literature on socioeconomic health differentials is causality. While many assume, and theorize, that economic status has a causal effect on health and mortality, others have pointed out that there is also a causal affect from health to economic status (e.g., Cutler et al., 2012;Deaton, 2003;Montez & Friedman, 2015;Smith, 1999Smith, , 2004. Poor health affects investments in education and labor supply, with important long-term effects on income, and deterioration of health status also affects income through sickness absence or reduced work hours. It is well-established in previous research that conditions early in life (e.g., in the fetal stage and during infancy) have long-lasting impacts on health and other outcomes (Barker, 1998;Bengtsson & Lindström, 2000;Bengtsson & Broström, 2009;Case & Paxson, 2009;Elo & Preston, 1992;Quaranta, 2014). Low nutrition and exposure to disease during early life can affect organ development and the onset of chronic disease in adulthood as well as cognitive ability. This in turn might affect both health and earnings later in life. Hence, parental income could have a direct effect on the health and cognitive development of children, which in turn could influence both their health and earnings in adulthood and thereby explain some of the association between income and health in adulthood (Currie 2009;Cutler, Lleras-Muney, and Vogl 2012).
There are studies using different quasi-experimental designs showing a causal effect of different dimensions of socioeconomic status on health or mortality. For example, a recent study using data on 50,000 Swedish twins estimated a three-year longer life expectancy at age 60 for those with high levels of schooling (13 years or more) compared to those with low levels of schooling (less than 10 years) (Lundborg, Lyttkens, and Nystedt 2016). Other studies have found causal effects of income on health using different empirical designs (Kawachi, Adler, and Dow 2010;Lindahl 2005), while some studies have not found any effects on health and mortality from random wealth shocks following lottery wins (Cesarini et al. 2016). We do not assess the direction of causality but focus on the long-term development of the association 7 between income and cause-specific mortality, acknowledging that such associations might reflect causal effects in both directions.

The Scanian Economic-Demographic Database (SEDD)
We used data from the Scanian Economic-Demographic Database (SEDD) linked to nationwide official register data from Statistics Sweden and the National Board of Health and Welfare. SEDD consists of individual-level longitudinal information from five rural and semiurban parishes and a port town in southern Sweden, covering the period from approximately 1650 to 1968 (Bengtsson, Dribe, Quaranta, et al., 2020). The database is one of very few that can follow individuals across multiple generations from preindustrial times up until the present, and with detailed information on both occupation and income, as well as different demographic outcomes and information on cause of death. The parishes included are not a representative sample of Sweden in a statistical sense, but the area is not atypical and reflects conditions shared by most rural and semi-urban areas of the time studied (see Dribe et al., 2015). Welfare. Data from these registers were linked to the historical sample using unique personal identification numbers available from 1947, which allowed an extension of the database along several dimensions. First, individuals who ever lived in the area prior to 1968 and who were still alive in that year were followed until 2014, or until death or emigration, regardless of their geographic location in Sweden. Additionally, spouses, parents, grandparents, children and siblings of individuals belonging to the original population in Scania were added to the database if they were alive and living in Sweden sometime after 1967. All individuals added to the sample population were similarly followed until 2014, death, or emigration from Sweden. 8
The period also included the First World War, in which Sweden did not participate but was indirectly affected by it. It was also the period of the Spanish influenza pandemic, which killed approximately 40,000 people in Sweden (out of a population of approximately 6 million). In 1913, a universal pension system was enacted, but with low levels of compensation (Edebalk 1996). The period also saw a continuation of improvements of water and sanitation systems that began in the nineteenth century, which led to dramatic improvements in living conditions, especially in urban areas (Helgertz and Önnerfors 2019; Sundin and Willner 2007).
The period 1922-1949 saw both rapid economic growth and deep economic crisis during the great depression of the 1930s. Overall, Sweden experienced faster economic growth than the average of that in Europe (Schön 2010). It was also a period with rapidly falling income inequality, at least between the top-earners and the rest of the population (Roine and Waldenström 2008), and witnessed the introduction of early welfare reforms, such as unemployment insurance, maternity benefits, and childcare (Magnusson 2000). As a result of previous improvements in urban and rural infrastructures (e.g., housing, water, and sanitation), disease patterns shifted from infectious diseases to chronic diseases, in particular circulatory diseases. In this period, there was also other measures taken, such as the rationing of alcohol consumption which contributed to keep diseases and deaths related to alcohol to a minimum (Sundin & Willner, 2007;p. 159).
The period 1949-1967 fell within a longer period of record economic growth in many industrial economies, Sweden included. Between 1950 and1975, the Swedish economy grew annually by over 3% per capita, while population grew by 0.6% (Schön 2010). Income inequality continued to decline (Roine and Waldenström 2008), and the labor force participation of married women increased rapidly, to a large extent connected to increasing demand for labor from both industry and the growing public sector of childcare, education and health and welfare (Stanfors 2014). This was also a period of large investments in housing and urban infrastructure. Overall, the living standards of the Swedish population greatly increased and were distributed across income groups through income transfers and provision of public services. As in other Western countries, it was a period of great progress in medical knowledge and technology, which contributed to the continuous increase in life expectancy, albeit with a growing gender gap.
During the last two periods, since 1970, the welfare state was further developed, covering broad areas of society regarding income security, education, child care, and health care. After 1973, many industrial economies, including Sweden, suffered from major structural crisis and struggled with low economic growth and high unemployment. After the industrial crisis, globalization and computerization created new conditions for economic growth, but at lower levels than during the 1950s and 1960s (Schön 2010). Since the 1980s, this development has also been connected with new (neo-liberal) economic and social policy and rising economic inequality in Sweden as well as in other Western countries (Roine and Waldenström 2008).
Previous labor-related immigration from southern Europe was replaced by large-scale immigration to Sweden of refugees, many whom came from developing countries in Africa and Asia (Bengtsson et al., 2005). The introduction of new medical drugs and treatments, including large-scale vaccinations in the period after 1950, contributed to a further decline in mortality.
However, the withdrawal of alcohol rationing measures in the mid-1950s resulted in a sharp increase in alcohol consumption and alcohol-related deaths (e.g., liver cirrhosis) in the following two decades (Norström 1987;Sundin and Willner 2007). In the last period, after 1990, the prevalence of tobacco smoking has declined significantly, albeit with pronounced socioeconomic differences. The lower-status groups have lagged behind in this development, which has produced increasing socioeconomic differentials in smoking prevalence (Nordlund 2005).

Measuring income
Throughout the period 1905-2014, we used information about income from individual tax returns, mandated from 1903 onward (see Helgertz et al., 2020). Given that we cover a period of over a century, we must account for changes in the tax system. For example, as the income-based benefits (unemployment, sickness, parental leave) were introduced, these were included in total income; this was mainly true for the post-1970 period.
Both men and women had to report their income. Sweden practiced joint taxation of married spouses until 1971, which affected the reporting of married women's earnings. Until 1947, married women were usually not included in the income and taxation registers, but their earnings were included in the tax returns of their husbands. Between 1948 and1953, earnings of married women were included separately from their spouses in the registers, and between 1954 and 1971, both income and taxes were reported separately in the registers. Nonetheless, until the tax reform of 1971, the taxes to be paid were calculated based on earnings of both spouses together. To have a measure as homogeneous as possible over the entire period, we calculated a family income adding together the incomes of both spouses. In a sensitivity analysis we also analyzed individual income for both men and women for the period after 1950, when such information became available.
There were income thresholds in the tax system, and income below the threshold did not have to be reported. The thresholds remained quite constant over time, and as a consequence, people increasingly had to report an income over time (see Helgertz et al., 2020 for a detailed description of the sources and income measures). In the case of farmers, a taxable income was calculated based on the value of their farm since they would otherwise report no or very low income. This was particularly the case at the beginning of the period under study when famers to a large extent could live on what they produced at the farm. Sometimes, wealthy individuals reported no or very low income, and then a taxable income was calculated by the tax authorities as a fraction of their wealth (1/60 until 1938 and 1/100 between 1939 and 1948, see Roine & Waldenström, 2008).
In the analyses, we used total income from sources related to labor (including selfemployment) and income from capital and real estate. Before 1968, the income and taxation registers included two different total labor-related income measures: the sum of declared income and the income assessed by the taxation board (for state tax) (Helgertz, Bengtsson, and Dribe 2020). We used the highest of the two, as it will be the best approximation of actual earnings. In the period before 1948, the highest income was usually the income assessed by the taxation board for state tax, while the difference was negligible after 1948. From 1968 onward, we relied on income data from national registers maintained by Statistics Sweden. We used the total income from labor, self-employment, and various benefits relating to previous labor earnings (pensions, parental leave benefits, unemployment benefits, etc.) and net taxable capital income. For the elderly, namely, people aged 60 to 89, we used the highest income recorded between 50 and 59 years old as their income in retirement.
We calculated income in constant prices using the consumer price index (SCB 2020). We derived quintiles using an approach adopted by Hederos et al. (2018). First, we ran fully interacted regressions of annual income on birth cohort, year, and sex keeping the residuals for each individual-year observation. Second, we used a three-year average including the residual from the current year together with the residuals from the preceding two years (e.g., the residual in 1980 was the average of the residuals in 1978, 1979, and 1980) 1 . This was done in order to mitigate possible short-term variations in reported earnings. Eventually, for each individual in each year, the income measure was assigned to the corresponding quintile. When estimating the mortality differentials by income quintile, we controlled for marital status which adjusted for the higher income assigned to married couples through the use of family income.

Imputing missing income data
There were two main kinds of missing income data in SEDD (before 1968). The first one regards individuals who were below the fiscal threshold and did not report an income, while the second is due to years when income information was not digitized for the city of Landskrona (prior to 1947). For years in which income data were available (i.e., years in which historical registers were digitized), individuals in the registers with no income reported were assigned a value of half the fiscal threshold for the corresponding year 2 , while individuals who were not listed in the registers were given a value of one. For years in which income information was not digitized, we run a set of imputations.
SEDD includes annual income records for the five parishes from 1905-1967, and from 1947 onward for Landskrona. Prior to 1947, it includes income data for approximately every five years for Landskrona . We imputed data for the years missing using available income information from Landskrona (from the previous and next closest years) and detailed 12 occupational records (coded in HISCO) for men 18-64 years old as well as for single women 3 with the same age range.
We employed Predictive Mean Matching methods (PMM), which is preferred to parametric multiple imputation (MI) in order preserve the original distribution of the empirical data when they are not normally distributed, as is usually the case with income (Kleinke 2017; Morris, White, and Royston 2014). 4 The PMM combines standard linear regression and a nearest-neighbor imputation approach. We used linear regression to obtain income predictions, which we then used as a distance measure to derive the set of the ten nearest neighbors 5 that consisted of individuals with complete values, and then we randomly drew a value from this set.
For instance, for the year 1907 (with missing income), we imputed income from the closest years (previous and next) with existing income data, 1905 and 1909. The first step was to estimate a linear regression of income on occupation as a categorical variable and age as a continuous variable. For each income year to be imputed, we took the CPI-adjusted income for the years with existing income. After running the regression, we predicted income values based on the estimated coefficients for each cell (individuals with both missing and existing income).
Then, the last step was imputing income for individuals with missing information, using data from those individuals with the closest predicted results from the regressions. In this sense, one of the main strengths of the predictive mean matching is the use of real income data and not randomly generated values.
In the imputations, we also included the incomes set at 50% of the threshold value to avoid biasing the imputed incomes upward for individuals in ages and with occupations most likely to be below the threshold. In other words, using the predictive mean matching with nearest-neighbor imputation we imputed income also for individuals who most likely were below the threshold. 6 As a robustness check, we took an alternative approach and assigned incomes below the threshold using an imputation based on income for similar individuals in the same year (age and occupation) with income above the threshold. The imputation was made using the method already described. In a second step, we used these incomes for the years when we had the 3 We do not impute income for married women because their incomes are not reported in the registers before 1947 but are included in their husbands' incomes. 4 We used the 'pmm' command in the MI package in STATA. 5 Our choice of the 10 nearest neighbors is based on Morris et al. (2014) who recommended it for larger samples with thousands of observations. 6 In the appendix, for the analysis were we used the 25% of the fiscal threshold instead of the 50% for individuals below it, the imputations were redone accordingly. 13 digitized registers when imputing income for the years when income registers were missing, using the same approach as previously described.
For men between 18 and 64 and for single women in the same age range, for each year in Landskrona without real income data, we used the closest previous and next years with real data to draw imputations (see Appendix, Tables A1 and A2). All the income data were adjusted for inflation for each specific year to be imputed. Overall, real and imputed incomes have not only similar means but also similar standard deviations, which is an important sign showing that the imputations followed the nature of the income distributions.
To assess the accuracy of the income imputations, we estimated the yearly Gini indexes for both men and women with real data and imputed data, respectively. As seen in Figure A1 and Figure A2, the results are highly similar in both levels and trends. These results show the accuracy of these imputations and the choice of PMM methodology for analysis that take into consideration social and economic disparities. As Modalsli (2015) pointed out, even when using real mean income or wealth data to estimate occupational incomes, as in so-called social tables, the level of inequality as measured by the Gini index is often underestimated, a pitfall that our imputations could overcome. Figure A1 and Figure A2 here To further evaluate the imputations, we performed a sensitivity analysis for the period for which we have annual income information also for Landskrona , comparing imputed and actual income. We used a similar structure to the one adopted for the earlier period, only keeping real income every five years and imputing the year in between. For example, we imputed income data for 1952 using real data from 1951 and 1956 from Landskrona, and then compared actual and imputed data in terms of distribution and overall mean, obtaining similar values (See appendix Table A3).
Eventually, in the analyzed sample, 40% of the time at risk for men in the first period and 38% in the second period has imputed income information. For single women, imputed income information account for 18% and 31% respectively in the first two periods.
Furthermore, 8% of the time at risk for men in the first two periods and 1% in the third period has missing income information (to which we gave a value of one and we also tested the robustness of the results by excluding observations with missing income information). For single women, missing income information accounts for about 20% of the time at risk in the first two periods and 2% in the third period. In the last two periods this figure is 3% for men and 6% for single women.

All-cause and cause-specific adult mortality
We study all-cause and cause-specific mortality for men and women between 30 and 89 years of age. Causes of death in SEDD (before 1968) were reported as text strings which have been coded into ICD10 (Hiltunen and Edvinsson 2018). For the period from 1968, we use information about cause of death from the official cause-of-death register (Dödsorsaksregistret) maintained by the National Board of Health and Welfare. In this register, causes of death are coded following the successive versions of the ICD. More specifically, the ICD codes are as follows: ICD8 between 1968and 1986, ICD9 in the period 1987-1996, and ICD10 from 1997 until the end of the observation period.
We adopted two levels of details for cause-specific mortality, grouping ICD codes according to two different classifications. In a first step, we divided ICD codes into nonpreventable and preventable mortality. This was done following the Avoidable Mortality in the European Union classification which provides a list of preventable causes of death; all other diseases not included in such list are considered as non-preventable 7 (AMIEHS, 2011;Ericsson et al., 2019). Developments in health care, medical treatments, and knowledge about risk factors certainly influenced the preventability of certain diseases over time. Hence, in a second step, we categorize causes of death following the ICD chapter, ultimately constructing a more objective and detailed measure. This resulted in seven cause-of-death groups: (1) infectious and parasitic diseases, (2) circulatory diseases, (3) respiratory diseases (including lung, larynx, trachea, bronchus, lip, oral cavity, pharynx cancers), (4) other cancers, (5) external causes, (6) other causes, and (7) missing and ill-defined causes of death (see Debiasi and Dribe 2020).
Group number six, other causes, includes individuals with an assigned ICD code that does not belong to any of the previous groups. Group number seven, missing and ill-defined causes, includes individuals with missing information and, before 1968, individuals who had a causeof-death string reported in the historical register that was not possible to code into ICD.

Methods
We employed a competing-risk hazard model with causes of death as competing outcomes, such that: The analysis was made separately by sex and period (1905-1921, 1922-1949, 1950-1967, 1968-1989, 1990-2014), controlling for birth year, marital status, parish of residence, and country of birth (Swedish born or foreign born). Birth year was included as a continuous variable as we do not expect marked non-linearities within the confined time periods analyzed.
We also ran one set of models adjusting for social class as well as a number of sensitivity analyses to test the robustness of the findings. For married couples we used the highest class position in the family, usually, but not always, the class position of the husband. We included four separate classes: non-manual workers, manual workers, farmers, and missing information.
The analytical sample included more than 2.2 million person-years and over 35,000 deaths. Table 1 shows the descriptive statistics for men and women in the sample. The changing distributions of social class across periods reflected the social and economic transformation that occurred in Sweden during the twentieth century. There was a clear decline in the share of farmers and unskilled workers over time (from 7.5% and 22.6% to 1.6% and 3.2% respectively, for men), and a parallel increase in the share of non-manual occupations, from approximately 20% to over 50% for both men and women. In terms of mortality, it is interesting to note the decreasing prevalence of infectious diseases in parallel to a growing share of circulatory diseases, respiratory diseases, and cancers. Table 1 here Table 2 reports results for all-cause, non-preventable, preventable mortality (full sets of estimates are presented in Appendix, Table A4). Looking first at men and all-cause mortality, the present-day income gradient in mortality is clearly visible . The bottom-20% of income earners (the reference category) have the highest mortality, and then mortality is progressively lowered for each quintile in a more or less linear fashion. Comparing the poorest and the richest, mortality in the bottom quintile of the income distribution is more than twice as high as the mortality in the top quintile. The pattern is similar in the two preceding periods 1968-1989 and 1950-1967, but the gradients are not as linear in these periods as in the final period. Mortality differences by income grew over time (from HR=0.624, p<0.001 for men in the highest income quintile relative to the lowest in the period 1950-1967 in the period 1990-2014). This pattern of growing income differentials in mortality is consistent with previous research both for Sweden and for other Western countries, as mentioned above.

All-cause, non-preventable, and preventable mortality
For the period 1922-1949, there is no similar gradient as in the later periods. Instead, the results suggest no mortality differences across income quintiles both with lack of statistical significance but also with hazard ratios being close to one. Similarly, there is no evidence for income differences in mortality in the first period.
For non-preventable mortality, in the last three periods, the pattern is similar to the one for all-cause mortality with decreasing mortality hazards from the bottom to the top of the income distribution with the association being somewhat lower in magnitude. During the first half of the twentieth century, hazard ratios are all close to one and not statistically significant, indicating no mortality inequalities by income quintile. The pattern for preventable mortality resembles very closely the one described for all-cause mortality also in terms of coefficients' size. In the last three periods, there is a clear income gradient, which is not present in the earlier periods.
The pattern for all-cause mortality for women in the last three periods is similar as the one for men. Once again, in the first two periods, we do not find any mortality differences with hazard ratios close to one across the different quintiles. When looking at non-preventable and preventable mortality, the results are similar to the ones for men for the entire time frame. The only exception is in the first period for preventable mortality, when we observe a difference in mortality between the two lowest quintiles.

Income-mortality associations adjusted for social class
To obtain a better understanding of the relation between income and mortality, we ran a set of models in which we adjusted for family social class (see Table 4). 8 We used the occupation of the spouse with the highest status in the family (in the order non-manual, farmer, manual). For men, all-cause, non-preventable, and preventable mortality results by income remain unchanged. In the last period, there is an independent negative association between social class and mortality in both all-cause and preventable mortality. In the period 1968-1989, being in the non-manual group is not associated with a lower hazard for any of the mortality groups. Interestingly, in the third period (1950 to 1967), the higher social class shows elevated all-cause, non-preventable, and preventable mortality. Such a positive association is present also in the first two periods, but it is significant only for preventable causes.
Similar to men, the income gradient for women does not change when adjusting for social class for any of the mortality outcomes (all-cause, non-preventable, and preventable mortality).
However, the pattern for social class is different from the one for men. For both all-cause and preventable mortality, the estimates suggest that women in the higher social class had lower mortality already from the late 1960s. Before this period, we do not find any association between social class and mortality. Overall, these results suggest that mortality differentials by income are quite independent from social class. 9 Table 4 here

Sensitivity analysis
We ran a set of sensitivity analyses in order to test for the robustness of our results to different specifications. All results are reported in the Appendix. Tables A7 and A8 show the results for separate models by age group (30-59 and 60-89), for men and women respectively.
For younger men, the income gradient emerges earlier than for older men. For all-cause and preventable mortality in the younger age group, the gradient is already pronounced in the period 1922-1949. For older men, there are reversed differences (higher mortality for higher-income groups) in all-cause and preventable mortality in the second period, which changes in the following period to a negative association and then becomes more marked with time. Women do not show strong age patterns in terms of emergence of the income gradient in mortality. The analyses divided by age groups show similar results to the main analysis presented above.
As a further robustness check, we analyze income differences in mortality using individual income instead of family income. This means that, for married people, instead of considering the sum of the two spouses' income, we use their own income. The results are reported in Table A9 and show a similar pattern as that for family income. Tables A10 reports the results for models restricted to only currently married individuals and they suggest that for men, the income gradient is even stronger and show a lower mortality at the center of the income distribution already in the second period. Married women instead show, in terms of coefficients, a similar pattern to the one described in the main analysis, but with less statistical significance.
Moreover, we estimated models where imputations of income were not used (Table A11) as well as models leaving individuals with missing income information as missing (not setting them to 1) ( Table A12). We also tested if the results changed by assigning 25% of the income threshold to those below (instead of 50%) or by assigning them an imputed income using the alternative approach described previously (see Table A13 and Table A14). The results were similar to the main analysis, which demonstrates that the absence of an income gradient in mortality before 1950 is not related to the way we assign income below the threshold. Even when assuming that all individuals without a recorded income had a similar income as other individuals of the same age and occupation in the same year, the results were the same.
Additionally, Table A15 shows the pattern for the missing and ill-defined causes of death for the first three periods. It is similar to the overall pattern in that there is no sign of an emergence of a gradient until after 1950.
Since in-and out-migration could potentially bias the results, in Table A16 and Table   A17 we report sensitivity analyses for all-cause mortality of individuals who died outside the area under study. More specifically, in Table A16 we used a sample in which people who have ever lived in the SEDD area and died elsewhere in the country have been probabilistically linked using personal details (i.e. names, surnames, birth date, birth place) to the Swedish Death Index  (Table A17). The results are similar to the main findings, which shows that the income-mortality association was not driven by sample selection related to migration.

20
In this study, we analyzed income differences in all-cause and cause-specific adult mortality for men and women for a period spanning from 1905 until 2014. The aim was to establish the timing of the emergence of the income gradient in mortality that has been consistently found in developed countries in the period after 1970 Our results show that the present-day income gradient emerged only in the post-WWII period for both men and women and it did so for all causes of death groups almost simultaneously. More specifically, for both men and women, we observed an income gradient in all-cause and cause-specific mortality since 1950. In the first half of the twentieth century, we found no evidence for an income gradient. The only exceptions were that, on the one hand, men in the top of the income distribution showed an advantage in mortality from infectious diseases as early as the 1920s; on the other hand, men in the highest quintile had higher circulatory mortality in the first half of the twentieth century, particularly in the first two decades. Further detailed analysis showed that the association between income and mortality was independent of social class. Taken together, the results considering both income and social class confirm that, across causes of death groups, the income gradient in mortality was present since the 1950s, and the results show a pattern by social class in line with previous findings (Debiasi and Dribe 2020). Additionally, when splitting the analyses by age groups, we found that the income gradient emerged earlier in working ages than in old age.
Our findings for the two more recent periods are consistent with previous work related to Sweden as a whole for both all-cause mortality and cause-specific mortality (Hederos et al.

2018
). Particularly in the last period, income inequalities steadily increased, as did mortality differentials. On the one hand, previous studies have shown a relation between income inequalities at a macro level and mortality (see Smith, 1996). One of the explanations for this link could be related to the detrimental health effects of inequalities in relative terms; that is, the psychological adverse effect of relative deprivation from belonging to a lower social strata in comparison to those higher on the ladder (Wilkinson 1994 The lack of gradient in the earlier periods could be explained by richer individuals not exploiting the entire array of protective factors. In particular, higher income in the early twentieth century has been linked to a higher consumption of alcohol and smoking tobacco as well as to a more sedentary lifestyle. Indeed, we observed a higher circulatory mortality in the top income quintile in the first period. These patterns are similar to those found when studying mortality differentials by social class, for both the same area and for Sweden as a whole, in the first half of the twentieth century (Bengtsson, Dribe, and Helgertz 2020; Debiasi and Dribe 2020; Dribe and Eriksson 2018).
The fact that the older age group had a somewhat reversed gradient in all-cause and preventable mortality in the second period could be further evidence of the importance of lifestyle factors. Higher-income groups in the early twentieth century were more exposed to unhealthy lifestyles, and it was more so for the older age group than for the younger. In other words, among richer individuals, those in the younger age group could already exploit at an earlier stage their knowledge about health risks and apply it in different behaviors, while for the elderly, it was too late to change bad habits and the consequences of having been exposed to them earlier in their adult life.
Finally, as stated above, establishing a causal relationship between income and mortality was not the aim of this study. However, we can observe that if reverse causality were behind the patterns described here, it would mean that health had an increasingly larger effect on income over time. This seems unlikely given the large expansion in welfare policies in terms of sick leave and disability that took place during the time when the income gradient in mortality emerged.
Taken together, our findings highlight the multidimensionality of socioeconomic status and the fact that different measures cannot be used interchangeably, stressing the importance of considering several indicators. At the same time, they lead to the important conclusion that the income gradient occurred only relatively recently and hence that income has not always been an important determinant of health and mortality. This conclusion is based on a regional sample in Sweden, and even though our findings for the period after 1970 are completely consistent with other research on Sweden as well as other developed countries, it is impossible to be certain that the findings have equally high validity outside the study area also for earlier periods. Therefore, it is important that further research investigate the historical incomemortality relationships for other contexts.   1905-1921 1922-1949 1950-1967 1968-1989 1990-2014 1905-1921 1922-1949 1950-1967 1968-1989 1990-2014 Mean income (1980 SEK)  Note: The group "Other causes" contains causes of death with an ICD10 code that does not belong to any of the analyzed chapters. The group "Missing and ill-defined causes" contains causes of death with no ICD10 code and causes of death that could not be given an ICD10 code (e.g., symptoms or, for example, "old age").
Source: See Table 1 APPENDIX Table A1. Years with real (in bold) and imputed income for men aged 18-65 in Landskrona (190518-65 in Landskrona ( -1946 Table A2. Years with real (in bold) and imputed income for single women aged 18-65 in Landskrona (190518-65 in Landskrona ( -1946 Table A3. Years with real and imputed income compared for the sensitivity analysis, for men aged 18-64 in Landskrona (194718-64 in Landskrona ( -1967 Years used to impute Real income data (SEK) Imputed income data (SEK)  Table A4. Hazard ratios of family income on all-cause, non-preventable, and preventable mortality, men and women aged 30-89.
Source: See Table A4. Table A14. Hazard ratios of family income on all-cause, non-preventable, and preventable mortality, men and women aged 30-89. Models with imputed income for individuals below the reporting threshold.
Source: See Table A4