Theory and explanation in demography: The case of low fertility in Europe

In the 50th anniversary edition of Population Studies, John Hobcraft commented that demographers spend too little time trying to explain the phenomena they measure and describe. A quarter of a century on, this paper looks at the state of theory and explanation in contemporary demography. I ask how demographers have approached the task of explanation since Hobcraft’s comment, grounding the discussion in the mainstream literature on low fertility in Europe. Using selected examples, I critically review macro- and micro-level approaches to explanation, highlighting some of the philosophical problems that each encounters. I argue that different conceptions of what demography is, and the explanatory language fertility researchers use, lead to differences in explanatory strategies that are rarely explicitly recognized. I also consider how critical theories challenge demographers to think in new ways. Despite the increasing attention paid to theory and explanation, I conclude that more engagement with the philosophy of social sciences is needed before fertility researchers can legitimately claim their studies do as much to explain and understand as to quantify and describe.


Introduction
It is still, I believe, a fair criticism of most of the profession that we spend too little time trying to explain and to understand, rather than to quantify and to describe (Hobcraft 1996, p. 488).
In the 50th anniversary edition of Population Studies, John Hobcraft reviewed work on fertility in England and Wales and noted that most analyses were 'extremely data-bound' (Hobcraft 1996, p. 488). He concluded that demographers spend too much time describing and too little time trying to explain and understand population change. In this paper I ask whether Hobcraft's criticism remains a fair one 25 years on. I approach this task not by examining empirical findings but by considering the role of theory in demography, how demographers have approached the task of explanation, and whether more could be done to promote an explanatory agenda. The discussion is grounded in the mainstream literature on low fertility in Europe and is not a comprehensive review. Rather, I consider theoretical contributions that have most influenced thinking in predominantly quantitative fertility research and draw on various recent reviews published in demography journals (e.g. Gauthier 2007;Balbo et al. 2013;Esping-Andersen and Billari 2015;Goldscheider et al. 2015;Matysiak et al. 2021) to select examples that best illustrate both different explanatory strategies and my argument.
If we ask the question 'why is fertility currently low across Europe?', we are looking for an explanation of an aggregate characteristic of the population (fertility) which is in some way surprising or concerning. We may want to know why fertility is lower than it was only a few decades ago (the temporal trend), why some but not all European populations are experiencing lowest-low fertility (the spatial variation), or even, from an evolutionary perspective, why human populations in Europe are failing to replace themselves (below-replacementlevel fertility). The last question is an interesting one because it is a reminder of how theory influences what is identified as surprising or puzzling. However, despite a small but growing interest in evolutionary demography (MacDonald 1999;Mathews and Sear 2013;Sear 2015), few demographers have addressed this question and I therefore focus on explanatory strategies directed towards understanding temporal and/or spatial differences in fertility.
'Theory', as a body of knowledge warranted by the empirical evidence, and 'theorizing', as a process of revealing connections between phenomena or events, are the backbone of explanation (Graham 2005). McDonald (2015, p. 156) suggests that 'theory in demography relates to explanation of why and when events occur to people' but theory has been interpreted in a variety of ways in demography, as we shall see. In order to be as inclusive as possible, I take theory to refer to any set of ideas that go beyond the particularities of individual cases and contribute to making certain circumstances, relationships, or events intelligible (Graham 2000). Three issues are of particular interest in the following discussion. The first concerns the assumptions that are made about what kind of theory-general/universal or context-specific, for example-is appropriate to explanations in demography; the second is about the language that fertility researchers use when talking about connections and relationships; and the third relates to how the term 'context' is understood and conceptualized. The explanatory agenda, I suggest, would benefit from greater clarity on all three.
The paper is organized as follows. First, I briefly consider the dominance of descriptive approaches in fertility research and their limitations in relation to explanation. Following previous reviews, the second and fourth sections address macro-and micro-level approaches, respectively. However, in the third section I interject a reflection on the language of explanation in order to illustrate how moving from macro to micro level brings into focus philosophical questions about the explanation of human action. I then consider recent attempts to integrate macro-and micro-level approaches in the fifth section, and the challenges of critical approaches to gender theory in the final section, before concluding with an assessment of whether or not Hobcraft's criticism still stands.

Description and analysis
In the past, demography was often regarded by its practitioners as a descriptive discipline (see Pressat 1972, English translation), focused on measurement and methods of data analysis, without attention to either the nature of theory or the role of explanation.
The accurate measurement of demographic rates and trends remains at the heart of contemporary demography. Measurement is not only vital, but improvement in measurement can be taken as a marker of progress. In fertility research, the total fertility rate (TFR) remains a commonly used measure, but researchers in the twenty-first century are now more aware of its limitations. In particular, following Bongaarts and Feeney (1998), the sensitivity of the TFR as a period measure to changes in the tempo of childbearing is better understood. More recently, other tempo-adjusted fertility indexes have been developed and are seen as improvements on past adjustments (Bongaarts and Sobotka 2012). Nevertheless, measurement, however refined, describes rather than explains demographic phenomena.
A view of demography as essentially descriptive seems unnecessarily limiting and would not be accepted by most population researchers today. Yet many papers on European fertility published in the past 25 years are primarily descriptive, outlining temporal trends and geographical variations and identifying statistical correlates (e.g. Chandola et al. 1999;Billari and Kohler 2004;Frejka and Sardon 2004;Kulu et al. 2007;Sánchez-Barricarte and Fernández-Carro 2007;Goldstein et al. 2009;Bermúdez et al. 2012;Klüsener et al. 2013;Örsal and Goldstein 2018;Sabater and Graham 2019). Some researchers might object to this assessment by arguing that demographers do offer demographic explanations of fertility change by, for example, decomposing trends or revealing how differences in the age structure of populations affect fertility rates. Bongaarts and Sobotka's (2012) examination of the hypothesis that the rise in period total fertility in Europe was caused by the end of the postponement transition is illustrative. Their tempo-and parity-adjusted measure of period fertility (TFRp*) reveals that there was little increase in the level (quantum) of fertility between the late 1990s and 2008, and that most of the observed increase in TFR was due to the diminishing pace of the postponement of childbearing. TFRp*, they say, provides 'a straightforward demographic explanation of recent fertility trends' (Bongaarts and Sobotka 2012, p. 112). It should immediately be recognized that the authors distinguish between two levels of explanation-demographic and socio-economicand do not limit demography's reach to the former. Of interest here, however, is what makes the demographic 'explanation' they offer an explanation at all. On the one hand, TFRp* can be said to provide a more accurate picture of trends in period fertility, which looks very much like better description. On S134 Elspeth Graham the other hand, TFRp* is used to demonstrate the impact of tempo changes on period fertility, which plays a role in advancing explanation by clarifying the questions to be asked but does not itself explain the slowing of postponement. When it comes to explaining low fertility in Europe, others have been more ambitious.

Macro-level approaches
A much more ambitious explanatory strategy is to start from the big picture by proposing a general theory from which the particulars of population change-such as low fertility in Europe-might be explained. This strategy is often associated with the idea that demography is a science (Weeks 2005;McDonald 2015). According to Caldwell (1996, p. 311), demographers are 'inheritors of nineteenth century positivism' and this has resulted in the dominance of quantitative methods and a conviction that for demography to be a science, the edifices of theory should be quantitatively testable. Yet, the meaning of the term science is rarely specified. This matters, because different understandings of 'science' entail differences in the nature of theory and its relationship to explanation. Take two prevalent definitions of the term science. The first, the narrower definition, takes the natural sciencesespecially physics and chemistry-as its blueprint and distinguishes them from other kinds of knowledge. Knowledge in the natural sciences has been built through the formulation, establishment, testing, and reformulation of laws and theories. Scientific laws are central to the explanatory enterprise in this kind of science, as they are precise statements of how the world works, identifying constant order. The expectation encapsulated in laws, such as Newton's law of universal gravitation, is that they will hold wherever and whenever they are (properly) tested. Thus, in so far as scientific theories account for relationships between different laws, they can be seen as theories without history or geography (Graham 2005).
The second definition of science is broader and involves any systematic and rigorous pursuit of knowledge. In this sense, the social sciences would qualify as science-but so would history (Collingwood 1970, p. 9). The ways that historians approach explanation are very different to those of physicists and chemists, however, not least because the timing of events in the historical record is typically of crucial importance to them. Attempts to produce general theories-a theory of (all) wars, for example-run into a variety of philosophical challenges (Collingwood 1970;Rosenberg 2008). Demography, too, qualifies as a science in this broader sense as its practitioners seek knowledge in systematic and rigorous ways, but could it also be a science in the narrower sense? While those demographers who are happy to call themselves 'population scientists' may aspire to emulate the methodology of the natural sciences, the appellation risks confusion in the absence of an explicit account of the meaning of 'science'. And if we judge contemporary demography on the success of its general theory-building, then it does not fare well. In order to illustrate this, I will take two examples-the second demographic transition and gender equity-both of which make claims to be general demographic theories.

The second demographic transition
There are few proposed general theories from within demography that could be of use in explaining contemporary low fertility in Europe. Perhaps the best known relates to the second demographic transition (SDT). The 'theory', first sketched by Lesthaeghe and van de Kaa in 1986, has been widely influential but also attracted criticism (Coleman 2004;Zaidi and Morgan 2017). According to Lesthaeghe (2010), the SDT started with a multifaceted revolution-contraceptive, sexual, and gender-with these ideational changes accentuating individual autonomy as the driving force of new family arrangements and behaviours, including a reduction in the number of children couples have (Surkyn and Lesthaeghe 2004). Twenty-five years ago, Hobcraft (1996) discussed Lesthaeghe's early work at some length and welcomed its broader emphasis on changing values and ideas as an extension of the then dominant economic analysis. A large literature has now developed that uses the SDT to frame empirical studies of European fertility (e.g. Sobotka et al. 2003;Sobotka 2008;Nauck and Tabuchi 2012;Latten and Mulder 2013;Vitali et al. 2015).
Since the SDT was proposed, Lesthaeghe and colleagues, along with van de Kaa, have done much to muster supporting empirical evidence and rebut various criticisms (van de Kaa 1996(van de Kaa , 2004Lesthaeghe and Moors 2000;Lesthaeghe 2010Lesthaeghe , 2020. Leaving aside the inherent Eurocentrism of the SDT's global claim-Europe first, the rest of the world follows-which has largely been discredited in development studies (Greenhalgh 1996;Hettne 2009), I want to examine more closely its nature and characteristics as a theory. What kind of Theory and explanation in demography S135 theory is the SDT? Three characteristics are especially pertinent. First, the SDT includes an essential temporal element. Indeed, not only does the second transition necessarily come after the first but Lesthaeghe and Moors (2000) put more precise dates to the onset of the second transition in various European and other industrialized countries.
A second characteristic of the SDT is that the proposed transition has a geographical dimension, as it happens at different times in different places/populations. According to its proponents, the ideational changes that drive the demographic shifts are spread through processes of diffusion, with some populations being leaders and others laggards. Although there is a recognition that ideational change may also diffuse across social groups (Lesthaeghe 1998), spatial diffusion at different scales is given precedence (Lesthaeghe and Neels 2002;Surkyn and Lesthaeghe 2004;Lesthaeghe 2010).
The third characteristic concerns the relationship between the SDT and the empirical evidence that might warrant its acceptance as a theory. There are numerous debates in different disciplines about how theories might be warranted, and philosophers of science have drawn attention to the 'underdetermination of theory by evidence', even in the natural sciences (Rosenberg 2008, p. 13). According to Lesthaeghe (1998, p. 12), overarching theories in demography are likely to be 'multi-causal with strong contextual variations', making testing them against empirical evidence especially challenging. The ambiguity among demographers about what the transition is and how to define it (Sobotka 2008) may be one reason for attempts to test various aspects of the SDT having shown mixed results (Kertzer et al. 2009;Perelli-Harris and Gerber 2011;Merz and Liefbroer 2012;Bystrov 2014;Brons et al. 2017;Bellani 2020).
In his latest global update, Lesthaeghe (2020) takes prediction as a prime test of the SDT. He comments 'it is reasonable to conclude that the prediction of generalized sub-replacement fertility still stands after 35 years, and that this period may be heading for the half century' (Lesthaeghe 2020, p. 14 of 38). The problem with this is that it does not fully address the 'mechanisms' that purportedly link ideational change to behaviour change to produce demographic 'outcomes' (Rotariu 2006). Further, if 'sub-replacement fertility is not of necessity a SDT feature' (Lesthaeghe 2020, p. 4 of 38, emphasis in original), then low fertility in a population cannot, on its own, provide empirical support for the theory.
What matters is whether sub-replacement fertility results from the impact of social change on individual behaviour, and it is surely essential to specify and investigate these links before the theory can be warranted. If we cannot be sure that contemporary low fertility is the consequence of the ideational changes outlined in the SDT, then Hobcraft's astute observation that 'whilst such findings are suggestive, it is still a long step to use them to explain fertility change' (Hobcraft 1996, p. 519, my emphasis) still resonates 25 years on.
As a general demographic theory, the SDT seems something of a hybrid because it offers a universal account of change with global reach, while also embracing differences across historical and social settings and allowing the possibility of exceptions. Measurement and statistical methods are used to justify its claims, yet the links between the 'conditioning societal correlates at the macro level' and demographic trends (Lesthaeghe 2020, p. 2 of 38) are underspecified, making it impossible to devise rigorous testing of the sort found in the natural sciences. This looks like a theory inspired by a scientific (in the narrow sense) model of what a theory should be but without the precision required to warrant its acceptance, which may be no more than to say that it is a social scientific theory. On the other hand, it may be that this whole approach to theory construction in demography is misconceived. Perhaps the SDT is no more (or less) than an 'anchored narrative' (van de Kaa 1996) or a 'conceptual map' (Lesthaeghe 2011), in which case it plays a rather different role in explanation to the theories found in the natural sciences.

Gender equity
Another theoretical contribution that might play a role in explaining low fertility in Europe concerns gender equity (McDonald 2000a(McDonald , 2000b. McDonald (2013, p. 982) provides a convenient summary of his theory in a more recent paper: Gender equity theory in relation to fertility argues that very low fertility is the result of incoherence in the levels of gender equity in individually oriented social institutions and family-oriented social institutions. In advanced economies today women are able to compete as equals in the individually oriented institutions of education and market employment. However, they face a dilemma if family-oriented institutions, particularly as reflected in their role within the family, constrain their capacity to fulfil their aspirations as an individual.

S136 Elspeth Graham
It is therefore the mismatch between women's experiences in different spheres of their lives, or so the argument goes, that sometimes presents them with a stark choice between work and family and leads them to have fewer children than they intended or no children at all. A crucial assumption is thus that women (or at least some women) will respond to an incoherence at the institutional level by restricting their reproduction at an individual level, with the goal of better fulfilling their (non-reproductive) aspirations. A comparison with the earlier 'new home economics' of Becker and colleagues (Becker 1960(Becker , 1981Becker et al. 1977) is revealing. According to Becker's economic theory, the decision to have a child is part of a utility maximization process where the choice is between children (quantity or quality) and other consumer goods. Thus, both theories regard individuals as facing a trade-off between reproductive and non-reproductive aspirations or preferences. However, gender equity theory cannot be placed within an established theoretical tradition, such as utility theory in microeconomics, which might provide legitimation (Lee 2015). Its theoretical claims must therefore be judged on their own merits, according to whether they are warranted by empirical evidence. How, then, are we to judge these claims? One strategy, following the natural science model, is to generate precise, formal hypotheses which can then be tested against the available evidence. Given the imprecision in the central claims of gender equity theory, this is difficult if not impossible. Note that I am not criticizing the theory here but rather raising the issue of whether or not demographers should adopt similar strategies to the natural sciences for warranting theories.
There have been relatively few empirical tests of gender equity theory, largely due to the difficulties of specifying testable hypotheses (Gauthier 2007;Mills 2010), as McDonald (2013 recognizes. In addition to concerns about the availability of appropriate data and how levels of gender equity might be measured, the complexities of society cannot be controlled in laboratory-type experiments. An alternative might be comparative studies where the respective institutional contexts in which fertility is measured can be assumed to be constant. This is a common strategy in the fertility literature and can be applied at different scales (see Fox et al. 2019 for an example at the subnational scale). It is the strategy Mills et al. (2008) adopt in their investigation of gender (in)equity as a potential explanation for low fertility in Europe using individuallevel data on fertility intentions from the Netherlands and Italy. Brinton and Lee (2016) also use a country-level comparison in their analysis of how gender inequality is related to fertility levels across 24 Organisation for Economic Co-operation and Development (OECD) countries. To understand whether this approach provides an appropriate test of gender equity theory, we need to look again at the nature of the theory.
McDonald (2015, p. 157) regards demography as 'inherently comparative' and sees comparative studies as central to the testing of the theory when he maintains that: The purpose of gender equity theory is to provide an explanation for observed differences in fertility across countries or, less specifically, across varying institutional contexts. … Decision making about fertility is made at the individual level and it is incumbent upon any macro theory to demonstrate how the theory is played out at the micro level. However, with its emphasis on the nature of institutions, the theory can only be tested across contexts using contextual level variables (McDonald 2013, p. 985, my emphasis).
For McDonald, gender equity theory is a macro-level theory, which 'would be confirmed if a large proportion of women in these countries indicated that the gender system was unfair to them by not allowing them to combine work and family in ways that maintained their parity of participation' (McDonald 2013, p. 984, my emphasis). In other words, the theory is not so much about the behaviour of individual women or couples, but more about the aggregate outcomes of that behaviour. This brings into sharp focus a conceptual distinction that is often blurred by the different meanings attached to the word 'fertility'. In demography, 'low fertility' and 'lowest-low fertility' are measured at the population level-by, for example, calculating the TFR-and are therefore characteristic of a population group and not of individuals or couples. When we refer to the fertility of individuals and couples, we are talking about how many children they have, or intend to have, which is better described as 'family formation'. The two meanings are, of course, related, but they are also distinct. This opens up the possibility that the kinds of theory that might contribute to explaining population-level fertility are different from those that might help to explain individual-level family formation.
The importance of theory in any area of research lies both in its potential contribution to explanation and in the ways that it influences research questions and directions, not only through explicit attempts to Theory and explanation in demography S137 test theoretical claims empirically but also in the added significance it gives to certain empirical observations and their interpretation. For example, an apparent reversal of the fertility decline during the late 2000s in European countries (Balbo et al. 2013) is important, not only for what it tells us about (low) fertility and its possible future trajectory but also because of its implications for the SDT theory. If ideational change, including a rise in individualism, influences women to have fewer children, any increase in fertility takes on a particular significance as it challenges a basic premise of SDT (see Bellani 2020 for a recent discussion of SDT and the educational gradient in fertility).
The SDT and gender equity theories are both macro-level theories that offer different explanatory accounts of low fertility in Europe, but they are not necessarily incompatible. Bernhardt (2004, p. 28) sees the lack of a clear gender perspective as the most problematic aspect of the SDT and suggests that there is 'plenty of room for improvements of the theory'. Taking a longer historical view, Anderson and Kohler (2015, abstract) respond to 'puzzling exceptions and outliers' by proposing 'a more allencompassing framework' which places gender equity catch-up as the central feature of stages four and five of a six-phase demographic transition. Interestingly, they propose a 'gender-equity dividend', whereby the young adult age structure facilitates a shift towards gender equity by increasing women's bargaining power, thus adding detail to McDonald's version of the theory. Problematically, given the motivation for developing their ambitious framework, they are left with Germany and Austria as exceptional cases. Exceptions may be inevitable in demography, or any other discipline dealing with the social world, but they also raise questions about the strategy of attempting to develop ever more general theories.
There are parallels in the natural sciences to this search for an overarching theory-as when Newton's theory of mechanics came to be seen as a special case within Einstein's special theory of relativity-but there are also manifest differences. In particular, any theory of demographic transition must not only cope with the complexities of relationships in the social world but also with a temporal dimension absent from theories in the natural sciences. When demographers follow a scientific model of theory construction, they confront a tension between this approach and the importance of (historical) time to their theorizing. Moreover, they also confront fundamental questions about the appropriate scale, or scales, at which to theorize and how these might be related. Both the SDT and gender equity theories have been proposed by demographers and both posit links to individual behaviour and thus explanatory strategies that bridge different levels of analysis. For theories that address individual behaviour, however, fertility researchers have turned to other disciplines. Before we consider micro-level approaches, I want to examine in more detail the language that demographers use in their search for explanation. This is important both because linguistic imprecision introduces conceptual ambiguity and thus impedes explanatory clarity, and because the language of explanation appropriate to micro-level theories may differ from that appropriate to macro-level theories.

The language of explanation
Explanation is a complex matter, with many layers. If we, as demographers, are to understand and explain the phenomena we study, we must first know what an explanation is (Saetra 2019). The purpose of explanation in the social sciences is to render phenomena, such as low fertility, intelligible. Achieving this depends both on understanding the nature of the phenomenon in question and being able to give an account of how it came about. In other words, we need to be able to answer both what and why questions. Malnes's (2019) distinction between constitutive explanation (the what) and etiological explanation (the why) is helpful for the conceptual clarity it brings. For example, Bongaarts and Sobotka's (2012) work on TFRp* discussed earlier in the paper can be considered constitutive rather than etiological as it addresses the what but not the why. Nevertheless, it should be noted that the what and the why are fundamentally connected. Thus, constitutive and etiological explanations are not different kinds of explanation but rather different aspects of an explanatory process with internal coherence. In other words, our understanding of the nature of the phenomenon we are studying will influence how we approach the task of providing an explanation of it. The explanatory language used, though often taken for granted in demography, is crucial for ensuring the coherence needed for explanatory success.

Drivers and determinants
The language of 'drivers' and 'determinants' is pervasive in the empirical literature on low fertility.

S138 Elspeth Graham
Look at almost any quantitative study on the topic and one or both of these terms will appear in the report and discussion of the findings. Yet there is a distinct lack of critical reflection on their meanings, at least in print. Are the two terms interchangeable or do they have distinct meanings? Both are typically used to describe those factors found to be significantly associated with low fertility in statistical models but, beyond that, uses of the two terms in the literature are somewhat confusing. They appear to be understood mainly as synonymous by fertility researchers, although the term driver may be preferred by those conducting cross-sectional analyses, as a less causally loaded term than determinant. In their decomposition of period fertility in Finland, Hellstrand et al. (2020, p. 11, my emphasis) argue that 'ultimate childlessness may become a strong driver of cohort fertility decline' and later add that 'it is unclear what underlying factors or socio-economic determinants are driving this fertility decline in Finland' (Hellstrand et al. 2020, p. 12, my emphasis). Here the distinction appears to be between (proximate) demographic factors (drivers) and underlying socio-economic factors (determinants), but that distinction is not evident in other usage. Sobotka (2017), to take just one example, identifies educational expansion (an underlying factor) as a 'key driver' of postponed parenthood, and thus presumably of low fertility. Balbo et al.'s (2013) review of research on fertility in advanced societies, on the other hand, is organized according to 'determinants' of fertility, which include women's education. They identify key determinants operating at three analytical levels: the macro level (e.g. economic trends; value and attitude changes), the meso level (e.g. social interaction; place of residence), and the micro level (e.g. fertility preferences; the gendered division of labour). While this provides an excellent overview of published work and allows the authors to highlight important challenges that need to be addressed in future research, it does not include any reflection on the meaning of the organizing concept. By definition, a determinant is something that controls or affects something else but, if we are to ensure internal consistency when seeking to explain low fertility, we need to think about how the numerous determinants discussed in the literature might produce that outcome. In particular, we need to consider whether microlevel factors (such as fertility preferences) determine fertility outcomes in the same way as macrolevel factors (such as economic trends).

Mechanisms and pathways
One way of thinking about the role of determinants is to conceptualize them as part of a causal mechanism. For instance, regarding studies showing earlier fertility among women in educational fields related to the more 'feminine' fields of caring, Balbo et al. (2013, p. 14, my emphasis) remark: 'The mechanism is that women either self-select themselves into educational paths that lead to jobs where they are more able to combine motherhood and employment or, the difficulty of combining career and children varies by chosen career type'. Talk of mechanisms, however, suggests a deterministic relationship, which raises all sorts of philosophical questions when the subject is human behaviour. A clock has a mechanism consisting of various parts functioning as a whole, but is this a good analogy for social phenomena? If it is, then we are committed to explaining features of society in terms of the purposes they serve, not for individuals, but for the society as a whole. Debates about functionalism and the existence of 'social facts' have occupied both philosophers and sociologists for many decades (Rosenberg 2008) and are beyond the scope of this paper. Suffice it to note that there are no similar debates in demography.
An alternative to conceptualizing society as functioning as a whole (holism) is to argue that society is no more than the sum of its parts and that macro-level explanations are ultimately reducible to explanations about the behaviour of individuals (methodological individualism). It is unclear which side of this debate the proponents of the SDT and gender equity theories are on. Lesthaeghe (2011, p. 212) emphasizes not only 'a dynamic set of value orientations' at the societal level but also recognizes that these can change at the individual level, while McDonald (2013) explicitly links the macro level of social institutions to the micro level of individual responses.
Those who take seriously the centrality of women's and men's individual decision-making and behaviour in the understanding of low fertility may hesitate to use terminology that suggests a deterministic or mechanistic relationship between the explanans and the explanandum. There seems little consistency among demographers in how explanatory sequences are conceptualized but another term sometimes used is 'pathway'. For example, in her study of lowest-low fertility in Ukraine, Perelli-Harris (2005) interprets her mixed methods analyses as showing that there is more than one pathway to lowest-low fertility, and Frejka (2017, p. 112, my emphasis), taking a long-term cohort approach to the fertility transition, concludes that 'to date there have been four distinct pathways of fertility decline'. Wood et al. (2016) also refer to the pathways through which economic cycles affect fertility. Billingsley and Ferrarini (2014) provide a conceptual framework of pathways, which illustrates a sequence of potential links through which family policies affect fertility intentions, and Snopkowski et al. (2016) use the same terminology when presenting their structural equation models of the pathways between education and fertility. Before we can identify an appropriate strategy for explaining low fertility, we need also to think about the nature of the links in any such pathway.

Causes and reasons
Many, probably most, demographers would suppose that in order to explain low fertility in Europe, we need to identify its causes. They would no doubt also agree that uncovering causality is a major methodological challenge because of endogeneity issues and the difficulties of knowing what is causing what. As Balbo et al. (2013) note, some researchers give causal interpretations of cross-sectional findings, which is a serious problem. I want to add a more fundamental challenge to the search for causation by asking whether it is ever appropriate for fertility researchers to give causal interpretations of their findings. In a paper on the limits to low fertility, Foster (2000, p. 228) comments, 'in almost all cases the pathway between genes and behaviour is far more complex than suggested by reference to a simple causal link, and is complicated by the phenomenon of consciousness'. She thus draws attention to an issue about the explanation of human action that is widely debated in the philosophy literature.
The debate centres on arguments about the nature of human action and thus how it can be appropriately explained (D'Oro and Sandis 2013). Conscious human behaviour is not, according to one account, to be explained as the effect of certain causes. Explanations of intentional actions must instead aim to render them intelligible in terms of the desires and beliefs of human agents. Take the example of how a train accident might be explained: heavy rain caused a landslip, which in turn caused several carriages to leave the tracks and roll down the embankment. We could, of course, embellish this causal sequence by giving further detail on the amount of rain and its effect on the embankment, but the example illustrates how connecting causes to effect involves showing how they 'hang together' (Malnes 2019). There is no question of conscious choice here, as it makes no sense to claim that the carriages 'chose' to leave the rails. Human action, or so the argument goes, is quite different because it can be explained only by giving reasons, not causes, as to why an individual acted the way they did. In other words, human action is what Winch (1958) in his influential book, The Idea of a Social Science, calls 'meaningful behaviour'.
If low fertility in Europe is understood to be (no more than) the aggregate outcome of the actions and choices of individuals or couples, then demographers cannot avoid engaging with debates in the philosophy of action. The two macro-level theories discussed in the previous section appear to depend on assumptions about behaviour at the individual level, whether the influence of growing individualism and secularism on women's decisions about childbearing or women's perceptions of unfairness in the gender system leading them to restrict their childbearing. In these circumstances, the difference between reasons and causes is crucial because reasons and causes play different explanatory roles (Rosenberg 2008). If a woman explains that the reason she is childless is that she is postponing childbearing because she first wants to establish her career, then her reason justifies her choice by rendering it intelligible. What is more, her explanation reveals certain beliefs and desires, such as her wish to follow a career and her belief that having a child now would make this difficult. However, suppose that she has made up this story because she does not want to talk about the cancer treatment that has left her infertile. In this case, the cancer treatment has caused her infertility, which in turn has the effect of making childbearing impossible. Her beliefs and desires play no part in this latter explanation. Her childlessness has a cause and is not the outcome of a conscious choice. Although fertility researchers might be interested in both pathways to childlessness, it is clear that for the majority of individuals and couples, childbearing is a meaningful action, explicable in terms of motives, reasons, desires, and beliefs. So we are left with a conundrum: if decisions about childbearing are generally not the effect of a causal sequence, should demographers abandon their search for the causes of low fertility and look instead for other kinds of explanation?
Within the social sciences, the debate about reasons for action is especially pertinent for psychologists interested in motivation and goal-directed S140 Elspeth Graham behaviour. Related ideas lie at the heart of the SDT, which posits connections between ideational change at the societal level and the reproductive behaviour of individual women. Lesthaeghe (2011, p. 197) argues that there has been a cultural drift towards expressive values and roles, and 'a recursive relationship between demographic choices and values orientation' in which 'greater secularism fostered choices in favor of premarital sex and non-traditional household formation patterns, but the latter also reinforced further secularization'. The underlying assumption is that women adopting new post-materialist values and thus the general goal of self-actualization will be motivated to have no children or fewer children than in the past: hence the low and very low fertility seen in European (and other) populations. It is, I think, impossible to spell out the assumed connections between individual-level demographic choices and value orientations without reference to motivation because 'values, in general, influence us by influencing the sorts of reasons we perceive as salient and by motivating us to act on certain reasons rather than others' (Fileva 2017, p. 193). The debate about reasons and causes cannot be avoided if we want our explanations of low fertility to 'hang together' for, 'whatever the truth of the matter, such debates demonstrate that it is of vital importance that we clarify our concepts of motivation, reasons, beliefs, and explanation' (D'Oro and Sandis 2013, p. 25).

Micro-level approaches
Encouraged by the greater availability of detailed individual-level data sets, interest in micro-level approaches has increased in the last quarter century, with fertility researchers looking to other disciplines for theories to inform their empirical work. It is noteworthy that the most influential disciplines in this respect have been economics and psychology. Chatterton (2016, p. 28) observes that both position themselves as 'the most "scientific" of the social sciences', by which he means the most like the physical sciences. It appears that in the search for explanation, many demographers gravitate towards a natural science model of theory even if this is not stated explicitly.
Two examples-new home economics and the theory of planned behaviour-are illustrative. Unlike the macro-level theories discussed earlier, these two micro-level theories were not developed by demographers, but both have influenced numerous empirical studies of fertility. Both are also open to philosophical challenges rarely discussed in the demographic literature. A third example, the life-course perspective, is included because it has become highly influential in studies of population change (Falkingham et al. 2020) and is sometimes referred to as a theory that can be applied to fertility behaviour (Huinink and Kohli 2014).

New home economics
The best-known micro-level theory cited by fertility researchers is the economic theory developed by Becker and others, as mentioned earlier. Associated with new home economics, this theory conceptualizes households as utility maximizers and children as commodities. Lee (2015, pp. 69-70) provides a summary: In this theory, a couple derived utility from its own consumption, from the number of its surviving children, and from the average quality of those children who were usually treated as homogeneous for simplicity. Children were viewed as a kind of consumer durable because they yielded psychic satisfaction to their parents over a long period.
The theory is based on a number of simplifying assumptions, including that households have full information on the costs and benefits of various alternatives and that preferences regarding children are homogenous among household members (Gauthier 2007). Other scholars have criticized one or more of these assumptions, either suggesting modifications (e.g. Stanfors and Goldscheider 2017) or indicating that the theory is fatally flawed (e.g. David 1986;Robinson 1997). One interesting criticism from feminist economists is that new home economics leaves the family as a 'black box', ignoring the issue of unpaid care labour and its implications for gender equality (Hara 2016). The theory appears to have lost some of its previous influence within demography, although Lee maintains that 'Becker gave us a coherent theory built on a sound foundation of established economic principles' (Lee 2015, p. 68), as well as concepts and vocabulary for discussing fertility and the family. The related concept of 'opportunity cost' (foregone income), for example, is used extensively in discussions of women's childbearing (e.g. Sobotka et al. 2011;Hart 2015;Brini 2020). Nevertheless, at least some of the assumptions on which the theory relies are evidently unrealistic; households do not act on the basis of full information, for example. Indeed, Theory and explanation in demography S141 as a paradigmatic economic theory with idealized assumptions, Becker's work arguably does not seek to explain individual reproductive behaviour at all for, as Rosenberg (2008, p. 83) argues, 'there seems no way fairly to test the theory that people always maximize utility'. This may not matter if the aim of the theory is to predict, rather than explain. However, how good it is at predicting is unclear. In her review of the impact of family policies on fertility, for instance, Gauthier (2007) discusses the implications of five assumptions of the economic model and comments that 'it is not easy to test this model empirically' and that the complexity of the underlying mechanisms may account for some of her 'unexpected findings' (Gauthier 2007, p. 327).

Theory of planned behaviour
Another general theory used by researchers seeking to understand reproductive decision-making at the individual level comes from psychology. The theory of planned behaviour (TPB), developed from the theory of reasoned action, was designed to predict and explain human behaviour in specific contexts (Ajzen 1988). Central to the theory is an individual's intention to perform a certain behaviour. Intention is seen to be influenced by three major, conceptually distinct factors: personal evaluation of a behaviour (attitude), socially expected mode of conduct (subjective norm), and self-efficacy with respect to the behaviour (perceived behavioural control). Actual behavioural control then mediates the relationship between intention and behaviour. Thus, the theory concerns intentional or purposeful actions-where an individual is able to exercise choice, that is, to make conscious or 'reasoned' decisions (Klobas 2011)-and the theory is general in the sense that it can be applied to all cases of such behaviour. In addition: True to its goal of explaining human behavior, not merely predicting it, the theory of planned behavior deals with the antecedents of attitudes, subjective norms, and perceived behavioral control, antecedents which in the final analysis determine intentions and actions. At the most basic level of explanation, the theory postulates that behavior is a function of salient information, or beliefs, relevant to the behaviour (Ajzen 1991, p. 189

, my emphasis).
A number of empirical studies of fertility intentions or reproductive behaviour have based their analyses on this theory (e.g. Billari et al. 2009;Philipov 2009;Dommermuth et al. 2011;Fahlén and Oláh 2018).
However, operationalizing the carefully defined core concepts of the theory is challenging, and not just because of the difficulties of finding appropriate data. 'Fertility behaviour', as Ajzen and Klobas (2013) emphasize, properly refers not to a behaviour at all but to the outcome or goal of behaviours (e.g. having unprotected sexual intercourse with someone of the opposite sex). With similar precision, they point out that a control belief is not the same as a constraint or a perceived constraint. Thus, perceived behavioural control is interpreted as a product of the perceived importance of a particular factor and the expectation (belief) that the particular factor will be present. Surprisingly, Ajzen and Klobas (2013) nevertheless suggest that TPB can be applied to fertility decisions, with 'intentions to have a child' predicting 'having a child'. This is surprising because 'having a child', similar to 'fertility behaviour', looks very much like an outcome or goal rather than a behaviour per se (Philipov and Bernardi 2011). At the same time, using the TPB framework to predict or explain the behaviour of having unprotected sexual intercourse seems less central to understanding why couples are having fewer children than in the past. The point is clearer if we take the negative case where an individual or couple does not want to have children. 'Not having children' is not a behaviour, even though, as a goal or desire, it may be associated with a behaviour such as using reliable contraception. Although the TPB might help to explain contraceptive use, it is less obviously useful for explaining childlessness.
The concept of 'intention' plays a central role in the TPB and this also becomes a problem when the theory is applied to reproductive decisions. Fertility intentions change across the life course and have been found in some cases to correspond only weakly with subsequent outcomes (Morgan and Rackin 2010). Ní Bhrolcháin and Beaujouan (2015, p. 26) argue that this instability and inconsistency can better be explained by conceptualizing intentions as constructed, suggesting that 'insofar as people have a family size goal, it is recognized when reached, rather than a pre-existing target'. Further, Liefbroer (2011, pp. 56-7) maintains that the TPB is hard or even impossible to falsify, and he worries that 'if one finds that intentions are not completely determined by attitudes, norms and perceived behavioural control, the natural reaction is not to doubt that the theory is correct, but rather to assert that the central concepts have been measured suboptimally'. I agree but with a strong proviso-that the theory cannot be falsified (or warranted) by demographers. Testing a socio-S142 Elspeth Graham psychological theory requires methods, measures, and standards established within psychology, and these are beyond the scope of demography. Demographers have little choice but to apply the theory and judge whether or not it provides useful insights into the questions they wish to answer. The perceptive study by Mencarini et al. (2015), implementing TPB to examine fertility intentions and outcomes in Italy, is illustrative. The authors use graphical models to analyse 'the complete path leading to fertility behaviour, within the explanatory framework of the TPB and considering the most common background variables (i.e. determinants) of fertility' (Mencarini et al. 2015, p. 15). Their findings are mixed. In line with the expectations of the TPB they find that none of its core dimensions (attitudes, norms, and perceived behavioural control) affects fertility behaviour directly but that, contrary to the theory, some background factors (such as woman's age and duration of a couple's relationship) do directly affect fertility intentions and realizations. They suggest several possible data-related reasons for the contrary findings and add a theoretical one by speculating that there might be unobserved (psychological) factors in addition to the core dimensions of TPB. It could also be, however, that the fertility outcome-having a child-is not a behaviour of the kind that is appropriately explained by the TPB. This is a reminder of how important it is for researchers to engage with more conceptual and theoretical debates.
Perhaps the best use of the TPB in fertility research is as a heuristic framework that sensitizes researchers to the complex pathways that link background 'determinants' to fertility outcomes (Liefbroer 2011). The framework could then be adjusted or extended in ways that clarify its scope and improve its ability to guide explanations of phenomena of central interest to demographers. Rossier and Bernardi (2009), for example, argue that the explanatory power of the TPB model could be enhanced by the addition of three social network 'mechanisms' (social learning, social influence, and social support). Other researchers are more critical. Morgan and Bachrach (2011) question the pre-eminence given to intentions in the TPB and propose an alternative theory-the theory of conjunctural action-which posits links between the brain's neural networks (schemas or mental structures) and behavioural events (fertility behaviour). Not only is testing this alternative theory arguably beyond the scope of demography, it also takes us into deep philosophical waters. The problem, according to philosophers, is the logical impossibility of identifying the belief or desire that any particular brain state constitutes. The argument turns on the subjectivity of beliefs and desires and how they are expressed in language. As Rosenberg (2008, p. 59) puts it, 'philosophers of psychology have expressed this point by saying that mental states are not reducible to behavior or brain states.' Some, perhaps many, cognitive neuroscientists would disagree but, to prove their point, they would have to show what is wrong with the philosophical argument. Theories, in other words, must be judged not only in relation to the relevant empirical evidence but also by the coherence of the philosophical foundations on which they rest (Graham 2000).
In his discussion of the TPB, Liefbroer (2011, p. 56) asks what kind of theory it is and remarks that 'theories can operate at very different levels of abstraction'. He then distinguishes between the kind of theory that gives rise to testable hypotheses and what he calls a 'theoretical or heuristic framework' that cannot be easily tested but rather points researchers to elements of a situation important for explanation. This understanding of theory proper as generating formally testable hypotheses appears to be based on a natural science model of what constitutes a theory. It is not the only possible conceptualization of theory and, indeed, other conceptualizations may be of greater relevance in demography. The differences between kinds theory are, at least in part, to do with the way that theories abstract from the messiness of the world. The universality of theories such as Einstein's theory of relativity is predicated on an identification of connections abstracted from history and geography. For this kind of theory, the when and the where do not matter. The TPB aspires to be a theory in this sense. One of the criticisms of the TPB is its alleged 'inability to accommodate a process in which intentions may be made and remade over the life course' (Morgan and Bachrach 2011, p. 16). Liefbroer (2011) disagrees, but both papers assume that it is important for a micro-level theory of reproductive behaviour to accommodate change over time.

The life-course perspective
The importance of a life-course perspective within contemporary social science, including demography, is reflected in the establishment in 2000 of an interdisciplinary journal devoted to Advances in Life Course Research. Huinink and Kohli (2014) describe the life-course approach as having fundamentally Theory and explanation in demography S143 changed the agenda of demographic research. Despite being called a theory by some (Elder et al. 2003;Huinink and Kohli 2014), this perspective is better described as a conceptual framework or theoretical orientation, at least in respect of its application within fertility research. It has been used extensively as a methodological guide in empirical studies of family formation. Elder et al. (2003) outline five key heuristic principles of the life-course perspective: (1) lifespan development; (2)  Thus, when transitions-such as becoming a parent -happen at different times in individuals' life courses, their antecedents as well as their consequences can also be expected to differ. This suggests that any explanation of a life-course event, such as the birth of a first child, must take into account both its historical timing and its timing within the individual life course.
The life-course perspective has fostered a more dynamic view of family formation as it unfolds over time. Yet, as interpreted by demographers, it appears to accommodate very different approaches to explanation. Three examples illustrate this point. First, Huinink and Kohli (2014) offer an interpretation within the theoretical framework of microeconomics, which shares certain features with Becker's work in new home economics. They conceptualize the life course as a complex process of welfare production, composed of interrelated life domains and time-related interdependencies, and embedded in a multilevel structure of social dynamics and personal development. This moves understanding beyond previous work on the microeconomics of fertility by incorporating many elements of the life-course perspective highlighted by Elder and colleagues. However, Huinink and Kohli's (2014) interpretation continues to rely on a version of rational choice theory, with individuals 'striving for subjective wellbeing (welfare production) as efficiently as they are able to' Kohli 2014, p. 1298, my emphasis). Fertility thus becomes one among a range of instrumental goals for achieving subjective wellbeing, in which the principal substitution is between family formation and labour force participation. The parallels with Becker's work are evident, despite the impressive detail and the much greater attention to time in this interpretation. While the assumption of perfect knowledge is abandoned, the concept of utility maximization is replaced by that of maximizing subjective wellbeing. Again, it is legitimate to ask what kind of evidence would count against the universal assumption of welfare maximization as a behavioural goal.
The other two examples are taken from the growing body of work that uses the life-course perspective as a framework for empirical studies. They offer less comprehensive interpretations but are of interest for the implications they hold about the nature of explanation. Mynarska et al. (2015) focus on the intersection of life domains in their study of childlessness. Using sequence analysis and data for women living in urban areas in Poland and Italy, they reconstruct the trajectories of childless women across the three life domains of employment, education, and partnership. They find both commonalities and differences between women in the two countries but, more importantly, their analyses show the diversity of pathways that lead to childlessness. They conclude: A new research perspective thus presents itself: instead of looking for the key determinants of childlessness, we should investigate how and why the different biographies of childless women emerge. Such an approach will allow us to understand the reasons why women have no children, without neglecting the diversity of the pathways which lead to childlessness (Mynarska et al. 2015, p. 44).
Where pathways into childlessness are diverse, this suggests, childlessness cannot be explained by a common set of determinants but only by understanding the diversity of women's reasons for remaining childless. To put this another way, the same demographic outcome may result from different lifecourse trajectories, each of which requires a different explanation; to explain individual outcomes we need to understand the reasons for those outcomes. However, understanding reasons requires a different methodological approach to the statistical modelling dominant in demography. For example, in their introduction to a special collection on partnership, Perelli-Harris and Bernardi (2015) discuss social norms and cohabitation. They argue that previous S144 Elspeth Graham research using quantitative methods is not sufficient to understand the nuances and complexities of partnership in different contexts or 'to provide substantive interpretations of social norms, attitudes, and meanings related to partnership' (Perelli-Harris and Bernardi 2015, p. 703). Their interest is in the reasons that cohabitation occurs in a particular order relative to other life-course events (such as becoming a parent) and they show how focus group discussions can help to uncover the meanings of cohabitation in different cultural contexts. Not only is the methodological approach qualitative rather than quantitative, but both the language and strategy of explanation are different too. The central concern is to provide an interpretative understanding of the different meanings that individuals ascribe to cohabitation in different cultural contexts. As with Mynarska et al. (2015), the implication of this diversity is that no single explanation of why and when couples cohabit will suffice. Conceptualizing family formation as a process that unfolds over the life course has been influential in fertility research, but it does not resolve the tension between an explanatory strategy that appeals to universal causes and one that emphasizes the interpretation of difference. Perelli-Harris and Bernardi (2015, p. 723) comment that 'while we may observe universal trends that are changing human behaviour everywhere, the way in which the behaviours change in each context is still unique'. We therefore need to ask how important context is for explaining low fertility in Europe.

Integrating macro-and micro-level approaches
Several frameworks have been suggested for integrating the large volume of evidence on different aspects of fertility and family formation in Europe, although none explicitly considers the distinction between fertility (as an aggregate measure of births within a population) and family formation (as a life-course process capturing individual experiences) (see Liefbroer et al. 2015 for a partial exception). While integrative frameworks do give some attention to context by connecting macro-and microlevel approaches, their underlying assumption appears to be that explanations of (low) fertility at the macro level are ultimately reducible to explanations of family formation at the micro level. Billari (2015), for example, identifies two related stages in the study of population change-'discovery' at the macro level and 'explanation' at the micro level-which, he maintains, interact iteratively. Explanation of population change, then, must be 'rooted in models of the action and interaction of individuals, couples, and families, as embedded in their macro-level context' (Billari 2015, p. S13). This 'social mechanisms' approach involves situational mechanisms (macro-to-micro), action-formation mechanisms (micro-to-micro), and transformational mechanisms (micro-tomacro), which all contribute to the explanation of macro-level outcomes. Others have suggested different approaches. Examples from a collection of papers on explaining fertility (Huinink et al. 2015) illustrate the nature of this work. The extent to which they are compatible with Billari's social mechanisms is unclear, but they all share the view that macro-level outcomes are ultimately to be explained by processes at the micro level.

An integrated 'theory of fertility'
In their introduction to the special collection on the potential for an integrated approach to explaining fertility, Huinink et al. (2015) identify five levels of fertility analysis, from socio-biological to socio-structural. Bringing them together in an integrated approach is the multidisciplinary challenge addressed in the collection. 'The strategy is to identify general mechanisms which, under specific conditions produce "singular" social outcomes in a particular historical situation at a certain place' (Huinink et al. 2015, pp. 95-6). There is a shared assumption that theory development should start at the micro level of individual (or couple) decisionmaking, but one of the issues that separates contributions, or so it seems to me, is how their approaches to theory deal with macro-level context. Werding (2014, p. 260), for example, outlines an economic theory of fertility based on 'a generic logic of choice' and admits that 'taking into account its social context is clearly not one of the strengths of the economic theory of fertility'. Nauck (2014Nauck ( , p. 1793) does more to incorporate context in a Value of Children approach that 'combines a multilevel and action-oriented theoretical model of generative behavior based on the principles of methodological individualism with the welfare maximizing assumptions derived from social production function theory'. Nevertheless, both authors follow recognizable microeconomic practice by identifying the socio-ecological context of fertility decisionmaking, as comprising opportunity structure, resources, and frames of welfare production.
Theory and explanation in demography S145 Bernardi and Klärner (2014), in contrast, draw on social network theory to focus on the context in which family formation occurs. In this approach, individual beliefs and behaviours are interdependent and depend in part on social interactions with others. Mechanisms such as social learning and social pressure affect individuals' beliefs and norms regarding childbearing. Thus, 'the number of children an individual or a couple want to have, including the choice of being childless, is a socially embedded preference' (Bernardi and Klärner 2014, p. 656, my emphasis). The authors argue that the social network framework complements other approaches to explaining fertility-including economic approaches, the TPB, and the life-course perspective-by expanding their conceptualizations of context. However, it is far from obvious that the interdependencies they recognize could be combined with, for example, microeconomic theories without losing the central insight of the social embeddedness of childbearing. The special collection encourages dialogue, but the search for an integrated 'theory of fertility' requires further interdisciplinary exchange to identify 'an integrated system of concepts' (Huinink et al. 2015, p. 105) on which such theory might be founded.
In more recent work, Bernardi et al. (2019) propose a three-dimensional life-course 'cube' as a conceptual tool to aid the understanding of complex demographic processes. They claim that the cube 'advances explanations of demographic change by focusing on micro to macro interdependencies through time' (Bernardi et al. 2020, p. 12), but it is a heuristic framework rather than a theory. Further, as currently specified, the life-course cube blurs the distinction Elder et al. (2003) makes between historical time and timing within the life course, and its treatment of socio-spatial context-time and place-is restricted to noting that changing views across the life course are simultaneously influenced by perceptions of 'environmental conditions'. This does not take geography seriously. Studies in spatial demography have shown that fertility rates and family formation patterns vary nationally, subnationally, and between places (Kulu et al. 2007;Klüsener et al. 2013;Fiori et al. 2014;Vitali and Billari 2017;Buelens 2019), and place is more than an abstract location. What is also missing, as Mayer (2019, p. 3) argues, is 'the recognition that the distinction between the individual and the supra-individual is not just one of "levels", but implies processes of the aggregation of individual life outcomes and the repercussion of such aggregation of life decisions and trajectories'. In Billari's (2015) terms, the transformational mechanisms, as well as the situational mechanisms are underspecified. There is scope for further development, but these limitations need to be addressed before the life-course cube could usefully inform explanations of low fertility in Europe.

The challenges of critical social theory
Understanding the way that societies are organized is important to the development of satisfactory explanations of low fertility, or fertility change, in Europe. The connections between aspects of society (such as welfare systems) and fertility levels are well recognized by fertility researchers (e.g. Esping-Andersen 1999;Mills and Blossfeld 2005;Balbo et al. 2013) but are rarely viewed from the perspective of critical social theory. Postmodernism, feminism, and post-structuralism are all part of the critical theory tradition, which emphasizes the social construction of society, the situatedness of knowledges, and the consequences of differential power relations. Most importantly for the current discussion, it aims to be transformative.
With few exceptions, research on European fertility published in mainstream demography journals has hardly begun to engage with the challenges of critical social theories. (The majority of contributions to feminist demography are not published in such journals, the papers by Sigle and Nandagiri in this volume being among a small number of exceptions.) More generally, there is evidence of some resistance to serious engagement with these challenges. Burch (2001, p. 276), for example, distances himself from critical social theory when he remarks that the model-based approach to science he advocates 'does not agree with the view … that science is totally a social construction … Nor does it have anything to do with "critical theory"', but he offers no discussion of the differences. There are many possible reasons for this reluctance to engage and a full consideration is far beyond the scope of the present paper (but see Williams 2010). I will therefore focus on the example of feminist gender theories as especially pertinent to the present discussion, to illustrate the nature and extent of the challenges posed by a way of theorizing that contrasts markedly with traditions of thought in the natural sciences.

Feminist gender theories
Theoretical work on gender has made an impact in other social sciences, at least since the 1970s.

S146 Elspeth Graham
Although many researchers investigating fertility in Europe cite changing gender relations as an explanation of recent trends, few recognize the complexities of gender (Riley and McCarthy 2003). Goldscheider et al. (2015), for example, posit a two-stage gender revolution in which structural changes in women's roles in the public sphere are followed by men's increased involvement in the domestic sphere. Contrasting their approach with the ideational emphasis of the SDT, they maintain that 'the gender revolution is thoroughly structural, reshaping as it does the fundamental relationships between men and women' (Goldscheider et al. 2015, p. 213). However, unlike feminist theorists, they do not interrogate the notion of 'gender' or treat it as a social construction. As Williams (2010, p. 200) remarks, 'demographic research on gender and women's empowerment is still seriously lacking in both scope and sophistication'. Feminist gender theorists do conceptualize gender in structural terms but also insist that gender is socially constructed and dynamic, embedded in the individual, interactional, and institutional dimensions of society (Risman 2004). In sociology, this complex body of theorizing contains a number of themes that have been revised and extended over time (see Risman 2018 for an overview). My purpose here is not to give an account of the complexity but rather to show how interrogating gender can challenge existing theories and explanations of low fertility in demography. I will restrict the discussion by highlighting three of the main interrelated challenges posed by recognizing that gender is both socially constructed and dynamic.
The first challenge to consider concerns the conceptualization of women's 'choices' about whether or not to have a(nother) child. Both the SDT and gender equity theories discussed earlier assume that gender relations are changing, at least in Europe, but neither critically interrogates gender or explores how change comes about. For feminist theorists, 'power is at the heart of how gender organises societies' (Riley and McCarthy 2003, p. 112) and inequality and oppression are produced by men and women having unequal access to power and privilege. Thus, women's choices are constrained not simply by practicalities (such as the availability of childcare) but more fundamentally by the extent to which they are empowered. Gender stratification is produced and reproduced by everyday interactions with others-by 'doing gender' (West and Zimmerman 1987)-and change comes from resistance to powerful patriarchal norms. Moreover, gender cannot be understood simply as a generic social structure constraining individual choices, because this ignores 'not only internalized gender at the individual level but also both the interactional expectations that remain attached to women and men because of their gender category and the cultural logics and ideologies embedded in society-wide stereotypes' (Risman 2018, p. 25).
Understanding gender as socially constructed and dynamic reveals complexities that challenge, for example, McDonald's (2013) distinction between gender equity in individually oriented social institutions and family-oriented social institutions for, as Risman (2018, p. 32) points out, 'a gender structure has implications for individuals themselves, their identities, personalities, and therefore the choices they make'. This complexity is hardly captured by assuming that individual choices about whether or not to have a(nother) child depend on 'perceptions of fairness' (McDonald 2013). Since the micro-level approaches to fertility theory reviewed earlier ignore patterned gender inequalities and how they are produced, reproduced, and revised, they can also be seen as failing to recognize the importance of gender as a continuously changing (rather than fixed) social structure and how differential power affects fertility outcomes.
The second challenge I want to highlight relates to the understanding of context, a central issue for explanations of any social behaviour (Riley 2019). Feminist theorists have not only shown how gender is embedded in the cultural logics of our organizations and institutions (Acker 1990) but also emphasized its cultural specificity. As Risman (2004, p. 442) notes, 'how gender identities are constructed on the individual and cultural dimensions vary tremendously over time and space'. This has quite profound implications for how empirical work on fertility is conducted, implications that go beyond the recent theorization of (national) variations in gender ideologies (Brinton et al. 2018). As Williams (2010, p. 205) puts it, 'gender must be treated as contextually and culturally specific and demographers must adjust their development of and use of survey instruments'. Explanations of low fertility, then, must be culturally sensitive, which requires 'an assumption that people engage their world in terms of highly various and local systems of meaning' (Fricke 1997, p. 186). This raises a number of questions for demographers as it cannot be taken for granted that cultural contexts coincide with national borders or indeed with any of the supra-or subnational areas that are typically used to analyse spatial variations in fertility. It is also possible that two or more local systems of meaning coexist in the same geographical area.
Theory and explanation in demography S147 Such diversity suggests that researchers seeking to incorporate gender in explanations of low fertility in Europe need a far more nuanced concept of context than has so far been developed.
The third challenge arises from the transformational, emancipatory stance of feminist gender theorists and is the most fundamental. It requires a redefinition of what demography is for, a recognition of the need to analyse power (Presser 1997), and a rethinking of what sort of theories and explanations are possible. Feminist gender research is dedicated to revealing and opposing gender injustice. It thus aims to transform as well as inform society (Risman 2004) and is necessarily political (Riley 2019). This conflicts with the notion cherished by some that demography is a value-neutral science, offering empirically informed, objective accounts of population change (e.g. Courgeau et al. 2017). Feminist theorizing about gender and fertility therefore moves away from seeking theories and explanation modelled on the natural sciences and embraces more interpretative approaches to understanding local meanings of gender and thus women's lives, including meanings of childlessness and motherhood. Despite some recent attention by researchers using qualitative data to how gender is socially constructed (King 2018), demography continues to rely heavily on positivist methodologies and the attendant aim of objectivity. While fertility researchers have expanded the scope of their work over the last 25 years to include much greater attention to gender, such expansion, as Williams (2010, p. 198) notes, 'has not been accompanied by a change in demographic epistemology or methodologies', nor by a widespread recognition of the complex multilevel dynamics of gender as a social structure (Risman 2018). There is in demography, to use Johnson-Hanks' (2007) words, a 'missing theoretical revolution'.
In her paper 'Doing feminist-demography', Williams (2010) notes demography's general lack of engagement with feminist theory and the continuing 'epistemological tensions' between the two fields, despite the expansion of demography's subject matter to include the influence of 'the social'. She concludes: Demographic research that uses truly mixed method designs will be more attuned to the benefits of qualitative research that go beyond validation of quantitative measures. This work makes it more likely that the epistemological assumptions underlying qualitative research will impact demographic theories and help create the intellectual space for feminist-demography (Williams 2010, p. 207). I am less optimistic because conflicting epistemological assumptions cannot be resolved solely through study design. As Johnson-Hanks (2007, p. 5) notes, 'we need to first explicitly address how and where we disagree'. Perhaps the most fruitful strategy for doing this is to participate in discussions with those from other disciplines currently at the margins of mainstream demography, who hold very different epistemological viewpoints (Coast et al. 2007). There is an ongoing interest in demography's connections to other disciplines (Pavlík 2000), and Balbo et al. (2013, p. 3) acknowledge that the study of fertility is 'highly interdisciplinary', although they mention neither anthropology nor geography when outlining the coverage of their review. As we have seen, when it comes to micro-level approaches to explanation, fertility researchers have turned first to economics and then to psychology for theoretical inspiration. Both have limitations, it seems to me, because neither encompasses an adequate conceptualization of social context. Even otherwise valuable formulations of an integrated framework for understanding family formation fall short in this respect. This looks like a very good moment to think more critically than in the past about how society works and how individuals may both reproduce it and change it.

Conclusion
There is now a vast literature on low fertility in Europe and I have been necessarily selective in the choice of examples that best illustrate my arguments.
In particular, I have only touched on the different ways that psychological theories might throw light on fertility motivation (McAllister et al. 2016) and have said nothing about the complex links between demography and biology (Hobcraft 2006). In my defence, and with the exception of critical theories, the examples of theoretical work I have discussed continue to exert a broad influence on empirical work in the field. Yet the underlying tensions between these various approaches to theory and explanation seem indicative of confusion rather than clarity. On the one hand, consensus has not crystallized around one dominant explanatory paradigm and, if demography is not a science in the narrow sense, perhaps we should not expect it to. On the other hand, simultaneously embracing fundamentally different explanatory strategies is not the answer because it risks fragmentation and incoherence. S148 Elspeth Graham I argue elsewhere that the study of populations requires different layers of theory-population theories, theories of society, and philosophical theories -and that 'no academic research, however mundane, can avoid making philosophical assumptions' (Graham 2000, p. 264). Since theories have explanatory force, these different layers must 'hang together' in a way that is conceptually coherent. If disagreements are fundamentally epistemological, then they will not be resolved without engaging in philosophical debate. Relevant debates include not only that between the social constructionism of critical theorists and the positivism of those who emphasize the scientific credentials of demography but also that between holism and methodological individualism (Courgeau 2003) and related philosophical questions, such as whether or not it makes sense to regard fertility rates as Durkheimian social facts (Johnson-Hanks 2007). All hold implications for how we approach the task of explanation and, ultimately, for the coherence of our explanations.
This leaves fertility researchers still with many questions to answer and issues to settle. Demography continues to be valued, rightly, for the precision with which it measures its core phenomena, including fertility. In comparison, very little critical attention is given to the language of explanation. What is meant by drivers and determinants? Are links between the life course and demographic change appropriately conceptualized as mechanisms or should they be understood as non-deterministic pathways? How important are reasons and motives in the understanding and explanation of family formation? Does it make sense to regard reproductive behaviour as 'caused' rather than meaningful action? Greater clarity on these questions would enhance theorizing and aid the development of a coherent explanatory strategy.
One of the main weaknesses of all the theoretical contributions reviewed relates to how they treat context, which is typically incorporated as a set of macro-level opportunities and constraints on individual actions or choices. Multilevel modelling approaches may have enhanced analytical potential by making it possible to study individuals situated in multidimensional space (Courgeau 2003), but they cannot replace reflection on how context influences, and is influenced by, individual actions (Billari 2015). For human actors, space is not an empty container. Rather, places are invested with different meanings (e.g. 'home', 'safe', 'dangerous', 'wealthy', 'poor') and social identities are made in place. Population change, Leick and Glorius (2016) suggest, is not only multilevel and multi-actor but also multiscalar and thus a context-specific phenomenon (see Buelens 2019, for a fertility example).
Fertility researchers, as demographers more generally, have shown increasing interest in the theoretical grounding of their analyses over the past 25 years. Many empirical investigations are now framed in terms of a theoretical perspective, whether as an explicit test of a theory or as a justification of the research design. Nevertheless, the sheer volume of empirical work greatly outweighs the effort devoted to theorizing and explaining. While there is now a wealth of empirical evidence on the levels, trends, and spatial variations in European fertility, explanation of the findings is often partial and speculative. Explaining low fertility in Europe is a complex task and the development of a coherent explanatory strategy would benefit from greater engagement with other disciplines beyond economics and psychology. In particular, attention to debates in the philosophy of social sciences is needed before fertility researchers can legitimately claim that their studies do as much to explain and understand as they do to quantify and describe. In the meantime, it seems to me that Hobcraft's (1996) criticism still stands, because there remains more groundwork to do before the explanatory agenda in demography can move forward.

Notes and acknowledgements
1 Please address all correspondence to Elspeth Graham, School of Geography and Sustainable Development, University of St Andrews, St Andrews, Fife, KY16 9AL, UK; or by Email: efg@st-andrews.ac.uk 2 I am grateful for the helpful comments of the Editor, two anonymous reviewers, and other contributors to this volume on earlier drafts of this paper.