Multistate modelling extended by behavioural rules: An application to migration

We propose to extend demographic multistate models by adding a behavioural element: behavioural rules explain intentions and thus transitions. Our framework is inspired by the Theory of Planned Behaviour. We exemplify our approach with a model of migration from Senegal to France. Model parameters are determined using empirical data where available. Parameters for which no empirical correspondence exists are determined by calibration. Age- and period-specific migration rates are used for model validation. Our approach adds to the toolkit of demographic projection by allowing for shocks and social influence, which alter behaviour in non-linear ways, while sticking to the general framework of multistate modelling. Our simulations yield that higher income growth in Senegal leads to higher emigration rates in the medium term, while a decrease in fertility yields lower emigration rates.


Introduction
For policymakers and researchers alike, it is of paramount interest to be able to predict how people make demographically relevant decisions. In this paper, we present a novel approach to projection by incorporating decision-making and interaction between individuals in a demographic projection model. We use an individual-based model rather than a population-based model such as the popular cohort-component model. For an overview of approaches to forecasting migration, see Bijak (2011), as well as other recent contributions from Hatton and Williamson (2011), Azose and Raftery (2013), and Abel and Sander (2014). This individual-based micro perspective enables us to incorporate behavioural mechanisms and social processes that influence demographic behaviour and population change into our model. We can thus predict the effect of external shocks, such as policy changes, by making the causal mechanism through which shocks alter behaviour explicit. Comparing model predictions with empirical outcomes facilitates the subsequent drawing of conclusions on the plausibility of the assumed behavioural mechanism. In this way, the model can be improved gradually over time, by refining the parameters and functional forms of its main component, in this case the decision-making on international migration.
In the model, as in real life, demographically relevant decisions are embedded in the human life course. Choices are motivated by aspirations and preferences in different domains of life, such as work and family, and constrained by available resources (not only financial, but also cognitive and social resources, and time). We take into consideration that people do not have perfect foresight and usually lack the time and cognitive abilities to acquire the full and unbiased information necessary to make a rational choice. Preferences vary over the life course, as do both the availability of and need for resources, because of events or changing conditions (e.g., available social support). As a result, the choices people make vary with stages of life, and events that occur in one domain of life influence decisions in other domains of life.
We model the human life course, which is operationalized as a sequence of states and transitions between states. In this paper, five status variables are considered: marital status, family status (number of children), employment status, place of residence, and 'living' status (alive or dead). The sequence of states occupied at each age and the ages at which transitions occur depend on personal, social, economic, and political factors as well as on random factors. The sequences of states and transitions are thus outcomes of a stochastic process. The process used here, which is commonly considered to describe life histories, is the continuous-time Markov process and extensions of it. It is governed by transition rates, which determine both the waiting time to a transition and the state occupied after the transition (destination state). The continuous-time Markov process incorporates the theory of competing risks, which determines the destination state entered after leaving a given (origin) state. For an explicit treatment of the theory of competing risks in the context of multistate models, see Aalen et al. (2008) and Beyersmann et al. (2012).
Transition rates, all of which are estimated from actual data apart from the transition rates to migration, may depend on the systematic factors mentioned, and may vary by age and over time.
Our main data sources are censuses and surveys. To represent the various influences on transition rates, we consider transition rates as 'properties of individuals' (see, e.g., Keyfitz and Caswell 2005, p. 511). Population-level rates result from the combination of individual rates and the stocks and flows of individuals.
The innovation introduced in this paper is that behavioural mechanisms, which characterize agentbased demographic models, are used in multistate life course modelling. For an overview of other simulation models of migration, see Klabunde and Willekens (2016); other well-known agent-based models in demography include Billari et al. (2007), Aparicio Diaz et al. (2011), Fent et al. (2013), and Grow and Van Bavel (2015).
To keep the model manageable and close to the current state of the art in multistate life course modelling, all transitions are governed by rates except one: migration. We replace the migration rate with a migration decision process. The decision to migrate is endogenous: migration is embedded in the life course. The predisposition of an individual to migrate depends on their experiences in the other domains of life, such as employment and family. In addition, a person at risk of migration is embedded in a transnational social network.
The life course has become a recognized framework for the study of migration (e.g., Mulder 1993;Kley 2011;Wingens et al. 2011) and the importance of the social network has been established (see, e.g., Munshi 2003;Epstein and Gang 2006;Haug 2008;Giulietti et al. 2013;and Baizan and González-Ferrer 2014 for migrants from Senegal). We model the migration decision process as a multistage (and multistate) process. The transition rates between stages are determined by decision rules. The time a decision takes is the combined duration (waiting time) in each stage.
Individuals pursue happiness, which is assumed to depend on two factors: income (which differs between countries) and being close to family. For an overview of theoretical models of migration in economics, see, for example, Hagen-Zanker (2008) or Borjas (2014). What matters are not only the individual's preferences but also the preferences and opinions of others (social norms) and the individual's ability to mobilize the resources required to remove barriers. Some barriers (e.g., increased border control) and some opportunities are unforeseen but influence the outcome of the decision process. The impact of these unforeseen factors on migration often depends on the stage of the decision process in which they arise (Kley and Mulder 2010;Kley 2017). The explicit modelling of the decision process offers an opportunity to distinguish different types of uncertainties that enter the migration decision at different stages and affect its outcome.
Several decision theories and theories of action are available for use as a basis for the behavioural mechanism to replace the migration rate (Balke and Gilbert 2014;Klabunde and Willekens 2016). Different factors may influence the migration decision in different ways according to the decision stage, for example, immigration regulation may matter little when forming the first intention to migrate, but during travel preparation it can become a true obstacle. Furthermore, re-evaluating one's intention to perform an action in light of new life circumstances (e.g., becoming a parent) is very common for lifealtering decisions such as migration. Therefore, the theory should be a process theory, in other words, a theory that distinguishes stages in the process leading to action. The theory should also allow for an explicit decision not to migrate but to stay, as argued by Coulter (2013). We use the Theory of Planned Behaviour (TPB) (Ajzen 1991;Fishbein and Ajzen 2010) because it is simple but still considerably more realistic than the utility maximization model. We adapt the TPB to turn it into a process theory.
Model parameters are estimated from empirical data on population, wealth, income, consumption, marriage, fertility, migration, and mortality in Senegal and France. Senegal is selected because it S52 Anna Klabunde et al. is one of the three African countries included in the survey on 'Migration between Africa and Europe' (MAFE) (Beauchemin 2015). The survey was conducted in 2008 in the Dakar region and was the main data source for parameterizing the model. Further data sources used are the Senegal Population Census 1988, the Demographic and Health Survey (DHS) of Senegal, and World Population Prospects (United Nations 2015). France is selected because it is one of the three European countries in which migrants from Senegal were interviewed in the MAFE survey.
The structure of the paper is as follows: in the next section, we describe our modelling approach. The model parametrization is then presented to show its application to migration from Senegal to France. In the 'Scenario-based projections' section we demonstrate how the model can be used for scenario analysis. Finally, we present our discussion and conclusion.

Our approach: multistate modelling enriched by behavioural mechanisms
The life course model Our approach extends the demographic multistate model by adding behavioural rules that determine the probability of a transition. A detailed model description is far too comprehensive for this paper. To this end, we refer to Klabunde et al. (2015) and the corresponding 'openabm' project (see https:// www.openabm.org/model/5146/version/6/view). In the current paper, the details we focus on are the interactions between the demographic multistate model and the migration decision model, and the role of the social network.
At each moment in time, individuals are at risk of experiencing multiple (exclusive) transitions such as death, marriage, childbirth, or migration. Such transitions are called competing risks. Since two transitions cannot occur at exactly the same time, the occurrence of one transition hinders the occurrence of another transition. Additionally, the occurrence of one transition changes the probability of other transitions at a later time.
Enriching the demographic multistate model with a behavioural element enables us to determine life course dynamics in a very specific and innovative manner: the relevant event (i.e., migration) is an outcome of a decision process, described by explicit behavioural rules, and the multistate model defines all circumstantial events (such as childbirth and marriage). After the decision process has been completed or interrupted, the multistate model takes over to describe the life course further.
Our decision model is inspired by the TPB, which states that intentions are the best predictors of behaviour. Intentions are determined by one's beliefs about: (a) the outcomes (benefits and costs) of the behaviour or attitude; (b) social norms; and (c) one's own ability to mobilize resources, take advantage of opportunities, and remove barriers (perceived and actual behavioural control). Background factors influence the beliefs that are formed. Figure 1 is a schematic presentation of the TPB applied to migration. The three rows in the middle show the individual beliefs (a), (b), and (c). There are several reasons for choosing the TPB: first, it is an established theory from social psychology (for a review of applications, see Armitage and Conner 2001) and is often applied to explain and predict demographic behaviour (e.g., Ajzen and Klobas 2013;Philipov et al. 2015), including migration (e.g., de Jong 2000). Second, the TPB offers a behavioural heuristic, which is apt for deliberate decisions that involve high levels of uncertainty, such as the migration decision. Third, it is possible to extend the TPB in such a way that attrition during the decision process can be modelled. Thus, we can account for interfering events, either in the form of competing risks (such as marriage or childbirth) or in the form of external events that change an individual's environment (such as a change in immigration law). Fourth, factors with a clearly defined interpretation can be included in the model as influencing attitudes, beliefs, norms, or perceived behavioural control.
Far-reaching decisions involving uncertainty take time: individuals need time to gather information about possible consequences of the decision, to evaluate positive and negative aspects, and to consult significant others. Very practical matters, such as applying for a visa or saving money, also take time. To capture this temporal aspect, we suggest describing decision-making as a continuous process. The basic assumption of our process model is that an individual makes a decision in three stages: first, they determine their attitude, social norms, and perceived behavioural control associated with a certain behaviour and form a behavioural intention. Whether an intention translates into an event is determined by the actual level of control over an action. Between the formation of an intention and the actual event (migration), a planning phase and a preparation phase occur, as shown in Figure 2. Individuals can leave the decision process at any stage and at any point in time.
Between first forming an intention and eventually migrating, the circumstances of a person's life can change drastically. This can happen either through the behaviour of other individuals-parents, spouse, employer-or because the individual's preferences or priorities change. For instance, the birth of a child or a job offer may change the desire to emigrate. Drastic external shocks such as an economic crisis can also change an individual's environment and impact on the decision to emigrate. Such shocks interfere during the period between an already formed intention and the actual behaviour and can cause an individual to give up an intention. In fact, while 14 per cent of the world's adults express a desire to emigrate, only 8 per cent of those who desire it are already planning to do so, and of those planning, only 39 per cent have started making preparations, which is only 0.5 per cent of the total (Esipova et al. 2011). The more hesitantly an individual approaches the execution of a plan, the more time passes since the first intention was formed and, with the passage of time, it becomes more likely that an interfering event occurs.
In summary, our approach is composed of two model components: a multistate model and a decision model, with the latter embedded into the first. First, we briefly describe the demographic multistate model. The decision model is presented next. In the multistate model, states are denoted by s j . In the decision model, we use σ j to denote the states.
Multistate model. Consider an individual in state s j and the transition to state s k . The rate of transition, which describes the propensity of an individual in s j to experience the transition to s k , is assumed to depend on time t (age) and the time w s j already spent in s j , and may also depend on background factors. The probability that the individual is still in Actual control over migration   state s j at t + w s j depends on all competing risks: where w s j , s k is the waiting time in s j until moving on to the next state s k , l s j ,s k are the corresponding transition rates, and K is the number of all possible next destination states, except the current state (competing risks). Assuming constant transition rates over fixed time intervals (e.g., years) causes piecewise exponential waiting time distributions, and eases the computation of further process implications. This representation implies stochastic independence of the waiting times corresponding to the distinct competing risks.
Decision model. In our application, migration is the relevant event whose decision-making process is described explicitly. At a certain age, each young individual forms an intention towards migration for the first time; this may be positive (inclination to migrate) or negative (inclination to stay). The age can be fixed in the model (e.g., at age 18) or can be a random variable with a given probability distribution. Following the argument in Billari (2000), we assume heterogeneous starting ages that follow a normal distribution. The intention of an individual to migrate, I(t), hinges on their attitude A(t), the social norms SN(t), and the perceived behavioural control PBC(t), each depending on time t (α, β, and γ constitute weighting parameters set through calibration): with A(t), SN(t) > 0, PBC(t) < 0, and −∞ < I(t) < ∞.
In the model, attitude towards migration is influenced by the probability of earning a higher income in the host country and by the number of family members that have already migrated. Social norms are determined by the number of other migrants someone knows. Perceived behavioural control is negative here because we only consider impeding, not enabling, factors, namely, migration cost and border control. The higher they are, the lower perceived behavioural control an individual has over migration, and the lower the intention. A negative intention value causes an individual to drop out of the decision process and not to re-enter it again. A positive value means that the individual forms an intention towards migration and moves to stage 1, the intention phase. Then the individual might move on to the planning phase (stage 2) and thereafter to the preparation stage before migration (stage 3). The transition rates l s i s h (t) of passing at time t from one stage i to the next stage h (i.e., from stage 1 to stage 2, from stage 2 to stage 3, and from stage 3 to migration) are defined as follows: where ρ is the baseline rate. When the intention is negative, an individual is assumed to leave the decision process. If the intention is positive, then the waiting time function S(w s i s h , t) of moving from stage i to stage h can be derived by After the waiting time in a stage has expired, the individual experiences the passage from one stage of the decision process to the next one. Thus, our decision model defines the waiting time from intention formation to migration by a convolution of three waiting time distributions corresponding to stages 1-3 of decision-making. On expiry of the final stage (preparation; stage 3), the success of migration depends on two further constraints: (1) whether household capital exceeds the costs of migration, taking into consideration any children the migrant may have to take along; and (2) whether the individual manages to cross the border. This last impediment hinges on the value pb(t) > 0, actual border control, which is assumed to be exogenous. The success probability π(t) is defined as This yields an average of only three out of ten persons who try to cross the border being successful. The MAFE contains a question about unsuccessful migration attempts. González-Ferrer et al. (2012) have evaluated that question and derived age-, period-, and country-specific migration success rates. They computed the rates by dividing the cumulated number of failed migration attempts (x) by the cumulated number of successful entries (y). Transforming the average success probability that we use in our model, 3/10 = y/(x + y), into this format yields an average rate of x/y = 7/3 = 2.3, which matches the figures derived by González-Ferrer et al. Entering a new stage in the migration decision process and experiencing a new demographic event (such as marriage) are competing risks. In other words, once the demographics of an individual vary, the value of the waiting time function in the respective stage of the Multistate modelling extended by behaviour S55 decision process might also change. Therefore, a new intention value and a new waiting time are computed after every demographic event, as shown in Figure 3. Here, a married and childless individual in the planning stage experiences the transition to the preparation stage as the next event (the waiting time corresponding to the preparation stage is shorter than the waiting time to death or to having a child). Thereafter, the individual has a child, since the corresponding waiting time is now the shortest. As the demographics of the individual change with this transition (i.e., the individual is no longer childless), they are assigned a new intention value and new waiting times are computed for all competing risks (i.e., migrating and dying). In this example, this yields a shorter waiting time to death than before, shorter than the one to migration. Consequently, as the next event the individual dies. For a more detailed description of the decision model of migration, see Klabunde et al. (2015) and Warnke et al. (2017). While the emigration decision process and consequently the propensity to emigrate are affected by transitions in other domains of life, transitions in these other domains are not affected by the migration decision stage. For instance, an individual who is planning to emigrate has the same rate of marriage or childbirth as an individual who is not. In principle, our approach can handle mutual dependencies, but they are omitted because of a lack of adequate data that would allow for plausible hypotheses on how cognitive processes in the migration domain could affect other behaviour, for example, in the fertility domain.

The simulation
Based on this transition model, individual life courses can be simulated. To simulate a population evolving over time, the population needs to be initialized and life courses need to be created for each individual. In addition, newborns are added to the population over time and individuals leave through death. For this purpose, the waiting time function (equation (1)) is decomposed into a set of independent waiting time functions, one for each possible destination state. These different waiting time functions and the waiting time function of the decision stage currently being processed (e.g., waiting time function for stage 1 or stage 3) are linked by the event that occurs first and censors all other events. This event might be a demographic event or the transition to the next stage of the decision process. To determine the next event, we compute random waiting times for all possible destination states and the next stage of the decision process. The shortest waiting time is the one to be realized and indicates the next event to happen. The waiting time in one state until moving to another state can be simulated by the inversion theorem (Rubinstein and Kroese 2008, pp. 51ff.). According to this theorem, a random waiting time w from the correct distribution results from where λ(t) denotes the respective transition rates and u is a standard uniform distributed random number (and for the sake of clarity, without loss of generality, all indices referring to either states or stages are omitted from here onwards). The computation of the waiting time function (equation (4)) requires an evaluation of the intention function (equation (2)). Individuals with negative intention values never make a transition to the planning stage; they exit the decision process and thus stay in the first stage for an infinitely long waiting time. Let w* be the shortest of the waiting times simulated. Then, the individual under consideration experiences the event corresponding to that waiting time at time t + w*. For each individual in the virtual population, this computation of the shortest waiting time, and (if necessary) re-evaluation of intentions, is repeated until either the individual leaves the virtual population by dying or the simulation stop time is reached. The migration decision model and the overall setting are implemented using NetLogo (Wilensky 1999) with the continuous-time extension by Sheppard and Railsback (2015). The demographic events are simulated using the R package MicSim (Zinn 2014). Connection between NetLogo and R is through a NetLogo extension (Thiele and Grimm 2010).

Application: migration from Senegal to Europe
To illustrate the capability and potential of our model, it is applied to migration from the Dakar region in Senegal to France between 1982 and 2050. The parameters of the model, including most of those in the decision model, are estimated from empirical data. The main data source is the MAFE survey (Beauchemin 2012), conducted in 2008. This includes both a household and an individual biographic (retrospective) sample survey of Senegalese natives residing in the Dakar region and a sample of Senegalese migrants currently residing in France, Spain, or Italy (Beauchemin 2012). These data provide a unique opportunity to follow Senegalese migrants and non-migrants for an accurate estimation of transition rates for demographic events, as they include complete histories of births, unions, migrations, migration attempts, and employment for every respondent. Additional data sources include the Population Census of Senegal and the DHS.
In the model, individuals are connected with others in households, families, and social networks. Income is shared within households; a part of joint household income is consumed, and the remainder, if any, is saved and added to household capital. The migration cost is covered from household capital. Network members in the destination country provide information on wages in that country and are the source of social norms on migration.
The application of the model to the population of the Dakar region requires the determination of rates of marriage and marriage dissolution, parityspecific fertility rates, mortality rates, rates of labour market entry and exit, and decision rules that govern the transition between the stages in the emigration decision. The demographic data need to be age specific. In addition, network and marriage market rules have to be defined. Finally, income distributions as well as initial household wealth have to be determined. For sake of simplicity and because of data limitations, we use data from France to represent the host country, as France is traditionally by far the most important country of emigration for Senegalese (Sakho et al. 2013). Because of the lack of comparable wage data in the MAFE data setthere are 36 per cent missing values in the biographic data set-we use empirical time series of gross domestic product (GDP, expressed as purchasing power parity (PPP) per capita), taken from the International Monetary Fund (http://www.imf.org/ external/ns/cs.aspx?id=28), and consumption data from the World Development Indicators (http:// databank.worldbank.org/data/reports.aspx?source= 2&country=SEN&series=&period=#). For estimating initial wealth in Senegal in 1982, we rely on Davies et al. (2009Davies et al. ( , 2011. For the distributions of income and wealth we use log-normal distributions. Initial values of wealth and income are assigned randomly to individuals.
In the corresponding 'openabm' project, we list all the data sources and the estimation methods used to obtain the model parameters and specifications of our application. In the current paper, we focus on how we determine the initial population, the rates of transition in the life course, the marriage market, and the social network.

Initial population and estimation of transition rates
We obtain an initial population of 2,000 individuals in the Dakar region in 1982 by drawing a random sample of the population from the 1988 Census in Senegal calibrated to population marginals of 1982. Selected individuals who are married are matched with a partner in the virtual population. The Multistate modelling extended by behaviour S57 matching procedure is described in the next subsection. If they have children in the Census data, they have the same number of children in the virtual population.
Age-and period-specific mortality rates are taken from the World Population Prospects (United Nations 2015). These data comprise mortality probabilities for Senegal and France for all years from 1980 to 2050, grouped into five-year age intervals. Senegalese migrants living in France are assumed to have the same mortality as French citizens. From these data, mortality probabilities (q x ) for single ages (x) are computed using the Heligman-Pollard model (Pollard 1973;Kostaki 1991;Ibrahim 2008).
The DHS of Senegal is used to estimate agespecific marriage and fertility rates for people residing in Senegal. Rates for migrants are estimated using the MAFE household data set, to account for the fact that the transition rates of migrants in the destination country differ from the rates of the stayers. A two-step procedure is used to determine the rates. First, single-year-of-age and (decennial) cohort-specific occurrence-exposure rates are estimated. Because of the limited number of cases, the single-year-of-age transition rates for the migrants are assumed to be time constant. In the second step, these rates are used to fit two types of parametric models to derive smooth age profiles: the Hadwiger model is used to fit fertility rates and the Coale-McNeil model to fit marriage rates (Hadwiger 1940;Gilje 1969;Coale and McNeil 1972;Chandola et al. 1999). Age-specific rates are estimated for all years from 1982 to 2014. For the sake of simplicity, fertility and marriage rates for 2015-50 are assumed to be equal to the age-specific rates in 2014.

Marriage market
At the beginning of the simulation, all individuals are unlinked. To create households, couples have to be formed. For this purpose, an initial marriage market is established, similar to the one described in Zinn (2012). All individuals in the initial population who are married in the empirical sample enter this market and search for an appropriate spouse. To create matches, we use the empirical probability that two individuals are married, considering the age of the man and the age difference between husband and wife. The model is estimated using Senegalese couple data for 1986-2014 from the DHS. In the market, all individuals are queued according to their age. The youngest individual is the head of the queue and the oldest individual constitutes the tail. Starting with the head of the queue, each individual inspects all other opposite-sex members of the market and selects the one with the highest empirical probability of forming a couple with that individual. A Bernoulli trial with the success probability being the matching probability decides whether the match takes place. After the whole queue has been passed through, some individuals might still be unpaired. These individuals are now stepwise linked to those market members with the highest matching probability. By design, this marriage algorithm does not preserve the estimated matching probabilities with regard to the composition of the pool of available candidates. However, we seek to form couples whose joint characteristics resemble observed couple characteristics. For this purpose, our approach is suitable. For a detailed discussion on this topic and for a comparison of simulated vs. observed couple characteristics, see Zinn (2012).
Once the initialization phase is complete and the run starts, the timing of marriage formation is determined by individual waiting times to marriage, which are derived from the age-specific marriage rates. An individual enters the market six months before a marriage event is scheduled, or immediately, if the time to the marriage event is shorter. Whether a man and woman form a couple depends on two factors (see Zinn 2012). First, in continuous-time simulations like ours, the probability that two individuals experience events separately at the same time is zero. Nonetheless, to enable individuals to marry at one specific point in time, each individual scheduled to marry is assigned a so-called marriage interval of one year. If the marriage intervals of two individuals overlap, they are considered as potential partners. If the couple gets married, the simulated marriage time is the mean of the two individually scheduled marriage times. Second, two potential partners have to be compatible, quantified by the probability that two individuals are married in the empirical data. A logit model is used to predict that probability, with the age difference of two potential spouses used as the explanatory variable. At the moment an individual enters the market, they choose the one with the highest matching probability among all opposite-sex members with overlapping marriage intervals. A Bernoulli trial decides on the success of a match. Newly formed couples leave the market. Individuals who are not able to find an appropriate partner in time (i.e., by six months after the scheduled marriage time) are matched according to the highest matching probabilities.

The role of the social network: family and significant others
In our model, a married couple forms a household, while newborns join the household and remain in the household until they get married themselves. Families and households are crucial when calculating the attitude towards migration. A positive attitude towards migration is most prevalent when the prospect of higher earnings seems very attractive and when migration implies reuniting with family members who migrated previously.
One aspect of the attitude towards migration is the evaluation of an expected higher income in the host country, y i,t (see equation (2)). It is higher, the poorer a person's household in Senegal is, in other words, the lower the household capital per household member is: where c h,t is the capital of the household h of the (adult) household member i at time t, A h,t is the number of individuals in the household, and j is a weighting parameter. We do not compare expected income (a flow concept) to household capital (a stock concept); rather, household capital measures the necessity of considering migration at all; it is a measure of poverty. The impact of a person's own income is considered when determining the probability that income will be increased through migration, see later. If the household capital is zero, the expected higher income in the host country is given the value of 1,000. Given that we use GDP (PPP per capita) as a proxy for the average wage, the evaluation of higher income in the host country is almost always positive with this parameterization. Thus, the evaluation of an increase in earnings tends to be larger when there are many children in the household. On the other hand, more children lead to lower household capital and thus an individual has lower perceived behavioural control over their ability to afford the migration cost. Which of these two effects prevails depends on the relative sizes of the parameters of attitude and perceived behavioural control, whose sizes are themselves determined by calibration (see next subsection). The attitude towards migration also depends on the prospect of family reunification through migration. In the model, the more family members (grandparents, parents, siblings, spouse, children) that have migrated, the stronger the attitude in favour of migration.
Individuals are connected through network links. At the beginning of an individual's life, they form links to their parents and older siblings (no matter where these relatives live) and to all other members of the household. In our implementation, households are arranged on a torus-shaped grid; space is interpreted to be purely social so that households that are spatially close are socially close (e.g., through shared ethnicity, religion, or origin). Individuals also form links to all other individuals on the eight patches surrounding them (Moore neighbourhood), independent of household membership. Those network links remain throughout life. Additionally, new links are formed when individuals move to a new household, such as through marriage. As long as migrants migrate alone, they remain part of their home country household. Only if other family members join them, is a new household formed in the host country.
In our model, network links have a threefold function. First, they serve to transmit information: when individuals compute the subjective probability of increasing their income through migration, which forms part of the attitude calculation, they proceed as follows. For each of the individual's network neighbours in the host country, the income after migration is compared with the individual's own income. The subjective probability of increasing one's income through migration is then the proportion of the individual's network neighbours in the host country who have a higher income than the individual's income. Thus, the lower a person's own income, the higher the probability of increasing expected income through migration. So, both the level of poverty, as measured by household capital, and the person's own income flow have an impact on the probability of migration.
Second, the subjective probability that border enforcement will hinder the individual's migration is defined to be the proportion of migration attempts of the individual's network neighbours that failed. Third, the network transmits social norms regarding migration: in our interpretation of the TPB, the 'social norms' component is defined as the proportion of network neighbours who are migrants.

Sensitivity analysis, calibration, and validation
Our model has six parameters that are free and can be used for calibration (see Table 1): two of these parameters (j and z) are part of the equation to compute individual attitude values (see Klabunde et al. 2015), three of the free parameters (a, b, and g) are components of the intention function (equation (2)), and one parameter (r) is part of the transition rates function (equation (3)). All remaining quantities can be determined based on empirical data (e.g., the Senegal census or the MAFE data). To explore the impact of the six free parameters on the model output, we could try to run the model with all possible parameter configurations on a six-dimensional fine-meshed grid. However, bearing in mind that a model instance with a starting population of 2,000 individuals running over 80 years takes approximately eight hours to run on a standard desktop machine, such an approach would take much too long. A feasible alternative is the definition of a metamodel serving as a surrogate for the simulation model. Such a metamodel is denoted as an emulator (Sacks et al. 1989) and can be used for sensitivity analysis and calibration. The idea is to perform simulation runs for a feasible set of input values ('design points') and to specify ('train') the emulator by means of the corresponding input and output values. Then, for any parameter configuration not part of the original input ensemble, the emulator provides a probability distribution of the potential model outcome. A simplifying and reasonable assumption in this direction is that the model outcome follows a multivariate normal distribution resulting in a Gaussian process emulator (Kennedy and O'Hagan 2001;Oakley and O'Hagan 2002). To explore the sensitivity of the migration model to changes in the free parameters, we define a summary of 729 design points (3 6 ), that is, three values for each of the free parameters (see Table 1). Then, a simulation run for each of the 729 configurations is performed holding the remaining parameters fixed. To train the emulator, we regress the total number of simulated migrations on the six free parameters. Several interaction terms between the free parameters turn out to be insignificant, so we omit them in the final version. For estimation, we rely on the software of Hankin (2012). The estimated coefficients and 95 per cent confidence intervals are given in the last column of Table 1.
We find that apart from r (the baseline hazard) and z (the weighting parameter for family reunification), all other parameters have a significant impact on the number of simulated migrations. All in all, a 10 per cent increase in α or β, measured on the basis of its largest design point component (i.e., an increase of 0.0002 in α or an increase of 100 in β), leads to average increases of 13.66 or 15.00 per cent, respectively, in the expected number of simulated migrants. In contrast, a 10 per cent increase in γ or j (i.e., an increase of 0.0001 in γ or an increase of 0.05 in j) leads to average decreases of 29.44 or 29.20 per cent, respectively, in simulated migrants. Similarly, Beine et al. (2015) find that while networks increase migration, family reunification is not the main driver and only accounts for one-quarter of the overall effect. The weighting parameter of household capital in the evaluation of higher income (j) and the weighting parameter (g) of the perceived behaviour control component in the intention equation show the strongest impacts. The importance of perceived behavioural control on intention to migrate is confirmed by van Dalen et al. (2005), who find that a lack of means and legal problems with emigration are among the most important reasons given for not intending to migrate. The effect of self-efficacy on For calibration, we search a parameter combination that, if possible, minimizes both: (1) the mean squared error (MSE) of the simulated migration age profile and the migration age profile given in the MAFE data (Beauchemin 2015); and (2) the MSE of the simulated and observed (period-specific) proportions of female migrants among all migrants in 1982-89, 1990-98, and 1999-2006, taken from the French 2011 Census. One might expect that distinct minima exist for the two criteria. However, in our case the objective function has one unique minimum for both kinds of MSE considered (see Figure 4), yielding an optimal parameter combination at α = 0.002, β = 50, γ = 0.0001, j = 0.05, z = 10, ρ = 0.05, as shown by the black dot in Figure 4. Figure 5 shows the results of the calibration. Given that we fix certain parameters, the free parameters are set to their optimal values. Of course, there is some degree of arbitrariness to the fact that we fix some parameters and not others. We make this choice based on whether or not we can identify an empirical correspondence for a parameter. Clearly, if we were to fix other parameters, then the optimal values of the remaining free parameters would most likely be different. In this sense, we face an identification problem. There are agentbased models whose main purpose is to identify the 'true' value of a real-world quantity or parameter that is immeasurable or uncertain, such as Alfarano et al. (2005) or Heard (2014); for an overview of statistical identification in agent-based modelling, especially using emulators, see also Hilton and Bijak (2017). We, on the other hand, do not claim to have identified the one 'true' value of the parameters, but just a meaningful model parametrization so that the model outcome resembles realworld phenomena.
The success of this exercise is checked by comparing the simulation output of the calibrated model with external data that have not been used so far. Concretely, we contrast the emigration rates (computed as occurrence-exposure rates) derived from the simulation with the emigration rates computed from the MAFE survey. The results are depicted in Figure 6. We find that the simulation model is capable of very closely mimicking the migration situation in Senegal's Dakar region in the years 1984-2008.
Scenario-based projections. Once the model is calibrated and validated, it is possible to run it under different future scenarios. We choose three different scenarios: two based on individual income growth in Senegal and one based on a different fertility rate. In the baseline scenario, real income is assumed to continue growing at the 2015 rate (Senegal: 2.4 per cent, France: 0.5 per cent) from 2015 to 2050. In scenarios A1 and A2, we assume that real income in Senegal starts to increase by 3 or 8 per cent per annum, respectively, while real income continues to grow What we see is that in the baseline and A1 scenarios the migration rate (number of migrations over total number of persons at risk) starts to fall in 2015. In contrast, it stays relatively stable in the A2 scenario from 2015 to 2025. Then, it increases slightly until 2033 before falling. The result that an 8 per cent increase in income in the origin country yields a larger migration rate might seem counter-intuitive at first. The reason for this is that there are two effects of a higher income in Senegal, which counterbalance one another. One is that leaving becomes less attractive (see equation (6)) because a higher income in the host country is not valued as highly any more. On the other hand, more people can suddenly afford to migrate: perceived behavioural control as well as actual behavioural control is increased strongly. In our parameterization, both effects are roughly the same size, so that the overall effect of the income increase is minor. After all, even after 35 years of much larger income growth   Figure 6 Observed emigration rates (in dark grey with 95 per cent confidence intervals) from Senegal to France estimated from the MAFE survey contrasted with simulated emigration rates (light grey dots) averaged over ten simulation runs: (a) age-specific rates for the period 1984-2008 and (b) period-specific emigration rates for the age range 18-39 in Senegal than in France (8 per cent vs. 0.5 per cent), average predicted income in France in 2050 is, at 2,884 euros (PPP), still larger than in Senegal at 2,046 euros. Thus, migration continues to remain attractive for some, especially for the poorer Senegalese strata who can now afford it. Given that the level of inequality in Senegal (Gini coefficient of 0.4) is larger than in Southern Europe (Gini of 0.3), a considerable number of people will remain poor. The derivation of the Gini coefficient is given in the supplementary material to this paper available at the related 'openabm' project. This result is in line with Clemens (2014) who finds that increases in annual GDP per head of up to US$9,000 increase emigration; see also Faini and Venturini (1993), de Haas (2010), and Westmore (2014). The overall decline in relative migration frequencies (in all scenarios) is mainly driven by an increase in population size, itself driven by fertility constantly above replacement level; thus, our findings are not at odds with studies that predict an increase in absolute migration from sub-Saharan Africa, such as Hatton and Williamson (2011). In fact, immigration from Senegal to France as well as to the EU28 as a whole has been remarkably stable in absolute numbers since 2008 (Eurostat 2016), despite strong population growth; our results are very consistent with this. In our baseline scenario, we assume constant fertility for all cohorts born after 1990, in line with recent findings by Bongaarts (2016). In our fertility scenario B, we instead assume that the decreasing linear trend in fertility, which we observe until 2005, continues until the 2030 birth cohort and then stabilizes. This is in line with the projections by the United Nations (2015).
We find that a stronger decrease in fertility leads to a lower proportion of people migrating, until approximately 2045 (Figure 8). Thereafter, the proportion of people migrating remains stable and thus surpasses that of the constant fertility scenario. This pattern results from there being fewer people per household, so household capital per household member is larger, making migration less attractive. This effect is counteracted by higher perceived and actual behavioural control because there is less competition, for example, with siblings, for household capital. This leads to more individuals migrating after planning and preparation have been completed. To explain the negative overall effect on migration, one should remember that the peak of migration happens at around age 25 in the sample. Thus, a higher mean age at first birth than in the baseline scenario (around age 23), as assumed in the decreasing fertility scenario, is associated with stronger competition between migrating and having children, leading to lower migration. After 2045, the cohorts born from 2031 onwards become fertile and their fertility rate is assumed not to decline further. Thus, starting from this time the proportion of migrants stays stable, in contrast to the baseline scenario. In the literature, high fertility and high population growth are usually associated with an increase in migration, both theoretically and empirically, mostly because of stronger competition for resources Multistate modelling extended by behaviour S63 and jobs (Ravenstein 1885;Hugo 2011;Reher 2011) and because sustained high fertility implies a high proportion of young people more prone to migration (Mayda 2010), so our result is in line with the literature.

Summary and conclusion
We use a multistate model to simulate individual life histories, considering five domains of life represented by status variables: marital status, family status (number of children), employment status, place of residence (Senegal, France), and living status (alive or dead). Transitions are governed by transition rates, except for migration, which is the outcome of a multistage decision process. The decision model is inspired by the TPB. External shocks, such as policy change, affect the decision process and hence the likelihood of migration. Taking into account decisionmaking in the individual life course and the life course in a multilevel environment (household, community, country, and transnational relations) has great potential to improve the prediction of migration beyond existing approaches based on empirical regularities, time series models, and regression models.
We demonstrate an application to two income scenarios and one fertility scenario. We find that migration rates, at least during our simulation horizon until 2050, are highest when growth in Senegal is high. Higher fertility also implies more migration. Many different applications are possible within the model framework we introduce in this paper; for example, relating to a weakening of ties to other migrants over time, or to an introduction of polygamous marriages, or to changes in the labour market in either the home or host country. The effect of these external changes on migration behaviour is always through a well-defined behavioural channel. Verifying the results of model predictions with empirical data allows us to draw conclusions as to whether the suggested behavioural channel is reasonable or not. If predictions turn out to be at odds with empirical reality, the model can help to identify alternative causal mechanisms that could explain observed patterns.
For the approach to live up to its potential, we need to deal with three challenges. The first challenge is to determine which transitions in the life course can be generated by transition rates and which should be approached as outcomes of decision processes. Any event in the life course potentially affects other domains of life, hence modelling calls for the selection of key dependencies. The second challenge is to account for the different types of uncertainty, including the uncertainties in the model specification and input parameter values. Our viewpoint of empirically determining all measurable parameters and calibrating the model by choosing the non-measurable behavioural parameters, completed by a thorough sensitivity analysis, is certainly not the only option. It is advocated by Cirillo and Gallegati (2012) but criticized by Grazzini and Richiardi (2015), who argue in favour of estimating all parameters through simulation. Yet another viewpoint is that of estimating the parameters in a Bayesian way, making use of all available qualitative and quantitative information to go from distributions of input values and arrive at distributions of output values (Poole and Raftery 2000;Bijak et al. 2013). A third challenge is to obtain better individual data on attitudes, norms, resources, intentions, and behavioural outcomes, as well as data on relevant social networks and the support received. This calls for longitudinal surveys. There are a few promising examples of excellent surveys (see, e.g., the REPRO project, Philipov et al. 2015) which, we hope, could inspire further initiatives in gathering extremely useful data such as these.