Cognition and policy change: the consistency of policy learning in the advocacy coalition framework

Abstract Policy actors involved in decision-making processes interact and gradually accumulate evidence about policy problems and solutions. As a result, they update their policy beliefs and preferences over time. This process of policy learning is consistent if policy preferences are aligned with any adaptations in beliefs about policy outcomes – a crucial condition of learning-induced policy changes. This article examines whether and when policy learning is consistent based on regression analyses conducted on data from a 2012 survey of 293 Belgian actors involved in the European liberalization policy process for the rail and electricity sectors. In line with the advocacy coalition framework, existing research has suggested that motivated modes of reasoning, such as selective exposure and biased assimilation, influence policy actors’ attitudes and behaviours. This study isolates the effect of biased assimilation on policy learning by demonstrating that when policy actors adapt their beliefs about policy outcomes, they do not necessarily align their policy preferences with those adaptations. Furthermore, biased assimilation is higher among politically curious actors, but their degree of commitment to the policy process does not appear to play a role. The theoretical and practical implications of these findings are discussed.


Introduction
Policy processes involve diverse types of policy actors, ranging from politicians and public officials to company and association managers. As a result of a varied set of interactions as well as the gradual accumulation of evidence regarding policy problems and solutions over time, those policy actors acquire, translate and disseminate new information and knowledge. In turn, they maintain, strengthen or revise their beliefs and preferences regarding policies. 'Policy learning' is a concept that captures this cognitive and social dynamic of belief updates (Dunlop & Radaelli, 2013;Heikkila & Gerlak, 2013).
One of the most often invoked reasons for scrutinizing policy learning is the role that it plays in policy change (e.g. McBeth, Shanahan, Arnell, & Hathaway, 2007). Even if there are doubts about the exact nature of this relation (e.g. Nohrstedt, 2005), it is recognized that KEYWORDS advocacy coalition framework; biased assimilation; bounded rationality; motivated reasoning; policy learning human learning is a fundamental intermediate factor in change processes. Eliciting change requires actors to create or to address new information and new experiences, a process that results in the enduring acquisition or modification of cognitive constructs (Vandenbos, 2007). These alterations, in turn, transform actors' preferences, behavioural intentions and concrete behaviours (Fishbein & Ajzen, 2010). In addition to its direct influence on policy decisions, policy learning has other potential intermediate outcomes, such as developing shared understandings and mutual agreements or transforming relationships among parties to a conflict (Leach, Weible, Vince, Siddiki, & Calanni, 2014).
This article focuses on the consistency of policy learning. Public problems, policies and contexts are constantly changing. Through learning, policy actors can maintain, reinforce or revise their beliefs about the patterns and outcomes of policies. Policy learning is consistent when actors align their policy preferences with their adapted beliefs. This is an important condition of learning-induced policy changes. Hence, it is crucial to look at the consistency of policy learning.
To examine whether and when policy learning is consistent, this article relies on the Advocacy Coalition Framework (ACF: . This approach is most often used to examine the role that coalitions of policy actors play in policy processes, but it also recognizes that policy change can depend on policy learning. In line with Festinger (1957), the ACF relies on a model of the individual characterized by an internally consistent system of beliefs and preferences regarding policies. According to this model, after updates in their policy beliefs (e.g. 'this policy has had more negative effects than expected'), policy actors should align their policy preferences with those updates (e.g. 'I am less favorable toward this policy than before'). At the same time, the ACF recognizes that the bounded rationality of policy actors imposes limits on their ability to process information (Sabatier & Weible, 2007;Simon, 1991). More specifically, recent policy research has demonstrated the effect of motivated modes of reasoning (Kunda, 1990;Kahan, 2013), such as selective exposure and biased assimilation, on policy actors' attitudes and behaviours. The present study is the first to isolate the role of biased assimilation in policy learning. As many policy approaches share with the ACF these types of behavioural assumptions for individual actors (Moyson, 2014;Zito & Schout, 2009), the findings of this study will speak to many researchers within and beyond the ACF community.
The test of the hypotheses is based on regression analyses of a survey conducted in 2012 among 293 Belgian policy actors who had been involved, during the last two decades, in the European liberalization policy process for two network industries: the rail and electricity sectors. Hence, this research is consistent with Sabatier's (1993) contention that policy processes should be considered 'over a decade or more' to capture the actual nature and effects of policy learning. This article follows a classical structure, which presents the theoretical expectations before the research design, the measures, the analysis and the results. Finally, the findings are discussed.

Policy learning and policy change in the ACF
The ACF (Jenkins-Smith, Nohrstedt, Weible, & Sabatier, 2014;Sabatier, 1987;Sabatier & Weible, 2007) is a social learning approach to the policy process (Zito & Schout, 2009). According to Heclo (1974), in one of the foundational formulations of social learning approaches, politics finds its sources not only in power but also in uncertainty -men collectively wondering what to do … Governments not only 'power' … they also puzzle. Policy-making is a form of collective puzzlement on society's behalf; it entails both deciding and knowing … Much political interaction has constituted a process of social learning expressed through policy. (pp. 305, 306) In the ACF, the policy process is conceptualized as a political struggle among (coalitions of) policy actors involved in a given policy subsystem. A policy subsystem is a set of 'actors from various public and private organizations who are actively concerned with a policy problem or issue such as air pollution control, and who regularly seek to influence public policy in that domain' (Sabatier & Jenkins-Smith, 1999, p. 119).
This study relies on two important cognitive constructs of policy actors: their beliefs about policy outcomes and their policy preferences. The ACF assumes that each policy actor holds a belief system composed of three strata. The first stratum contains 'deep core' beliefs, which are personal philosophical precepts that are very broad in scope (e.g. 'I believe that justice is an important value'). The second stratum is represented by 'policy core' beliefs that are precepts specific to one subsystem, such as the proper scope of governmental action or the identification of groups whose welfare is of greatest concern (e.g. poor people, junkies, employees vs. employers, etc.). At this level, actors also hold factual beliefs about the outcomes of policies (e.g. 'I believe that this policy option increases the degree of justice among population groups'). Those factual beliefs, in turn, determine these actors' policy core policy preferences (e.g. 'I believe that this policy option is better than others'). Policy core policy preferences (or 'policy preferences') are 'normative beliefs that project an image of how the policy subsystem ought to be, provide the vision that guides coalition strategic behaviour, and help unite allies and divide opponents' (Sabatier & Weible, 2007, p. 195). Studies looking at the interplay between factual policy beliefs and normative policy preferences are rare. At the third stratum, 'secondary' beliefs are more specific. They concern particular administrative rules, budgetary allocations, programme performance, etc. (e.g. 'I believe that this administrative decision facilitates the implementation of my preferred policy option').
One important objective of the ACF is to explain policy change, which is defined as 'fluctuations in the dominant belief systems (i.e. those incorporated into public policy)' (Sabatier, 1987, p. 682). The main objective of policy actors, the ACF assumes, is to transform their policy preferences into concrete policy decisions. Typically, policy actors maintain and defend their policy beliefs and preferences. They use their resources and coordinate their political activity within 'advocacy coalitions' to become 'dominant' and impose their understanding of policy problems and their preferred policy solutions on other coalitions .
This being said, policy change can also result from changes in policy actors' beliefs and preferences -a causal mechanism called 'policy learning' . The ACF defines policy learning as 'relatively enduring alterations of thought or behavioural intentions that result from experience and which are concerned with the attainment or revision of the precepts of the belief system of individuals or of collectivities' (Sabatier, 1993, p. 42). Beyond social interactions among policy actors and the accumulation of evidence on a policy issue, major 'shifts in the core attributes of the subsystem' or 'shocks' are typical causes of policy learning (e.g. a legal shock or a shock in the distribution of natural resources: Weible, Sabatier, & McQueen, 2009, p. 124). However, after three decades of research, the ACF shares with many other social learning approaches to the policy process a fair amount of scepticism regarding the actual role of policy learning in policy change (Weible et al., 2009).

The cognitive consistency of policy learning
This study looks at the consistency of policy learning. In line with the theory of reasoned action (Fishbein & Ajzen, 2010), the ACF considers the belief system of any human being to be composed of various cognitive propositions regarding concrete or abstract objects. These propositions are considered to be true or false and are key drivers of both behavioural intentions and behaviours. Beliefs are consistent if one belief logically follows the other. For example, 'I support this policy' is consistent with 'policies should be efficient' and 'this policy is efficient' . Policy learning, in turn, is consistent when policy actors revise their policy preferences to better align them with belief adaptations. For example, 'my opinion of this policy is more positive than before' is consistent with 'this policy change has had more positive outcomes than I initially expected' . In contrast, learning is inconsistent when policy actors maintain their preferences or modify them the opposite direction. This study examines whether (hypothesis 1) and when (hypotheses 2.1 and 2.2) policy learning is consistent. The theoretical foundations of these hypotheses are presented in the remainder of this section.
The ACF model of the individual is based on two assumptions. According to the first assumption, 'there are strong grounds for assuming that most actors will have relatively complex and internally consistent belief systems in the policy area(s) of interest to them' (Sabatier, 1993, p. 30). With this assumption, the ACF recognizes Festinger's (1957)'s theory of cognitive dissonance. Festinger's (1957) basic assumption is that human beings are comfortable with cognitive consistency, whereas inconsistency provokes 'dissonance' or a state of arousal. Because dissonances indicate erroneous propositions in one's belief system, this state of arousal functions as a signal that the system should be revised to facilitate context-appropriate action (Harmon-Jones, Amodio, & Harmon-Jones, 2009). In decision-making processes, if policy actors believe that existing solutions are no longer appropriate, this theory suggests that they will revise their preferences in favour of alternative solutions (Gawronski & Strack, 2012). These cognitive efforts deployed by policy actors to adopt attitudes and behaviours that reduce dissonance serve as a core mechanism that confers concrete policy effects on policy learning. Most policy actors are experienced policy 'elites' . In addition, the opportunities to make such cognitive efforts are particularly numerous in long-term policy processes, such as those often scrutinized by ACF-based studies. This leads me to the null hypothesis of this research: in the long run, policy actors tend to align their policy preferences with the adaptations of their policy beliefs.
However, the second important assumption of the ACF is recognizing that individual rationality is 'limited rather than perfect' (Sabatier, 1993, p. 30). This assumption results from a 'behavioralist turn' (Zito & Schout, 2009) adopted by the ACF in parallel with many other approaches to the policy process (Dunlop & Radaelli, 2013) and borrowed from organizational research (Simon, 1991). Bounded rationality suggests that policy actors have a limited ability to revise their policy preferences consistent with their beliefs for two main reasons. First, the information available about policies can be of poor quality or low quantity. Second, the inherent ability of individuals to process this information is limited (Birkland, 2006;Moynihan, 2008).
Given the limits to their ability to process information, human beings must rely on heuristic-based modes of reasoning (Kahneman, 2011). Heuristics are cognitive rules that simplify information processing. For example, rather than re-assessing their entire belief system according to every new piece of information, there are strong scientific grounds to believe that human beings 'tend to conform assessments of information to some goal or end extrinsic to accuracy' (Kahan, 2013, p. 408). This tendency is called 'motivated reasoning' (Kunda, 1990) and pushes people to systematically prefer standpoint-consistent information to standpoint-inconsistent information.
Recent ACF research provides indications that, through their detrimental effect on the consistency of individuals' belief system, motivated modes of reasoning influence the attitudes and behaviour of actors and coalitions within policy subsystems. For example, Matti and Sandström (2011) show that while policy core beliefs are driving forces of coalition formation, the members of the same advocacy coalition do not always share similar deep core beliefs. To explain belief stability despite contradictory information, Pierce (2011, p. 427) relies on 'the assumption within the ACF that individuals filter new information based upon their belief systems, and that this information will be used in a biased manner to support these beliefs' . Henry (2011) also finds that patterns of agreement/disagreement are correlated with patterns of collaboration/noncollaboration. He speculates that this correlation could be attributed to the 'biased assimilation' of information and of evidence provided by policy actors with whom one disagrees. In doing so, motivated reasoning contributes to the polarization of policy coalitions (Anderson & Harbridge, 2014) and makes policy compromises more difficult to attain (Steyaert & Jiggins, 2007).
In fact, human beings' natural preference for standpoint-consistent information can operate at two levels in policy-making processes. At the first level, policy actors may be tempted to avoid or ignore standpoint-inconsistent information. Psychological research has demonstrated that such 'selective exposure' (Hart et al., 2009) to information is even stronger over the long run because of heightened commitment to beliefs and preferences (Jonas, Schulz-Hadt, Dieter, & Theler, 2001). Consistent with this idea, ACF research has demonstrated that policy actors involved in working with a policy issue for some time privilege discussion with like-minded colleagues who confirm what they already (believe to) know. Conversely, they engage in a 'dialogue of the deaf ' with policy actors from adverse coalitions with whom they disagree (e.g. Mauersberger, 2016, pp. 202-205), especially because they see them as more 'evil' than they actually are (Fischer, Ingold, Sciarini, & Varone, 2016;Sabatier, Hunter, & McLaughlin, 1987). While this finding appears to be confirmed by the considerable stability of policy actors' beliefs over time (e.g. Jenkins-Smith & Sabatier, 1993;Moyson, 2016), recent evidence suggests that actors involved in policy processes are actually open to acquiring new knowledge (Leach et al., 2014). In fact, there is no reason to assume that policy actors are completely deaf towards the information they read in policy documents or hear from other policy actors, even when it comes from an adverse coalition (Montpetit, 2016;Montpetit & Lachapelle, 2015).
Motivated reasoning can operate at a second level: rather than ignoring standpoint-inconsistent information or evidence, human beings can also give more significance to those acquired beliefs that confirm their preexisting preferences. Conversely, they may find counter-arguments to decrease the significance of new information that questions preferences. As a result of such 'biased assimilation' (Munro & Ditto, 1997;Munro et al., 2002), I suggest that policy actors can adapt their policy beliefs to new information (e.g. 'I believe that new evidence demonstrates the environmental inefficiency of this policy') without aligning their policy preferences (e.g. 'I maintain my support for this policy because the economic efficiency of policies is more important than their environmental efficiency'). This suggestion is in line with existing research showing that people can be very prone to defend (policy) preferences despite contradictory evidence demonstrating their invalidity (e.g. Boysen, 2007;Corner, 2012;Lodge & Matus, 2014;Tetlock, 2005): new information is processed rather than ignored, but the learning process is biased and tends to confirm rather than to challenge preexisting preferences. This leads me to the first operational hypothesis of this study: there should be a negative relation between the amount of change in policy beliefs and the alignment of preferences with those beliefs (hypothesis 1). This effect of biased assimilation could explain why, despite policy actors' ability to 'learn' (adaptation of beliefs about policy outcomes), empirical evidence makes ACF researchers sceptical about the actual relation between policy learning and policy change (no alignment of the policy preferences that policy actors want to incorporate in concrete policies).
It is not only crucial to understand whether biased assimilation acts as an obstacle to the transformation of new evidence into new policy preferences and decisions; it is also important to understand when it happens. There is rich psychological research on the individual and situational factors fostering or limiting biased assimilation (e.g. Swami et al., 2012; for a review, see Lord & Taylor, 2009). The present study focuses on two factors specific to policy-making processes: political curiosity and policy commitment. First, policy actors have different levels of political curiosity, defined as a desire to learn or know about the principles and actors of government. In an experiment, Knobloch-Westerwick and Meng (2009) demonstrated that an interest in politics fosters participants' propensity to devote time to articles presenting views that are opposed to their preexisting political views instead of articles that confirm those views. Battaglio (2009) also speculated that this kind of curiosity increases citizens' willingness and ability to understand complex policies. These findings suggest that politically curious actors exposed to new evidence are more willing and able to overcome cognitive discrepancies related to policy issues (through adaptations of their policy preferences). At the same time, Kahneman's (2011) research on heuristics and biases and its application to decision-making processes (Leach et al., 2014) suggest that policy actors are less prone to admit they are wrong when they consider themselves to be knowledgeable or competent in a policy domain. All in all, these findings suggest, at the very least, that the alignment of policy preferences with belief change will be influenced by political curiosity. However, they are inconclusive as to whether such curiosity will have a positive or negative effect on the relation between belief change and the consistency of policy learning (hypothesis 2.1).
Second, psychological research suggests that public commitment increases resistance to persuasion. Public commitment occurs 'when a person's opinions or positions are made public or known to others' (Gopinath & Nyer, 2009). Existing studies have demonstrated that commitment locks individuals' opinions (e.g. Hollenbeck, Williams, & Klein, 1989) and they are even more resistant if commitment occurred publicly than if it occurred privately (Cialdini & Trost, 1998). Consumer research has taken special interest in public commitment (e.g. Ahluwalia, Unnava, & Burnkrant, 2001;Gopinath & Nyer, 2009;Nyer & Dellande, 2010). The 'foot-in-the-door' tactic, in particular, relies on human beings' desire to be and to appear consistent over time (Cialdini, Trost, & Newsom, 1995) by gaining public commitment with a small request (e.g. signing a petition for a cause) in order to facilitate compliance with a larger one (e.g. providing financial support to this cause). The ACF suggests that most members of a policy subsystem are primarily involved in defending the consistency of their coalition's belief system (Sabatier & Weible, 2007). However, some policy actors are more central and involved than others: they have, as a result, a higher number of opportunities to develop, structure and reinforce their preferences for specific policy options. Further, they must express those preferences in public on many occasions, such as in reports or meetings. In other words, their public commitment to certain policy options is higher. In line with psychological and consumer research, I expect more resistance from the most 'committed' actors to aligning their policy preferences with new, standpoint-inconsistent information and lower consistency in policy learning. In other words, there should be a positive effect of policy commitment on the relation between belief change and the consistency of policy learning (hypothesis 2.2).
The research hypotheses are summarized in Table 1 and represented in Figure 1. In statistical terms, belief change is expected to have a direct, negative influence on the consistency of policy learning. In contrast, political curiosity and policy commitment are expected to 'moderate' the relation between belief change and learning consistency. The consistency of policy learning is an important condition of learning-induced policy change.

Research design
To examine the consistency of policy learning, a web survey was submitted to Belgian policy actors involved in policy changes related to the implementation of the European liberalization policy process for network industries within the rail and electricity sectors. Network industries 'are characterized by the delivery of products or services to final customers via a "network infrastructure" linking upstream supply with downstream customers' (European     , 1999). Network industries are typical of sectors such as telecommunications, energy, transport or postal services. Since the 1980s, many network industries have been subject to a liberalization policy process (Genoud, 2004). Gradually, network activities have been unbundled. Previously, a stateowned company (or 'incumbent') had a monopoly on the management and commercial exploitation of the network, but currently, a public 'infrastructure manager' is responsible for the maintenance and security of the infrastructure, and the incumbent competes with other private companies (or 'new entrants') for use of this infrastructure. In addition, various independent regulatory agencies have been created at the European and national levels.
This study focuses on two national subsystems of policy actors: the Belgian rail and electricity policy subsystems. In the railways, the European liberalization process began in 1991, with European directive 91/440/EEC. The implementation of this process in Belgium began with the Royal Decree of 5 February 1997 (see Dehousse & Gadisseur, 2002;Moyson & Aubin, 2011). A similar process of liberalization for the European electricity sector was initiated with Directive 96/92/EC. The implementation of this process in Belgium began with the Federal Law of 29 April 1999 (see Declercq, 2000;Declercq & Vincent, 2000a, 2000bGlachant & Perez, 2011).
The web survey was administered via email between April and November 2012 to 1256 people holding top to middle positions within 51 public and private organizations involved in the liberalization process. Given their position, these people were regularly involved in the process of implementing the European liberalization policy: they form two policy subsystems. The identification of those policy actors was, first, based on a documentary analysis. Then, a snowballing (or 'chain referral') sampling method (Atkinson & Flint, 2001) was applied through a campaign of 33 preliminary semi-structured interviews. In the railways, 12 (75%) out of the 16 organizations participated in the survey, while in the electricity sector, there were 26 (74%) participating organizations out of the 35 that were contacted. Within the participating organizations, in the railways, 199 (35.53%) out of 560 solicited individual policy actors participated in the survey, while in the electricity sector, 214 (30.75%) out of 696 policy actors filled in the questionnaire, which is a fairly similar rate. The response rate of the survey overall was 32.88% (413 policy actors from 38 organizations). 1 1 Within each participating organization, i included in the survey all members from the highest to the lowest organizational level where, according to the interviewees, at least several actors could be identified as relevant respondents to my survey. i applied this 'hierarchical correction' (i.e. including all people at the lowest relevant hierarchical level) to compensate for the tendency of the snowball sampling procedure to over-represent 'well-connected' actors and to under-represent 'unconnected' actors (atkinson & Flint, 2001). the following types of organizations were invited to participate in the survey within each sector: all competent public administrations, all competent regulatory agencies, the infrastructure manager, the incumbent, all new entrants, as well as the interest groups representing the workers (e.g. trade unions or associations of train drivers) and the different types of companies (e.g. association of public sector train companies or associations of green producers). the organizational and individual response rates were fairly similar for each type of organizations. For more details about the liberalization survey, see Moyson (2014). there are at least three reasons to think that the survey allows long-term policy learning to be examined in a valid way. First, most respondents had professional seniority. indeed, an additional question of the survey demonstrates that 70.31% of the respondents had worked for more than 10 years in their sector; 12.29% between 5 and 10 years; 13.65% between 2 and 4 years; and only 3.75% had worked one year or less. Second, the implementation of the european liberalization policy is a long-term process that began much before the first Belgian-level policy decision was made (e.g. european-level consultations with Belgian actors, preparation for the implementation within each national industry, etc.). Since then, this process has progressively unfolded. Still today, there are very important decisions that are being made in each sector to implement the liberalization policy in Belgium (e.g. the introduction of competition to the national railway transport of passengers). this means that not only the most experienced policy actors but also the less experienced ones are able to compare periods before and periods after important policy changes related to the liberalization policy occurred. third, the analyses were repeated for the 29.69% of respondents with less than 10 years of seniority. those respondents, compared to their more experienced counterparts, reported alterations of their policy beliefs and preferences that are not significantly different. in addition, the regression analyses were repeated on this specific set of respondents, and they lead to similar results.

Dependent variable: the consistency of policy learning
This study looks at the consistency of policy learning, i.e. whether policy actors align their policy preferences with changes in their beliefs about policy outcomes over time. To measure the consistency of policy learning using the liberalization survey, the respondents were asked (1) to report the evolution of their preferences towards the liberalization policy, as well as (2) to report the evolution of their beliefs about the outcomes of this policy for their industry. Finally, (3) computations were made to assess the alignment of their policy preferences with their beliefs about policy outcomes.
(1) Evolution of respondents' policy preferences -The evolution of respondents' policy preferences was measured with the 'simple gain scores' method (Allison, 1990). Given four Likert-type items ranging from 'Very unfavorable' [−2] to 'Very favorable' [+2], respondents were asked to report their preferences for the liberalization process at the beginning of this process (or when they became involved in the Belgian rail/electricity sector for the first time). Then, the respondents were invited to report their 2012 preferences using the same items. To get an idea of how the respondents' preferences evolved over time, the values for initial preferences were subtracted from the values for current preferences. This   (Kline, 2005) was performed with a maximum likelihood procedure. the starting values of the parameters were set to one, except for the covariance parameters, which were set to .5. this strategy is appropriate when working on standardized variables with positive covariances (Kolenikov, 2009). Factor scores were computed with the Bartlett method because this method provides unbiased scores (Hershberger, 2005). in general, good model fit is indicated by values of the root mean square error of approximation (RMSea) lower than .60, values of the comparative fit index (cFi) higher than .90, values of the standardized root mean square residual (SRMR) lower than .08, as well as p-values of the chi square test higher than .05 (i.e. failure to reject the null hypothesis of good fit). note, however, that RMSea = .00 and cFi = 1.00 can indicate that χ² < df rather than a perfect fit.
an idea of how respondents' beliefs evolved over time, initial belief values were subtracted from current belief values. This provided a new list of items or 'gain scores' measuring change in the respondents' beliefs about the outcomes of the liberalization policy. Factor analyses were separately conducted on the list of four/ five gain scores in the two sectors. The EFA suggested that all scores should be kept in each sector. The CFA validated this structure in the rail sector (χ² = 6.29, p = .04; RMSEA = .12; SRMR = .04; CFI = .97) and in the electricity sector (χ² = 8.25, p = .14; RMSEA = .07; SRMR = .04; CFI = .98). 3 The scores of the two factors were normalized to obtain one scale, common to the two sectors. . If the result of this subtraction equals 0, it means that the respondent perfectly aligned his policy preferences with the change (or stability) in his beliefs about policy outcomes. If the result of the subtraction is negative, then his preferences evolved less positively than his beliefs about policy outcomes; conversely, the result of the subtraction is positive if his policy preferences evolved more positively than his beliefs about policy outcomes. In other words, the more different the result of the subtraction is from 0, the more inconsistent policy learning is. However, this study does not specifically distinguish between positive and negative biases in learning. In addition, this analysis looks at consistency rather than at inconsistency. To obtain a measure of consistency, the opposed absolute value of the result of the subtraction was computed: when this new value equals 0, policy learning is perfectly consistent; the more negative its value is, the less consistent policy learning is: This study relies on innovative measurement methods in order to empirically assess policy learning. First, a simple gain scores method (Allison, 1990) is used to measure the evolution of policy actors' beliefs and preferences, which overcomes two possible types of systematic measurement error. On the one hand, respondents could be tempted to provide socially desirable answers, especially if they want to show that they are stable and reliable people or, on the contrary, that they are able to change their minds. On the other hand, as the survey 3 in the eFa, factors 1 and 2 had eigenvalues of 1.84 and .05 in the rail sector; they had eigenvalues of 2.02 and .12 in the electricity sector. after rotation, all items had loadings equal or higher to .40. 4 in each sector, the two intermediate variables have cronbach's alpha (α c ) coefficients equal to or higher than .71 except for the evolution of respondents' beliefs about policy outcomes in the railways (α c = .62). deleting change score 3 ('the unbundling of operations on, and management of, railway infrastructure') would slightly increase the α c of this variable to .66. there are, however, two reasons to keep the four-item structure. First, α c 's are not weighted, whereas factor scores depend on the loading of each item that comprises the factor structure. in this research, the fit statistics of the cFa indicate a very good fit. this suggests that change score 3 may be kept. Second, the four-item structure is grounded in the literature on the european liberalization process of network industries, which suggests that this structure is more representative of this policy than shorter structures (Genoud, 2004;Geradin, 2006).

Consistency of policy learning = −abs evolution of policy preferences
−evolution of beliefs about policy outcomes contained professional questions submitted in a professional context, respondents could be tempted to provide professionally desirable answers. In particular, there are good reasons to suspect that respondents could be concerned about appearing more/less favourable to the liberalization process when they worked in an organization or among colleagues militating for/against this policy. Studies that directly measured policy learning are relatively scarce and relied, most often, on one set of items on preferences change ('did you change your opinion on … ?'). Such an approach does not control for the types of measurement error mentioned above. In the simple gain scores method, in contrast, two sets of items -one about past beliefs/preferences and one about current beliefs/preferences -are used and compared by the researcher. On the one hand, this drastically decreases the ability of respondents to strategize around the social desirability of the reported change in their beliefs/preferences. On the other hand, the simple gain scores approach does not remove systematic error in the measurement of preferences themselves (professionally desirable answers). However, simple gain scores modelling protects regression results from the possible effects of such a measurement error: it provides unbiased results (Allison, 1990).
Second, this study addresses recollection issues. Indeed, it can be difficult to remember past preferences (Janson, 1990). However, a confident attitude towards a memory is a reasonable indicator of its accuracy (Roediger, 2012). In turn, conviction is a reliable indicator of attitude confidence/certainty (Holland, Verplanken, & van Knippenberg, 2003). Hence, respondents were also asked to report their degree of conviction about their policy preferences on a five-point Likert scale. The respondents who reported that they were 'completely unconvinced' [−2] or 'rather unconvinced' [−1] about their past or current preferences were removed from the sample (32 respondents were removed).
In addition, this study focuses on policy actors who have been involved in the European liberalization process for a long time. As this process has been a long-term policy change for network industries, there are good reasons to think that policy actors have reliable memories of their past preferences regarding this change. Indeed, research in cognitive psychology suggests that the importance of an event or process, as well as the number of opportunities to hear and discuss it, increases the accuracy of memories about past opinions of it (Kvavilashvili, Mirani, Schlagman, & Kornbrot, 2003;Neisser et al., 1996).

Independent variables
This study examines the effect of the amount of belief change on the consistency of policy learning. The second intermediate variable of the study, computed above, measures the evolution of respondents' beliefs about policy outcomes. To measure the amount of change in their beliefs about policy outcomes, I rely on the absolute value of this variable. Methodologically, it can appear intriguing to derive an independent variable from (part of) a dependent variable. However, computationally, the link between the amount of belief change and the consistency of policy learning is not at all automatic; it is not a translation. Hence, this manipulation does not raise any computational issues.
Political curiosity and policy commitment are also examined in this study. Following Battaglio and Legge (Battaglio, 2009;Battaglio & Legge, 2009), political curiosity was measured with the following five-point Likert scale item: 'I am interested in politics' . Policy commitment results from the addition of the two following five-point Likert scale items: 'The documents that I produce sometimes concern the politics and policies of the Belgian rail/electricity sector'; 'I sometimes discuss the politics and policies of the Belgian rail/ electricity sector with my colleagues' .
This research also accounts for four covariates: the gender (male = 0; female = 1), the age (from 'less than 20 years old' = 1; to 'more than 70 years old' = 12; by intervals of 5 years), the educational level (secondary education or less = 1; undergraduate = 2; graduate or more = 3) and the policy sector (or subsystem: rail sector = 0; electricity sector = 1) of the respondent. Women and younger people are expected to show higher compliance (e.g. to new policies: Petty & Wegener, 1998), which could explain differences in the consistency of policy learning (young women adapting their policy preferences towards the liberalization process more than older men). In general, educated people benefited from the access to more diverse perspectives in their scholastic life, which can explain the positive correlation between education and tolerance or its negative correlation with racism and authoritarianism (Radloff, 2007). Similarly, education could foster adaptation of policy beliefs/preferences and, in turn, influence the consistency of policy learning.

Analysis and results
The summary statistics in Table 2 show that policy actors' preferences have not evolved very much over time (intermediate variable 1). This is especially true in the rail sector, which has a mean close to 0. On average, with a mean of 1.46, policy actors' opinions regarding the liberalization policy have evolved more positively in the electricity sector. A possible explanation for this result is that the liberalization process has been deeper and, for this reason, has become more consensual in the electricity sector than in the rail sector, where the monopoly of the incumbent over the national transport of passengers was still applicable in 2012 (but discussed). 5 Furthermore, in the two sectors, the standard deviation suggests quite substantial inter-individual variation. Policy learning is less consistent in the electricity sector than in the rail sector. Further, the average of the final dependent variable is more negative in the electricity sector than in the rail sector: while policy preferences (intermediate variable 1) and beliefs about outcomes (intermediate variable 2) evolved in fairly similar ways in the rail sector, policy preferences evolved positively, whereas beliefs about outcomes evolved negatively in the electricity sector.
Turning to the independent variables, first, the amount of belief change is low in the two subsystems: on average, many policy actors maintained their beliefs about policy outcomes over time or revised them only slightly. Second, the respondents reported high levels of political curiosity and policy commitment in the two subsystems. Finally, the covariates have approximately the same means in each subsystem, except that policy actors are older in the rail sector than in the electricity sector. Gender was introduced as a numeric (dummy) variable in the regression analyses. Concretely, there are 19 female respondents in the rail sector and in the electricity sector.
Out of the 413 survey respondents, 32 were removed because they were not sufficiently convinced about their past policy preferences (see above), and 88 others were removed 5 the european Parliament and the european council adopted the fourth railway package introducing competition to the national railway transport of passengers in december 2016 (directive 2016/2370/eU).
because they did not provide any answer to one or more of the questions used to construct the variables. The missing values of the covariates gender, age, and educational level were replaced by their mean (consistent with Allison, 2002). Hence, the final sample is composed of 293 respondents: 151 in the rail sector and 142 in the electricity sector. They come from 38 different organizations: 12 in the rail sector and 26 in the electricity sector. The data were analyzed using linear regression models (Fox, 2008) in Table 3. In these models, the distribution of studentized residuals is not perfectly normal (p-value of all Shapiro-Wilk tests <.001), and the residuals are somewhat heteroscedastic (p-value of all Cook-Weisberg tests <.001). Hence, robust standard errors were used. 6 Despite the correlation between political curiosity and policy commitment, in Table 4, the variance inflation factors are never higher than 1.50 nor higher than the model-dependent cut-off values (Craney & Surles, 2002).
Model 1 looks at the effect of covariates. There is no effect of gender, age or educational level on learning consistency, but as mentioned above, policy learning is less consistent in the electricity sector than in the rail sector at the aggregate level. This confirms the added value of this covariate in the analysis. Part of this effect can be explained by the higher average amount of belief change in the electricity sector (see Table 2), which is confirmed by the .11 decrease in the coefficient in Model 2. Beyond this effect of belief change, understanding the relation between the context (sector) and learning consistency would require further research.
In Models 2-4, the effect of the independent variables is examined. Model 2 shows that the amount of belief change has an important effect on the consistency of policy learning. The model is significant, and the R-squared jumps to .25. The coefficient of the amount of belief change is significantly negative, and its size is 'medium' (Cohen, 1988). 7 Compared 6 the respondents of the survey come from 38 different organizations. Hence, robust clustered standard errors could also have been used (with clusters = organizations). However, there is no theoretical reason to think that the link between independent and dependent variables of the study could be different for each organization. to be sure, all regression analyses were repeated with clustered robust standard errors: they do not provide statistically different results. 7 according to cohen (1988), this qualification of effect sizes is arbitrary. However, the risk inherent to such a rule of thumb is acceptable when considering 'that more is to be gained than lost by supplying a common conventional frame of reference which is recommended for use only when no better basis for estimating the effect size index is available' (p. 25).  to Model 1, the likelihood ratio test shows that introducing the amount of belief change is statistically useful to explain levels of learning consistency. Model 3 examines whether levels of political curiosity have a moderating effect on the negative relation between belief change and the consistency of policy learning. According to the results, this effect is negative and its size, though 'small' (Cohen, 1988), is significant. The BIC index highly penalizes the introduction of new variables relative to their contribution. This is probably why the value of this index slightly decreases following the introduction of two new variables in Model 3. However, the significance of the likelihood ratio test, as well as the .05 increase in the R-squared, statistically demonstrates that belief change tends to involve lower levels of learning consistency when policy actors are politically curious. In contrast, Model 4 suggests that policy commitment has no moderating effect on the relation between belief change and the consistency of policy learning.

Policy beliefs and preferences are stable over time
On average, policy actors do not revise their policy beliefs and preferences very much over time. Such stability is suggested by the summary statistics in Table 4 and illustrated by the kernel estimation of the frequency distribution of these two variables in Figure 2.
Recent evidence suggests that actors involved in policy processes are actually open to learning. For example, Leach et al. (2014) asked 121 participants in US partnerships focused on marine aquaculture 'whether they had acquired a better understanding' of (1) aquaculture science, (2) aquaculture economics or business; (3) aquaculture policy, law or regulation; and (4) other stakeholder perspectives. Based on the aggregated index of those four Likert-type questions, 87% of the respondents reported that they gained some new knowledge. Leach et al. (2014, p. 606) concluded that 'learning is more common than not' . Montpetit and Lachapelle's (2015) experimental results suggest that expert information on the efficacy of environmentally friendly soil decontamination techniques can convince policy actors that this technique could be adopted. They concluded that policy actors 'can learn' (Montpetit & Lachapelle, 2015). My own findings do not contradict these results but do qualify them: policy actors do adapt their policy beliefs and preferences over time, but to a very modest extent. These findings also reveal the strong influence of methodological choices on the measurement of policy learning.

Policy learning is not consistent
According to Model 2, the policy actors who reported greater amounts of adaptation in their beliefs about policy outcomes also reported less consistent learning processes. This results from biased assimilation: Figure 3 illustrates that many policy actors adapt their beliefs about policy outcomes without aligning their policy preferences. This leads me to validate hypothesis 1 of this research: belief change has a negative effect on the consistency of policy learning because policy actors do not adapt their policy preferences according to this change. This being said, Figure 3 pinpoints that learning inconsistency also results from policy actors' tendency to revise their policy preferences despite stability in their beliefs about policy outcomes. This result raises a question about the nature of the motivations that elicit the stability of policy preferences (despite adaptations in beliefs) or their fluctuation (despite the stability of policy beliefs). In this respect, there has been strong suspicion that policy actors' interests tend to influence their attitudes and behaviours (e.g. Hoberg, 1996;Nohrstedt, 2005). Similarly, there could be suspicion that interests overcome the effect of evidence and information on actors' belief system. However, analyses on the basis of the same data-set used in the present study have already been conducted to assess the effect of policy learning on the evolution of policy actors' preferences: these analyses have concluded that personal and organizational interests exert a 'real but limited influence' (Moyson, 2016, p. 19). This suggests that the resistance of policy actors' preferences to belief change (biased assimilation) -or their adaptation despite belief stability -do not primarily result from the influence of interests. This also suggests that other factors of learning consistency should be examined, such as political curiosity and policy commitment.

Political curiosity exacerbates biased assimilation
Political curiosity refers to the desire of policy actors to learn or to know about the principles and actors of government, while policy commitment occurs when a policy actor's opinions or positions are made public or known to others. According to Model 3, political curiosity has a negative moderating effect on the relation between belief change and the consistency of policy learning, whereas according to Model 4, policy commitment has no such an effect. In Figure 4 and Figure 5, the respondents of the present study were separated in two groups based on high/low political curiosity and high/low policy commitment according to the median of these two variables. Figure 4 illustrates that the consistency of policy learning decreases more among the politically curious policy actors than it does among the others, when they adapt their policy beliefs. Figures 5 illustrates that there is no such an effect of policy commitment. In other words, hypothesis 2.1 is validated, whereas hypothesis 2.2 is not.
The moderating effect of political curiosity on the relation between belief change and learning consistency is in line with the psychological research on heuristics and biases (Kahneman, 2011). More specifically, it brings new empirical credit to Leach et al. 's (2014) speculation that policy actors are reluctant to admit that they are wrong when they consider    themselves to be knowledgeable and competent (here, on politics). In this manner, political curiosity exacerbates policy actors' propensity to maintain their preferences despite change in their beliefs, i.e. the form of motivated reasoning called biased assimilation.
In contrast, this study fails to demonstrate any link between policy commitment and the consistency of policy learning. Despite multiple opportunities to speak and write about the liberalization policy, policy actors do not seem more committed to their policy preferences after adaptations of their policy beliefs, which is not in line with psychological research (Cialdini & Trost, 1998;Hollenbeck et al., 1989) or consumer research (Ahluwalia et al., 2001;Gopinath & Nyer, 2009;Nyer & Dellande, 2010) that support hypothesis 2.2. This finding is also relatively counterintuitive with respect to the ACF which suggests that committed and central policy elites have structured systems of beliefs that are particularly resistant to change.

Conclusion
The ACF ) is a theory that considers the role of policy learning in policy change processes. Policy learning is a cognitive and social dynamic in which new information and knowledge resulting from various experiences and interactions can elicit enduring alterations of policy actors' beliefs and preferences (Dunlop & Radaelli, 2013;Heikkila & Gerlak, 2013). The consistency of policy learning measures the alignment of policy actors' preferences towards a policy with the adaptations of their beliefs about the outcomes of this policy. Learning consistency is an important condition of learning-induced policy changes.
The consistency of policy learning was examined with a web survey filled in by 293 policy actors involved in the European liberalization process of the Belgian rail and electricity sectors. A high number of respondents reported few change in their beliefs about policy outcomes and in their policy preferences. The negative relation between the amount of belief change and the consistency of policy learning (hypothesis 1 validated) results from the lack of preference alignment after belief adaptations (biased assimilation) but also from preference change occurring despite belief stability. Belief change elicits less inconsistency of policy learning among politically curious policy actors (hypothesis 2.1 validated). In contrast, the research failed to identify any effect of policy commitment, which occurs when a policy actor's opinions are made public, on the relation between belief change and the consistency of policy learning (hypothesis 2.2 not validated).
Previous ACF research has demonstrated the effect of motivated reasoning (Kunda, 1990;Kahan, 2013) on actors' and coalitions' behaviour within policy subsystems (e.g. Anderson & Harbridge, 2014;Henry, 2011;Matti & Sandström, 2011;Pierce, 2011;Steyaert & Jiggins, 2007). However, the specific role of motivated reasoning in policy learning has not been examined. Furthermore, the respective effects of selective exposure and biased assimilation have not been isolated. This study is the first to isolate the effect of biased assimilation on policy learning. In doing so, it has reconciled those studies noticing a strong tendency of policy actors to maintain their policy preferences despite standpoint-inconsistent information (e.g. Lodge & Matus, 2014;Pierce, 2011;Tetlock, 2005) and those that point to the ability of policy actors to adapt their policy beliefs over time (e.g. Leach et al., 2014;Montpetit & Lachapelle, 2015). The present study provides a new indication that policy actors' beliefs and preferences are highly stable over time -a stability that can be attributed to selective exposure mechanisms but also to learning processes confirming the validity of preexisting beliefs. This being said, even when policy actors adapt their policy beliefs, this study has shown that they do not revise their policy preferences to be consistent with their new beliefs -clear, empirical evidence of biased assimilation in policy learning. Because policy preferences are key drivers of coalition behaviour and policy change, this finding explains ACF researchers' scepticism about the concrete effects of policy learning, especially on policy change.
Theoretically speaking, this study has started from the assumption that subsystem members have internally consistent systems of belief to argue that after change in beliefs about policy outcomes, stability in policy preferences provides empirical evidence of biased assimilation. In fact, if this assumption is not true, stability in policy preferences can also indicate that belief adaptations allow policy actors to resolve preexisting inconsistencies. Future policy research on biased assimilation should examine the relation between the preexisting consistency of belief systems and the consistency of policy learning to assess the actual contribution of policy learning. Further research is also required about the factors fostering or impeding the consistency of policy learning. Previous research suggests that actors' interests only play a small role (Moyson, 2016), while the present study has pointed to the effect of political curiosity. Future research should look at social practices and institutional settings fostering policy actors' leaps of faith in their policy preferences (biased assimilation) or their willingness to adapt those preferences despite the stability of their policy beliefs.
Empirically speaking, the results have shown that the consistency of policy learning has been lower in electricity sector than in the rail sector. This is surprising, considering that the implementation of the liberalization policy process is more advanced in the electricity sector including, supposedly, more opportunities for policy actors to adapt their beliefs according to the actual effects of policy change and to align their policy preferences with those adaptations. Further research on the relation between the progress of policy implementation and policy actors' cognition is needed.
Methodologically speaking, this study has relied on self-reported, cross-sectional data. To measure policy learning, innovative methods have been implemented to overcome recollection issues and systematic measurement errors (simple gains scores). In future studies, this type of problem could be solved even more efficiently with a longitudinal design and/ or vignette questions.
Practically speaking, this study confirms previous research's scepticism about policy actors' propensity to adapt their policy preferences given belief change: instead, they tend to either undermine standpoint-inconsistent information or to adapt their policy preferences despite belief stability. In addition, this tendency is higher among politically curious policy actors. If one admits that consistency is decisive in learning-induced policy changes, the findings of this study then call for the implementation of institutional settings and social practices fostering the malleability of policy preferences to belief adaptations. Fortunately, the belief system of central policy elites, who are the most committed to decision-making processes, is not necessarily more rigid than the belief system of peripheral policy actors. This suggests that they are not less able to adapt their beliefs and preferences to changing policy environments and demands.