Policy learning in Norwegian school reform: a social network analysis of the 2020 incremental reform

ABSTRACT This policy study examines how policymakers and policy experts in Norway made us of research and studies – produced in Norway, in the Nordic countries and outside the Nordic region – to explain the 2020 incremental school reform. In total, 2 White Papers, 12 Green Papers and 3438 texts, cited in the White and Green Papers, were used as data for the text-based social network analysis. The three major findings were the following: First, the policymakers and experts make excessive use of references (on average, 246 references per White or Green Paper). The publications they cite are highly specialized and issue centred with little overlap between the various papers. Second, the policy references for the 2020 reform were mainly domestic. Approximately 70% of the referenced texts were published in Norway. Finally, the social network analysis enabled the authors to identify five texts that were influential and that bridged curriculum with quality monitoring reform topics. The authors suggest that more attention should be paid to an analysis of incremental reforms such as the 2020 reform in Norway. They identify a few of the blind spots that the more commonly used focus on fundamental reforms tends to produce.

Researchers tend to be more interested in understanding why governments issue new policies and regulations, how they explain the need for revamping a system and whom they mobilize to carry out reforms, than they are in examining incremental changes accompanied by comparatively little fanfare. There is good reason for this: Dramatic changes attract notice. Nevertheless, the academic fascination with large-scale reform makes one wonder whether there is also not a lot to learn from the small alterations that, while relatively minor in context, may tell us a great deal about the policy process. A focus on the details leads to an important question for comparative policy studies: How can we advance our understanding of how policies are made by examining small changes, that is, the new policies that entail only minor adjustments and minimal revisions to a previous reform?
In order to address this question, we examine a school reform slated to take effect across all public schools in Norway in the year 2020. Designed as an incremental reform, the policy has been introduced in stages over a two-year period as a two-pronged effort targeting two distinct yet interrelated policy domains: curriculum and quality monitoring. The curriculum part, with a focus on teaching method and learning content, was introduced in a 2015/2016 White Paper (WP) titled Subjects -Indepth learning -Understanding. A Renewal of the Norwegian Knowledge Promotion Reform (Ministry of Education and Research, 2016). As the name indicates, the reform was a confirmation or 'renewal' of the major curriculum reform called Knowledge Promotion Reform which was issued a decade earlier in 2006.
The second WP, Eager to learn -Early Intervention and Quality in Schools (Ministry of Education and Research, 2017), which dealt with monitoring the quality of teaching and learning in Norwegian public schools, was introduced one year later. This aspect of the reform was less popular because it dealt with a topic that had long been debated but never systematically addressed. In retrospect, it seems the Government initiated the more controversial move towards a standards-based quality control only after the less contentious changes to curricula were already underway. Anyhow, the question of accountability was not unexpected and revitalized earlier debates on the quality of the Norwegian education system that had surfaced periodically since the start of the millennium.
Nevertheless, or rather precisely because the twopronged reform included only minor revisions to earlier policies, the 2020 reform begs analysis of how policymakers and experts make use of evidence to explain their adjustments. By mandate, the policy experts are supposed to evaluate past experiences or experiences in other countries and propose revisions to earlier reforms based on what they have learned from their reviews. Given that these revisions consolidated earlier debates on quality assessment, we should attend to how the architects of change relate and refer to each other, as well as what bodies of knowledge they draw from when making recommendations for parliamentary decisions on launching incremental reforms.
Policy learning and lesson-drawing from past experiences and from elsewhere In a much-cited publication produced a quarter-century ago, Peter A. Hall produced a remarkable analysis of first-, second-and third-order changes (Hall, 1993). Hall's term 'policy learning' has since expanded beyond the field of comparative political science to encompass a fascinating interdisciplinary array of analytical work dealing with the actors, processes and effects of policy change. Hall frames policy change as social learning, that is, a deliberate attempt to adjust the goals or techniques of policy in response to past experience and new information. Learning is indicated when policy changes as the result of such a process. (Hall, 1993, p. 278) Incremental or first-order changes represent the most common type of policy learning. The instruments and goals of the policy are preserved, but the policy is pursued with greater vigour, efficiency and effectiveness. In second-order changes, the policy instruments are altered, but the policy goals are maintained. While second-and third-order changes involve a broad range of actors and organizations involved in the social learning process, third-order changes tend to be steered by a single individual who make radical alterations comparable to a Kuhnian 'paradigm shift' where all the elements in a system are reorientated around new assumptions and ideas. Known for his analyses of neoliberal thought in the 1980s and 1990s, Hall identified the reform of economic policy under British Prime Minister Margaret Thatcher as a third-order change, because the Keynesian mode of policymaking was completely revamped and replaced with a new way of thinking, monetarism. Likewise, third-order changes are often claimed to be necessitated by policy failure and, as a consequence, replace not only the instruments but also the goals of the previous practice.
Though Hall's concept of policy learning and typology of reforms is useful, it neglects both the transnational and spatial dimensions of the policy process. Our analysis includes both, because along with reflections on past experiences in the national context, policy actors can also be affected by what has happened elsewhere and use these positive or negative references as an argument for national agenda setting or policy formulation. This interpretive framework derives from policy borrowing research, an area of research with which scholars in comparative education have long been enamoured (Steiner-Khamsi & Waldow, 2012).
Indeed, the proliferation of 'best practices', international standards and global education policies traveling at breathtaking pace around the globe has attracted an ever-increasing number of scholars to this field. Along with the reasons, processes and impacts of policy borrowing, the agencies of dissemination have also come under greater scrutiny. In recent years, numerous publications have addressed how the World Bank, OECD, 1 Pearson and other international organizations transfer and disseminate their portfolios of 'best practices'. Exerting influence by way of loans and grants, standardized comparisons, ranking and exemplary case studies are just a few of the technologies implemented in global education governance (Mundy, Green, Lingard, & Verger, 2016).

Case and context
This article examines an example of incremental school reform for primary and lower secondary education. A brief overview of contextual information, notably on past reform initiatives in Norway, is indispensable for understanding the research design and for interpreting the findings of the study. The following background information situates the 2020 reform against the backdrop of earlier reforms. It also helps to understand why national policy actors nowadays make great use of published reviews, reports and other knowledge products to modify existing or develop new reforms.
In Norway, where the Ministry of Education and Research initiates and steers national school reform processes, there have been three comprehensive school reforms over the course of the past three decades. They are listed below by the year in which they went, or will go, respectively, into effect: • Reform of 1997, referred to as the Systemic School Reform • Reform of 2006, known as the Knowledge Promotion Reform • Reform of 2020, composed of the Renewal and Improvement Reform We labelled the 1997 policy change Systemic School Reform because of the reform's primary objective of integrating all different aspects and units of the educational system into a new organization and structure (Gundem & Sivesind, 1997;Smith & O'Day, 1990). In all regards, it was considered a comprehensive reform that attempted to increase the coherence within the educational system. It had done so by clearly defining national curriculum objectives and content, clarifying the role of after-school programmes, emphasizing the importance of family involvement and strengthening the partnership between school and civil society. It was a well-prepared reform initiative that paid attention to structural and cultural opportunities and challenges. The preparation for the next major school reform, titled the Knowledge Promotion Reform, started to take shape in the first years of the new millennium. The national curriculum was completed in 2005 and formally implemented in school year 2006/2007. The general curriculum section was adopted from an earlier version, written in 1993. However, the subjectspecific sections, in particular, the specification of subject-specific objectives, content and instructional time, were novel. The focus on basic skills and competency-based learning objectives, outlined in the introduction part and broken down for each subject, signalled the new focus on acquiring knowledge and competencies. Not only what students should know, but also which competencies they have acquired became central for assessing students' learning outcomes and more broadly for determining the quality of education. As a result of this specific orientation, the authorities labelled the new policy the Knowledge Promotion Reform. The 2006 reform was considered fundamental because it replaced two earlier reforms of primary and lower secondary curricula (Ministry of Church Affairs, Education and Research, 1996) as well as upper secondary curricula (Ministry of Church Affairs, Education and Research, 1994).
Unsurprisingly, the Knowledge Promotion Reform was the most visible signpost of a new era in which measurable objectives, standardized tests and databased planning became important policy tools in the educational system (Skedsmo, 2011). It is important to bear in mind that nearly 20 years earlier, the evaluation of the Norwegian education system, carried out by OECD in 1987(OECD, 1988, had already endorsed the value of management by objectives and the necessity of data-informed policy. As Christensen points out, the preoccupation with measurable objectives and outcomes became a feature of New Public Management in the entire public sector of Norway (Christensen, 2005) and was thus not limited to the educational sector. As part of this strategy, statefunded research and evaluation opportunities were created to generate data and knowledge as a foundation for further planning and follow-up reform projects (Ministry of Education, Research, and Church Affairs, 1999). Furthermore, a large number of academics and scientific institutions, financed by the state, produced policy-relevant reports for the state bureaucracy in the hope that they would have an impact on national reform decisions.
It is also at this point in time that the first results from PISA, 2 TIMSS 3 and PIRLS 4 gained prominence in Norway. In the wake of these studies, a debate on the need for an assessment and quality evaluation system emerged. In a similar vein, the OECD (2002) review on Lifelong Learning in Norway suggested changing the national reform strategy from being supply-driven to becoming demand-driven with the primary emphasis on outcomes (see Prøitz, 2015). The strategy of improving the quality of education by incorporating a feedback system that is based on student assessments was taken up, first in the Green Paper In the First Row. Increased Quality within a Basic Education System for Everyone (NOU 2003, p. 16) and thereafter in the WP Culture for Learning (Ministry of Education, 2004). As explained above, the WP resulted in a new national curriculum, labelled the Curriculum for the Knowledge Promotion Reform (2006). Assessment projects and formative evaluation have since then been redesigned to comply with political and public expectations about learning improvement within a life-long perspective. Early intervention, which constitutes a political project of the national authorities for the past 10 years, is considered to be heavily dependent on this kind of assessment (Ministry of Education and Research, 2007).
A few years later, the OECD report on assessment was published (OECD, 2011). It recommended to improve assessment and evaluation by specifying learning goals and quality criteria. Nowadays, the assessment system in Norway consists of both voluntary and mandatory tests including a broad variety of instruments to assess the quality of learning in all corners of the education system. The national quality assessment system, administered by central and local authorities, has been put in place to monitor the outcomes. The results from national tests are made publicly available and periodically discussed in the media. The question of how to align the desirable outcomes with a national curriculum continues to constitute a challenge. The current reform, The Renewal and Improvement Reform (The Board of Education, 2017), aims at improving the content of the school subjects based on evidence, formative evaluation and the differentiation of learning (Ministry of Education andResearch, 2016, 2017).
Several questions arise when we place the Renewal and Improvement Reform in its historical context and when we take into account that evidence-based educational policy analysis was actively promoted and funded by the government since the beginning of the new millennium. Acknowledging that the government draws on expert panels to inform its policy decisions, the study examines the knowledge that policymakers (government officials) and panels of experts consider relevant when reviewing and discussing the current school reform. We consider the publications that they cite in WPs and Green Papers as indicative of their policy knowledge. In the broader context of evidence-based regulation, knowledge is used as evidence (see Maroy, 2012). Thus, we consider the references to knowledge, that is, the functional aspect of citations relevant for the study of the policy process. Since both groups evaluate experiences and review relevant literature, we examine the publications they reference in their papers to empirically investigate the following prototypical research questions of policy borrowing research: (i) Whose knowledge (national, regional or international) is used to justify the 2020 reform in Norway? What counts as evidence that change is necessary? (ii) Do the references to Norwegian, Nordic (Denmark, Iceland, Finland and Sweden) or international publications represent specific policies? Can we speak of a typical 'domestic', 'Nordic' or 'global education policy'? (iii) What kind of authorization is associated with the references? Are they supposed to prove a need for reform (agenda setting) or substantiate the proposed revisions with policy solutions (policy formulation)?

Research design and methodology
We have pursued the three research questions by analysing the citations and references made in published policy documents. This text-based network analysis enables us to examine the social structure of policy discourse and interpret the various knowledge networks they build based on proximity and distance, respectively. The selection of sample texts (source documents), coding of text attributes and the type of analyses used to explore knowledge and policy networks are explained below.
As mentioned in the introductory section, the political authorization of the 2020 reform is based on two WPs in which the Ministry of Education and Research explains the incremental reform: To reiterate an important piece of contextual information, the WP1 addresses curricular issues and reconfirms or 'renews' the earlier curriculum reform of 2006. It includes directions on how to revise the earlier curriculum reform. It also explicitly maintains that questions of quality assessment need to be addressed with greater urgency and that decisions on that issue will be published later. WP2 on quality monitoring, announced the previous year in WP1, deals with early intervention, professionalization of teachers and principals, as well as how to advance the quality assurance system. Prior to issuing WPs, the Ministry of Education and Research solicits reviews and recommendations from government-sponsored expert panels, known as Royal Norwegian Commissions. The two WPs of the 2020 reform explicitly mention the reports (known as Green Papers) of the relevant expert panels. WP1 and WP2 identify in total 13 Green Papers as key to implementing the new policy. The Ministry identified for the Curriculum Renewal reform dimension seven relevant Green Papers and for the Quality Monitoring dimension six Green Papers. However, one Green Paper, labelled About belonging and a safe psycho-social school environment (DOC #92, NOU 2015, p. 2; see Green Paper 7 in Appendix 1), is cited in both WPs. It proposes a series of measures to reduce bullying, harassment and discrimination in schools. As a result of this co-citation, we only had to enter 12 Green Papers into the database. A list of the 12 Green Papers cited by WP1, WP2 or both is provided in Appendix 1, along with a short summary of their content. Table 1 illustrates the relation between WPs, Green Papers and the references listed in the bibliography sections of both. We entered a total of 3452 texts into the data set: 2 WPs, 12 Green Papers and 3438 references that are cited in White and Green Papers. 6 It is important to point out that this study focuses on so-called 'official policy knowledge', that is the White and Green Papers (labelled in our network analysis as source documents), as well as the references made in both types of official papers. We chose to consider the knowledge reflected in the Green Papers also as 'official' because the Royal Norwegian Commissions are government appointed and funded.
In addition to a quantitative analysis (measuring the frequency of citations), we also coded a series of attributes for all documents to allow for better interpretation: (i) year of publication, (ii) publisher or institutional affiliation of the author/authoring organization and (iii) location of publication, author or organization. The code for location had three values: (1) Norwegian or domestic, (2) Nordic or regional and (3) international, that is, neither domestic nor regional. The disaggregation of the 'international' category allowed us to compute how often Norwegian policymakers and experts reference regional texts published in neighbouring Nordic countries like Denmark, Iceland, Finland and Sweden, as opposed to texts produced in other parts of Europe and elsewhere. The software programme UCINET 6.289 (Borgatti, Everett, & Freeman, 2002) was used to develop the database and generate descriptive statistics. The programme NetDraw 2.097 enabled us to visualize the relationships between the documents in the data set. All figures are based on Multidimensional Scaling layout with node repulsion and equal edge length bias. This approach puts two nodes (or documents) closer together if they are more similar, and each document in the data set was coded with a unique identification number. The initial data set includes two columns. The first column contained the identification number of the source document and the second column contained the identification number of the reference. Once data entry was complete, we checked for duplicates (where the same reference is coded under multiple identification numbers). We then transposed the initial data set to a balanced matrix, with equal numbers of rows and columns that contained the identification numbers of all documents. Citation relationshipwhen source document X cites reference Ywas coded 1, and no citation relationship was coded 0. We used this matrix to analyse the network structure of policy learning for the 2020 reform.
Because documents (artefacts of knowledge) vary in their importance for the policy learning process, we calculated an 'in-degree' centrality measure. This measure is equal to the total of incoming citations for a given document. If a given document is cited many times, this means it is considered important to making an argument. As a result, the distances between nodes and the direction (or location) become interpretable.

Findings
The network analysis yields a host of interesting patterns that relate to the three research questions, presented in the methodology section of this paper. We confine ourselves to a few major findings, presented in the following.
What counts as evidence in official policy knowledge?
The first research question deals with the type of knowledge that policymakers and expert panels use as evidence in their reviews, recommendations and decisions.
Three features are striking with regard to how the authors of the White and Green Papers establish credibility and expertise. First, every document draws heavily on evidence, that is, references a large body of studies and reports. Taken together, the 14 policy documents (2 WPs and 12 Green Papers) cite 3452 other texts to substantiate, juxtapose or illustrate their points. This extensive number of references (an average of 246 per source document) is at first unexpected, but upon reflection confirms the political pressure surrounding evidence-based policy planning in different parts of the world (see Fenwick, Mangez, & Ozga, 2014;Grek, 2008;Ozga, 2009;Pizmony-Levy, 2017), including apparently in Norway. It is also indicative of the larger shift from government to governance (by numbers). This is particularly discernible when we compare the citation pattern over time.
The number of references used as evidence in policy evaluations, recommendations and formulations of the Royal Norwegian Commissions increased with each period of school reform. For example, the 1996 reform made only sparse use of secondary assessments and literature, many of which were either embedded in the text or listed as footnotes. Such citations evince a lack of concern for the authoritative status nowadays attributed to empirical studies and other analytical work. It is also noticeable that the papers associated with the Quality Monitoring policy dimension rely on 1973 references compared with the 1091 texts cited in papers related to the Curriculum Renewal dimension of the reform. Further analyses would be needed to understand the excessive use of reference or evidence in the policy domain related to quality monitoring, early intervention and learning outcome benchmarks in WP2. One explanation worth considering is that topics associated with quality monitoring are by nature more controversial and therefore in greater need of justification.
Second, only a small portion of the referenced texts are, in an academic sense, peer-reviewed publications or exposed to, a process that represents a useful and meaningful check on the veracity, validity and reliability of the research findings (Wang & Bowers, 2016, p. 22).
On average, only 12% of the references to journal articles or books fit this criterion. This is not to suggest an absence of standards in the White and Green Papers, but it just means that the government-sponsored commissions follow their own rules for assessing the quality of publications. Eighty-eight per cent of the studies, reviews, reports or other publications cited in the papers play by these rules for what constitutes validity. The policy experts of the Royal Norwegian Commissions are not alone in using non-peer-reviewed publications as sources. Wang and Bowers (2016), for example, found a similar pattern in the US educational administration research literature: The majority of citations (54.71%) constitute 'grey literature' that represent alternative forms of publication, thereby questioning the balance between the openness to new ideas and rigorous external scrutiny of such ideas.
Third, the White and Green Papers tend to draw on highly specialized, issue-centred publications that directly relate to their objectives. Only 224 documents -6.5% of all referencesare cited by more than one source. With the exception of two commissions for the Curriculum Renewal reform that consisted of the same panels (GP3 and GP4), the body of knowledge used by the various experts is highly specialized and therefore varies widely. This means a social network analysis of individuals serving on the Royal Norwegian Commissions would probably reveal a great diversity of loosely connected experts serving on different panels, drawing on disparate bodies of knowledge.

The spatial orientation of the two policy domains
We disaggregated the references made in the Papers into two setsone of which deals with the policy dimension Curriculum Renewal and the other with Quality Monitoring. Next we looked at the country of publication in order to understand the 'reference societies' (Schriewer & Martinez, 2004; see also Bendix, 1978;Crane, 1972) the expert panels used as inspiration for lesson-drawing. Since we did not analyse whether or not the text references were positive or negative, we do not imply any particular meaning for the reference. For example, a Green Paper may have referred negatively to a regional source such as Denmark, Finland, Iceland or Sweden to warn on how the Norwegian system should not develop. The oppositeencouragement towards emulation and lesson-drawingmight also apply. In any case, investigation of reference societies is a starting point to shed light on the 'educational space' wherein government officials and their experts situate themselves (Nóvoa & Lawn, 2002). Figure 2 shows the distribution of domestic, international and regional publications for the two WPs and their corresponding Green Papers. Overall, regional references (marked in light grey) are minimal for both sets. An overwhelming majority of references are domestic texts (marked in white): 73.7% of all the texts, used by the Quality Monitoring papers, were published in Norway. The proportion of domestic references is with 68.0% of all references still high for the Curriculum Renewal reform dimension. This pattern is statistically significant (Chi Square = 12.57, DF = 2, p < .01).
Strikingly, curriculum experts in Norway seem notably more interested in studies published outside of Norway and outside the Nordic region (marked in white)the USA, France (especially OECD in Paris) and the UK in particularthan what has been published locally. In comparative policy studies, quality monitoring reforms associated with testing and accountability are typically seen as visible sign of a managerial reform that have gone global (see Verger, Novelli, & Altinyelken, 2012). Meanwhile curriculum is often seen as a national project that selectively borrows the global script, or sometimes only the global rhetoric, of competency-based curriculum reform but subsequently translates it massively to suit the local context (Sivesind, Afsar, & Bachmann, 2016). Thus, the assumption is that curriculum specialists are less interested in global trends than are quality monitoring experts. What we find here, however, is the opposite: the curriculum specialists are more receptive to debates in other countries than those experts who focus on quality monitoring.
There is a fascinating disagreement between system theorists who argue curriculum specialists draw on international experiences and the 'semantics of globalization' to justify national decisions and between neoinstitutionalists like Lerch, Bromley, Meyer, and Ramirez (2016)   world are becoming increasingly similar. This study also points at an unexpected finding for the quality monitoring/accountability reform, a policy domain that is often hijacked by the global accountability discourse. As we will discuss in the Conclusion section, the Quality Monitoring reform dimension includes a strong national and regional adaptation of the global accountability reforms. The Nordic variant of the global accountability reform movement seems to always include an equity aspect, in this case, in the form of early intervention, benefiting students from a low socio-economic background. Contrary to what one might expect, publications from Finland, the PISA league leader, are read far less than those originating in Denmark and Sweden. Of course, language barriers matter. Nevertheless, it is surprising, given that, starting in 2009, the Finnish Ministry of Education actively promoted the marketization and export of their education system (Finnish Acts of Parliament 1296/2013, 2013) and even had studies translated into English and other languages (see Seppänen, Rinne, Kauko, & Kosunen, forthcoming). Only three of the references in the Quality Monitoring policy domain were published in Finland (0.18%), as opposed to 11 papers (1.17%) cited by expert panels dealing with Curriculum Renewal.

Policy usage of references: agenda setting versus policy formulation
Finally, the last research question examines the knowledge network within each policy domain and between the two domains. As mentioned before, the WPs and the Green Papers tend to draw on a very specialized or exclusive body of knowledge or texts. After one has excluded the references from GP7 (NOU [Norges Offentlige Utredninger, engl.: Official Norwegian Reports], 2015, p. 2; see Appendix 1)the Green Paper shared by both WP1 and WP2the two policy domains only have 2% of the references in common (67 documents). Figure 2 represents the reference network coloured based on their policy domains (WP1: light grey; WP2: white). The references shared by both policy domains are coloured in black and separated from the references of GP7 (coloured in dark grey). Low presence of black nodes indicates the exclusiveness and specialization of policy knowledge, evidence or references.
To identify the most central texts in the 2020 reform, we focused on the most-cited publications. Figure 3 only includes references that received more than a single citationan in-degree measure greater than one. Our focus on most-cited texts shows the social structure of the shared references, allowing us to examine their attributes in greater depth. The circles represent the source documents (2 WPs and 12 GPs), and the squares are the references. The size of the square represents 'in-degree centrality', indicating how often a text has been cited with the largest square being the most cited. This measure should be read as an indication of the impact of an author or text, as assessed by the number of readers that have cited the text. In the absence of an in-depth qualitative analysis of the database, we need to acknowledge that these texts may be influential for a variety of reasons, such as, texts that are used as an evidence, justification or authorization for a new or controversial statement or argument or texts that serve as a foundation for the object of review, to name only a few possible reasons for why some texts are cited more than others.
Based on the in-degree measures, as depicted in Figure 3, we identified the following five texts that were the most cited by 'separate' sources of the two policy domains (threshold: in-degree measure equal Light grey related to Curriculum Renewal, white related to Quality Monitoring and dark grey related to both reform dimensions. Black are cocitations, that is, cited in both reform dimensions or in both White Papers. or greater than six). 7 Thus, the identified texts could be interpreted as the bridge between the two policy domains that takes on a central position for both dimensions of the reform. These five most cited documents are described in Table 2 in more detail.
The text referenced most often is Document 2140 (Ministry of Education and Research, 2007) . . . And no one is left behind. Early Intervention for Lifelong Learning. It is cited by three expert panels associated with the Curriculum Renewal thread of the reform and five panels or commissions that produced foundational texts for the Quality Monitoring dimension. It is important to bear in mind that the highest possible number of citations for each of the WPs is 7-8, meaning that the measures are relatively high. One of the five most influential texts listed in the table above is a Green Paper, The Official Norwegian Report (GP1; NOU [Norges Offentlige Utredninger, engl.: Official Norwegian Reports], 2003, p. 16, 2003), entitled In the First Row. Increased quality within a basic education system for everyone, prepared by the Committee for Quality in Primary and Secondary Education in Norway. From all other Green Papers that were produced, the Green Paper of this particular Royal Norwegian Commission has apparently served as a bridge between the two policy domains, as indicated in the high number of citations. A commission charged with the curriculum dimension, GP1 or the Official Norwegian Report has been cited by four (out of six) other commissions in the curriculum domain and by two (out of six) commissions dealing with evaluation and quality monitoring issues.
The next question is why were these five texts so influential to the 2020 reform? DOC #2140, on the early intervention for lifelong learning, created in 2006 by the Ministry of Education and Research, focuses on the urgent need to reduce social inequality. The study suggests learning from experiences in educational systems which, according to OECD and IEA 12 reports, have been more successful in accomplishing this goal. These include problem-solving Norwegian references are in white, international in black and regional in light grey. projects and measures such as the introduction of learning assessment in early childhood education, new national regulations which stipulate specific qualification requirements to teach important subjects at certain grades, as well as the establishment of a major research programme on learning and teaching to produce knowledge about what works. The second text, (DOC #58) on Culture for Learning (Ministry of Education and Research, 2004), focuses on the quality and enhancement of learning from kindergarten on. This paper is the first among the official reports to recommend aligning national tests and curriculum revision to enhance skills and competences and improve learning outcomes.
The third text, (GP1) In the First Row. Increased quality within a basic education system for everyone (NOU [Norges Offentlige Utredninger, engl.: Official Norwegian Reports], 2003:16), draws on national research evaluations and OECD studies. It was written by the Committee for Quality in Primary and Secondary Education in Norway, charged with examining the content, quality and organization of lower and secondary education. This report breaks with earlier Norwegian policies by recommending national tests and a quality web-portal where results are published to increase transparency on the quality of schooling.
The Danish Clearing-house report by Nordenbo et al. (2008), (DOC #1967), is a technical paper written for the Norwegian Ministry of Education and Research, which uses the technique of systematic literature review to examine causal relationships between teacher competences and student learning. The report was written as a contribution to improve teacher effectiveness, as measured by student learning outcomes.
The fifth text, Hattie's Visible Learning (2009; DOC #1518), claims to be a synthesis of over 800 meta-analyses of scientific articles covering a range of topics. The book is widely referenced to argue that direct instruction improves learning more efficiently than project-based and other approaches.
Another network analysis worth comparing with our own covers the 2014 fundamental school reform in Denmark (Brøgger, Pizmony-Levy, Staunaes, & Steiner-Khamsi, Forthcoming). The Danish reform was inspired by an OECD country report produced 10 years earlier, which produced an avalanche of publications expressing an urgent need for reform. The authors of the analysis labelled the 2004 report a 'crisis-generating' international text of type policy studies textbooks associated with agenda setting (Howlett & Ramesh, 2003). Hattie's book (2009) surfaced as another influential international text later in the Danish reform debate. The authors labelled this book as 'solution producing' because it offered a host of 'best practices' on how to fix an education system (Brøgger et al., Forthcoming).
On the continuum between crisis-generating texts and solution-producing texts, the five most influential in the 2020 Norwegian school reform generally fall into the latter category in that they provide solutions, prescribe 'best practices' and focus on international standards, lesson drawing, emulation or policy borrowing from other educational systems. 13

Conclusions
To locate the significance of our analysis within the larger framework of comparative policy studies, we will begin with our first research question: Whose knowledge counts as evidence? The policy expert panels commissioned to evaluate past experiences refer to a large number of specialized studies relevant to their mandate. Since different commissions are charged with separate policy dimensions and view their mission as stocktaking enterprise in which all relevant studies are reviewed or at least mentioned in their report, we find little overlap among the Green Papers cited. Strikingly, only 12% of the citations refer to what we consider, in a strict academic sense, peer-reviewed literature. This means 88% of the knowledge used in the policy documents consists of 'grey literature' in the form of commissioned research reports, technical reports, literature reviews, trade books, newsletters and other publications that did not undergo rigorous review or external quality control. Though this is compensated by internal quality control mechanisms, the large volume of references raises the question of to what degree the content of these texts was actually synthesized, evaluated and understoodas opposed to simply being cited to bolster credibility. Since we only counted references as a single citation even if the commission cited the same text several times, this is a question which must be pursued in greater detail by a qualitative follow-up study.
Our second research question addressed the issue of reference societies or 'educational space' in which Norwegian policy experts situate themselves when discussing matters related to curriculum and assessment. We found that the policy references are primarily domestic; around 70% of the referenced literature was published in Norway. As for outside literature, most of it derived from English-speaking countries (USA and UK) or from English-speaking publishers (OECD Paris). One notable distinction is that references used in Curriculum Renewal papers were far more international than those cited in the Quality Monitoring papers. This finding was unexpected, given that other studies show that curriculum reform is typically driven by national expertise, while quality monitoring reforms are most often the product of global accountability, saturated with international 'best practices' promoted, disseminated and funded by transnational regimes such as the OECD and the World Bank.
This surprising finding begs interpretation and further study. It represents in fact an invitation to learn more about the Nordic version of 'accountability'.
As evidenced by the prevalence of accountability reforms in two casesthe 2014 fundamental school reform in Denmark (Brøgger et al., Forthcoming) and the 2020 incremental curriculum reform in Norway presented in this articlethese countries have not been spared from the global accountability reform discourse. In both cases, however, the two social network analyses revealed a particular pattern of translation into the respective country context in which concepts of equity were attached to the global education policy of accountability, resulting in a pedagogical rather than a managerial approach. Thus a student-/teacher-centred, 'soft' accountability emphasizing early intervention for students in the form of early diagnostics and formative evaluation, with teacher support by means of assistants and instructional leadership, seems to prevail. In stark contrast, the managerial version of accountability implemented in certain other countries (Verger et al., 2012) expects from school directors to hire and fire teachers, to use poor test results for public naming and shaming and to keep teachers in a vulnerable position by hiring them on a contractual basis and denying them permanent employment or tenure.
Our third and final research question investigated the network structure of reference authority by identifying influential texts cited both by the Curriculum Reform and by the Quality Monitoring expert panels. We assumed that some authors and texts bridged the two topics, helping the Government to integrate them by creating a coherent argument for reform dimensions that are in principle separate, yet still related. We identified five such texts (two Norwegian, one Green Paper, one regional text and one international text) as influential and integrative. Our content analysis revealed a commonality between the five texts: all tend to report on 'best practices' in terms of global standards or provide evidence of success within a specific system. We therefore labelled these publications 'solution-producing' texts providing policy knowledge that, in the language of policy studies, contributes to policy formulation. Three of the five influential texts, published over the period 2003-2006, address the earlier reform of 2006. The cross-referencing of texts that were foundational for the 2006 Knowledge Promotion Reform signals continuity and reflects the gradual, step-by-step or incremental reform approach pursued by the Ministry of Education and Research and its expert panels (see Karseth & Sivesind, 2010, p. 106).
At the other end of the spectrum are 'crisis-generating' texts, often found in national stocktaking exercises such as OECD country reports. In European educational systems, these country reports serve as a quasi-external source of authority, because they are paid and commissioned by the national government but written on behalf of OECD, a transnational, external regime. The fact that the five most cited texts are more 'solution-producing' and 'retrospective' rather than 'crisis-generating' and futureoriented reconfirms that the 2020 reform was incremental. From a comparative policy study perspective, there was no need for the Government to generate a crisis through quasi-external sources of authority (such as OECD) because it had not planned a fundamental reform that would have required substantial consensus and coalition building.
The study of the 2020 incremental reform tells us a great deal about the school reform process, notably the use of highly specialized, non-academic or nonpeer-reviewed knowledge, and the numerous references which policymakers and experts use as evidence to justify their reviews, recommendations and decisions. This study attempts to contribute to the critical study of evidence, typically seen as the foundation for knowledge-based policy regulation. In concert with Kvernbekk (2011), we find it essential to examine the question in greater depth of what counts as evidence and how it functions. Previous literature has acknowledged the multiple types of evidence used in the policy process (Davies, Nutley, & Smith, 2000;Weiss, 1979). Evidence ranges from research findings, existing statistics, to expert knowledge and secondary sources. Despite the broad definition of evidence, what constitutes as 'good' evidence has been extensively debated.
Our review of the literature suggests that the definition of evidence (i) changes over time, (ii) is context specific and (iii) varies depending on the stage in the policy process. First, Hadorn and his colleagues (1996) point to the hierarchies of evidence whereby some forms are perceived as more robust than other forms. For example, in some countries, randomized control trials seem to rank nowadays higher than expert opinions. If observed over a period of several decades, one would most likely find that some types of evidence come into fashion, whereas others burn out over time. Second, Hulme, Hulme, and Rauschenberger (2017) examine how the global script of evidence-based educational reform is locally adapted, recontextualized or selectively borrowed in three different policy environments of Great Britain. In all three cases, there is a commitment to learning from 'what works', but its translation into the policy contexts of Scotland, England and Wales differs greatly. For example, the choice of randomized controlled trials to identify 'best practices' is only found in the What Works Centres in England. The policy analysts in Scotland and Wales tend to use other tools for determining 'what works '. Finally, McDonnell and Weatherford (2013) expand this argument and claim that policy actors use different types of evidence for each stage of policy process. During the problem definition and solution identification stage, non-research evidences such as anecdotes and metaphors are used to humanize the problem by appealing to policy actors' and the public's core values, in addition to research-based evidence. In the policy design stage, evidence is less emotional and normative; it is more technical. The authors find that although motivation for the evidence-driven policy is to depoliticize the ideological and controversial debates, certain policy contexts lead to the use of alternative forms of evidence. For example, to obtain political supports, evidence could be drawn from interest groups involved in the process. Furthermore, in order to seize the limited policy window, policy actors utilize non-peerreviewed research and expert judgement in the absence of appropriate research. In the policy enactment stage, the evidence is similar to the ones used in the problem and solution definition stage; however, it is more targeted to individual legislatures to build policy coalitions.
The typology of McDonnell and Weatherford (2013) greatly resonates with the findings of our study. We found a large number of references to technical reports, reviews, evaluations and non-academic publication, which reconfirm the assertion that the references were not used as evidence for creating problem awareness or setting a new reform agenda, but rather for consolidating or 'renewing' the existing practices.
In addition, the study also helps advance the theory debate in education policy studies. As mentioned in the introductory section, most studies ignore incremental reforms that merely propose minor revisions to previous reforms (as seen in the Curriculum Renewal WP and its associated Green Papers) or follow-up on previously controversial debates (as reflected in the Quality Monitoring WP and its associated Green Papers). Though the focus on large reforms is unsurprising, given the likelihood that such analyses will result in correspondingly large conclusionsthe fact is most school reforms are incremental. Sequencing between fundamental and incremental reforms is particularly relevant to educational systems where municipalities determine how they implement national policies and guidelines. Also, fundamental national reforms are much more difficult to administer in highly decentralized policy contexts like Norway. As a result, fundamental reforms (such as the 2006 curriculum) rely on subsequent smaller or incremental reforms to consolidate what was issued in the first place.
The predominant interest in understanding fundamental changes generates several blind spots in policy studies. For example, as a result of this narrow focus, international large-scale assessment, the OECD, the World Bank or other quasi-external sources of authority appear as influential policy tools or actors. Yet their influence may be exaggerated give the absence of all of the above in the 2020 reform. As discussed in other publications and briefly sketched earlier in this article, 'externalization' or references to external sources are mainly found in policy contexts where there is a need for consensus and coalition building. The concept of externalization, borrowed from sociological system theory (Luhmann, 1995), lends itself as a useful interpretive framework for explaining the receptiveness towards, or frequency of, regional and international references (see, for example, Sivesind et al., 2016;Steiner-Khamsi & Waldow, 2012).
This leaves us with an extensive agenda for further research in which social network analysis could be used to understand and theorize the policy process from a comparative perspective. This particular method of inquiry investigates relations between individual actors or institutions, a focus that only recently has drawn the attention of scholars in comparative policy studies, policy studies and globalization research (see Ball, Junemann, & Santori, 2017;Pizmony-Levy, 2016). In terms of an agenda for further research, this particular study would greatly benefit from a systematic content analysis of the 3452 texts in the database. Understanding the semantics of the policy networks is essential for determining the main arguments for or against a reform and for identifying the coalitions that were formed over time in support of, or in opposition to, a reform.
In terms of comparative policy studies, a comparison across time (across different school reform periods, including periods of fundamental change) as well as with other educational systems in the Nordic region is very much needed. It would provide important clues for a more comprehensive interpretation of the results. The latter would enable us to put the findings in perspective and to discuss how one and the same global education policy, such as the competency-based curriculum reform or the accountability reform, is interpreted and translated differently in countries of the Nordic region.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by the UTNAM Policy Transfer in Education (Sivesind & Steiner-Khamsi), University of Oslo ES578742/271314. kindergarten and schools. The link between the School and Christianity was at stake and the Green paper suggested to strengthen the connection to the human rights instead of a particular religion. (3) 3 and 4 are produced by the same committee DOC #60 + DOC #51 (=twin) DOC #60 NoU 2014:7 Elevenes laering I fremtidens skole: Et kunnskapgrunnlag (Students' learning in the School for the Future. A knowledge base. This Green paper is an interim report that makes references to research on learning to show the importance of in-depth learning, the tight relationship between students' learning and their social and emotional competences. The importance of a broad concept of competence is emphasized with reference to competence needs in the twenty-first century. The report also includes some comparison with other countries. (4) DOC#51 NOU 2015:8 Fremtidens skole. Fornyelse av fag og kompetanser (The School of the Future. Renewal of subjects and competences. This Green paper is labelled the principal report and it focuses on the necessity to renew the subjects in school in order to meet future competences needs in working life and society. The students need to develop many different competences and four areas are suggested. Green Papers of the Assessment Introduction White Paper 2, published in 2016/17 (marked in the database as DOC #40) (7) DOC #92 NOU 2015:2 Å høre til. Virkemidler for et trygt psykososialt skolemiljø (About belonging and a safe psycho-social school environment) The Green paper proposes a series of measures to reduce the number of pupils being violated, bullying, harassment and discrimination at school. There are suggestions to change the regulations. Please note: this Green Paper is also identified in WP1 as a relevant GP (see number 7 above, listed under White Paper 1). (8) DOC # 54 NOU 2009:18. Rett til laering (about students' rights to learning). The Green paper focuses on children, young people and adults with special needs and presents different measures to strengthen the learning for these groups. (9) DOC# 53 NOU 2010:7 Mangfold og mestring-Flerspråklige barn, unge og vaksne i opplaeringssystemet (about Diversity) The Green paper focuses on the education provided by kindergarten, school and higher education for minority language-speaking children, young people and adults. (10) DOC#41 NOU 2011:14 Bedre integrering -Mål, strategier, tiltak.
(About integration) The Green paper (to the Ministry of Children and Equality) focuses on challenges and opportunities in a multicultural Norway and proposes measures for inclusion and integration in working life, education and civil society. (11) DOC #42 NOU 2012 Til barnas beste. Ny lovgivning for barnehagene (about the legal regulation of the kindergarten). The Green paper examines the existing regulation and suggests some more quality assurance requirements to ensure a good and equal kindergarten. (12) DOC # 50 NOU 2016: 14 Mer å hente -Bedre laering for elever med stort laeringspotensiale (More to gain -Better learning for students with higher learning potential. The Green paper focuses on the high achieving students and proposes measures to increase the number of students that can perform on higher and more advanced levels in primary and lower secondary education.