The (Non-)Use of Configurative Reviews in Education

ABSTRACT The push for evidence-based practice in education has led to a range of initiatives aimed at bridging the gap between research, policy and practice. Among these are the establishment of brokerage agencies with a mission to synthesise the findings of educational research. This development has been the subject of extensive controversy over the last decades. Critics emphasise that brokerage agencies in most fields prioritise experimental designs that measure the impact of interventions. However, the use of different methods for systematic reviews has increased over the last decade. In education, this development has included a growing interest in configurative reviews. Configurative approaches have been promoted as suitable for synthesising complex bodies of research and for pursuing questions that go beyond what works. This study explores the use of configurative reviews in two brokerage agencies that acknowledge the need to work with different kinds of reviews in education. However, the overall result shows that configurative reviews are rarely used. Less distinctive configurative elements can be identified in many reviews, but generally they operate within the frame of the conventional methodology and tend to be subordinated to an aggregative logic. These findings are discussed as threats to the relevance and quality of systematic reviewing in education.


Introduction
Generally, the increased focus on evidence-based policy and practice relates to several societal developments (Bhatti, Hansen, & Rieper, 2006;Bohlin & Sager, 2011; Organisation for Economic Co-operation and Development [OECD], 2007). One aspect is the increased production of and availability to research, information and data in a digital and global world. Another aspect is a current strong belief in research evidence as the best foundation for policy development and professional decision-making within most fields. The increased emphasis on evidence in education is explained by a multitude of factors: a greater concern with students' achievement outcomes; the explosion of available information due to a greater emphasis on testing and assessment; more explicit and vocal dissatisfaction with education systems nationally and locally; and the increased access to research evidence via the internet and other technologies (Bohlin, 2010;Hansen & Rieper, 2009;Levinsson, 2013;OECD, 2007).
The call for evidence-based education has led to a range of initiatives aimed at bridging the gap between research, policy and practice (Biesta, 2007). Among these are the establishment Numerous critics (e.g., Biesta, 2007;Bridges, Smeyers, & Smith, 2009;Keiner, 2004;Moos, 2006) stress that different traditions of inquiry, including qualitative research, are needed in education to provide answers not only to why something works, for whom and under what circumstances, but also, and perhaps most importantly, to address the question: "what should count as working?" Biesta (2007) calls for a broader approach in which technical questions can be addressed "in close connection with normative, educational, and political questions about what is educationally desirable" (p. 22).
The systematic review movement has been the subject of extensive controversy in a number of educational research journals since the late 1990s. 1 Unfortunately, the heated debate has tended to reinforce the dualism between quantitative and qualitative research paradigms within the educational field and also seems to have made educational researchers reluctant to deal with the idea of evidence-based practice (Bohlin, 2010;Pring, 2004aPring, , 2004b. However, the variety and use of different methods for systematic reviews has increased over the last decade, not least when it comes to the synthesis of qualitative research (Barnett-Page & Thomas, 2009). In education, there are research communities that have a clear ambition to contribute to the development of the field, for example, the EPPI-Centre and the Danish Clearinghouse for Educational Research (DCU). In particular, the initiatives taken by the EPPI-Centre have resulted in a democratisation of the review process by emphasising stakeholder involvement (Rees & Oliver, 2012); the development of mixed-method approaches (Thomas et al., 2003), including methods for synthesising qualitative research (Oliver et al., 2005); and consequently, a growing interest in configurative reviews (Gough, Oliver, & Thomas, 2012a, 2012b. As a result, the literature on how to proceed with various types of research synthesis, including configurative reviews, has increased significantly in many fields during the last couple of years (e.g., Anderson et al., 2013;Petticrew, 2015;Snilstveit, Oliver, & Vojtkova, 2012).
Configurative reviews have been promoted as suitable for the heterogeneity of educational research, and as necessary for pursuing questions that address the complexities involved in teachers' work and that correspond to the broader aims of the educational system (Gough et al., 2012a;Levinsson, 2015;Snilstveit et al., 2012). Configurative reviews also have gained popularity among stakeholders in the systematic review movement as a tool to transform teaching into an evidence-based profession (e.g., Lillejord, Børte, Halvorsrud, Ruud, & Freyr, 2015). However, some of the early efforts made at the EPPI-Centre to develop alternative reviews appear, despite other intentions, to be subordinated to an aggregative logic. Bohlin points out (2010 [translation by the authors]): That the method, which has been developed for synthesising empirical research at the centre, also includes descriptive, non-experimental studies does not mean that the method has very much in common with meta-ethnography. On the contrary, most indications are that the data from the single studies that are included in the reviews, conducted by the EPPI-Centre, are extracted and aggregated according to principles which Noblit and Hare (1988) reject. (p. 175) Moreover, according to recent studies by the authors, alternative review methods do not seem to be of the same interest among outsourcers and financers, who mainly seem to favour significant effect-sizes and studies about what works (Levinsson, 2015;Prøitz, 2015). These studies display that it is unclear whether configurative approaches are considered to be a tangible alternative within the field of education. The approach in this study recognises all forms of research synthesis as a potentially productive element in educational research, with the various and contrasting perspectives on research reviews described as a background. This study goes beyond previous controversies by empirically exploring the use of configurative reviews restricted to the field of education. The main purpose is to examine how configurative reviews are manifested in practice among brokerage agencies that explicitly prioritise a wide range of approaches to systematic review. This study will address the following research questions: • How do the agencies describe their review activities and what approaches to synthesis do they apply within the educational field? • To what extent, and in what ways, do the agencies use configurative approaches to research synthesis? • What promotes or impedes the agencies to use configurative reviews within the educational field?
Aggregative and configurative reviews In education, as well as in other areas, the use of aggregative reviews can be traced back to the development of meta-analysis by Gene Glass and Mary Lee Smith in the 1970s (Gough, 2004). Meta-analysis converts data from multiple studies into a single measure called effect-size, allowing for comparisons between individual results and a pooled effect-size. The development of meta-analysis has been regarded as a cornerstone in the rise of evidence-based medicine (Bohlin, 2011). An established hierarchy of evidence characterises the medical approach to systematic reviewing in which randomised control trials (RCTs) have been considered the golden standard for evaluating the effects of interventions (Petticrew & Roberts, 2006;Torgerson, 2003). The use of meta-analysis paved the way for the development of other methods of aggregation since the principles of meta-analysis were seen as promising but not considered as applicable to all kinds of research (Bohlin, 2011). According to the typology of reviews developed by Gough et al. (2012aGough et al. ( , 2012b at the EPPI-Centre, aggregative reviews are based on realist approaches and the assumption that the phenomena studied are relatively unambiguous in nature. Aggregative reviews are characterised by a deductive logic of synthesis that summarises, or adds up, the findings of similar studies, primarily to measure the impact of an intervention (see Figure 1). The methodology of aggregative reviews is generally set a priori. Most of the review stages are specified in advance by a predefined conceptual framework, usually derived from a description of the study's population, intervention, comparison, outcome and setting (PICOS) (e.g., O'Connor, Green, & Higgins, 2011). Two other characteristics of aggregative reviews are that the literature search aims to be exhaustive, generally based on the use of several complementary search strategies, and that the quality assessment is primarily performed to avoid various forms of bias (e.g., Campbell Collaboration, 2014;Higgins, Altman, & Sterne, 2011). 2 The aim is to find all relevant studies and, thereby, ensure a sample that is as representative, homogeneous and trustworthy as possible (Gough & Thomas, 2012).
In contrast, configurative reviews have been argued to be appropriate when confronted with a complex body of research for which different ways to study the same Figure 1. Approaches in aggregative and configurative reviews. This figure is a slightly modified version of the figure in Gough et al. (2012b, p. 3). Permission to reproduce the figure has been obtained from David Gough. issue open up different understandings of the phenomena (Gough et al., 2012a). According to the typology, configurative reviews are based on idealist approaches and the assumption that the phenomena studied are multifaceted in naturewhat is brought to the surface by a particular study depends on the study's theoretical and methodological points of departure (cf. Dixon-Woods et al., 2006). The synthesis follows an inductive logic that arranges the findings of different studies in a way that offers a meaningful picture of what the research presents. It may use techniques such as meta-ethnography (Noblit & Hare, 1988) to synthesise different strands of research that explore or develop theory. Meta-study (Paterson, Thorne, Canam, & Jillings, 2001), meta-narrative (Greenhalgh et al., 2005) and critical interpretive synthesis (Dixon-Woods et al., 2006) are other examples of configurative approaches (cf. Barnett-Page & Thomas, 2009).
Compared to aggregative reviews, the methodology of configurative reviews generally is described as more exploratory and iterative (Gough et al., 2012a(Gough et al., , 2012b. Drawing on the studies that are found, a configurative review can be adjusted continuously, and the stages might overlap and influence each other. For example, the quality assessment and the synthesis cannot be separated easily in some configurative reviews since the potential value of a study only becomes visible when related to other studies in the field. As pointed out by Gough et al. (2013), "A spread of different and unusual cases may provide greater insights than a representative sample that reveals more about typical cases" (p. 20). Further, the literature search is likely to be done repeatedly, and the principle of saturation may be used to inform the inclusion and exclusion of studies. The aim is to find sufficient studies to provide a meaningful configuration that has the potential to deepen the understanding of the phenomena (cf. Petticrew, 2015).
Even though aggregative and configurative approaches substantially differ, it is important to underscore that most reviews contain elements of both and there are reviews that can be placed in between aggregation and configuration. Moreover, aggregation of qualitative data occurs in some reviews, as well as configuration of quantitative data in others. The typology of reviews made by Gough et al. (2012aGough et al. ( , 2012b is described as an attempt to overcome the dichotomy between qualitative and quantitative research.

Methods and analytical framework
This study is restricted to two brokerage agenciesthe EPPI-Centre and the DCUwhich have an explicit focus on education and position themselves within a combinedmethod approach. 3 The study draws on a multitude of data from several types of sources that were collected from October 2014 to August 2016: (1) literature on methods developed by and for the agencies, including various types of how-to documents made available as PDF documents and PowerPoint files on the agencies web pages, such as concept notes, guidelines, manuals and different types of protocol; (2) final products made by the agencies, such as systematic review reports, systematic maps, rapid reviews, technical reports, user summaries and newsletters; and (3) four expert interviews with informants in the DCU and the EPPI-Centre.
The material provides a rich set of data for the analysis of what the agencies say they do (in documents and on web pages) and what they actually do (illustrated by a selection of final products). The material also provides data through personal interviews on the expert informants' reasoning as to why things are the way they are (see Table 1). The selection of documents, interview guides and analysis strictly focused on explicitly defined methods described in the material. This focus on method places the basic logic and defined standards, procedures and ideals for investigative practice at the centre of the study of the document-based material.
Documents representing the final products of the agencies were selected from the total pool of reports presented by the agencies on their web pages. Out of a total of 185 systematic review reports documented in the list of EPPI-Centre systematic reviews, 85 reports were identified within the field of education (EPPI-Centre, 2016). 4 An initial screening of the title of the reports was conducted to identify configurative-oriented reviews in line with the aim of the study. As a result of the screening, 57 reports were excluded on the basis of being aggregative-oriented approaches. 5 The remaining 28 reports then were obtained and read in two stages: a summary and method chapter screen, followed by in-depth reading and closer analysis of the full report. The total of 27 review reports produced by the DCU at the time of data collection were all read in the same two-stage procedure. 6 The authors independently reviewed all the reports included in this study and any disagreement was resolved by discussion.
Three expert interviews were conducted as audio-recorded semi-structured dialogues, while one of the interviews was conducted via e-mail. Informants were selected based on two criteria: their expertise and working position in the field of research synthesis. The dialogues focused on methods used for reviews; development of methods for review in general and within the educational field in particular; and thoughts on future needs and developments in the field of systematic review. The interview data supplements the information provided by the documents; they also give insight into the thoughts on methods of experienced researchers and experts in the field.
The typology developed by Gough et al. (2012aGough et al. ( , 2012b, presented earlier in this paper, clarifies diversities between various systematic reviews and has been used as a lens in the analysis of the data material (cf. Figure 1). The analysis focuses on the identification of indicators of aggregative and configurative elements in described and applied methods as well as the determination of their main orientation. Five main indicators have guided the analysis: (1) the nature of the research questionto test, explore or generate theory; (2) the degree of predetermined methodsselected a priori or iteratively as the review proceeds; (3) the aim of the literature searchesto find all relevant studies (exhaustive) or a sufficient amount of studies (theoretical); (4) the type of studies includedquantitative and/or qualitative data; and (5) the focus of the quality assessmentto avoid bias or to value the uniqueness of the contribution.

The Danish Clearinghouse for Educational Research
Analyses of websites, method manuals and reports When the DCU was established in 2006, it became the first agency of its kind in continental Europe. The overall assignment of the DCU is to further the development of evidence-based practice and policy in education. Another important task is to contribute to an examination of the methods used to "ensure the quality development of the Clearinghouse's own products and contribute to the international development of the field" (DCU, 2007, p. 9). According to the concept note, the DCU follows the internationally accepted standards for systematic reviews, but not the ones adopted by the Cochrane and Campbell Collaboration in the first place: Other organisations that work with clearing also exist, and may give significant inspiration to a Danish clearinghouse. One example is the English EPPI-Centre . . . In contrast to Campbell, EPPI works with systematic reviews that include research based on different methodological approaches. In reviews from EPPI one thus finds both quantitative and qualitative studies of effect. EPPI has also worked with reviews that examine evidence for other types of questions than effect . . . As a starting point, the Danish Clearinghouse for Educational Research will work with several methods that allow for the inclusion of both qualitative and quantitative studies. (DCU, 2007, pp. 8-9) Meta-analysis, model-based, narrative, additive and combined syntheses are examples of methods put forward as usable by the DCU, depending on whether quantitative or qualitative studies are included. The DCU offers three different kinds of products that vary in depth and width: systematic reviews, systematic research maps and brief research maps. Tools originally developed at the EPPI-Centre support much of the practical work at the DCU. The EPPI Reviewer software is used to ensure a high degree of formalisation and transparency in the majority of the published reviews and maps (e.g., Søgaard Larsen, Brørup Dyssegaard, & Tiftikci, 2012).
The DCU's descriptions of strategies and tools used in the review process match their ambition to work with different kinds of reviews and to contribute to the development of the field. However, this picture is not confirmed in this paper's examination of the systematic reviews and maps available on their website. The majority of the 27 reports examined in this study focus on effects of various interventions and programmes in education (e.g., Dyssegaard, de Hemmer Egeberg, & Steenberg, 2013;Dyssegaard, Søgaard Larsen, & Tiftikci, 2013;Larsen, Kornbeck, Kristensen, Larsen, & Sommersel, 2013a;Nielsen, Tiftikci, & Søgaard, 2013). These reviews and maps are based on the conventional methodology and primarily are adapted to aggregation. This means that the review stages generally are separated from each other and set a priori. The literature search is exhaustive and based on the use of several complementary strategies, including searches in several databases, searches on internet resources, hand searches and snowball searching. Even DCU's brief research maps, which are characterised by a less extensive search, aim to find all relevant studies within these limitations by using predefined strategies. Primarily quantitative studies are included, 7 the quality assessment aims to avoid bias, and meta-analysis or other aggregative approaches are given priority when synthesis is performed or when the possibility for synthesis is investigated.
Configurative elements can be observed in many of the DCU's products, but they are limited primarily to subgroup descriptions and analyses, adjustments of the scope, as well as the criteria for inclusion and exclusion (e.g., Dyssegaard, de Hemmer Egeberg et al., 2013;Larsen et al., 2011;Larsen, Kornbeck, Kristensen, & Larsen, 2012;Søgaard Larsen et al., 2012). The review conducted by Dyssegaard, Søgaard Larsen et al. (2013) may serve as an example to illuminate the nature of these considerations and adjustments. In this particular review, the screening process ended up with over 400 potentially relevant studies. However, the sample was considered by the authors to be too large for the synthesis. Consequently, the scope of the review was narrowed to allow only study designs that evaluated effects and that focused on effects in different subgroups of the population. The inclusion criteria for publication year of the studies also were altered. After these adjustments, the number of studies that were passed on to the next step in the review was reduced to 65. Some reviews and maps also raise questions that may be suitable for configurative approaches. As an example, the research map performed by Larsen et al. (2012), which examines the causes and effects of dropouts in higher education, is predominantly based on aggregation, but it also seems to have a configurative ambition in the investigation of how dropout as a phenomenon is framed and defined -"What is drop-out from university studies?" (p. 14). Still, the literature search and the selection of studies follow the conventional methodology. Only quantitative studies were included.
However, the DCU also has performed research syntheses besides those available on their website, for example, in collaboration with Ramboll Management Consulting, VIA University College and the professional colleges UCC and Metropol. In 2014, the DCU contributed to six research maps commissioned by the Danish Ministry of Education to form a basis for the reform of the Danish elementary school system (see Table 2). 8 As perhaps a result of being produced by the same authors, these maps have an identical methodology, in which a configurative approach to synthesis is described as follows: This approach is appropriate when the collected knowledge is based on studies that have been performed in different situations and contexts. It is important to stress that this approach does not allow for comparisons of effect-sizes across studies. In this map, we have used a so-called narrative synthesis within the configurative tradition. A narrative synthesis is well suited to handle studies with different research designs and interventions implemented in many different national and local contexts. (Dyssegaard, de Hemmer Egeberg, & Steenberg, 2014b, pp. 4-5

[translation by the authors])
A narrative synthesis is claimed to be used to establish themes inductively in all six maps. However, based on the limited method description offered in the reports, it is difficult to identify how the themes actually have been constructed. It is clear that the themes are built up as subcategories and that the studies under each theme examine methods, strategies or programmes that seem to operate on the same level and/or share the same targets. Nevertheless, the formation of themes must have called for quite comprehensive analyses and interpretations, but this configurative work is not described in a transparent manner in the reports.
The other configurative elements that are visible in these maps are quite similar to the less distinctive ones identified in many of the DCU's conventional reviews. There are adjustments of the scope, as well as the criteria for inclusion and exclusion, and subgroup descriptions and analyses are made frequently.
However, when it comes to indicators such as literature search, selection of studies and quality assessment, these maps are not in accordance with how the configurative approach is presented in the literature (see Figure 1). The literature search does not aim to find sufficient studies to accomplish a meaningful configuration; the selection appears not to be governed by the degree of saturation expected within the themes; and the quality assessment primarily focuses on the studies' suitability for measuring effects rather than estimating the uniqueness of the contribution.
The use of the conventional methodology in these stages could be explained by the fact that all six maps are driven by questions that focus on effects. All the questions begin with the phrase "what methods and efforts have an effect or impact on" (e.g., Dyssegaard et al., 2014c, p. 2 [translation by the authors]). Consequently, the great majority of the included studies have a quantitative approach. Of the 380 studies included in the maps, only 3% have a qualitative approach. This proportion would drop even more if one takes into account the studies that actually are passed on to the synthesis. As pointed out by the DCU (2007), "the differences between the types of studies included in systematic reviews also have consequences for the method used to synthesise knowledge" (p. 8).
Moreover, the conclusions presented in each map appear to be drawn by simply adding up the studies within and across themes. Lines of arguments, such as "in many of the studies, it is clear that" or "results from many of the studies in the map and across the themes show", underpin the presented conclusions (e.g., Dyssegaard et al., 2014e, pp. 52-56 [translations by the authors]). This can be seen as contradictory to the DCU's own description of the configurative approach, which clearly emphasises the  (Dyssegaard et al., 2014b, pp. 4-5). The analysis shows that these maps taken together should be placed somewhere, perhaps more correctly, in between aggregation and configuration (cf. Gough & Thomas, 2012). It is also important to emphasise that a narrative synthesis could be used for both configurative and aggregative purposes (Snilstveit et al., 2012).

Analyses of interviews with expert informants
The DCU's focus on aggregative reviews and use of the conventional methodology might be explained by the influence that outsourcers and financers seem to have on their review activities. Unlike brokerage agencies in some other countries, the DCU is not publicly funded. Rather, the DCU is dependent on the availability of external project assignments. As one expert said in an interview, "You are so lucky in Norway and Sweden because you get money! I do not have it; I must get my own. We do not share a budget with the university either" (Expert 1, DCU [translation by the authors]).
The Danish Ministry of Education has commissioned many of the systematic reviews and maps that have been published by the DCU. Other clients are the Danish Evaluation Institute, which explores and develops the quality of day care centres, schools and educational programmes, and various politically bound or unbound think tanks. In many casesas perhaps exemplified by the "configurative" research maps discussed in the previous sectionthe DCU must adapt to the aims and questions of external project assignments.
The DCU's adaptability to outsourcers and financers also is reflected in how it presents its products to potential users. Besides systematic reviews, the DCU offers two other formats, a systematic research map and a brief systematic research map. A systematic research map mainly describes the research activity within a given field of knowledge. A brief systematic research map has a similar purpose but is characterised by a less extensive literature search and does not always include quality-assurance procedures in the different phases of the mapping. However, the expert interviews showed that the actual performance of these two different kinds of maps could be further customised: "Finally, I want to emphasise that we do not necessarily only offer these three exact products. The exact form will always be customised to the needs and economy of the outsourcer" (Expert 2, DCU [translation by the authors]). The research maps are described as less time-consuming and recourse-demanding in comparison to a full systematic review. This perhaps is appealing to outsourcers on urgent missions and limited budgets.
Outsourcers and financers also seem to influence the staffing and the competencies needed at the DCU. The availability of project assignments determines how many and which employees they have over time. One of the informants (Expert 1, DCU) pointed out that many different competencies, such as knowledge about the educational field and the subject matter, the ability to work conceptually to make adequate literature searches and the ability to write reports, are important in the making of a systematic review or map. Yet, when it comes to specific methodological competencies, certain skills noticeably are given prominence that do not match the DCU's aim to work with different kinds of reviews. The same informant emphasised that the staff have to be "methodologically skilled" and well acquainted with quantitative research, to assess levels of significance and work in various statistics software, among other things.

Analyses of websites, method manuals and reports
The methodological profile and method of the EPPI-Centre is described thoroughly on the organisation's web pages. The front web page presents the work of the centre in more popular ways, but the centre also has developed an extensive library of references, short summaries, various publications, reports and PowerPoint files under the label Methods and Databases. Most of the material on these pages are openly available and cover the various aspects of how to perform a systematic research synthesis. A centrepiece in this material is The EPPI-Centre Method for Conducting Systematic Reviews (2010). The document describes every step in the development of a systematic review, and it also gives advice and recommendations, as well as underscores a set of quality requirements for the EPPI review. In addition, it refers to the book Introduction to Systematic Review (Gough et al., 2012a) for updated and more elaborate discussions on the various steps of the method. The method document and the book can be characterised as the EPPI-Centre methods for conducting a systematic review.
Both documents emphasise that all systematic reviews require explicit methods for the description and synthesis of evidence, but also that methods can vary depending on the questions that are to be answered by the review: Often, ideological resistance is linked to confusion between experimental methodologies and systematic reviews (i.e. the myth that randomised controlled trials are the only type of research evidence that is accepted, and that "what works" questions are the only ones addressed by reviews). In recent years the flexible nature of systematic research synthesis has been illustrated, with a huge variety of types of questions being answered with syntheses of a broad range of study types. (EPPI-Centre, 2010, p. 3) The presented logic is that different types of evidence will be appropriate for answering different questions. Both documents emphasise that the majority of systematic reviews are about informing policy and practice of efficiency. This refers to studies that mainly combine numerical data from experiments in meta-analysis that aim to answer questions on what works. Both documents highlight that the logic of systematic review, to a large extent, varies in research questions, the kind of primary research that is included, methods of synthesis and procedures, as well as in terms of the kind of evidence included in the studies. They also describe how research questions can be defined broadly or narrowly and the consequences of the chosen method. These statements illustrate how the EPPI-Centre has opened the field of research synthesis to studies of more varied methods and approaches. It also reflects an aspiration to make use of multi-method approaches in systematic reviews. However, the description of the method and steps for conducting a systematic review is highly grounded in the conventional method for systematic review inspired by RCT studies in medicine and an aggregative approach. More configurative approaches are described too scarcely in the documents to be considered as guidelines or advice for conducting configurative reviews: In some reviews, the question and method is not so pre-specified, so allowing for a more iterative method of review. These reviews tend to have broader questions and take a more investigative approach to examining the evidence rather than pre-specifying every aspect of the review. (The EPPI-Centre, 2010, p. 4) The 28 selected reports show various approaches to synthesis, ranging from metaanalysis (e.g., Hawkes & Ugur, 2012), realist review (e.g., Westhorp et al., 2014), narrative synthesis (e.g., Bennett, Lubben, Hogarth, Campbell, & Robinson, 2005) and explorative review (e.g., Bills et al., 2008) to systematic mapping. 9 The majority of the reports, however, are based clearly on the methodological principles of the aggregative approach. This can be seen in the main use of a priori defined methods, how search strategies are designed for exhaustive searches rather than the pursuit of saturation, and how the quality assessment mainly aims to avoid bias. Yet, there are some distinct variations in how aggregative or configurative the reviews are.
Another group of reports in the sample, mainly narrative syntheses (Bennett et al., 2005;Bills et al., 2007;Francis, Skelton, & Archer, 2002;Harlen, 2004;Kingdon et al., 2014;Kyriacou & Goulding, 2006;Novelli, Higgins, Ugur, & Valiente, 2014;Powell & Tod, 2004), but also one realist review (Westhorp et al., 2014) and one explorative review (Bills et al., 2008), are based on the conventional methodology. Yet, they include more elaborated configurations, such as establishment of themes across studies (e.g., Kyriacou & Goulding, 2006), contrasting the findings of different studies in light of the research question (e.g., Bills et al., 2007), and some reviews explicitly aim to explore and/or develop theory. The realist review conducted by Westhorp et al. (2014) may serve as an example to illustrate the nature of the latter. This review addresses the question: "Under what circumstances does enhancing community accountability and empowerment improve education outcomes, particularly for the poor?" (p. 10). A preliminary programme theory was first developed for community accountability and empowerment. Exhaustive searches were then performed to identify all outcome studies of relevance. Studies that provided support for the theoretical elaboration also were included. Evidence from the included studies was extracted and related to the initial programme theory. To identify influencing features of context, programme mechanisms and hypothetical causal pathways, the review team was involved in in-depth reading and analyses of texts, deliberations between leading researches and repeated searches for supporting evidence. This iterative process ended up in a quite complex realist account "of the contexts in which, and mechanisms through which, community accountability and empowerment interventions may contribute to improving education outcomes" (p. 137). The example illustrates, as pointed out by Gough et al. (2012a), that a realist review can be placed in between aggregation and configuration.
The narrative syntheses in the sample that claim to explore or develop theory share some of the configurative elements that can be identified in a realist review (e.g., Kingdon et al., 2014;Novelli et al., 2014;Powell & Tod, 2004), but the most distinguishing configurative work that can be identified in the narrative syntheses at the EPPI-Centre generally is the establishment of themes. The themes, however, mainly are constructed deductively and could be characterised as predefined sub-categories in which the findings of individual studies are summarised. The themes are derived from, for example, the conceptual framework (e.g., Novelli et al., 2014), review question (Harlen, 2004), previous research (e.g., Kyriacou & Goulding, 2006), weight of evidence (e.g., Bills et al., 2007) or study characteristics (e.g., Bennett et al., 2005). It is important to underscore, however, that some of the syntheses that establish themes deductively tend to apply both aggregative and configurative logics when constructing the narrative. This can be seen in how the studies included are both summarised and contrasted in the light of the research question (e.g., Bennett et al., 2005;Bills et al., 2008;Kyriacou & Goulding, 2006).
There are five narrative syntheses in which themes mainly emerge inductively (see Table 3) and that, perhaps, can be regarded as the EPPI reports that come closest to how the configurative approach is presented in the literature. Three of these reviews focus on the experiences and perceptions of educational stakeholders (e.g., school leaders, teachers, support staff and teaching assistants) and mainly draw on qualitative data (Cajkler et al., 2006(Cajkler et al., , 2007Nixon, Gregson, Spedding, & Mearns, 2008). The two other reviews are concerned with teaching strategies that contribute to the inclusion of pupils with special needs and are based on both qualitative and quantitative studies (Rix, Hall, Nind, Sheehy, & Wearmouth, 2006;Sheehy et al., 2009).
The review conducted by Nixon et al. (2008), which examines practitioners' experiences of implementing national education policy (16-19 policy) at the local level, may serve as an example to clarify how themes emerge inductively. In this particular review, exhaustive search strategies were applied. A total of 58 studies were included in the initial mapping. The review team then introduced an additional inclusion and exclusion criteria to identify a focused subset of studies that most closely addressed the review question: "What do practitioners in Further Education (FE) colleges say about the conditions, attitudes and implementation of national education policy?" (p. 19). Ten studies that investigate the attitudes, perceptions, views and beliefs of practitioners about their working context and the implementation of national policy in the local setting were selected for in-depth review. The synthesis was then performed in three distinct steps, all of which thoroughly are described in the report: The first step in this three stage process was to organise the direct quotations identified into clusters that illuminated one general theme ... The second step of the process was then to examine the themes, analysis and implications drawn by the study authors, and to identify comment that also reflected the emerging theme of mediation ... The final step in the process was to review the quotations, themes and the rough theme to develop a statement that encapsulated all this material and which is presented as the finding. (pp. 11-12) This process generated five main finding statements: "policy mediation", "practitioner pragmatism", "juggling competing discourses", "constriction of pedagogic judgement and agency", and "uncertainty and insecurity". In connection to each finding statement, key terms associated with that statement are identified and the meaning of these key terms are explored with reference to the themes and quotations taken from the studies included. Emerging themes represent a distinct configurative element in all of these five narrative syntheses. Nevertheless, much effort also seems to be made to apply procedures grounded in the aggregative approach. The literature search does not aim to find sufficient studies to accomplish a meaningful configuration; the selection appears not to be governed by the degree of saturation, which could be expected within the themes, and the quality assessment does not seem to pay much attention to the uniqueness of the contribution. Moreover, the two narrative syntheses that draw on both qualitative and quantitative data explicitly point out that "the differences in foci and emphasis across the studies, together with the fact that most used mixed methods, meant that a meta-analysis of a statistical nature was not appropriate" (Rix et al., 2006, p. 40). The quality assessment in these two reviews clearly aims to avoid bias (e.g., Sheehy et al., 2009, pp. 49-53). Overall, the analysis of the five EPPI reports shows that the approach to establish themes inductively only partly matches the characteristics of a configurative review (cf. Figure 1).

Analyses of interviews with expert informants
Even though the majority of systematic reviews are grounded in the aggregative approach, the expert informants at the EPPI-Centre pointed out that this often can be considered as a balance between elements of aggregative and more configurative approaches. One of the informants exemplified this by describing how, in situations with few studies to aggregate, they could develop broader research questions to supplement the study with elements of configuration, thus turning it into a combined study (Expert 2, EPPI). However, whether this actually would happen is dependent on the interests of the outsourcer. This informant said that there is little to be done if the client has made up his or her mind. Another possibility in such situations is to point to the results of a research mapping or initial scoping of the field to illustrate what kind of primary studies are available and the potential for development of an additional research question for the review. Another aspect discussed in the interviews by both informants was how British authorities, to a lesser degree, have commissioned systematic reviews in education over the last couple of years, while this does not seem to be the same for fields like health and medicine. One informant described how the health programme has developed a relationship with the funder and that they work more together with research questions. The informant said the funder understands that time is necessary to receive quality studies and that this also has resulted in some innovative work (Expert 2, EPPI). When asked if there are differences in method approaches in different fields, the informant said that projects are run the same way, but reviews in education take more time in general, more time is spent on defining research questions, the searches tend to be less precise and there is very often a need to perform supplementary searches.
Both informants at the EPPI-Centre agree with the argument that the field is dominated by the aggregative approach. One of the informants also underscored how the field of research synthesis has been and still is characterised by a methodological debate about whether qualitative studies can be used in systematic reviews: You need clear standards. You do not want too much variation, but growing awareness that effects are not enough has led to the use of mixed methods and multi-components methods. We are pleased about that. The aggregation and configural illustrate this complexitythere is technical difficulty. I use complexity to talk about mixing methods and components. (Expert 1,EPPI) Both informants describe how the centre's developmental work on methods has aimed to open up a broader use of rigorous, quantitative and qualitative research, which has led to greater variation in types of reviews, including configurative reviews, though perhaps more so in the field of health care than in education. One of the informants also emphasised that their work on the configurative approach first and foremost is developmental and a work in progress (Expert 1, EPPI).
When asked about why the majority of reviews still are aggregative, the informants, in the same way as the informants at the Danish Clearinghouse, point in the direction of funding: "In method development, we work configurative, but the minority of funders want configurative reviews" (Expert 2, EPPI).
One of the informants explained the use of the concepts of aggregative and configurative approaches as a way of making the diversity of possibilities in research synthesis visible (Expert 1, EPPI). At the same time, this informant underscored the importance of recognising the complexity of the educational field. Systematic review is only one tool among several available to those who work for education development. As such, the informant places the research synthesis in education as one source of information in line with all other sources about the educational field. To this informant, it is important to develop a coherent system that relates the different providers and sources of information to each other as a basis for the development of education, including primary research, research synthesis and data on student learning outcomes and experience-based knowledge. Another aspect emphasised by this informant was the value of research synthesis seen in relation to development of research strategies and research funding.

Discussion
Taken together, the analysis of the reports and the interview data presented here shows that configurative reviews rarely are used in education. This study was not able to find any pure example of a configurative review or map that includes a wide range of research approaches, methods and interests within the education field. Less distinctive configurative elements, however, can be identified in many reviews, such as the narrowing of scope, refinement of the research question, adjustment of inclusion and exclusion criteria, subgroup descriptions and analyses, and deductive construction of themes. Generally, though, they operate within the frame of the conventional methodology and tend to be subordinated to an aggregative review logic (cf. Bohlin, 2010). This tendency also applies to the reviews that explicitly are based on a configurative approach, as exemplified by the DCU's systematic research maps from 2014. With the exception of some of the reviews at the EPPI-Centre, the configurative elements identified in this study have a relatively small impact on the review process and outcome. Still, both agencies underline the importance of applying a wide range of approaches in education, and they provide examples of how to proceed with configurative reviews. Yet, the majority of their reviews in education follow the conventional methodology and generally could be categorised as aggregative reviews. The main purpose of the reviews is to investigate the effects of interventions or other types of activities. First and foremost, this illustrates how the aggregative approach dominates the educational field, even in the development of alternative research synthesis. The findings of this study can be discussed in relation to a range of issues, and they have implications for the relevance and quality of systematic reviewing in the field of education.
The influence of outsourcers and financers on the focus and development within the field of systematic review can be seen as worrisome. The extent and kind of involvement of this category of "users" hardly can be regarded as steps towards democratisation of the review process, which acknowledge the different perspectives of researchers, policymakers and practitioners (cf. Rees & Oliver, 2012). This study illustrates how brokerage agencies within a market context adapt their activities to the questions and aims of external project assignments and, to a certain extent, offer products that are flexible to the request of less time-and resource-demanding processes. However, the development towards slimmed-down products, which depart from the methodology of a full systematic review, has been discussed previously as a potential threat to the quality of systematic reviews (Ganann, Ciliska, & Thomas, 2010;Harker & Kleijnen, 2012).
One might also question if brief reviews and mappings can do justice to configurative approaches and whether they are appropriate for synthesising heterogeneous bodies of research. Nevertheless, it is undoubtedly difficult to contribute to method development within the limitations that outsourcers and financers appear to place on the agencies' review activity. Yet, as emphasised by the informants, configurative reviews successfully have been put into practice within other fields. The fact that health care seems more pluralistic than education in terms of review methodology needs further investigation, especially as it is reasonable to expect education to be open to broader questions addressed by more qualitative paradigms. The informants point towards differences in client relations and fewer commissioned reviews in education over the last years. Since 2012, when Gough et al. (2012aGough et al. ( , 2012b published on the typology of systematic reviews, the EPPI-Centre has conducted a relatively small number of reviews that cover overall topics within the educational field. While not denying the need to strengthen the knowledge base on what works, the dominant use of conventional methods does not seem to cover the educational field sufficiently as a whole (Biesta, 2007). There is an impending risk that valuable evidencefor example, from qualitative research that potentially addresses other important assignments of schools and teachersthereby, is overlooked (Bridges, 2008;Hammersley, 2001). Education in most European countries aims for democracy citizenship, personal development and social and communication skills, but the current orientation of systematic reviews in education does not include studies that explore or generate theories on these issues. It is obvious that the agencies studied acknowledge the need to go beyond what works in education. At present, however, both agencies seem to be caught in between their ambition to do justice to the complexities within the educational field and the interest among outsourcers and financers, who mainly seem to be interested in significant effect-sizes.
The far-reaching consequences of a one-sided approach are, of course, difficult to anticipate. One unfortunate consequence could be that professionals feel that their everyday work in school is not supported by systematic reviews. This leads to the important question of who will contribute to the development of the field and work with different kinds of reviews if leading organisations in the educational arena fail due to their dependence on external funding. Over the last years, however, new brokerage agencies have appeared in the educational field, for example, the Knowledge Centre for Education in Norway and, most recently, the Institute for Educational Research in Sweden. These agencies have an opportunity to take on an important role in the development of the field by putting configurative approaches into practice.
One potential way forward in this matter could be to strengthen the cooperation between the organisations involved in systematic reviews. There are already established networks, such as the Evidence Informed Policy and Practice in Education in Europe Network (EIPPEE Network), which among other things, aims to increase access to relevant educational research. The EIPPEE Network, and those like it, could open opportunities to method-development projects and to share experiences of any effort made to support review pluralism and enhance the use of configurative reviews in education. Collaboration on these matters also may highlight the specific competencies needed for the performance of configurative reviews. The inclusion of a wide range of research approaches, methods and interests most likely will require the expertise that belongs to the interpretative and critical research traditions (cf. Dixon-Woods et al., 2006;Levinsson, 2015). The work of Gough et al. (2012aGough et al. ( , 2012b at the EPPI-Centre has been important to conceptualising the field and to clarifying diversities among the reviews that are used, but the field of education and the organisations that conduct systematic reviews most likely would benefit from more practical examples on how to proceed with different kinds of configurative reviews.  (2001,1997) in Europe. 2. A number of different schemes and tools have been developed to assess the quality of study design types included in aggregative reviews. The purpose of applying these schemes and tools generally is to assess the risk of bias, that is, the systematic errors in the results or inferences. As an example, the Cochrane Collaboration's tool for assessing risk of bias assigns a judgement of the risk of bias in seven different domains: sequence generation, allocation concealment (selection bias), blinding of participants and personnel (performance bias), blinding of outcome assessment (detection bias), incomplete outcome data (attrition bias), selective outcome reporting (reporting bias) and other issues (e.g., Higgins et al., 2011). 3. Such agencies as the What Works Clearinghouse and the Campbell Collaboration are not considered relevant for this investigation since their work on systematic reviews are focused primarily on the synthesis of effect studies on interventions. 4. The list of EPPI-Centre systematic reviews was retrieved many times during data collection.
The last check for updates of the list was made in May 2016. Since a handful of reviews span disciplines, the estimation of the total number of EPPI reports within the educational field is a bit uncertain. 5. This was based on containing at least one of the following keywords in the title: impact, effect, intervention, what works and successful practice. 6. The last check for updates of review reports posted on DCU's website was made in May 2016. 7. There are a few exceptions to this description. Over the years, the DCU has published six descriptive maps that aim to include all Scandinavian research conducted in the area of preschool education in a specific year (e.g., Larsen et al., 2013b). These broad maps include qualitative studies as well. However, they do not answer a specific research question or include an explicit synthesis of the studies. The same exceptions apply for two of the DCU's brief systematic research maps that were conducted in 2014. 8. Since the time of the study, these six systematic research maps have been posted on the DCU's website. 9. Four of the reports were identified as systematic research maps (Dyson & Gallannaugh, 2008;Graham-Matheson, Connolly, Robson, & Stow, 2006;Husbands, Shreeve, & Jones, 2008;Moyles & Stuart, 2003). However, unlike the systematic research maps at the DCU, none of these maps contain a synthesis of the studies included. Consequently, these four reports were excluded on the basis of being considered not relevant for this investigation.