Policy innovation lab scholarship: past, present, and the future – Introduction to the special issue on policy innovation labs

Abstract The past decade has seen a rapid rise in the number of policy innovation labs (PILs). PILs that are found both inside and outside of government address a wide range of social issues. Many PILs share a few distinct common characteristics: a commitment to the design-thinking methodology, a focus on applying experimental approaches to testing and measuring the efficacy of comprehensive public policy and intervention program prototypes, and the use of user-centric techniques to stakeholders in the design process. In this introduction to the special issue on PILs, we begin by taking stock of the policy lab literature published to date by providing an overview of 70 related publications (peer review articles, book chapters, theses, reports, and catalogs) and the extent that they engage the policy literature. This review demonstrates the underexplored practitioner perspective, which serves as the theme for this special issue. Next, the six articles that comprise this special issue are introduced. They are written from a practitioner perspective and include contributions from Brazil, Canada, Finland, and the United Kingdom. Finally, suggestions for future research are highlighted, including the role of PILs in policy work, PILs as street-level policy entrepreneurship settings, and the need for more rigorous inferential methods.


Introduction
In scarcely a decade, a "labification" phenomenon have taken hold on a global scale, whereby the search for innovative policy solutions for social problems is embedded within scientific experimental-like structures. Policy labs, also referred to as policy innovation labs (PILs), have been steadily growing and can be found with government agencies, universities, or not-for-profit organizations. Each seeks to address a pressing social or economic issue. In global terms, most PILs have been established since 2011, and their rapid growth has led to claims that they "are on the path to becoming a pervasive part of the social infrastructure of modern public organizations' (Carstensen and Bason 2012, 5). Policy labs are also referred to as "public innovation labs," "public sector innovation labs," "government innovation labs," "organizational innovation labs," "policy innovation labs," "innovation labs," "public policy labs," "social innovation labs," "systems change labs," and "design labs," and "policy labs" (Whicher 2021;Hinrichs-Krapels et al. 2020).
Policy labs share similarities and resemble well-known organizations, including think tanks, research institutes, or policy shops with their shared goals of providing policy solutions for problems that often arise in specific sectoral areas such as health, welfare, open or big data, and the environment. In the effort to reorganize or rationalize activities in those sectors, the reasons for the creation of PILs and their purposes are not as clear cut (Tõnurist, Kattel, and Lember 2017).
The term "policy lab" can include established teams (or organizations, or institutes) set up specifically for innovative activities for public policy making and physical spaces set up to conduct workshops or other stakeholder activities. Muddying the picture is also the growth of other related organizations such as living labs, research institutes, and nudge (behavioral economics) groups contributing to policy making. We estimate that there are well over 450 lab-like entities worldwide.
Despite this ambiguity, PILs tend to share three distinctive features: (1) The use of design-thinking methodology (e.g. Lee and Ma 2020;McGann et al. 2018a), which originated in industrial and product and service design (Manzini 2015); (2) A focus on innovation through the application of experimental approaches and the emulation of scientific methodologies to test and measure the efficacy of various public policies and programs, thus drawing on experiments, often as pilots or prototypes. By seeking to emulate scientific methodologies, PILs attempt to test and measure the efficacy of various public policies and programs as well as to provide evidence for evidence-based design (Bason 2017;Kimbell 2015;Lee and Ma 2020); and (3) A user-centric approach whereby target populations actively engage in the design process (Lee and Ma 2020). Indeed, many PILs coordinate efforts between public, private, and academic actors (Williamson 2014a(Williamson , 2014b. Additionally, PILs are often characterized by the wide usage of digital instruments to allow public transparency (Olejniczak et al. 2020).
Therefore, an important goal of PILs is to create a collaborative space to enable participants with varied skill sets to reach a common understanding of a policy challenge and then explore design and test user-centered solutions for potential implementation across the system (El-Haddadeh et al. 2014;Bellefontaine 2012). Thus, PILs are understood as both a process and a particular kind of workspace that breaks down hierarchies and engages people in divergent and creative thinking (Gryszkiewicz, Lykourentzou, and Toivonen 2017;McGann et al. 2018a).
Guided by user-centric approaches and drawing on experiments as pilots, policy labs aim to address the well-documented phenomena of implementation gaps (e.g. Gassner and Gofen 2018) and noncompliance (Gofen 2014(Gofen , 2015 by enhancing the notion of evidence-based design. The policy labification trend supports Lindquist and Buttazzoni's (2021) argument that these widely different manifestations are required to build on new knowledge and skills that are often recruited from other parts of an organization (in government-based policy labs) or by autonomous or semi-autonomous organizations. Thus, the adhocracy form seeks to encourage flexibility, adaptation, and creativity to deal with environments characterized by uncertainty, ambiguity, and information overload, produce innovative products, adapt quickly to new opportunities, and build emergent strategies' (Lindquist and Buttazzoni 2021).
In this introductory paper to the Policy Design and Practice special issue on policy innovation labs, we first review the existing policy lab literature. By taking stock of the growing number of publications, other scholars and practitioners will better understand the available scholarship, thus this special issue will provide a valuable one-stop resource. Our review suggests that practitioner perspectives on policy labs are understudied. This six-article special issue brings this scholarship together to broaden the understanding of policy labs both among scholars and practitioners. We conclude by suggesting possible avenues for future PIL research.

State of the policy innovation lab scholarship
Many theoretical policy frameworks have been employed to explain the rise of policy innovation labs and policy "labification," including design thinking, experimental government, and collaborative governance (Andersen, Kelemen, and Matzdorf 2020). For example, it is argued that the role of design thinking in policymaking may lead to improved policy design because it promotes more nuanced solutions (Brown 2008;Howlett 2014;Sch€ on 1988Sch€ on , 1992. Interestingly, contemporary design thinking in policymaking reflects the technocratic policy design approach initially developed in the 1970s and 1980s, which emerged in analogy to design in engineering or architecture (Peters 2020). Experimental government is also rooted in a long-standing tradition of experimentalism, emphasizing the importance of experimenting with social change, for example, in the Mus ee social in Paris. Policy labs are often referred to as experimentation "islands" where the public sector can rapidly experiment with policy design by testing and scaling public-service innovations (Tõnurist, Kattel, and Lember 2017;McGann et al. 2018a). Policy labification is also rooted in collaborative governance, which manifests the well-known notion of participatory and deliberative democracy that emerged during the 1960s and 1970s as an alternative, unorthodox approach to neo-liberalism and public management, which consider citizens "customers" and "clients," thus peripheral actors of politics (Schuler and Namioka 1993;Vitale 2006). PILs, therefore, echo the well-documented co-production notion, whereby policy solutions are co-created (Nesti 2018). Co-design is also a well-established approach to creative practice within the public sector, with roots in the participatory design techniques developed in Scandinavia during the 1970s (Puttick, Baeck, and Colligan 2014). Engaging both governmental and non-governmental actors, PILs are often studied by using network and networking theories and intermediates between researchers and policy actors (Ojha et al. 2020;Olejniczak et al. 2020). PILs are also considered instruments that facilitate policy knowledge transfer (Lee and Ma 2020).
In contrast to the literature examining think tanks, living labs, design thinking, and behavioral insight units (i.e. nudging), the policy lab literature are surprisingly small. In May 2021, we conducted a database search using Google Scholar, Proquest, and Scopus. 1 The bibliographies of the publications were also searched for possible undetected publications. Finally, several leading policy lab scholars verified the completeness of our search results. Along with peer-reviewed articles, we also included conference papers, book chapters, reports, and theses. search focus was for policy and public sector innovation-specific labs. Other entities such as behavior/nudge units, living labs, research institutes, and think tanks were omitted. However, given the ambiguity of the literature, differentiating these entities from PILs was not always possible. In total, 70 publications, including the six papers in this special issue, were found, and they are listed in chronological order and a brief description of each in Appendix 1. The results of a preliminary analysis are presented in Tables 1-6. The documents were uploaded onto NVivo 12, a qualitative data analysis application. We acknowledge and recommend that a more rigorous approach to this literature should be undertaken.
The first known PIL paper was Lewis and Moultrie's (2005) article which chronicles the formation of three early UK policy labs. In the past two years, there has been a significant increase in policy lab-related publications, with the trend from conference papers and reports to peer-reviewed articles (Table 1). A small majority of the publications are peer-reviewed articles (39), followed by reports (17) ( Table 2). There were three lab-based Master theses, all of which investigated the Finnish Inland lab.
Geographically, when stated, the focus of the publications has been widespread, with the UK accounting for the highest number (8) (Table 3). Notably, there were no publications from or directly analyzing African policy labs. Table 4 provides an overview of the publications' focus or, in some cases, foci. As a new field, some of the publications provided a conceptual lens, often providing theoretical arguments explaining the rise of labs and their role in public sector reform and policymaking. There was nearly an equal number of single case studies (17) as multiple case studies (14). Only a few publications attempt to systematically compare PILs. Examples include Lee and Ma's (2020) intercountry study and Evans and Cheng's (2021) intra-country Canadian study. Key informant interviews and workshops were the most commonly employed methods in empirical studies. There were only a handful of PIL surveys, which is not surprising given the relatively small number of labs. Regardless of the method employed, all of the studies were descriptive with no attempt to provide rigorous causal explanations. Many empirically-based publications tended to examine policy labs in a variety of sectors. Only 14 of the studies could be considered sector-specific, with "data"-based being the most frequent (7) ( Table 5). Very few studies explicitly focused on national, sub-national, or municipal issues.   The policy lab field is very multidisciplinary, attracting scholars from a variety of fields. While categorizing the disciplinary backgrounds of the many authors in the 70 publications would be nearly impossible to do, many are identified with the design field. In contrast, a growing number are from the public policy and public management fields. Table 6 highlights the extent to which the public policy literature has entered the policy lab scholarship.
Most practitioners are familiar with the policy cycle (or policy stages) (Lindquist and Wellstead 2021;Cairney 2015), which is the best-known heuristic describing the policy-making process (Howlett, Ramesh, and Perl 2020;Cairney 2019). The policy cycle or policy stages concepts were highlighted in 17 publications, including Conliffe, Story, and Hsu (2018) and P olvora and Nascimento (2021). These authors acknowledge that the concepts represent an important starting point when understanding the policy process. They also argue that policymaking is far more complex, which presents designers an opportunity to play a critical role in the process. Whicher and Crick (2019) point out that the "policy cycle is deeply embedded in the cultures of legislatures and bureaucracies around the world, is one of the main reasons why policy processes are primarily focused on the production of documents, rather than the production of outcomes" (p. 296). Olejniczak et al. (2020) state that lab activities are embedded within the main policy cycle as they often build in a smaller loop of design-testing adaptation.
Within the policy cycle, agenda-setting was only sparingly mentioned (eight publications). Hinrichs-Krapels et al. (2020) suggest that labs could provide evidence to policymakers that a particular issue is not ready to be on the policy agenda. The role of policy labs in policy formulation received slightly more attention (13 publications). Fleischer and Carstens (2021) acknowledged that policy labs were an unconventional actor compared to the formulation process dominated by traditional and hierarchical bureaucracies. Vrabie and Ianole-C alin (2020) found that since labs promote open government and evidence-based criteria, they can encourage governments to become more transparent, participative, and collaborative during policy formulation. As with agenda-setting, policy implementation was sparingly mentioned (18 publications). Olejniczak et al. (2020) found that it was unclear if policy labs were "effectively feeding their solutions into the actual policymaking and policy implementation process" (p. 104). Komatsu et al. (2021) also made similar criticisms. Finally, while some publications highlighted the evaluation of PILs, there was very little evidence of labs playing a role in formal policy evaluations or the policy cycle. Overall, the connection to other aspects of the policy and public management literature (e.g. policy capacity, public value, policy work) was minimal. Unsurprisingly, the term "policy design" was raised in 40 publications. Upon closer inspection, this term is used in the larger context of design-based approaches rather than how policy design is understood in the policy sciences. Clarke and Craft (2019) commented on the differences between these two variants. They pointed out that policy design accounts for political and policy capacity constraints and policy mixes and policy styles in the latter.

Special issue overview
This special issue focuses on the lessons learned by practitioners on various aspects of policy design in policy labs, which will broaden the on-the-ground perspectives on policy labs. We then turn to the policy lab-specific papers, beginning in Canada with Kathy Brock's paper "Policy labs, partners and policy effectiveness in Canada," which focuses on how 'deliverology' in Canada after the 2015 election of Justin Trudeau's Liberal Party spurred the growth of policy innovation labs. Brock provides a broad overview of the Canadian experience with policy labs between 2015 and 2020 and, in particular, with Policy Development Units (PDUs) in the central machinery of government. This paper focuses on the bringing of nonprofit and private sector partners into the center of public sector decision-making through policy hubs, as well as the establishment of private labs. The study also highlights that collaborative relations with the government resulted in mixed implications for the nonprofit sector. Collaborating through policy hubs provided nonprofit organizations with new opportunities and access to impact policy decisions. However, it posed risks to the independence, legitimacy, and effectiveness of nonprofit organizations as policy advocates. Therefore, practical insights of this study emphasize that both public and nonprofit sector partners in PILs should be cautious about their choice of partnership and recognize that their ability to influence policy change is often limited. Jenny Lewis's paper "The limits of policy labs: characteristics, opportunities, and constraints" provides a broad overview of policy lab research that has taken place in Australia and New Zealand over the past five years. This paper offers insights and lessons learned from three empirical studies, which are generalizable and should be of interest to readers from other jurisdictions. Lewis's paper focuses on critical characteristics of policy labs, notably organizational forms, size, focus, and methods. PILs can be controlled, enabled, or led by the government and run independently. Importantly, lessons learned regarding the opportunities and constraints are highlighted. Specifically, in practice, labs' autonomy and closeness to citizens and communities provide opportunities to broaden the scope of potential policy solutions. Practical constraints are ascribed to labs' dependency on political patronage and labs' common features, notably their small size and often short life cycles.
The evolution of policy labs' operating models is at the focus of Anna Whicher's paper "Evolution of policy labs and use of design for policy in UK government." This paper draws on the growth of UK policy labs, which was precipitated by two policy agendas: open policy making and devolution. Offering a typology of four distinct financing models of labs shifts attention to the extent and the scope of a lab's dependency upon its financing source. Labs are funded by one or multiple departments, from recovering part of the projects' costs, charging for projects on a not-for-profit basis, consultancy rates with a profit margin to expand operations, and from multiple income sources. Whicher also suggests a framework for the establishment, review, and evaluation of policy labs, which comprises four components, namely (1) Propositionthe vision, governance, and finance models; (2) Productthe offering, user needs, and tools; (3) Peoplethe people skills, knowledge diffusion, and broader capacity building; and (4) Processthe routes to engagement, user journey, and promotion mechanism. From a practical perspective, the financing typology and the framework provide practitioners with analytical tools to plan and categorize labs.
The Inland Design lab located within the Digital Service unit of Finland's Immigration Service (Maahanmuuttovirasto) provides the case study for Tamami Komatsu, Mariana Salgado, Alessandro Deserti, and Francesca Rizzo's case study in their paper "Policy labs challenges in the public sector: the value of design for more responsive organizations." This paper is the fourth study of the lab, an ongoing process of design experiments supported by the Finnish government (see Kantola 2019;Kokki 2018;Swan 2018 in Appendix 1). Komatsu et al. (2021) argue that design culture is essential for meaningfully transforming an organization through human-centered design and co-creation (See O'Flynn 2007). Readers have the opportunity to experience the details of the design process and the improvements made in a 2017 pilot to improve immigrant-related services. Komatsu et al. (2021) argue that design culture is essential for transforming an organization through human-design design, co-creation, and, more generally, increasing public sector value (See O'Flynn 2007).
Taking a deep position within the work of practitioners as means to generate theory is the theme for Elisabete Ferrarezi, Isabella Brandalise, and Joselene Lemos's paper "Evaluating experimentation in the public sector: learning from a Brazilian innovation lab." The starting point of this paper is that practitioners and researchers alike question whether the impact of policy labs meets the expectations. The focus here is on the changing political environment, which necessitated the evaluation of GNova, the Brazilian federal policy lab; the findings of this paper provide a framework for evaluating PSI labs. According to this framework, the link of theory-practice is crucial; therefore, there is a practical need to clearly articulate the values, purpose, and definition of innovation. As in Komatsu et al.'s paper, the workshops and interviews discussed here highlight the importance of creating a public sector.
All the papers in the special issue highlight the hurdles that policy labs face in meeting the common expectation that they will provide innovative, implementable policy solutions. In addition, practical recommendations for both planning and designing a lab and for reviewing and evaluating a lab's impact.

Future research directions
Both Evans and Cheng (2021) and Olejniczak et al. (2020) suggest that policy labs need to be better understood within a more extensive policy work ecosystem. Labs should not be seen as an alternative to traditional practices, but instead as a promising addition. Nearly two decades ago, Mayer, Van Daalen, and Bots (2004) developed a framework that accounted for the complexities of policy analysis that includes many of the innovative contributions made by the design community (Figure 1). Beyond the rationalist style, Mayer, Van Daalen, and Bots (2004) pointed out five other styles that define contemporary policy analysis: argumentative, client advice, participatory, process, interactive. Recently, De Smedt and Borch (2021) applied these styles to develop a narrative framework for policy design for sustainable transitions.
Despite the prevailing criticism in the policy lab literature that hierarchical, bureaucratic structures stifle policy work in government agencies, the evidence may suggest otherwise. Several earlier empirical policy work studies demonstrate that policy work is quite dynamic and incorporates the complexity of tasks outlined in Figure 1 (See (Vesel y 2017; Carson and Wellstead 2015;Evans and Sapeha 2015;Howlett and Wellstead 2011). A notable exception is Timeus and Gasc o's (2018) study of policy impact labs' contributions to local government innovation capacity, which suggests that they do improve innovation capacity by contributing to aspects such as idea generation and knowledge management. At the same time, this study also acknowledges that labs' isolation from the public organizations they advise limits their overall impact, and raises questions about innovation sustainability.
This special issue shifts attention to policy lab practitioners and the practice of policy labs. Moreover, we acknowledge that the expected influence of policy labs is inherent "bottom-up," and that policy lab serve as "technology" or "instrument" in order to improve policy-making processes. A promising avenue of research is conceptualizing the policy lab as a source of innovation diffusion (Berry and Berry 2018). Similarly, policy labs may be considered a type of "street-level policy entrepreneurship" (SLPE). Specifically, SPLEs "seek to develop or adopt policy innovations intended to improve the implementation processes they prosecute and to entrench these innovations in the day-to-day activities of bureaucratic peers" (Arnold 2015, p. 3). SLPEs often use various strategies to influence the policy agenda, linking design and implementation (Gofen and Golan 2021; Gofen and Lotta 2021; Lavee and Cohen 2019). Street-level bureaucrats (SLB) are often associated with low-and middle-level government officials. However, many of the challenges facing SLBs are similar to the challenges faced by those working in and leading policy labs.
An additional venue of future research is applying more rigorous empirical methods when studying PILs. Most of the empirical studies in Appendix 1 were primarily descriptive. Exceptional examples are the two surveys by Tõnurist, Kattel, and Lember (2017) and McGann et al. (2018a), which had small sample sizes, making it difficult to make any statistical inferences. One alternative would be to change the unit of analysis from the organization (the lab) to the projects or the individual lab workers involved. The difficulty would be developing a population list from such a disparate group of individuals scattered across the globe.
From a methodological perspective, there are some qualitative methods that future researchers could draw from, such as process tracing (Kay and Baker 2015) or qualitative comparative analysis (QCA) (Rihoux, Rezs€ ohazy, and Bol 2011). However, to apply any of these methods, clearer dependent variables or outcomes would have to be established. The current literature rarely suggests how to measure the impact of a policy lab. However, we anticipate that research will shift from a focus on the internal dynamics of PILs to considering their broader social implications. Understanding PILs as social problem solvers and as a governing technique will lead to more promising research. Note 1. The search term was based on the terms in Hinrichs-Krapels et al. (2020) and Whicher (2021); they included public innovation lab" OR "public sector innovation lab" OR "government innovation lab" OR "organizational innovation lab" OR "policy innovation lab" OR "innovation lab" OR "public policy lab" OR "social innovation lab" OR "systems change lab" OR "policy lab."

Disclosure statement
No potential conflict of interest was reported by the author(s).