The need for robust critique of research on social and health impacts of the arts

ABSTRACT
 This paper outlines the growth of interest in the UK in the social and health impacts of the arts from the late 1990s onwards. It highlights the early critiques of claims made about such impacts by Belfiore and Mirza (Mirza, 2006a). Attention is given to two recent commissioned reviews of arts and health research, by the World Health Organization (WHO) Europe, and the UK Department for Digital, Culture, Media and Sport (DCMS), which conclude that the can arts have an important role in promoting health and reducing social and health inequalities. These reports have substantial limitations, however, and the critical concerns raised by Belfiore and Mirza remain to be addressed. The paper concludes that broad scoping reviews are ill-advised as a guide for practice and policy development, and future progress should be guided by rigorous, systematic and transparent methods that ensure that review results are trustworthy. The arts and cultural engagement may have a part to play in promoting wellbeing, but whether or not they can have a substantial role in promoting population health and reducing social and health inequalities is yet to be demonstrated.


Introduction
The idea that engagement with the arts has social value and can benefit health received a substantial impetus from the programme initiated by the 1997 Labour Government to address issues of poverty and "social exclusion". The tenth Policy Action Team report on arts and sport (PAT, 1999) provided case studies of diverse community arts projects responding to local needs and circumstances. It emphasised the principle of equality of access to arts and sports and claimed that engagement can promote self-respect, selfconfidence, a sense of achievement, and improve mental wellbeing. The focus on the social value of the arts, prompted the then Health Education Authority to commission a review of community arts and health projects across the country (Meyrick, 2000). A survey of 90 community arts projects and 15 detailed case studies, illustrated the potential of the arts in promoting health. Stress was placed on the quality of artistic work produced, with inspired creative leadership from artists. Even in the absence of research, the report identified what have subsequently emerged as the key benefits of participation in the arts for health and wellbeing: improved personal skills, friendship, community involvement and opportunities to celebrate what is created (Clift & Camic, 2015).
In response to the Labour policy initiatives critical perspectives appeared in a book entitled "Culture Vultures" which questioned these developments (Mirza, 2006a). Belfiore (2006), in this volume, noted that several reviews on the social impacts of arts had been completed (e.g. Jermyn, 2001;Reeves, 2002;Ruiz, 2004), but cautioned that: … the quality of the evidence on the social impacts of the arts is generally poor, and that the evaluation methodologies are still unsatisfactory. (p. 29) Since then, developments in the field have produced a larger body of evidence, and research has become more robust, with a growing number of systematic reviews. Evaluations have improved, but Belfiore offers further observations that continue to have real relevance in assessing the current evidence base on the social and health impacts of culture and the arts. These include: . Any form of participatory activity could have "an empowering effect, whether artsbased or not". . Existing reviews have ignored details which suggest "negative" impacts from arts and cultural engagement. Lessons from experiences of "culture-led regeneration" suggest that "the arts can actually be socially divisive". . Little attention given to whether cultural and arts initiatives "provide the most costeffective means to tackling social exclusion, health problems" compared with "established practices within social and health services". . Little attention to longer-term outcomes as opposed to short-term effects. . Little attention to the artistic or aesthetic quality of cultural and arts engagements in assessing outcomes. . A focus on the role of the arts and culture my serve "as a convenient means to divert attention from the real causes of today's social problems and the tough solutions that might be needed to solve them".
Mirza (2006b) extends Belfiore's critical stance to the notion that the arts and culture are "good for health". In her view: … claims that the 'arts are good for health' are so vague and inconsistent that they are at best just common sense, or at worse (sic), misleading. (p. 96) Her concerns echo those of Belfiore and include: . An activity such as listening to music may help to relieve patient anxiety during medical treatments, but this is a function of patient preferences and other forms of non-arts distraction may be equally effective. . Arts in healthcare environments may have benefits for staff wellbeing and job satisfaction, but such efforts may be "a distraction from other pressing concerns" that affect staff morale.
. Arts in community projects risk conceptualising "social as individual, psychological problems that require therapy" rather than addressing root causes (such as poor housing or lack of employment). . Little attention has been given to the artistic quality and integrity of "the arts" being used in health initiatives.

Developments in arts and health since 2006
Notwithstanding the critiques mounted by Belfiore and Mizra, development of initiatives and research in arts, social issues and health continued apace in the UK from 2006 onwards. Landmarks in the field have been: the Cayton (2007) report; the Arts Council/ Department of Health (2007) Prospectus for Arts and Health; the appearance of two academic journals in 2009 (Clift et al., 2009); the ESCRC seminar series on arts, health and wellbeing (Stickley et al., 2017); the Oxford Public Health Textbook on Creative Arts, Health and Wellbeing (Clift & Camic, 2015), the All Party Parliamentary Group for Arts, Health and Wellbeing inquiry (APPG, 2017); the establishment of the Royal Society of Public Health Special Interest Group for Arts, Health and Wellbeing (Stickley et al., 2017), and the creation of support networks for practitioners and researchers (e.g. the national Culture, Health and Wellbeing Alliance, https://www.culturehealthandwellbeing.org.uk/, the Arts Health Early Career Research network, https://www.artshealthecrn.com/, and the MARCH project see: https://www.marchnetwork.org/). A further development of national importance was the launch in 2018 of the Social Prescribing Initiative in England, with "arts on prescription" as one of three strands of community referral (the other two being exercise and engagement with nature) (NHS England, 2020, https://www.england.nhs.uk/ personalisedcare/social-prescribing/). A plethora of reviews of research and evaluation on the social and health impacts of the arts have been published in the UK from 2010 onwards (e.g. Bungay et al., 2014;Carnwath & Brown, 2014;Ings & McMahon, 2018;McLean et al., 2011;Mowlah et al., 2014;RSPH, 2013) all providing positive assessments of the growing evidence base. There has also been an increase in systematic reviews on arts and social and health outcomes (e.g. Daykin et al., 2016;Mansfield et al., 2017;Tomlinson et al., 2018). These offer cautious assessments of the evidence based on rigorous inclusion criteria, quality screening and data extraction. The conclusion given by Glew et al. (2020) from their systematic review of 13 studies on the wellbeing and psychosocial outcomes of group singing for children, is typical: Conclusions about the effectiveness of group singing could not be drawn from quantitative studies, which were of low quality. Qualitative synthesis indicates group singing may support young people's wellbeing through mechanisms of 'social connectedness' and confidence. Current conclusions are limited, and additional, high quality qualitative and quantitative research is required to build on these findings. Further careful study may support the development and funding of group singing projects. (p. 1) Finally, and most recently, a major international review of evidence on arts and health has been published by the World Health Organization (WHO) Europe (Fancourt & Finn, 2019). This report has been influential in the development of further activities by WHO (2019). A second report commissioned by the UK Department of Digital, Culture, Media and Sport (DCMS) (Fancourt et al., 2020) extends the WHO review by adding an assessment of evidence quality for policy development.
None of the reviews referred to above cites the early critiques of Belfiore (2006) and Mirza (2006b), and over the last 15 years, there has been a lack of the kind of sceptical analysis they offered. There are, however, signs of a recent resurgence of critical questions about arts and health. The APPG report has been criticised by Phillips (2019) for its lack of ideological clarity and weak treatment of the published evidence base. Clift (2020) has critically reviewed the WHO (2019) report. And Yoeli et al. (2020) question the notion of "arts as treatment". Researchers outside the field of arts and health, have raised concerns about the harm that the arts, music, and culture can do. Gross and Musgrave (2020) examine the way in which the music industry can make professional musicians sick. Brook et al. (2020) have explored the consequences of employment conditions and inequalities in the cultural and creative industries, arguing that "culture is bad for you". And Pritchard et al. (2020) in a discussion of "artwashing" in contexts of "gentrification and social cleansing" shows how the arts can provide "the perfect foil for the vengeful ideology of neoliberalisma smokescreen for dispossession and displacement" (p. 179).

Aims
The present paper focuses on the claims in the recent WHO and DCMS reviews (Fancourt et al., 2020;Fancourt & Finn, 2019) that the evidence base for the role of the arts on social and health outcomes is strong, and a good guide for policy development. These reviews will be scrutinised by looking at selected examples of original research studies surveyed from specific sections in both reports. Serious reservations about all the studies considered will be raised. For the DCMS report attention is also given to the methodology employed to grade the quality of evidence reviewed (FORM, Hillier et al., 2011), and the legitimacy of the claims made about the strength and usefulness of the existing evidence base for policy development will be questioned. In the discussion section of this paper, the issues emerging from examination of the research evidence included in these reviews will be discussed in relation to the earlier critiques mounted by Belfiore (2006) and Mirza (2006b).

Method
Two sections of the WHO report (on health inequalities and frailty) and one in the DCMS report (on social inclusion) were chosen for careful reading. Three tables were constructed for the sources cited in each of these sections with columns for lead author, date, country/ies covered, aims, design (where an empirical paper), participants, art form(s), and outcomes. A final column indicates whether the source is specifically mentioned in this paper with some additional notes. All three tables plus an explanatory introduction can be found at: www.colouringinculture.org.uk. In the present paper, textual summaries of each section and examples of specific research reports cited are given, together with critical commentaries. It is recognised that this approach considers a sample of the sources referenced in the two reports and that conclusions drawn below may not generalise to the reports overall. However, it is not unreasonable to assume that the samples of studies considered, and the way they are presented, are a fair representation of the methods adopted, in the WHO and DCMS reports, and their limitations.
WHO report: what is the evidence on the role of the arts in improving health and wellbeing?
The publication of the WHO review of arts and health research (Fancourt & Finn, 2019) was announced in a press release from University College London with the headline: "Arts 'crucial' to reducing poor health and inequality". The release quotes the lead author as saying that the report: … highlights that engagement with the arts can affect the social determinants of health, improving social cohesion and reducing social inequalities and inequities. Crucially, the arts can support the prevention of illness and promotion of good health.
The authors of the report set out to show that these claims are supported by the existing body of evidence. Their broader ambition was to close what they see as an "awareness gap" between the research evidence and those professionals involved in healthcare policy development and practice, to promote "knowledge and technology transfer" of evidence into service provision. The intended audiences for the report include policy makers in central and local government; commissioners of health and care services; funders of research; managers of arts and cultural organisations, and institutions involved in the training of health and arts professionals. Policy makers, the authors suggest, should consider "supporting the implementation of arts interventions where a substantial evidence base exists".
In Section 2 of the report, Figure 2 lays out the many contributions the arts can make to prevention and health promotion, and to the management and treatment of health conditions. The report is very wide-ranging and brings together a considerable body of evidence. The authors describe it as "the most comprehensive survey of the literature on arts and health to date". In Section 3, they say that the report presents the conclusions drawn from over 200 "systematic reviews, quantitative meta-analyses and qualitative metasyntheses" and over 700 individual studies published between 2000 and 2019.
Nevertheless, the authors acknowledge that the report has limitations. They point out that a "detailed discussion of the strengths and limitations of different methodological approaches or individual studies" was not possible due to the requirements of Health Evidence Network reports. The authors also highlight "gaps and challenges" in the arts and health research literature. The first is "an inherent publication bias in the literature towards positive findings". The second is a need for research to give greater attention to "effect sizes" in addition to the statistical significance of the impacts of arts activities on health outcomes.
The authors suggest that readers should consult "discussions within specific studies or the reviews cited here" to make their own assessments of the quality of the evidence cited and this is what we will do now, focusing on examples of research presented in a positive light within two sections of the review.

Social determinants of health: health inequalities
In this section of the WHO report, claims are made that the arts can help reduce "social inequalities and inequities" (pp. 10-12) and 15 sources are cited. Of these, one is a systematic review, two reports randomised controlled trials and three are qualitative studies, with the remaining sources being discussion papers, economic reports, a website, and a book. It is remarkable, given the central importance of the role of social determinants in generating inequalities in health, and the attention this has been given globally by WHO, that the literature cited here is so limited in scope. It might be expected that much more could be said on the role of the arts in addressing this central challenge in public health. Be that as it may, attention will be given here to the systematic review and the two RCTs.
The systematic review (Cain et al., 2016) looked at outcomes from "participatory music programs" operating in what are described as "culturally and linguistically diverse (CALD) and at-risk communities". Six small-scale studies from Australia and the United States, are considered, and in one case involved young people living in detention. Fancourt & Finn report a range of health outcomes from participatory music identified by the authors of the studies included in the review (e.g. music reduced anxiety, depression, truancy, aggression and increased empathy, confidence, and healthy nutrition). However, they give no indication that Cain et al. (2016) emphasise substantial limitations which do not allow well-substantiated generalisations to be drawn from the studies they review. The main problems Cain et al. stress are: . The definitions employed of "at-risk" young people varied widely, with few characteristics common to the studies . A "pronounced lack of definitions of common factors such as 'depression,' 'mental health,' 'poverty,' (…) making effective comparisons impossible". . Only two studies involved any form of comparison group making it "difficult to establish how the program may have impacted participant outcomes". . Five of the studies reporting qualitative evaluation did not provide a clear account of how the data were analysed. . All studies were short-term, so the long-term outcomes of participatory music programmes were not established.
The two RCTs are also concerned with the outcomes of "musical" interventions for children. The first is a small randomised controlled trial (RCT) of music therapy from South Korea (Kim, 2017). This evaluated a twelve-week programme of music-making for children aged 7-11 experiencing poverty and on-going "maltreatment". Only 26 children took part, and the author reports that the "effect sizes" on the outcome measures employed were very small. Kim also stresses that variability among participants was substantial, and the results could not be generalised. But these considerations are hardly the most significant concerns raised by this study. More important is the fact that music therapy is used to help support young children living in domestic circumstances highly perilous to their wellbeing. The author gives graphic examples: … two boys were victims of chronic domestic violence. They were often restless and prone to emotional outbursts and aggression towards younger and weaker children in the group (…) Two boys missed out group music therapy sessions from time to time (…) later the therapist was informed that the boys were badly beaten at home the previous day. (pp. 74-75) So, at best, music therapy was ameliorative, to a small degree, for individual children, but at worst it failed to address questions of child protection, and simply ignored tackling wider issues of poverty and social mores normalising physical violence towards children. All that Fancourt & Finn say of this study, however, is that "group music can help to prevent the development of depression, anxiety, attention problems and withdrawal", without any reference to the dire circumstances surrounding the children involved.
The Korean study is followed by an RCT of the National System of Youth and Children's Orchestras in Venezuela -El Sistema (Aleman et al., 2017). This nationwide initiative aims to enhance educational opportunities for disadvantaged children through music. The trial took place in 16 music centres in five regions of the country and involved almost 3000 children aged 6-14. The parents making an application to these centres were randomised to an offer of admission in September 2012 or a year later in September 2013. Of the participants, 16.7% were living below a household poverty level of US$4 a day, compared with 46.5% in the regions covered. It is clear, therefore, that poorer children were under-represented, and so far from addressing social inequalities, the work of the centres served to reinforce thementirely contrary to the idea of an intervention designed to reduce social and health inequalities. Children taking part in the musical programme showed some improvement in "self-control" and some reduction in "behavioural difficulties", but the statistical p-value reported is 10% and is hardly impressive.
Apart from the questionable outcomes from the Korean and Venezuelan studies, these two studies could scarcely be more different in the nature of the "music" involved. In the Korean example, children met weekly to improvise on instruments and to singbut in the El Sistema programme, young people engaged in serious musical training, to equip them to play orchestral music.

Prevention of ill-health: frailty
In this section of the WHO report, Fancourt and Finn cite 24 sources, most of which focus on dance, dance therapy or dance-related exercise. The studies are drawn from many different countries, and most involve women. Those from Korea, Turkey, Thailand, and China make use of traditional or folk dance (see the supplementary tables for links to illustrative videos). A specific and important health focus of some studies is the idea that dance can help to prevent falls in older people, and the evidence, they say, is mixed: "dance may help to prevent falls, particularly in populations with existing health conditions, although other studies have not found benefits" (p. 25). The sources cited in support of this statement, include two systematic reviews and two randomised controlled trials. A close reading of these sources, however, reveals a range of problems which cast doubt on the view that "dance" can reduce falls.
In the earlier of the systematic reviews (Fernández-Argüelles et al., 2015), none of the seven RCTs considered incidence of falls, but focused on risk factors for falling (such as balance, gait and strength). And while some studies did show some beneficial changes on such variables, the authors highlight the heterogeneity and limitations of the studies considered and conclude that the evidence does "not enable us to confirm that dance has significant benefits" in relation to falls risk factors.
A second systematic review (Veronese et al., 2017) focused on six different RCTs. Of these, four directly assessed the effects of dance on falls while three assessed "fear of falls". Within the former group, only one study (da Silva Borges et al., 2014) reports that ballroom dancing, undertaken three times a week over twelve weeks, in a care home setting, resulted in a significant reduction in falls. The "effect size" of 2.67 (Cohen's d) reported in favour of dancing is spectacular, but unfortunately there is a major flaw in the paper. Figure 2, which the text states give the falls data, reproduces, in error, the results for a postural balance measure already given in Figure 1. Unfortunately, Veronese et al. fail to mention this major flaw in the da Silva Borges et al. paper, but the paper is rightly excluded from a subsequent Cochrane review (Cameron et al., 2018) for precisely this reason. The remaining three studies in the Veronese review did not show that a dance intervention compared with the control condition reduced fear of falls.
In addition to the systematic reviews, two specific RCTs on falls are cited in the section on frailty. The first is a large-scale cluster randomised trial on weekly "social dancing" conducted in retirement villages in Australia over twelve months (Merom et al., 2016). This study showed no differences in falls between the dance and the waiting control arms of the trial. Indeed, post-hoc comparisons suggested that for participants with a history of falls, falling was more prevalent for the dance group, suggesting that dance far from reducing falls may in some cases enhance risks.
The second RCT, conducted in Canada, assessed the role of "rhythmic auditory stimulation" (RAS) in reducing falls for people with Parkinson's (Thaut et al., 2019). Fancourt and Finn note that, RAS involves the use of music to provide rhythmic cues, and that this "is a core feature of dance". Nevertheless, RAS itself cannot be regarded as "dancing" which in most of the studies considered, involved ballroom dancing, or traditional forms of folk dancing in a group setting. In the Thaut et al. study, in contrast, individual Parkinson's patients followed a daily programme of 30 minutes of structured walking in their own homes over 24 weeks, paced by "metronome click-embedded music". This activity was found to be very effective, and if the findings are confirmed in further studies, there might then be a strong case for wider implementation of the RAS intervention with Parkinson's patients. But to repeat, RAS is not dance as widely understood.
DCMS evidence summary for policy: the role of the arts in improving health & wellbeing Fancourt et al. (2020) explain that this document is the result of a commission from the Department of Digital Culture Media and Sport (DCMS). The DCMS report draws upon the WHO report, and reviews research evidence linked to "three DCMS policy-relevant" areas: (i) social outcomes, (ii) youth development, and (iii) the prevention of mental and physical illness. In addition, evidence under each policy-relevant heading is graded according to study type and quality. This is done using a modified version of the Australian "Formulating Recommendations Matrix" (FORM) (Hillier et al., 2011), which judges health evidence in terms of quality and risk of bias, consistency, clinical impact, generalisability and applicability, as a basis for developing guidelines for clinical interventions.
There are several problems with the use made of this framework. Firstly, Fancourt et al. simply modify the framework by removing references to clinical issues and the Australian healthcare context and substitute the intention of "grading recommendations in evidence-based clinical guidelines" with the vague notion of relevance to policy. There was no attempt to consult on the amendments made or to check on the validity of employing the modified framework to arrive at recommendations for policy.
Secondly, the Australian approach explicitly relies upon the identification of specific clinical issues (Hillier et al. give the example of clinical practice guidelines for the management of melanoma) and relevant systematic reviews of the evidence. In Fancourt et al.'s use of this approach, no specific health or social issues are identified (other than DCMS defined policy-relevant areas), and the evidence they consider is not subject to systematic review. Rather they organise the studies they consider according to a "hierarchy of evidence" in which systematic reviews and randomised controlled trials are assumed to be the "gold-standard" for evidence-based practice.
Furthermore, it is unclear how the additional criteria for judging "a body of evidence" were applied as no detail is given of their methods for arriving at overall gradings for strength, consistency, impact, generalisability and applicability (a key issue that is not addressed, for example, in whether two reviewers made independent assessments of the studies included). In relation to impact, for example, considerable store is placed on the "effect sizes" demonstrated by research. Here is the statement related to "excellent" potential impact from the Modified FORM Grading Tables given in Appendix 2 of the DCMS report: The intervention has the potential to have a large effect on the outcome as demonstrated by large statistical effect sizes or the ability to have a substantial impact on the outcome. (p. 18) However, nowhere in the body of the review are details given of the "effect sizes" reported by individual studies and, more worrying still, it is not difficult to find examples where A ratings are given for a "body of evidence" where individual studies explicitly report only "small" effects.
The same concern is raised over how judgements were made on the "applicability" of the body of evidence reviewed. All that is said in the Fancourt et al. version of FORM is that a grade of A is given where the evidence is "Directly applicable to the relevant intervention context" but this is excessively vague. In the original version of FORM, applicability is judged explicitly in relation to "the Australian healthcare context". This is undoubtedly important, as findings from clinical research in other countries may not be meaningfully and practically applied in Australia. For the purposes of the DCMS review, surely the question needs to be asked whether the findings from the studies reviewed, which could have been undertaken anywhere in the world, can be meaningfully applied in the process of policy development in the UK? This question is simply not addressed (For a recent example of a more transparent rapid assessment of evidence on the role of the arts in health promotion in the Australian context, see Davies & Pescud, 2020).

Evidence on arts and "social inclusion" presented in the DCMS report
Turning now to the content of the report, research work on "Arts engagement and health outcomes" is presented under the three headings noted earlier: "social outcomes", "youth development" and "prevention of mental and physical ill health". For each of these areas, there are sub-divisions. Under social outcomes, for example, three issues are addressed: "social development in infants and young children", "social cohesion" and "social inequalities". For each of these areas, "a body of evidence" is categorised according to study method: RCTs (including systematic reviews), quantitative studies and qualitative studies. After outlining the evidence, a summary grading of evidence quality is given using the modified FORM framework.
Space does not permit a detailed analysis of all sections of the DCMS report, and one example must suffice. The section on "social cohesion" (pp. 6-7) will be taken for this purpose. The authors explain that: "Social cohesion refers to the willingness of people within society to cooperate with and support one another" (p. 6). They cite 21 papers, including reviews and original studies, in this section, and they arrive at the following conclusion, using their version of the FORM framework: The evidence base on arts and aspects of social cohesion such as social interactions, behaviours and loneliness is strong (A), consistent (A), applicable (A), and has a potentially large impact (A). This provides an overall grade of recommendation of A, suggesting the evidence base on arts and social cohesion is strong and can be trusted to guide policy development. (pp. 6-7) Several concerns emerge following close examination of the sources cited in this section. What is clear is that the "body of evidence" on "social inclusion" consists of varied and unrelated studies. These come from many different national contexts, address a wide range of different activities, with a diverse array of participants, and multiple outcomes. The attempt to synthesise these studies to arrive at general policy recommendations is surely misconceived, and it is difficult to see how "A grades" for strength of evidence, consistency, generalisability, applicability and impact are warranted. Some critical observations can be given to justify this general assessment.
Firstly, there are papers in this "body of evidence" on "social inclusion" which should not be there. For example, an experimental laboratory study (Greitemeyer & Schwab, 2014) on the effects of ten-minute "exposure" to recorded songs described as "pro-social" or "neutral" on anti-immigrant prejudice among Austrian university students. Scrutiny of the results shows that the students expressed little or no such prejudice, and the difference in response to the two kinds of songs was trivial. Also, a protocol paper for the evaluation of an Australian dance programme for people with dementia contains no empirical findings (Skinner et al., 2018) and has no place in an evidence review. Similarly, a study on a creative arts programme for ill children in the garden of a Canadian hospital (Smart et al., 2018) reports adults' perceptions of effective strategies for engaging children and offers no accounts from the children themselves nor data on the effects of the programme.
Turning to the systematic reviews of RCTs included, the first two concern the social effects of reading literature for children and young people (Dodell-Feder & Tamir, 2018;Montgomery & Maunders, 2015). Both, report only "small" or "small to moderate" effects on the social outcome measures considered. In the Montgomery and Maunders review, over half of the assessments of "bias" in the RCTs included are high or "unclear". In addition, only three out of the eight RCTs included assessed "prosocial" outcomes, and the authors note that none of the measures employed had been "extensively, externally validated". The third systematic review (Poscia et al., 2018) is also of limited interest as only three of the studies included are concerned with creative arts or music activities. The single quantitative evaluation of group singing included (Davidson et al., 2014) found no significant effects on the measures employed. Consequently, the evidence from the reviews and RCTs presented in this section of the DCMS report scarcely justifies an A rating for "impact" given that the FORM table in Appendix 2 clearly specifies "large statistical effect sizes".
Further quantitative studies are also limited in their scope. Three focus on group singing, but one examines the effects of 30 minutes of singing on mood and social bonding (Kreutz, 2014), and a second of one hour of singing on mood and biomarkers for stress and immune function (Fancourt et al., 2016). The results are interesting but scarcely address wider issues of social exclusion. A third monitors participants in singing and creative arts groups over several weeks and shows that singing appears to produce more rapid social bonding, but the outcomes from both kinds of activity are identical over the whole period of the study (Pearce et al., 2015).
Many of the qualitative studies included are fascinating, and the DCMS review does not do them justice; but they come from very different parts of the world, each has a very specific focus, and some involve very small numbers of participants. Consequently, it is difficult to see how they contribute to an A rating for "generalisability" and "applicability" for policy development in the UK. What use could DCMS make, for example, of a US study of the impact of a theatre production on police and ex-offender attitudes towards one another based on the views of 10 participants (Smigelsky et al., 2016)? Or an Australian study of a "verbatim theatre" play on audience understandings of domestic violence assessed in a small sample of audience members through an online survey (Madsen, 2018)?
Three further studies from Canada illustrate the same difficulties regarding generalisability to the UK context. How, for example, can generalisations be drawn from a study using theatre to address inter-generational understandings based on the views of 15 older people and 17 university students (Moody & Phinney, 2012)? Similarly, can generalisations be drawn from a study of the role of circus skills workshops for "street youth" provided by the world-famous Cirque du Soleil in Quebec (Spiegel & Parent, 2018) in which an unspecified number of young people gave feedback? Or equally, a study of a multi-arts programme for Indigenous youth in the Northwest Province to explore challenging issues facing their community, in which only four young people and five facilitators gave their views (Fanian et al., 2015)?
But beyond problems with the empirical evidence, wider questions are raised about the kinds of policies that might be formulated for the UK from such a disparate evidence base: What social or health issues might policy address and in what context or setting? What would the role of artists be and what kinds of artistic practice would be involved? What constituencies of participants might be addressed in terms of class, ethnicity, gender, sexuality, disability, and age group? And finally, what would the anticipated social and health outcomes be and how would they be assessed? Nothing in this report provides any guidance to policy makers in DCMS in response to such specific questions.

Discussion
What we see in the WHO and DCMS reviews are the application of medical model standards of evidence in the field of arts and health. Despite the acknowledgment of the role of qualitative research designs and the valuable insights which come from the narrative testimonies of participants in creative activities, it is the outcomes of RCTs, and the findings from the quantitative assessment of key outcome variables that are seen as providing robust evidence on the effectiveness of arts for social and health impact "interventions". At the very top of this hierarchy are not just RCTs, but the findings from sophisticated processes of amalgamating the findings from a series of such studies through processes of systematic review and the arcane techniques of meta-analysis.
In addition to considering "bodies of evidence" on arts, social issues and health, these reports argue that such evidence can provide a dependable guide to formulating policy to drive the wider dissemination and scaling-up of arts and creative initiatives in the interests of promoting social and health benefits.
Ironically, however, both reviews, when carefully examined, turn out to undermine the very case they make for a strong evidence base to justify policy development. Reference to the original papers cited shows that such "strong" evidence is simply lacking.
It is useful, finally, to look back to the more radical concerns expressed by Belfiore and Mirza back in 2006 on the dangers of an over-instrumentalised view of the arts, and the limitations they saw in research conducted up to that time. Many of these limitations continue to beset the studies reviewed in the sections of the WHO and DCMS reports considered in this paper.
There is a recurrent over-inclusiveness in the use of the terms "arts", "music" and "dance" in these reports, with the danger of reification, and losing sight of the specificities of culturally contextualised practice. This is especially clear in the pairing of two RCTs, from Korea and Venezuela as examples of "musical" interventions when they involve such different forms of music-making for different purposes. It is also clear in the inclusion of studies concerned with "rhythmic auditory cueing" in reviewing the role that dance might play in promoting positive social and health outcomes.
The repeated references to "intervention" are also problematic as this serves to efface issues of active choice and engagement as key factors in the outcomes of any activity. If people choose to engage in activities they enjoy and value, and they show sustained commitment, is it any surprise that they will gain benefitsespecially where, in the case of dance, there is a substantial component of physical exercise?
It is also the case that most of the studies examined above involve short-term programmes, with evaluation immediately on completion of the planned activity, and limited follow-up. In much of the arts for social or health impact research, in fact, there is virtually no indication that programmes were scaled up to reach more people on a sustainable basis. It may have been demonstrated, for example, that weekly dance over several months had a measurable impact on aspects of balance or muscle strength, but all the evidence on the role of exercise in health shows the importance of regular, sustained activity at moderate levels, day by day, month by month, year after year.
In this respect, it is essential to consider what kinds of interventions other than arts or creative activities might be more effective and less costly, in achieving the same outcomes. It is interesting, in this respect to consult a recent Cochrane review on "falls prevention" (Sherrington et al., 2019) which considers multiple forms of intervention. Only a small number of trials considered were concerned with dance and the authors conclude: "We are uncertain of the effects of … dance … on the rate of falls and the number of people who experience falls" (p. 2).
Issues of the importance of aesthetics and quality of arts engagement are also raised by Belfiore and Mirza. It is striking that the reports focused on in this paper give scant regard to aesthetic criteria, the role of professional creative artists, and the development of artistic identities among participants. No attention is given to the characteristics of practice and participation that define arts and cultural activities, as distinct from other forms of social engagement.
Finally, the reports illustrate a continued danger in the arts and health field of "psychologising" social and health issues and failing to see the larger public health picture with the central role played by underlying economic and social-structural causes of inequalities (Marmot et al., 2010). The assumption appears to be that developing creative skills and widening cultural engagement among participants from disadvantaged or marginalised sections of society will challenge the structural forces which generate social and health inequalities and inequities. But the idea that the arts can make a substantial difference in addressing such issues on a national scale, or across the WHO European region (WHO, 2019), is implausible. We should remember that despite the existence in the UK since the 1940s of a National Health Service, universal secondary education, and a benefits system to support people in financial need, British society continues to be driven by economic and health inequalities. These have increased in the last ten years (Marmot, Allen, Boyce, et al., 2020), and have been further exacerbated by the Covid-19 pandemic .

Conclusion
In scrutinising the WHO and DCMS reports, this paper has looked at the "bodies of evidence" presented with respect to three social and health issues: inequalities, inclusion, and frailty. It is notable how small and limited these bodies of evidence are. What the WHO review shows for the topic of "inequality" is that quantitative evidence on the putative effects of arts engagement, identified from a systematic search of the global peerreviewed research literature published from 2000 onwards, amounts to one systematic review, two controlled trials, and three qualitative studies, none of which was conducted in the UK. Similarly, for "frailty" research on dance and falls prevention is limited, and no studies conducted in the UK were identified. Finally, in looking at the issue of "social inclusion" in the DCMS report, the paucity of research evidence is again striking, with most of it coming from outside the UK. The use and relevance of this body of literature for formulating policy in the UK, on the role of the arts for social and health benefits, must surely be in question.
The main conclusion to be drawn from this paper, is that the wide-ranging, uncritical, scoping reviews of arts and health research, undertaken for the WHO and DCMS, are misleading. The sections of these reports considered here, do not show that a substantial, robust evidence base exists to support arguments that arts engagement can improve health and reduce social and health inequalities. And certainly, it is premature to suggest, as the WHO and DCMS reports do, that the evidence on arts and health provides a secure foundation on which to develop social and health policy. In moving research and practice forward in future, the field must rely on rigorous systematic reviews involving careful quality assessment of both quantitative and qualitative studies. Such rich, grounded reviews will provide nuanced conclusions which recognise the complexities of cultural context, research designs and methods, participant involvement, the role of artists and the artistic process, and finally, the nature and seriousness of the social and health issues addressed. Arts and culture can play an important part in promoting individual and community wellbeing, but the evidence does not currently show that they are "crucial" in meeting the challenge of promoting health and reducing social and health inequalities …

Notes on contributors
Stephen Clift is Professor Emeritus, Canterbury Christ Church University, and former Director of the Sidney De Haan Research Centre for Arts and Health. He is a Professorial Fellow of the Royal Society for Public Health (RSPH) and is also Visiting Professor in the International Centre for Community Music, York St John University. Stephen was one of the founding editors of the journal Arts & Health: An international journal for research, policy and practice. He was the founding Chair of the RSPH Special Interest Group for Arts, Health and Wellbeing, and a founding trustee of Arts Enterprise with a Social Purpose (AESOP). Currently, he is working on a special collection of critical papers on arts and health with Frontiers in Psychology, and a special issue of the International Journal for Community Music on the impacts of the COVID-19 pandemic.
Kate Phillips is currently a lecturer, supervisor and placement co-coordinator on the art psychotherapy Masters' programme at Goldsmith's College, University of London. Kate's PhD research explored art-based projects for improving refugee wellbeing. Her work experience spans health, social care and humanitarian assistance in the UK and Australia. Kate currently holds the early career researcher position on the steering committee for the Royal Society Special Interest Group for Arts Health and Wellbeing.
Stephen Pritchard is currently Business Development Manager at Helix Arts. Stephen cares about our everyday cultures and the role that art and artists can play in changing people's lives, developing communities, and enacting social change. Stephen has worked as arts professional since 2007, having spent sixteen years working as a senior manager in the fashion business and as Director at Dot to Dot Active Arts. He has worked with Arts Council England National Portfolio Organisations, Creative People and Places programmes, and Great Place projects. He is also a community artist, creative producer, academic, published writer and filmmaker. He has a strong track record of working with people considered to be "difficult to engage" and "hard to reach". Stephen integrates his strategic and tactical business experience and his academic research to look for ways to do things differently and make things happen by focusing on cultural democracy, cultural development, and community development.