A spatial analysis of the Brexit vote in the West Midlands

Recent votes for populist parties and policies have been a focus for an increasingly signi ﬁ cant body of academic research. In the UK this has particularly focused research on the drivers of the vote to leave the European Union (Brexit) in 2016. In spite of a growing body of work on the subject, the literature investigating the applicability of spatial econometric methods is surprisingly thin. This paper applies such methods to hitherto unused data for the West Midlands region, where we have an unusually rich set of small-area results. The work ﬁ nds substantial spatial autocorrelation even after demographic differences are accounted for. Whilst focusing on a particular region, the rise of populism globally gives these ﬁ ndings a wider salience.


INTRODUCTION
In June 2016, a majority of voters in the UK voted to leave the European Union (EU). The result was a shock to most in the UKwhere a substantial majority of voters had expected a 'Remain' victory (Ashcroft, 2016). This result will have a substantial economic impact on the UK and across Europe, particularly in certain sectors, countries and regions (Chen et al., 2018;Lawless & Morgenroth, 2019), with the overwhelming preponderance of academic work suggesting a significant deleterious effect on the UK economy (Dhingra et al., 2016a(Dhingra et al., , 2016b. The ramifications for domestic politics and governance will meanwhile be profound (Gamble, 2018).
However, the vote of 2016 has a wider significance when viewed within the prism of electoral movements that have been broadly labelled as populist (Essletzbichler et al., 2018;Goodwin, 2018;Kriesi & Pappas, 2015;Mudde & Kaltwasser, 2017). As a result, academics and others have taken a keen interest in the potential drivers of the vote in the UK, both in its own right and as a case of a more general phenomenon.
Many of the factors that appear to have been important in the 2016 vote in the UK are mirrored across the EU, with votes for populist and anti-EU parties driven in part by geographical factors (Dijkstra et al., 2020). More widely, evidence suggests that this is a global phenomenon as witnessed in the United States with the election of President Donald Trump and elsewhere (Rodríguez-Pose, 2018). Intriguingly, some analyses of Brexit find systematic similarities with the regional distribution of votes for the Front Nationale in France (Becker et al., 2017).
Are these spatial patterns solely the result of demographic factors? Using a spatial error model and highly localized data in the West Midlands, this paper finds that even after accounting for demographic differences, clear spatial patterns remain in the 2016 vote in that region. It then discusses some of the potential drivers of these patterns and hypothesize that a 'geography of discontent' (Los et al., 2017) could also be visible at extremely local levels. Finally, avenues for future research are considered, arguing that local spatial variation should be investigated further in an international context.

LITERATURE REVIEW
There is now a substantial literature on the 2016 referendum. Much of this makes use of aggregate data, which became available earlier than most widely used survey evidence. The earliest work analysing the drivers of the vote came as part of the political science literature, most notably looking at groups that might be labelled 'left behind' by globalization (Goodwin & Heath, 2016). There is evidence that austerity contributed to the 'Leave' vote (Fetzer, 2019): the burden of government fiscal retrenchment fell disproportionately on poorer areas (Gray & Barford, 2018;Hastings et al., 2017).
Such arguments fit with the hypothesis that it is precisely the fact that certain regions 'don't matter' that has fuelled recent 'populist' voting patterns (Rodríguez-Pose, 2018). Given evidence that 'individuals with very similar characteristics vote differently in Leave-and Remain-voting areas' (Abreu & Öner, 2019, p. 2), studying and understanding the spatial distribution of the vote is important. It is this insight that motivates the current paper.
It has long been acknowledged that voting decisions exhibit a distinct spatial pattern. This is true in the United States, amongst otherswhere certain states and counties can be reliably counted on to vote more intensively for a particular candidate or party than others. Crucially, these are typically clustered, with evidence that this effect has grown stronger over time (Kim et al., 2003). More broadly, it is possible to use this information to improve statistical prediction of electoral outcomes (Wing & Walker, 2010), although such prediction remains necessarily imperfect. The same is true in the UK and elsewhere in Europe. There is substantial prima facie evidence of spatial clustering in the referendum results with a Moran's I of 0.627, 1 which is statistically significantly different from 0 (the null hypothesis is rejected with a probability of > 99.9%). With regard to the 2016 EU referendum in particular, very little econometric work has explicitly taken into account the spatial pattern of results, although many have used 'multilevel' models allowing for differing regional intercepts. 2 Beecham et al. (2018) are a notable exception to this in positing an explicit spatial model that allows for all model parameters to vary by region. Becker et al. (2017), in contrast, offer one of the most comprehensive overviews of demographic factors and the vote, using a best subset selection model selection procedure but do not using any spatial modelling techniques. The key findings are consistent across paperseducation and age are strong predictors of how an area voted, with the young and degree educated being much more likely to vote to remain part of the EU.

METHODOLOGY AND DATA
It is this literature that motivates the current paper. In particular, the following research questions present themselves. First, to what extent are the findings of the literature applicable at different spatial scales (specifically at the local level)? Second, and arguably more importantly, is there local spatial dependence and what are the implications of this? This paper therefore contributes to the above literature by investigating the spatial distribution of the vote to leave the EU at a local level. An explicit spatial model is posited and has considerably better predictive power than its non-spatial counterparts. Such a spatial model is a significant improvement on traditional non-spatial approaches for several key reasons. First, as is well known, spatial dependence violates the Gauss-Markov conditions. 3 However, a finding of spatial dependence after controlling for other factors suggests some deeper issues at play. In particular, it raises the intriguing possibility that the process might be inherently spatial in the sense thatafter controlling for differences in their demographic characteristics certain locations are more likely to vote in favour of Brexit than others. 4 If applicable to the local level data used in this paper, this would imply that some form of 'neighbourhood effect' is likely to be at play. The implications of this are explored further in the 'discussion' section below.
Whilst the motivation is clear, the primary limiting factor is the availability of data. Official results were published at the local authority (district) level for Great Britain 5 (Electoral Commission, 2016), which have been widely used in the literature. However, whereas most councils have not made data available for smaller spatial units than local authority districts, a small number have voluntarily done so.
The West Midlands is unusual (and privileged) in having several adjacent councils choose to make such data available at the 'ward' level. 6 These are the most granular electoral units in the UK, typically comprising a few thousand people, although several in Birmingham have a population in excess of 30,000. Results by ward are available in seven contiguous local authority districts, which comprise 161 wards and a total of well in excess of 1 million voters. It is this data availability that primarily drives our choice of area. Figure 1 illustrates these local authorities and outlines the wards within them. The most egregious omission is that of Sandwell (the 'hole' in the centre of the map). Sandwell as a whole voted to leave the EU by 66.72% to 33.28%, making it very similar to adjoining local authorities. Unfortunately, votes for Sandwell were counted together, which has meant that the council were unable to provide any further breakdown.
Naturally, this is unfortunate since wards that border Sandwell are treated as having no neighbours on that side, which is clearly false. In fact, this is a broader issue since the same problem arises around the exterior border of the region as well. These areas have similar demographic profiles to bordering wards, although this is of only modest comfort since what is relevant to our analysis is the error after controlling for these factors. The upshot of this is that the results should be treated with some care and this is an inevitable limitation, although the robustness to different weighting criteria and the demographic profile of neighbouring wards gives some confidence in the overall results.
The percentage of total voters who voted Leave in the EU referendum is modelled using a spatial-autoregressive model with spatially correlated errors (Anselin, 1988). The model takes the well-known form: where y represents a vector of results; X is a k × n matrix of independent variables; β is a vector of parameter estimates; and u is an error term. W is a matrix of weights; and ε is a vector of i.i.d. Normal error terms. Estimation is via maximum likelihood. This approach encompasses quite a wide range of models, allowing relatively complex spatial effects whilst maintaining effective use of the entire data set.
On an econometric level, it does have two disadvantages. First, it assumes linearity in (1) even though the dependent variable is bounded. Second, it imposes Normality on the error term. The former calls for care in interpretation, although it should provide a locally accurate linear approximation (in essence it is likely to prove poor in wards where predictors are far from their average values).
Naturally, there are a variety of potential weighting strategies for W. Euclidean distance is an attractive criteria, but there are also numerous options in terms of assigning particular weights to neighbours. Whilst it is theoretically possible to use two different weighting matrices in (1) and (2), there is no obvious theoretical reason to do so. In practice, predictive accuracy was highest when contiguity was used as a criteria and weights were row-standardized. 7 Results using a variety of weighting criteriaincluding inverse (Euclidian) distance, unstandardized contiguity and the square of inverse distancewere similar with only minor differences in coefficient estimates. 8 The only concern here is that wards on the exterior of our data set effectively 'double weight' their neighbours since they behave as if there were no neighbours on the other side, which is false. This is an argument in favour of using unstandardized contiguity as a criterion, although doing so makes no significant difference to the overall findings of the research.
Whilst the choice of weighting matrix was ultimately an empirical decision, there are weak grounds to prefer contiguity to distance. Specifically, in our data set travel time (and also the probable intensity of social mixing) is likely to be better represented by contiguity than distance. Sparsely populated wards might be a significant distance apart but will enjoy low travel times due to high vehicle speeds. In contrast, travel speed (irrespective of transport mode) is much lower in the densely populated towns and cities. Similarly, if there is voter clustering, this is likely to straddle ward boundaries.
A range of sociodemographic variables are included in the model, simultaneously controlling for demographic differences across wards and enabling us to investigate several important hypotheses. These variables have been selected due to their availability at ward level, but also in an attempt to build on previous work. Given evidence that white Britons voted more strongly to leave the EU than their ethnic minority counterparts (Alabrese et al., 2019), the 'white British' variable is included.
Younger people and those with a degree were much less likely to vote Leave than other groups (Ashcroft, 2016), leading to the inclusion of variables on education level and the 65+ age category. The hypothesis of certain areas being 'left behind' is explored at a very local level via the proportion of the workforce engaged in manufacturing and median property prices, alongside a proxy for social class. Car ownership and commute length are broad measures of how 'urban' a community is. Migration was a key campaign issue during the referendum and so the proportion of the population from EU states is included as an independent variable.
Data on all independent variables are sourced from the 2011 Census, apart from median property prices, which are published independently by the Office for National Statistics (ONS). In many regards, using highly localized data represents an improvement relative to using the official results for local authority districts due to the much greater level of inter-ward variation in both voter preferences and sociodemographic indicators. For example, there is a great deal of inter-ward variation in the proportion of ethnic minorities by ward, ranging from under 1.5% in Rawnsley to 89.3% in Lozells and East Handsworth.

DATA AND EMPIRICS
As can be seen from Figure 2, although all seven local authorities voted to leave the EU overall, many wards voted strongly to remain and this variation is not randomly distributed over space (wards voted similarly to their neighbours).
As can be imagined, a Moran's I statistic of 0.704 gives statistically significant evidence (p > 0.999) of spatial autocorrelation. This persists even after controlling for sociodemographic characteristics as the results below demonstrate. Note that all variables are standardized ( Table 1).
Note that predictive ability (represented by Akaike information criterion -AIC) improves substantially when a spatial error term is included and in all cases lambda is highly significant. The residuals from models (1) and (2) are spatially correlated (Moran's I being equal to 0.228 and 0.246, respectively, and highly significant in both cases). The Moran scatter plot ( Figure  3) makes this point quite starkly. Ultimately, one can conclude that, within the West Midlands at least, even highly localized referendum results show significant spatial patterns after controlling for differences in demographics.

Testing for statistical adequacy
Statistical testing is an important part of any modelling procedure. Although a Shapiro-Wilks test for normality in the error term rejects the null, a Q-Q plot (Figure 4) shows that this is largely driven by a single outlying ward. Removal of that individual ward -Hagleygives a Shapiro-Wilk value of 0.992, which is not significant at any conventional significant level (p > 0.5). The density plot ( Figure 5) indicates the importance of this one observation, with a residual of almost -10.

DISCUSSION
These results require considerable care in interpretation. The spatial error model (4) offers the best predictive performance and is to be preferred, minimizing AIC. This is consistent with the fact that the rho coefficient in the SARAR model (5) is not statistically significantly different from zero. In most cases, coefficient estimates are similar (but not identical) between using OLS and the model including a spatial error term.  Notes: Model selection was undertaken by minimizing the AIC. R 2 is not reported, but is > 0.95 for the two ordinary least squares (OLS) models. ***Significant at 1%; **significant at 5%; *significant at 10%. The coefficients on degree-level education fit with similar conclusions from previous research (Becker et al., 2017), namely that the presence of degree-holders is strongly correlated to a higher 'remain' vote, although this appears to be due, in part, to neighbourhood composition and selfselection (Abreu & Öner, 2019). It is also clear that the size of the migrant population from post-2004 accession states is a factor. The larger the size of this population, ceteris paribus, the greater the percentage of the ward's population that voted 'leave'. This fits with broader arguments over the role of immigration in the vote (Goodwin & Milazzo, 2017), although extreme caution is warranted in interpreting this finding.
A causal relationship should not necessarily be inferred from this and it is not clear whether it is migration per se that had an impact or the perception of excess pressure on local services due to  an increase in population at the same time as austerity cut public spending in real terms. This also raises interesting questions for future research: are the spatial patterns of migration different to previous waves and are there qualitative differences between migrants living in Remain voting areas to those living in Leave voting areas?
Two further questions present themselves. First, why does the proportion of manufacturing employment have such predictive power? Second, what underlying factors are causing the presence of spatial dependence, after controlling for sociodemographic factors? It is, of course, impossible to fully answer either question fully with the current datait is likely that detailed survey and interview information will be needed in order to understand this. However, local manufacturing employment is frequently used as a proxy for a 'blue collar' workforce and economic decline. 9 When combined with the sociodemographic profile of extremely 'pro-Brexit' wards (notably a large white British population and relatively few degree holders), this lends further support to the hypothesis that the Leave vote was in part driven by territorial inequalities (Rodríguez-Pose, 2018) and suggests that this was true at a very local level. Deindustrialization and economic decline are hallmarks of areas that have been 'left behind', perhaps explaining why wards where many work in manufacturing voted leave.
Insofar as such 'relative decline' is itself spatially correlated but is not fully captured by sociodemographic variables, this could also drive spatial dependence within the model. This would be akin to a 'missing variable' problemwhere the missing variable is both spatially correlated and (obviously) correlated with the model residuals. This might also be a factor driving some of the neighbourhood effects identified by Abreu and Öner (2019), fitting nicely with the growing body of evidence around the challenges of 'ecological inference' (King et al., 2004). If correct, this suggests that policymakers seeking to address the grievances of 'left behind' communities also need to focus on policies that empower communities and facilitate economic development at a highly local level as well as a regional one. This is likely to be true more generally in countries and regions where populist political movements are on the rise. Future work will undoubtedly want to assess the applicability of the SARAR model to regional voting patterns across Europe. Other factors appear to be more place-specific. Ethnic composition, for example, appears to have an unusually strong effect in the West Midlands: wards with a large ethnic minority population tended to vote more strongly to 'remain' than others.

CONCLUSIONS
This paper has demonstrated that, even after controlling for demographic factors, spatial autocorrelation in the residual vote pattern remains, suggesting that demographics alone cannot account for the 2016 vote to Leave the EU in the West Midlands. These results confirm the findings of Beecham et al. (2018), suggesting failure to account for spatial autocorrelation leads to misleading predictions, implying that on a highly localized level, spatial factors were significant in the Brexit vote. These results are limited to 'prediction' and need to be interpreted with caution, both due to the ecological inference fallacy (Robinson, 1950) and the modifiable areal unit problem (Fotheringham & Wong, 1991).
The findings are of academic interest, adding further nuance to our understanding of the vote in 2016, particularly at a very local level. Although the findings relate to local communities in the West Midlands, the work sits in a broader empirical context. In particular, it suggests an avenue for future research: there is a clear rationale for conducting further work to understand the rise in populism across Europe that focuses on very local communities. More generally, it suggests an expanded research agenda considering local intra-regional inequalities.
This points to the wider significance of some of the findings. In particular, these results suggest a clear need for electoral research to do two things. First, it is necessary to explicit account for the implications of spatial dependence when modelling. Second, whilst the current work reaffirms the importance of the 'places that don't matter' (Rodríguez-Pose, 2018) and the 'geography of discontent' (Los et al., 2017) in broad analytical terms, these results suggest that the spatial scale at which this phenomenon manifests is likely to be extremely local.
The paper, therefore, has argued that future analysis of populist movements internationally should take this into account rather than categorizing entire regions in this way. In the US case, for example, counties vary wildly in size, but many are extremely large, although in many cases results (and other statistics) are available much more locally. Similarly, for policy-makers, this research adds to the growing body of evidence that spatial factors clearly matter when assessing the dynamics of populism and, moreover, suggests that this is true at more granular level than was hitherto realised. Policies to address this must make a difference at a highly localized level as well as at a regional and meso-scale. assume that the impact of education is constant across regions (such that the gap between degreeand high-school-educated individuals is identical in every region). 3 For a fuller discussion, see, for example, LeSage and Pace (2009). As a result, estimation that does not take account of this dependence (e.g., OLS) is likely to be inefficient, meaning that coefficient estimates are likely to be less accurate than they would be were an explicit spatial model to be estimated. 4 At a much larger spatial scale, locational differences exist with Scotland, parts of northern Wales (especially Gwynedd) and certain local authorities around Liverpool being much more likely to vote 'Remain' than other areas with equivalent sociodemographic characteristics. Interestingly, in each case local media consumption patterns differ substantially from the norm in the rest of the UK. This is not the case in the West Midlands. 5 Results for Northern Ireland and Gibraltar were given in totem, although there exists some breakdown for the former. Ultimately, Northern Ireland is an interesting case deserving treatment in its own right. 6 These were the UK's lower Local Authority Units (LAU-2, formerly NUTS-5 regions) as applied in 2016, per the then nomenclature of Eurostat (2019). 7 In practical terms, this meant that weights for each ward summed to 1 (a ward with two neighbours would assign a weight of 0.5 to each, whereas one with four neighbours would assign a weight of 0.25 to each). 8 The only (minor) exception to this relates to the inclusion of the variable on the percentage of residents from elsewhere in the EU. Where inverse distance is used as a weighting matrix, the inclusion of this variable yields a very slightly lower Akaike information criterion (AIC) than when it is excluded, although it is not statistically significant (even at the 10% level). In contrast, our preferred model delivers a similarly slight reduction in the AIC by excluding said variable. Ultimately, the differences are marginal, and, in any case, a more parsimonious model is generally to be preferred. 9 Manufacturing employment in the UK has contracted rapidly over time, particularly in the West Midlands.