Devolution and geographies of education: the use of the Millennium Cohort Study for ‘home international’ comparisons across the UK

Following political devolution in the late 1990s and the establishment of the governments for Wales and Scotland, the education systems of the four home countries of the UK have significantly diverged. Consequently, not only does that mean that education research in the UK has to be sensitive to such divergence, but that the divergence of policy and practice provides an important opportunity to undertake comparative research within the UK. Such ‘home international’ comparisons between the four home countries of the UK also provide the opportunity to undertake ‘natural experiments’ of education policy and practice across similar socio-economic contexts. By drawing specifically on the UK Millennium Cohort Study (MCS) – a recent longitudinal birth cohort study specifically designed to provide the potential for geographical analysis – the paper finds considerable variation in child development by country of the UK, with no single story of ‘success’. However, the paper finds that literacy development amongst children in England is, particularly in London, on average, greater than for children elsewhere. The paper concludes by arguing that ‘home international’ comparisons must take seriously issues of scale and geography when interpreting the influence of ‘national’ education systems and policies on educational outcomes.


Introduction
Educational research has been rather slow to absorb the implications of the 'spatial turn' that has been so influential across the social sciences (and the humanities) in recent years (for example, Warf and Arias 2008).To be sure, there is a long and distinguished tradition of scholarship in comparative education, which has been concerned overwhelmingly with comparisons between different states (Crossley, Broadfoot, and Schweisfurth 2007).More recently, interest has begun to grow in the analysis of educational issues at the supra-national level (for example, Ozga et al. 2011).However, the more complex geographies of spatial variation within national states, whilst implicit in much educational research, have much less frequently been analysed systematically (Taylor 2009;Thiem 2009).
This relative neglect of the regional and more local scales in the geographical analysis of educational issues is surprising.Educational outcomes (such as levels of educational attainment, for example) exhibit distinctive spatial distributions (at these scales), as well as the much better recognised differentiations between social groups (defined most frequently in terms of gender, ethnic background or social class) or distinctive institutional contexts (such as type of school attended).Indeed, these different dimensions of educational differentiation are closely interlinked.Hence, at one level, these local and regional distributions of educational phenomena themselves reflect the spatial patterning of social and economic conditions; the inequalities that are known to shape access to educational opportunities and consequent outcomes are themselves significantly differentiated between different regions and localities (for example, Butler and Hamnett 2007).Moreover, there are important questions as to the extent to which the effects of geographical area on educational outcomes amount to more than the impacts of the social and economic conditions which characterise each area (Atkinson and Kintrea 2001).
Equally, educational policies which are responsible for shaping the availability of educational opportunities also have impacts that are differentiated at the local and regional levels.It is true that many educational policies originate at the national level; indeed, there has been a significant trend towards increasing centralisation of control over education policy in the UK (and in England especially) in recent decades (for example, Ball 2008).However, some educational policies are targeted at local or, much less frequently, regional scales, in order to focus provision on geographical areas that are characterised by particular social and economic conditions (usually high levels of social disadvantage) (Power, Rees, and Taylor 2005).Moreover, the effects of national-level policies continue to be mediated significantly at the local level, whether through local education authorities or individual schools and colleges.In part, this reflects the capacities of actors at sub-national levels to shape the implementation of policies in particular ways (Ball, Maguire, and Braun 2012).Equally, however, there are complex interactions between national policy initiatives and social and economic circumstances that produce outcomes that are inevitably differentiated between different local areas and, indeed, regions.As Byrne, Williamson, and Fletcher (1975) demonstrated some time ago, the same national-level policies (in relation to educational expenditure, for example) impacted on different localities and regions in highly distinctive ways, reflecting the variations in pre-existing conditions in these diverse places.
Clearly, sorting out the effects of these various dimensions of sub-national, geographical variation in education poses complex analytical issues.One innovative approach to addressing some of these issues (at least in the context of the UK) is that proposed by Raffe et al. (1999), involving what they termed 'home international' comparison.Here, the differences in policy approaches between the constituent countries of the UK -England, Northern Ireland, Scotland and Walesprovide a means of analysing the differential impacts of these policies on educational outcomes; thereby permitting an exploration of the determinants of differences in educational outcomes at the sub-national level.Whilst there are undoubtedly some differences in social and economic conditions between the 'home countries', these are significantly less marked than is the case in fully international comparisons; thereby introducing elements of 'controlling' these exogenous factors and enabling the generation of important new insights, both theoretically and empirically.In short, then, 'home international' comparisons offer an innovative hybrid in terms of geographical scales; 'national'-level policies may be compared between the 'home countries', whilst the wider UK provides a significant element of commonality, thereby making the attribution of relationships between policies and educational outcomes somewhat clearer.
Moreover, since the advent of parliamentary devolution in 1999, the constitutional framework within which 'home international' comparisons can be made, has shifted significantly.The UK government has retained responsibility for educational provision in England.However, in the other 'home countries', it has become the responsibility of devolved governments in Belfast, Cardiff and Edinburgh.This, in turn, has provided a context for the increasing divergence in policy approaches between the 'home countries'; and, more specifically, between England and the other countries (Greer 2009).In the case of education policy, the UK government has adopted an increasingly radical programme of change that has led to important differences with the other 'home countries' across the range of educational provision, from early years to higher education and lifelong learning.This is not to suggest, however, that UK-wide commonalities have wholly disappeared.Indeed, the exact extent and significance of policy divergence between the 'home countries' has become a matter of considerable debate (for example, Raffe 2006;Rees 2007).Nevertheless, it is clear that 'home international' comparisons do hold out considerable promise for policy analysis, in terms of permitting a more robust exploration of the effects of policy interventions at a geographical scale below that of the UK state than would otherwise be possible.
Given this, it is undoubtedly unfortunate that these already complex methodological waters have been muddied by the use of extremely crude 'home international' comparisons by politicians and the media, in the context of debates about the efficacy of the educational policies adopted in the different UK jurisdictions.More specifically, such comparisons between England and Wales have been especially influential. 1 Indeed, in Wales, there is now a widespread view that the educational system is in 'crisis', largely on the basis ofat bestvery partial comparisons between Welsh and English levels of educational attainment; and also what is perceived to be Wales's 'under-performance' in international comparisons such as PISA, not only in relation to the other countries of the UK, but also more widely.Moreover, this popular account of the state of Welsh education has had tangible results in terms of significant shifts in educational policy, with new interventions in relation, for example, to raising 'teaching quality', evaluating the performance of schools, improving levels of literacy and numeracy, and shifting responsibility for implementing educational provision away from the local authorities (Rees 2012).
What is most immediately problematic about the use of these 'home international' comparisons in such political contexts is straightforwardly that their methodological basis is so crude.For example, the well-established limitations of PISA in providing a satisfactory basis for evaluating the effects of systems of educational provision are nowhere acknowledged (Goldstein 2008).Similarly, comparisons of levels of educational attainment are frequently made without recognising the difficulties in constructing comparable datasets for the different 'home countries' (Rees 2012); or taking adequate account of the effects of differences in social and economic conditions between them (Gorard 2000).Accordingly, if 'home international' comparisons are to be used as a method of policy evaluationand, potentially, policy learning (Raffe et al. 1999)it is essential that they are conducted on an appropriately robust methodological basis.
One approachbut by no means the only oneto establishing this sort of robust methodological basis is through the conduct of 'home international' comparisons as what have been termed 'natural experiments'.Here, 'naturally occurring' events or circumstances can be used to replicate some of the characteristics of an experiment in a wholly observational study (Dunning 2012).Hence, the adoption of a different policy in one 'home country' (the 'treatment' group) allows systematic comparison with another (the 'control' group), thereby enabling the delineation of the effects of the policy innovation, assuming, of course, that other differences between the two 'home countries' are limited or can, in some way, be allowed for in the analysis (generally, by means of statistical applications).It is instructive, however, that relatively few such formal 'natural experiments' have actually been carried out in the context of 'home international' comparisons, especially in the area of education policy [see, however, Burgess, Wilson, and Worth (2010) for an interesting, but flawed example].And one problematical issue here is the availability of appropriate data to enable this kind of formal analysis.
In this paper, we address this issue of the availability of appropriate data to make stringent 'home international' comparisons by drawing upon the Millennium Cohort Study (MCS).This is a large-scale, birth cohort study of children born across the UK during 2000-2001, with data currently available up until they are seven years old.As with other birth cohort studies, it has the advantage of tracking individuals over time, thereby providing a more robust means of measuring the effects of educational changes than is provided, for example, in the repeated cross-sectional datasets that are more frequently used in analysis of this kind (as, for example, in comparing the GCSE attainments of successive cohorts of young people) (compare, for example, Goldstein 2004).In addition, the MCS is unusual in its use of geographical criteria in the selection of its sample of respondents.It therefore provides an excellent basis on which to carry out geographical comparisons of children growing up in the different 'home countries' of the UK and, indeed, in different areas within these 'home countries' (a consideration which is especially relevant for England, given its much greater population size than the other countries).
We also attempt to develop the methodological basis on which 'home international' comparisons can be made.Here, we apply novel statistical techniquespropensity score matchingwhich, we argue, produce much more robust comparisons than thosebeloved of politicians and the mediawhich are based upon 'raw score' measures of educational attainment.They also have some advantages over more conventional statistical approaches (based on various forms of regression analysis), especially in relation to the clarity with which results can be presented publicly.
Finally, we explore some of the complexities that arise when 'home international' comparisons take account of regional variations within England.Considerable interest has been aroused recently in such regional variations, especially as a result of the dramatic improvements in levels of educational attainment that have been recorded over recent years in London (Cook 2012), with the inevitable speculations that these improvements are the result of specific policy interventions, such as London Challenge, or of national policy initiatives that have been particularly prevalent in the UK capital, such as academies and Teach First (Whitty and Anders forthcoming).In fact, of course, the 'London effect' is more appropriately understood as one example of a much wider phenomenon of geographical variation at a sub-national scale.As the analyses which follow demonstrate, exploring variations in educational issues at such sub-national scales, whilst methodologically complex, opens up the potential for important new insights, in relation both to the impacts of policies, as well as more fundamental theoretical concerns.

Methodology
A key limitation to any comparative study is the issue of 'equivalence'.For example, Gorard (2000) demonstrated that important differences in the number of pupils eligible for free school meals could account for many of the 'apparent' differences in educational achievement between England and Wales.This is an important observation to make, particularly for policy-makers who may be concerned about the extent to which differences in the education systems are responsible for differences in educational achievement.However, the issue of equivalence in comparative geographical studies extends well beyond just accounting for the number of pupils from low-income families.For example, to what extent should such comparisons take in to account other factors that may help determine educational achievement?
Related to this is the availability of data or information relating to the characteristics of children.One of the benefits of 'home international' comparisons is that this reduces a number of major differences between the comparison groups that may otherwise have had to be accounted for.However, even then there is the constraint of having access to common or equivalent outcome measures.For example, Scotland has a very different qualifications system to the rest of the UK which means it rarely gets compared to elsewhere in the UK.
In order to address many of these issues we draw upon the first four sweeps of the Millennium Cohort Study (MCS)a large-scale birth cohort study of approximately 19,000 children born across the UK over 12 months in 2000-2001.The first four sweeps of data collection were conducted when the children were around nine months, three years, five years and seven years old. 2  The MCS has a number of qualities that make geographical comparisons of the different education systems of the UK possible.First, it collects very detailed information relating to the children in the cohort, including information about their families and, more recently, about their education and schooling.Crucially, it collected information about the children from birth and as they have grown up, helping to observe changes over their life-course.This makes the analysis that follows very distinct from previous 'home international' comparisons that rely on cross-sectional data.As Goldstein (2004) argues 'With only cross sectional data it is very difficult, if not impossible, to draw satisfactory inferences about the effects of different educational systems' (328).
Second, despite including children in different education systems of the UK it undertook the same assessments (physical, cognitive and behavioural) of the children that allow results to be compared, irrespective of where a child lives and grows up.Third, the sample design for the MCS is very different from previous national longitudinal cohort studies in that children were selected to participate in the MCS based on where they lived.Importantly, two geographical criteria were used in the selection of the sample at: (i) country-level and (ii) ward-level.At the country-level the MCS includes an over-representation of children from Wales, Scotland and Northern Ireland.This means that the MCS sampling design ensures there are enough children in the cohort for meaningful comparisons between England and these other countries, despite their relatively smaller sizes. 3Furthermore, the MCS cohort was also sampled on the basis of the neighbourhood where they were born; children born between September 2000 and August 2001 were included in the MCS sample if they lived in selected electoral wards across the UK.These wards were selected on the basis of a local measure of child poverty, 4 which meant that the MCS was able to include an over-representation of children born in the most deprived areas of the UK. 5 The natural geographical clustering of children in the MCS cohort that arises from this sampling design has also meant that many children in the MCS share similar 'geographical' characteristics that provide a basis of further geographical analyses, particularly in relation to neighbourhood quality (Ketende, McDonald, and Joshi 2010), but also means that the cohort can be compared at a regional level (within England) and that urban-rural differences can be observed within the cohort (Joshi, Dodgeon, and Hughes 2008).
One of the best examples of the comparative use of the MCS is by Jones, Blackaby, and Murphy (2010) who compare the health and cognitive outcomes of children up to age five years in Wales with other parts of the UK.They also consider intra-regional (within-Wales) differences, comparing children from advantaged wards with children from disadvantaged wards within Wales.The analysis we present here extends the work of Jones et al. by focusing on education-related outcomes when the children were aged seven years.The analysis is divided into two parts.First, we are interested in how children from different countries of the UK compare in terms of their educational development.In this analysis we are also interested in how inequalities in child development at three years of age in each country continue to develop or reduce as the children grow up.In the second part of the analysis we compare the same outcomes at the regional level.Not only are we interested in identifying the presence of any regional 'effects' on the outcomes, such as a London 'effect', we are also interested in how different regions of England compare with other countries of the UK.Throughout both sets of analyses we identify a number of exogenous factors that are used to control for the background characteristics of the children when they were born.We also incorporate in to both sets of analyses two additional geographical factors: whether the children live in urban, rural or mixed neighbourhoods and the level of neighbourhood deprivation they live in.In addition we also examine the mediating influence of the home learning environment on outcomes.Table 1 outlines the exogenous factors, the mediating geographical factors and outcome measures used in the subsequent analyses.
Although there is a great deal of commonality in the two sets of analyses (we tend to compare the same outcome measures and use the same exogenous factors to control for the children's background characteristics) we use different approaches and analytical techniques.For the comparative analysis by country of the UK we use the technique of propensity score matching in an attempt to simplify the way in which we present differences in the outcomes of children by country whilst ensuring we are comparing 'equivalent' groups of children.Propensity score matching works by matching children from one country (the 'control' group) to children from the country of interest (the 'intervention' group) based on sharing similar characteristics (Rosenbaum and Rubin 1983).In practice we are interested in identifying a sub-group of children in the MCS living in England who share similar characteristics with children in the MCS living in Wales.We then repeat this to identify a sub-group of children in the MCS living in England who share similar characteristics with children in the MCS living in Scotland. 6 Matching of children is derived from using a logistic model that estimates the probability of being in the intervention group based upon observed characteristics (the exogenous factors identified in Table 1).These probabilities are calculated for members of both the 'control' and 'intervention' groups.There are a number of ways of determining which members of the 'control' and 'intervention' groups are included in the final matched samples. 7In the results that follow we present the results from two matching estimators that could be considered to reflect examples of a more liberal 'matching' approach and a more 'conservative' approach to matching, 8 but generally results do not vary significantly in the way the matching is undertaken.
Table 2 attempts to illustrate this process based on children in the MCS for whom the necessary data were available, including valid outcome measures, in this case word reading ability scores at age seven.The results using matching estimator 1 are based on matching children with the most similar probability of being from Wales (or from Scotland), i.e. their nearest neighbour.A calliper is used to limit how similar, or not, The urban-rural classifications available in the MCS for cohort members in each country of the UK are derived slightly differently to one another.However, for the purpose of this analysis these are combined into a common classification that distinguishes between wards as urban, rural or mixed.However, in Scotland there is no equivalent category for mixed wards so this should be noted when interpreting the results. b The Index of Multiple Deprivation is calculated for each country.
c More information about the HLE and how it is calculated from the MCS can be found in de la Rochebrochard (2012), and is based on the work of Melhuish et al. (2008).
d More information about these assessments can be found in Hansen et al. (2010).All measures are standardised and age-adjusted.
e These measures of subjective wellbeing are based on the composite responses to a series of questions in a self-completion questionnaire for cohort members of the MCS when they were aged approximately seven years.NB: These measures are not age-adjusted.
the probabilities can be before they can be matched.So, for example, Table 2 shows that 63 children in Wales were unable to be matched because they were not similar enough to a comparable child in England.However, it also shows that only 1601 children from England were used in the final matched comparison, since this is the maximum number of children in Wales they could be matched to.In other words, those children in England who were the least like any of the children in Wales are also not included in the comparison.
The second matching estimator is slightly different.This has a smaller calliper, which means that the threshold for being matched is much stricter, hence there are more children from the two 'control' groups that are unmatched; 222 in Wales and 236 in Scotland.However, in contrast to the nearest neighbour technique of the other matching estimator the final selection for comparison is based on radius matching.This means that every child in the intervention group could be matched to one or more children in the control group, 9 but within the constraints set by the calliper.Although this means that more children in England are included in the final matched comparison, there are still some who were not included because they were too different to other children in Wales (or Scotland).
For the second set of analyses, which focus more on regional comparisons, we revert to traditional general regression models to derive standardised estimates that indicate the strength of association with the outcomes measures.The results from a series of models are presented to help identify what mediating influence, if any, these additional geographical characteristics have on the country 'effects'.
A final important methodological remark relates to the sampling design of the MCS.As previously noted, the way in which children were selected to be part of the MCS was slightly more complicated than other well-known birth cohort studies.Furthermore, like most other longitudinal studies the MCS suffers from attrition over time.This means that most analyses of the MCS have to take in to account the resulting selection biases that exist in the MCS.Fortunately the effects of these biases can be significantly reduced by the use of sampling and attrition weights

Comparative Education
especially prepared for the MCS (Plewis 2007).For the second set of analyses all the regression models use these weighted estimates.This ensures that the results are based on a nationally representative sample.However, in the first set of analyses, using propensity score matching, it is not possible to include these sampling and attrition weights. 10Indeed, we have already seen that matching can only occur where children from one country have similar characteristics to children in another country, leaving a number of children in each country unmatched.Clearly this means that these comparisons are not based on nationally representative samples, and any interpretation of the results must recognise this; but in presenting results from two methodological approaches we are also able to consider what impact this sampling issue has on the results.

Country analysis
Table 3 presents the results of using matched samples of children between Wales and England and between Scotland and England.Here we focus on three assessments of literacy skills as children grow up.The average scores without matching are also provided to demonstrate the impact of using matched samples of children in the comparison.For example, without matching there are significant differences in the vocabulary skills of children in the three countries at age three.On average, children in England have the lowest scores and children in Scotland have the highest scores.But using matched comparisons the difference between the samples of children from Wales and England disappears, i.e. children sharing comparable characteristics in England and Wales are likely to get similar scores.Similarly the 'gap' between Scotland and England narrows but remains significant.However, as the children grow up this pattern begins to change.By age five children in England appear to have improved, on average, their literacy skills at a faster rate than comparable children in Wales and Scotland.So by the time the children have entered compulsory education the 'gap' between Scotland and England has nearly disappeared and a significant but relatively small 'gap' between Wales and England has appeared.
By age seven this trend appears to continue; children in England appear to continue to make greater progress in their literacy skills than comparable children in Wales and Scotland.Of particular significance is the growing 'gap' in literacy skills between Wales and England; however, a difference of 2.5 in word reading ability at age seven (the difference in average scores using matching estimator 1) is only the equivalent of around a month in vocabulary development.Although we do not directly compare the cognitive development of children in Wales with comparable children in Scotland the results presented in Table 3 suggest any differences already exist by age three years and do not appear to worsen (or improve) as children enter the primary phase of their education.
Crucially, these country-by-country trends and patterns in assessments of children's literacy and language development are not repeated in other, equally important, areas of cognitive development.For example, whilst children in Scotland are, on average, slightly behind comparable children in England in maths ability at age seven, the average score of children in Wales is similar to the average score of equivalent children in England; and in pattern construction, Wales does significantly better than England and Scotland.Here a difference of 1.5 in pattern construction is the equivalent to around three months in the development of spatial visualisation and non-verbal reasoning.
These results suggest that although we observe some significant differences in the cognitive development of children according to which country they grow up in they are often very modest differences and suggest the 'ranking' of countries depends on which area of cognitive development is being contrasted.
But following Feinstein (2003) and Blanden and Machin (2010), how do differences in the education systems and policies of different countries within the UK influence different groups of children, particularly those living in poverty?The method of propensity score matching allows us to examine the progress of two equivalently matched groups of children over time.Here we match the children in each country based on their background characteristics (as before) but compare the educational outcomes of particular sub-groups of those matched children based on levels of family income.Specifically we compare the results of children from the 'richest' and 'poorest' 25% of households in each matched sample. 11We then compare the assessment scores of these sub-groups of children as they grow up against the same three measures of literacy skills at ages three, five and seven years as we used above.Although these sub-groups of children are not an accurate reflection of the poorest and richest quartiles of the population in each country, they provide an indication of the relative differences in cognitive development of children at contrasting ends of the income spectrum.Variations in the scores between assessments and the norms used to standardise assessment scores in the MCS make the direct comparison of average scores difficult to interpret.Therefore, Figure 1 presents proportionate differences in the corresponding mean scores. 12 In the main, the same trends and patterns in literacy development identified above can be seen amongst both the richest and poorest groups of the matched samples of children (Figure 1).Despite different levels of vocabulary development at age three, children in England tend to make the greatest improvement in literacy as they grow up, such that by the age of seven the word reading ability of children in Wales is behind that of England and Scotland, irrespective of whether they are from families with relatively low or high incomes.However, Figure 1 does illustrate some interesting incomerelated differences in the cognitive development of children in different countries.For example, at age five the poorest children in the matched sample from Wales were not too far behind in their vocabulary skills as the poorest children in England.But by age seven a significant gap between the poorest children in Wales and England has emerged.Similarly in Scotland it is the poorest children that perform less well than their low-income counterparts in England; high-income children in Scotland continue to out-perform high-income children in England throughout the first seven years.
These results seem to make the comparisons between countries even more complex, since the differences in literacy skills vary according to whether we are comparing 'rich' or 'poor' children.As Dex et al. (2008) suggest, such variances could be due to 'differences in early years education provision, in pre-school education, activities in the home or in grandparent influence' (8). 13But importantly, differences in the literacy development of low-income children in England, Wales and Scotland by age seven suggest that the more 'comprehensive' and perhaps less target-driven systems of Wales and Scotland appear to be associated with greater inequalities in child development.This would seem to corroborate similar conclusions by Croxford (2010) for Scotland based on international comparisons made by the OECD (2007): The Scottish system serves pupils from higher social class backgrounds well and produces relatively high levels of academic attainment and entry to higher education.However, the review [by the OECD] also indicates that the system does not serve pupils from less advantaged backgrounds well.(Croxford 2010, 17) Despite the relatively low achievement of comparable 'poor' children in Wales and Scotland compared to England such children generally report greater levels of wellbeing than comparable children in England (Figure 2).This is particularly striking given these groups of children generally have, on average, a lower word reading ability than similar children in England.This might suggest that the possible attention on developing literacy skills in England could come at the expense of children's subjective wellbeing.It is also interesting to note that the home learning environment of children from relatively low-income families is greater in Wales and Scotland than it is for their counterparts in England. 14Perhaps contrary to expectations these results may be worrying to policy-makers in Wales and Scotland.This is because differences in the cognitive development of children, particularly from the 'poorest' families in the sample, appear to exist despite 'better' home learning environments and more positive dispositions to school and learning in these countries.

Regional analysis
In this second part of the analysis we focus our attention on regional comparisons of cognitive development amongst children by the age of seven years.In particular, we are interested in identifying geographical patterns to these outcomes within England, after attempting to control for differences in the background characteristics of children.
In doing this we are also able to see what impact, if any, taking a regional approach to the analysis of outcomes in England has on our interpretation of the educational outcomes in other countries of the UK.This may be particularly important if, for example, we observe a London 'effect' on child development.
As discussed earlier we take a different analytical approach to this analysis, using multivariate regression modelling to compare outcomes.Although the results of this approach may be slightly more complicated to many readers this does have the benefit of allowing us to consider the way in which geographical factors may mediate the influence of other background factors.It is also important to restate that this part of the analysis uses all children in the MCS and utilises sampling and attrition weights.Consequently, although these results may be more difficult to interpret than those presented above, this part of the analysis is not susceptible to concerns about the representative nature of the results for the rest of the UK.We are also able to consider the influence of growing up in Northern Ireland on children's cognitive development.
The statistical models presented in Table 4 all use a number of control variables (the exogenous variables outlined in Table 1) to predict three measures of children's cognitive development at ages three, five and seven.For each measure of cognitive development we present three statistical models: the first only includes an indicator of the country that a child lives in (Models 1, 4 and 7); the second includes an indicator of the region that a child lives in, specifically to help distinguish between the outcomes and living in different regions of England (Models 2, 5 and 8); and the third model introduces two additional geographical variables to the modelsan urban-rural indicator and a measure of the level of multiple deprivation for the ward that a child lives into see what influence, if any, they have on other predictors of cognitive development.
As Table 4 highlights, there are many significant and important relationships between the socio-economic background of the childrensuch as their social class background, the educational levels of their parent(s) and levels of household income and their literacy skills throughout the first seven years of their lives.The analysis also shows how the influence of some of these factors change as the children grow up and enter school.For example, one of the most striking points to note is the influence of ethnicity on literacy skills.Up to the age of seven, children from ethnic minority backgrounds have, on average, significantly lower levels of literacy, but at age seven not only has the 'gap' between White and ethnic minority children closed, we find that children from ethnic minority backgrounds have on average a higher word reading ability score than White children all other things being equal.This is particularly the case for Indian, Pakistani and Bangladeshi children.We also see the emerging importance of season of birth on literacy skills once children reach the age of seven and after entering school.However, it is not the purpose nor is there the scope to discuss these results in detail.Indeed, many of the findings have been discussed elsewhere.Instead we are concerned with whether the country or region a child lives in is associated with their literacy development after controlling for these background factors.
The first thing to note from the regression analyses (Models 1, 4 and 7) is that we see exactly the same patterns and trends for the country 'effects' that we saw in the propensity score matching analysis above.Children in Scotland are associated with high vocabulary scores at age three, whereas children in England and Wales share similar scores.As they grow up children in Wales fall significantly behind children in England in their literacy abilities, whilst children in England appear to 'catch up' with children in Scotland.Furthermore, the estimates presented in Table 4 are very similar to the 'gaps' in average scores of the matched comparison groups in Table 3.However, in contrast to the earlier analysis, Table 4 is able to show what influence growing up in Northern Ireland seems to have on cognitive development.These results are particularly noteworthy; despite being associated with significantly higher vocabulary scores at ages three and five, children in Northern Ireland have, on average, a significantly low word reading ability by age seven compared to children in England.Again, this illustrates the relatively greater improvement of children in England as they grow up.Nevertheless, policy-makers in Northern Ireland may be particularly concerned about what happens to children between the ages of five and seven.
Another key benefit of these regression models is that we are able to see the relative influence of which country the children are from compared to other key predictors of cognitive development.It is notable that the significant estimates for the country indicators are generally smaller than the main socio-economic predictors, such as social class, ethnicity and educational levels of parent(s).Indeed, they are generally lower than other predictors of cognitive development, such as the number of older siblings and the season of birth.
We also see from Table 4 that, despite previously highlighting country-level differences in the home learning environment, this remains a significant predictor of cognitive development at all stages of a child's early development at the level of individual households.This apparent contradiction could be due to biases in the matched samples used earlier or it could suggest there are differences in the degree of influence that the home learning environment has on children's cognitive development in different countries.Only with more detailed analysis, perhaps using multi-level statistical models, can this be explored further.
We now move on to consider whether other geographical factors mediate the influence of these main predictors of cognitive development, including the country the children are from (Models 2, 3, 5, 6, 8 and 9 in Table 4).
The first thing to note is that there appears to be significant variation in the literacy development of children by region of England.Most notably, children in London at age three are associated with low vocabulary scores, particularly when compared to children in Wales, Scotland and Northern Ireland (Model 2).We also see a significant positive association in these scores for children from the south west of England, who have, on average, the highest vocabulary scores at age three of all children in the UK.At age five these patterns shift slightly (Model 5).Although children in London appear to make greater improvement in their vocabulary ability than children in Wales and Scotland there are other regions of England where children would seem to make even greater progress (the North West, the East Midlands and the East of England).However, by age seven the picture is much clearer; children in London are associated with significantly higher word reading ability scores, all other things being equal (Model 8) compared with children from most other regions/countries of the UK.Importantly, if we compare the estimates for Wales and Scotland with the estimates for other regions in England we actually see that much of the 'improvement' in literacy skills in England identified above, could be attributed to children in London.
Next we consider the mediating influence of whether the children live in urban or rural areas and the level of multiple deprivation of their local neighbourhood (Models 3, 6 and 9).Here we see a negative association between neighbourhood disadvantage and cognitive ability, particularly at age five.The relationship between urban-rural geography and cognitive development are less clear, with urban areas associated with low vocabulary scores at age five but high word reading ability scores at age seven.However, we do see that, in general, these two geographical factors have a small but important mediating effect on the influence of the region and country that the children live in.These geographical factors also appear to mediate, to some extent, the influence of other family-level socio-economic factors, most notably the social class background of the children; but equally it is also worth noting that some factors are unaffected by the inclusion of geographical variables, such as the sex of the child, a child's birth weight, the number of older siblings, the child's season of birth and their home learning environment.
Finally, it is useful to return to the influence of ethnicity on this discussion, particularly because of the strong connection between some of the geographical indicators and non-White ethnic groups.As noted above 'improvement' in literacy skills of ethnic minority children, relative to White children, is quite significant.Importantly, here too we see the mediating influence of the geographical factors on ethnicity.However, what is not clear is whether the 'cause' of literacy improvement is due to the presence of non-White children in an area or the influence of the education system (and its associated policies and initiatives) in those same areas.This would seem to be of particular relevance to the apparent improvement in literacy development for children in London.However, as far as it is possible to examine such issues in this paper it does appear that both ethnicity and geography have some role in determining cognitive outcomes, and that in combination these probably go some way in helping to understand the relative under-performance of children in other areas with limited ethnic minority populations by age seven, such as in Wales and Northern Ireland.

Conclusion
In this paper we have attempted to demonstrate the potential of 'home international' comparisons as an important tool for policy analysis, embodying some of the necessary features of 'natural experiments'.However, we have also shown that simple, and often crude, 'home international' comparisons, often undertaken within the media and by politicians, can lead to very crude, and potentially misleading, 'evaluations' of alternative policy approaches in different countries of the UK.It is very clear from this discussion that careful and detailed 'home international' comparisons are warranted.But these are often dependent on the quality of available data.This analysis has been based entirely on the Millennium Cohort Study, which has helped us to illustrate some of the complexities of 'home international' analyses, especially in relation to different geographical scales.Although we have only been able to consider the cognitive development of children up to the age of seven years, we have still been able to provide new and important insights into the relationships between divergent educational policies and cognitive development.
There would seem to be three main conclusions from this analysis.The first is that there is no single national 'success story', suggesting one education system in the UK is 'better' than another.For a start, any ranking of countries depends on which measure of cognitive development is being considered.Also, where there are significant differences in cognitive abilities by country these are relatively small compared to the influence of other conditions that children are born in to.Furthermore, it is not entirely clear whether the apparent 'benefit' of living in England on literacy at age seven can be attributed to its distinctive national education policies or simply a reflection of processes and influences on cognitive development that exist at a regional or local level.For example, differences found between England and the rest of the UK may in part be due to the significant improvement in cognitive development of children living in London.In turn, however, the differences between London and the rest of England may be due to the significant improvement in cognitive development of ethnic minority children who are concentrated in the capital.
This leads on to the second main conclusion, that there is still a considerable need for further 'home international' analyses that utilise genuinely comparative and longitudinal data.The MCS provides an excellent example of this, and as the MCS cohort ages and further sweeps of data are collected this is going to become increasingly more valuable in undertaking 'home international' comparisons and for conducting natural experiments of particular policy initiatives in the UK.
Despite detailed and complex datasets, such as the MCS, often requiring detailed and complex techniques for analysis we have also attempted to demonstrate, through the use of propensity score matching, that these comparisons can often be made very clearly and efficiently.This is important, as it can be the same reason why policy-makers and the media often rely on simple and crude comparisons, an issue that Goldthorpe (2012) also acknowledges when referring to the 'media hysteresis' surrounding social mobility.
This in turn leads on to our final conclusion, that much greater consideration should be given to the desire to use simple comparisons in education for the immediate purposes of policy evaluation, formation and borrowing.Comparative education researchers have, for a long time, raised concerns about this (see Crossley and Watson 2009) and queried the basis of recent international comparisons (see Sturman 2012).But despite this, policy-makers appear to be increasingly influenced by simple comparisons of a small number of, often narrowly defined, educational outcomes.Nóvoa and Yariv-Mashal (2003) and Ozga (2012) have gone further, suggesting that comparative studies of education are now essentially political tools and a form of policy technology that undermines the intellectual scholarship that they once had.
We would suggest that in order to counter these policy moves education policy analysis, and comparative studies in particular, must (a) make more use of 'home international' analyses within the UK, since they can demonstrate the complexities of making such comparisons despite comparing national education systems with so much in common, and (b) take issues of geography, scale and context more seriously when interpreting the influence of 'national' education systems and policies on educational outcomes.
2. The most recent fifth sweep is currently being undertaken at the time of writing, so data relating to when the children are aged 11 years old were not available for this current analysis.3. Despite country-level boosts to the original sample size, the limited size of the cohort population in the three smaller countries of the UK is still a limitation on particular kinds of analyses, particularly those that are dependent on comparing particular sub-groups of the MCS cohort as they get older due to attrition of the original sample.However, for national comparisons, as undertaken here, the cohort sizes are relatively large enough to still make meaningful and insightful comparisons.4. The Child Poverty Index is defined as 'the percentage of children under 16 in an electoral ward living in families that were, in 1998, receiving at least one of the following benefits: Income Support; Jobseekers Allowance; Family Credit; or Disability Working Allowance' (Plewis 2007, 10).The measure and definition is common across all countries of the UK, although differences in the socio-economic demographics of each country meant that in some countries there were more wards and children to choose from [see Plewis (2007) for more details].5.In England wards were also selected because they had a high concentration of ethnic minority families living there.This was to ensure that there was an over-representation of ethnic minority children in the England MCS sample.However, it is important to note that this criterion was not used in the selection of children in Scotland, Wales and Northern Ireland.6.We do not match a sub-group of children in the MCS living in England with the MCS children living in Northern Ireland largely because of the smaller MCS cohort size in Northern Ireland and because of the smaller emphasis on the impact of devolution in Northern Ireland.However, MCS children in Northern Ireland are included in the second set of analyses.7. Given the different number of MCS cohort children in each country it is inevitable that not all children from one country (typically in England) will be matched to a child in another country.Hence the selection of children to be included or excluded from the control group and subsequent comparison is important.8.The two matching estimators we report results for are: (i) Nearest Neighbour Matching, calliper 0.001, no replacement; and (ii) Radius Matching, calliper 0.0001.9.If a child in the intervention group is matched to more than one child in the control group the average outcome scores of those children are used for comparison.10.This is particularly important when comparing children from Wales, since the over-representation of children from areas with high levels of child poverty was greater here than in other countries (see Plewis 2007).11.This is a derived measure in the MCS that predicts weekly net family income that has been adjusted for family size.The use of household income here is independent to the process of matching children, as it was not used in the Propensity Score Matching.The quartiles have been defined on the unweighted results for the corresponding 'intervention' countryi.e. for the sample of children in Wales and Scotlandand then applied to the corresponding matched sample from England.Therefore, these sub-groups are not strictly the richest 25% and poorest 25% of each country's population since the matched samples are not representative of their respective countries.12.These differences are calculated proportionatelyi.e.(a-b)/(a+b)and are not the raw score differences.13.Dex et al. (2008) also suggest that there may have also been selective attrition bias in Scotland compared to the rest of the MCS sample at age three years.14.It is only compared to children from the 'richest' families in the Scotland matched sample that children in England appear to have a greater home learning environment.

Notes on contributors
Chris Taylor is a Professor in the School of Social Science at Cardiff University and a research director in the Wales Institute for Social and Economic Research, Data and Methods (WISERD).He has been researching the geography of education for many years, particularly in relation to school admissions, education and neighbourhoods, education markets and participation in higher education.A central focus of all his research is the impact of education policy on issues of social justice.He was recently awarded an ESRC Mid-Career Fellowship to develop skills in the spatial analysis of the Millennium Cohort Study.He currently leads the official three-year independent evaluation of the Foundation Phase in Wales, the early years flagship policy of the Welsh Government.
Gareth Rees is Professor and Director of the Wales Institute for Social and Economic Research, Data and Methods (WISERD).He has held a number of visiting positions, notably at the University of New South Wales in Sydney and at the University of British Columbia in Vancouver.Key research interests include the social impacts of higher education; lifelong learning and the Learning Society; vocational education and training and economic development; the governance of education policy; and research-capacity building in the social sciences.In addition to his academic research, he has been an adviser to a number of governmental bodies, including the OECD, the European Commission, UK government departments and the Welsh Government.
He is an Academician of the UK Academy of Social Sciences and a Fellow of the Learned Society of Wales.
Rhys Davies is a senior researcher in the Wales Institute for Social and Economic Research, Data and Methods (WISERD) and has extensive experience of analysis of large-scale datasets related to employment and the labour market.This has been based on large-scale primary data collection, as well as conducting analysis on secondary and administrative data sources.Key research interests relate to occupational health and safety in relation to the economic environment and structural change; graduate labour market and destinations of leavers from HE and FE; the development and implementation of occupation and social classifications; gender differentials in labour market outcomes; and survey design and data linking.

Figure 1 .
Figure 1.Inequalities in literacy development of children from 'rich' and 'poor' families by country.

Figure 2 .
Figure 2. Inequalities in home learning environment and subjective wellbeing of children from 'rich' and 'poor' families by country.

Table 1 .
Variables used in comparative analyses.
*These variables are used in the propensity score matching.a

Table 2 .
Matched samples (word reading ability at age seven).

Table 3 .
Matched comparisons in cognitive development to age seven.

Table 4 .
Regression of geographical factors on children's literacy development at ages three, five and seven.