A critical review of the advantages and limitations of using large-scale national surveys to examine childcare patterns and the ECEC workforce in Britain

Abstract OECD countries have established statistical collections to ensure quality within Early Childhood Education and Care (ECEC). Focusing on one part of ECEC – preschool ‘childcare services’ – this paper critically reviews statistical collections specifically designed to measure childcare patterns in England alongside UK data collected for other purposes which can be used to examine childcare patterns. The paper evaluates how far these data provide a reliable basis for examining the childcare workforce, how well childcare usage and provision patterns can be analysed and the degree to which the data provide comparable geographical coverage. Results show analysis is restricted by the various ways data-sets count and classify occupations. Differences in geographical coverage make them difficult to compare. More refinement of occupation categories would make existing sources more useful. The themes discussed here are relevant for other countries seeking to understand how best to utilise their statistical collections for examining childcare patterns.


Introduction
A growing body of research recognises that Early Childhood Education and Care (ECEC) brings a wide range of benefits, including social and economic benefits; better child well-being and learning outcomes; more equitable outcomes and reduction of poverty; increased intergenerational social mobility; higher female labour market participation and gender equality; increased fertility rates; and better social and economic development for society at large (OECD 2006). Expenditure on ECEC services has increased over time for most member countries (OECD 2014a). OECD data show that public expenditure on childcare in the UK was 0.5% of GDP on early childhood services compared to 0.7-1.1% in the Nordic countries with higher maternal employment levels and lower levels of child poverty (ibid). The UK figures are likely to be inflated by the early age at which children start school in the UK compared to elsewhere in Europe (at age five compared to age six and even age seven for other parts of Europe), so that many 4-year-olds in the UK are already in formal education (often full-time) below, which are collected from households and employers which can be utilised to examine childcare patterns -because they contain relevant questions as part of their general enquiry. This paper critically reviews the value of both statistical collections specifically designed and established in England in order to measure childcare patterns, alongside other largescale UK data which have been collected for other purposes, but which can also be used to examine childcare patterns. The specific aims are to examine how far existing data provide a reliable basis for examining the childcare workforce. It also looks to what extent existing data can provide a good picture of usage and provision. In both cases, the paper considers whether existing data can provide comparable geographical coverage. The aims of this paper represent different aspects of childcare which are often considered in isolation but which are important to consider together for a better understanding of the complexity of the childcare system. For example, provision patterns, which provide details about supply and who provides childcare, are important to consider alongside usage patterns, about parental demands for childcare. The geographical coverage can inform the extent to which the available statistical sources can be used to provide comparable information about these two aspects. It is also important to consider geographical coverage for the reasons discussed above regarding the nature of statistical collections in the UK.
The contribution of this paper is to highlight the importance of good quality national data for monitoring ongoing childcare patterns which is important for assessing progress with meeting key government childcare policies. The themes discussed in this paper are relevant for other countries seeking to understand how best to utilise their own statistical collections for examining childcare patterns, particularly those interested in using their own statistical collections beyond their original purposes for monitoring and/or assessing quality early childhood services. In particular, the paper argues that despite having an integrated governance structure (where childcare and preschool education activities are integrated), the UK's statistical systems are still split between covering aspects of education and childcare, causing problems for examining the workforce as a whole. This may resonate with other countries with recently but still not fully integrated governance systems.

Aims, data-sets and methods
This paper draws on evidence from a study examining 'The Provision and usage of preschool childcare in Britain' to illustrate and address the key aims of this paper (Simon, Owen, and Hollingworth 2015). The methodology is the secondary analysis of statistical data (Dale, Arber, and Procter 1988). Two kinds of data have been analysed: administrative data and survey data. Administrative data are complete records (except for unintended errors), such as registration data for childcare providers. This contrasts with survey data, which are collected on a sample and necessarily incorporate sampling variation, which means that survey data always have a margin of error (Owen 2017). Survey data need to be weighted to give population estimates.
The Family Resources Survey (FRS) and the Childcare and Early Years Survey of Parents (CEYSP), are the main data sources for analysing parents' use of childcare. The FRS, which has been running since 1992, is a continuous survey conducted on behalf of the UK Department for Work and Pensions (DWP). It is the leading household survey for the collection of household income (Simon, Owen, and Hollingworth 2015). The CEYSP, which has been running for the past 10 years, is conducted every 2 years by the Department for Education, and therefore only includes parents using childcare provision in England. It is used to provide information to help monitor the progress of policies and public attitudes in the area of childcare and early years education. Additionally, the study drew on Understanding Society (US), the UK household longitudinal study, to complement the results provided by the FRS and CEYSP. The FRS contains just under 4000 cases of families with children aged 0-4 for each survey year from 2006-2007 through to 2010-2011. These were sufficient sample sizes for some year-on-year analysis without needing to combine survey years. The CEYSP includes around 3000 cases per survey year for children aged 0-4.
The Labour Force Survey (LFS), running since 1992, is the largest and most comprehensive source of data on the workforce collecting data from approximately 60,000 households per quarter from across the UK. The Annual Survey of Hours and Earnings (ASHE) is based on a 1% sample of employee jobs taken from HM Revenue and Customs PAYE records. Information on earnings and hours is obtained from employers (Ormerod 2006). Both of these data-sets were examined for data on the childcare workforce for England.
In the LFS people's jobs are classified using the four-digit 2010 Standard Occupational Classification (SOC). Using this classification system, three individual occupations ('Nursery nurses & assistants' , 'Childminders & related occupations' and 'Playworker') were combined to make up the 'childcare' workforce. These were analysed, both separately and together, as childcare occupations. ASHE, running since 1997, also uses the SOC. The Childcare and Early Years Providers Survey (CEYPS) is the third main data source used for analysing the 'childcare' workforce. Like the Parents survey (CEYSP), the Childcare and Early Years Providers Survey (CEYPS), which has been running since 1998, is conducted every two years by the Department for Education, and so only covers childcare provision in England. Within the UK, childcare provision is a nationbased competence and therefore the CEYSP and the CEYPS, which are undertaken for the Department for Education, cover England only. The CEYPS includes group-based provision, out of school provision, childminders and early years settings in maintained schools (DfE 2013a).
The study additionally included statistics collected by the government departments of England (including data on the provision for children under the age of five in the maintained, private, voluntary and independent sectors in England), and statistics on the registration of childcare provision, including full day care, sessional day care and data on childminders collected by Ofsted, the childcare regulation body for England.
The LFS, the FRS, the UK Household Longitudinal Survey (UKHLS), the British Social Attitudes Survey (BSA) and the CEYSP are key large-scale data-sets for examining informal care of preschool children. These data-sets provide some information about the volume of informal care within and outside of the child's home. There is no single agreed definition of 'informal childcare' but a useful term recently employed is: 'Childcare that is largely unregistered by the state for quality control, child protection and/or taxation purposes' (Rutter and Evans 2012). Using the definition above, informal childcare includes childcare offered by: grandparents, other relations of the child, older brothers and sisters of the child, and neighbours and friends of the child's parents. Most of this childcare is unpaid or provided on a reciprocal or bartered basis. Although childminders and nannies provide childcare in the 'home context' (the former in the childminder's home and the latter in the parent's home), these are paid for services.

Results
Below presents a critical analysis of the advantages and limitations of using key statistical collections within England that are specifically designed to measure childcare patterns alongside other large-scale UK data collected for other purposes which can be used to examine childcare patterns. The key results of this analysis are presented in Table 1. The Table  and discussion below addresses each of the questions set out above.

How far do existing data provide a reliable basis for examining the workforce?
The data-sets available for analysing the ECEC workforce have a number of important limitations. The categories of the Standard Occupational Classification keep separate childcare and education staff. The Childcare and Related Occupations group includes most childcare staff, but excludes nursery managers and owners: these are classified with 'Teaching and other educational professionals not elsewhere classified' . This makes it difficult to get a full picture of the childcare part of the workforce. For the education workforce, nursery teachers are in a single category with 'Primary and nursery education professionals' , so it is not possible to separate those teachers working with preschool children from those working exclusively with children of compulsory school age.
Another key issue to considering the reliability of these data-sets is the extent to which these classifications may have changed over time. For 2005, the LFS provides information using SOC 2000 codes (ONS 2000); the 2012-2014 uses SOC 2010 codes (ONS 2010). Although the occupations for classifying the childcare workforce mentioned earlier have remained largely unchanged between these two sets of SOC codes, there were some subtle changes in the labelling of some of the categories which may make some difference to how people were classified between SOC 2000 and SOC 2010. For example, code '6123' labelled 'Playgroup leaders/assistants' in SOC 2000, became 'Playworker' in SOC 2010 and code '6121' Nursery nurses in SOC 2000 became 'Nursery nurses and assistants' in 2010 ( Table 2). The change to 'playworker' could mean a broader category. In which case, this change is likely to be problematic for analysing childcare for preschool children since not all 'playworkers' are involved with working with children under five years of age. Nursery nurses have now gained assistants and so are now a broader category. However, although the job title changed, the text describing the occupation made it clear that the same people were to be included in 2000 and 2010. The SOC code '6122: Childminders and related occupations' has improved from SOC2000 (when it was labelled 'other childcare and related occupations') because it now includes explicit reference to childminders. However, it still includes a large range of other occupations such as nannies and au pairs. It would be more useful for users of these data if the SOC could be coded so that childminders becomes a category on its own and if nannies could be distinguished from other forms of childcare. This is because nannies and au pairs differ from childminders in the nature of their work and they are not legally required to be registered with Ofsted (Simon, Owen, and Hollingworth 2016). Indeed there are discrepant numbers in the LFS compared with Ofsted about the numbers of childminders working in this occupation. For example, in 2014 in England, Ofsted report 53,000 registered childminders and the LFS report 100,916. This compares in 2008 with 61,929 reported in Ofsted and 102,964 in the LFS. The decline between 2008 and 2014 in childminders is greatest in the Ofsted statistics. This difference is likely to be a result of the variation in the way 'childminders' are defined between the two sources (e.g. the inclusion of 'related occupations' in the LFS); the decline between 2008 and 2014 in childminders reported in both sources could indicate a rise in unregistered (illegal) childminding (Simon, Owen, and Hollingworth 2015). There are some important implications of leaving out both Nursery teachers and managers from the childcare workforce. Previous research suggests teachers and managers are likely to be better qualified and paid than other childcare workers in the sector (Simon and Owen 2007) and by not including them, average levels of pay (and possibly qualification levels too) may be underestimated. This problem is especially relevant in relation to other statistics of the childcare workforce. For example, the CEYPS, which reports pay for different levels of seniority of childcare staff; pay for different grades of staff is much more useful for provider organisations than just a flat level of pay for childcare workers. The ASHE similarly uses the SOC to classify and count occupations. Therefore, it has similar advantages and limitations as discussed above with the LFS. As these examples show, the SOC therefore counts some but not all childcare workers. This is a major drawback considering government policy announcements about education and care of children combined (Childcare Act 2006).
Unlike the LFS, the CEYPS is information provided by employers in childcare establishments. This means it will only provide information about those workers employed by childcare providers but excludes some other types of childcare worker such as nannies or au pairs. In contrast, the LFS provides data reported by employees and the LFS does not entirely agree with the CEYPS in terms of the workforce numbers, characteristics and pay (Simon, Owen, and Hollingworth 2016). For example, while the CEYPS (2013) shows an increase in childcare workers between 2008 and 2013, followed by a decrease between 2011 and 2013, the LFS suggests a decrease of 5% for the childcare workforce in Great Britain (GB) over time from 2005-2007 to 2012-2014 (Table 3). However, taking England alone, the LFS reports a decline of approximately 1% (Table 3).
Taken together, the information presented here shows that the LFS is the best source for providing detailed information about the characteristics, pay, qualifications and working conditions for the childcare workforce. However, the LFS does not allow education to be counted with childcare which is a major drawback for comparing to other European countries which commonly report education and care together. To what extent do existing data provide a good picture of usage and provision?
The LFS and CEYPS provide very good up-to-date information about the qualifications, pay and working conditions of the 'childcare' workforce. For example, the LFS shows that in 2012-2014, 13% of this group had a degree level qualification or above, 73% of childcare workers had NVQ level 3 or higher (Simon, Owen, and Hollingworth 2015). While NVQ 3 only remains in use in Wales and Northern Ireland and has been replaced by the Early Years Educator and Early Years Teacher Status qualifications in England, NVQ levels are reported here because they are the common units for qualifications used in the LFS. The pay for the childcare workforce was about 10% above the minimum wage (Simon, Owen, and Hollingworth 2015). The LFS also shows the 'childcare' workforce had very low pay -on average only 10 pence above the national minimum wage for the UK (Simon, Owen, and Hollingworth 2016). While the LFS enables examination of overall qualification levels, such as the percentage increase in qualifications to at least NVQ level three, which remains a key government target (DfE 2013c), it does not provide any data about relevant childcare qualifications. Data on specific qualifications such as the graduate-level 'Early Years Professional Status' qualification would be beneficial for making assessments about increases in quality of the workforce because it would not only provide an indication of whether qualification levels were rising, but also to what extent people employed in the childcare workforce were gaining key target qualifications which offer specific quality rated training in the field. The Department for Education in England also produces annual statistics on early years provision for children under five years in the maintained, private, voluntary and independent sectors in England (DfE 2014). There are many tables in this annual publication, most of which refer to 'education' staff, which are people involved in the provision of education for young children (such as staff working in primary schools or nursery classes) rather than staff providing services for the care of children (such as nannies, childminders, etc.). However, it is possible to separate statistics relating to staff that could be included within the childcare workforce and exclude tables that report solely on 'education' staff. The data reported in these statistics are collected through the Early Years Census and is therefore likely to be an undercount of children and providers. This is because only those providers with children receiving some funded early education are required to make an Early Years Census return; the Early Years census and the 'Provision for Children' publications do not provide a count of all children aged two, three or four in private, voluntary and Independent providers. A key advantage of the Department for Education in England's annual statistics on early years provision for children under five years over the LFS is that the former source provides a useful table specifying the proportion of staff employed within different provider settings with 'Qualified Teacher Status' (QTS) or 'Early Years Professional Status' (EYPS) which DfE uses to monitor changes in highly qualified staff delivering early education over time in England. In contrast, the LFS is limited to providing information about the highest levels of qualification of people working in the childcare workforce and does not differentiate qualifications relevant to the early years workforce.
Patterns of childcare usage were examined using the CEYSP and the FRS. The analysis on childcare usage patterns facilitated measurement of the extent to which childcare was being used by families, the types of childcare being used and details about the socio-demographic characteristics of the families using and not using childcare. For example, the use of childcare is very high (the FRS shows 68% of families were using some form of childcare), with around half of families using more than one type (FRS shows 42% of families are using more than one type of childcare, Table 3) (Simon, Owen, and Hollingworth 2015). The proportion of families using more than two types of childcare has increased over time with those using two or more types of childcare most likely to be combining care by grandparents with some form of formal service (Simon, Owen, and Hollingworth 2015).
Other research has shown the importance of informal care for preschool children (e.g. Rutter and Evans 2012) for understanding patterns of childcare usage. However, very little is known about the characteristics of those providing informal childcare and there is no available large data source currently providing information about patterns of informal care usage (Simon, Owen, and Hollingworth 2015). The FRS does provide a wealth of data about people receiving or providing informal care within and outside of households. However, the question asks: 'And how about people not living with you: do you/(or does anyone in this household) provide any help or support for anyone not living with you who has a long-term physical or mental ill-health problem or disability, or problems relating to old age?' . Therefore the data available are a subset of all children receiving care, those children with a long-term physical or mental ill-health problem or disability. Even if one looks at informal care from the viewpoint of the children receiving it, there are varying limitations with the available data. For example, the FRS provides information about the children receiving informal care but the FRS does not provide any information about what pay (if any) parents spend on informal care. The BSA provides information about informal care as provided by grandparents through questions such as: 'Do you ever look after your grandchild or grandchildren' , and 'About how many hours a week do you spend looking after your grandchild or grandchildren' . However, the BSA stopped including questions on informal care after 2009, prohibiting any analysis of informal care post 2009. The CEYP provides hours of informal care but only for parents in England and so cannot be analysed for other geographical areas.
A further issue is that most of the data available for examining preschool childcare provision and use are cross-sectional, which makes it difficult to track changes in how families utilise different forms of childcare related to changes in their circumstances (such as movement in and out of work for mothers). Understanding Society (US) has been utilised by other researchers to analyse patterns of informal care (Wellard 2011) but unfortunately only includes a small number of preschool children (the most recent data wave included nearly 4000 under 5s).

To what degree do the existing data provide comparable geographic coverage?
Statistics within the UK are collected at different geographical levels. Some statistics are for the whole of the UK, some are for GB (excluding Northern Ireland) and others for the constituent nations of the UK (England, Scotland, Wales and Northern Ireland). Although Northern Ireland is not included in the boundaries of GB, it is possible to include Northern Ireland in some of the data sources including the LFS and ASHE. While there is an annual publication produced by the 'Employers For Childcare Charitable Group' for Northern Ireland which reports on childcare costs in Northern Ireland, there are no regular surveys similar to the CEYSP available in Northern Ireland. For England, Scotland and Wales, the data sources are very varied in terms of their geographical coverage of GB. For examining childcare usage, the FRS, and US data-sets provide good coverage of the UK but as discussed earlier, are not as detailed in terms of their content of childcare usage as the CEYSP. The CEYSP remains the best source for providing a comprehensive annual picture of patterns of childcare usage by parents but, being restricted to England only, these data do not allow comparisons with other parts of GB. This is a real limitation for making important comparisons with other localities. For example, it is difficult to compare available data on childcare usage for preschool children in GB with childcare usage in other parts of Europe because statistics for and by the EU or OECD are usually given for the UK as whole but the detailed data available in the CEYSP is for England only.
As discussed earlier, the LFS is restricted by the SOC in the examination of occupations, meaning it excludes managers. However, the LFS is still a very important data source for examining the 'childcare workforce' . Indeed, it is the only available data source to provide detailed information about the workforce in terms of its characteristics, pay and other working conditions for the whole of the UK. As the LFS also provides the same information for other workforces, this means the LFS can usefully be employed to compare the childcare workforce with other occupations not counted as childcare. Indeed, comparisons with 'all other occupations' reveals that childcare workers are poorly paid compared with other occupations (Simon, Owen, and Hollingworth 2015). By covering the UK, the LFS can be compared to the ASHE which also usefully provides information about pay for the childcare workforce for the UK. The CEYPS however, which is the largest survey specifically of childcare providers, and which usefully provides information about childcare managers, is restricted to England and so unfortunately cannot be used to compare with the LFS for GB.
In addition to examining major longitudinal and cross-sectional data, a number of statistical series which collect information about childcare provision in England exist. The Scottish government also publishes some more detailed statistics about qualifications for the childcare workforce. However, these data were only available between 2008 and 2010, making it restrictive for comparing trends over time and impossible for obtaining an up-to-date current picture of childcare specific qualifications for childcare workers in Scotland. In contrast with England and Scotland, very little data are collected about care workers by the Welsh government apart from the numbers of childcare workers in different settings (Care and Social Services Inspectorate Wales 2014).

Discussion and conclusions
Existing data are very good at examining formal childcare provision and usage but very weak for providing information about informal childcare for preschool children; data on the characteristics of people providing informal care for preschool children are practically non-existent and there is a real need for this information. The LFS is the best source for providing information about the characteristics, pay, qualifications and working conditions for the childcare workforce, allowing comparisons to be made over time and with other occupations, which is valuable for not only placing the childcare workers in the context of other workers but for making international comparisons about childcare provision. However, there are still notable limitations in carrying out analysis of childcare provision and usage using the available sources.
Despite having a partially integrated governance structure (where childcare and preschool education activities are nominally integrated in terms of inspection or curricular requirements -but not in terms of the workforce), the UK's statistical systems are still split between covering aspects of education and childcare, causing problems for examining the workforce as a whole. The limitations of using the SOC in the LFS mean managers are excluded and workers providing early education for preschool children cannot be 'joined up' with childcare. For childcare usage, certain important features, such as the number of hours children are being cared for informally by grandparents or other family/neighbours, are limited or absent in the data sources discussed in this paper. Differences in geography between the available data sources also make them difficult to compare. Some sources have more extensive data coverage than others. For example, the CEYSP and CEYPS are very good in terms of their content but only provide coverage of England. The LFS, FRS and ASHE are more extensive in terms of their geographical coverage but are limited in terms of their content and/or sample size. Additionally, while English data sources provide good national data about childcare provision, Scottish sources, and in particular Welsh data, offer much less information, which makes is problematic to compare what is happening in terms of childcare provision and usage currently and over time.
There are some key factors that could improve existing data sources on childcare provision and usage. First, the development of SOC categories that would better capture the work of the ECEC workforce, joining up those working within educational settings with those working in other settings, along with the creation of a specific and separate SOC code for managers working in ECEC. This would give more value to the LFS by enabling those interested in childcare provision an opportunity for analysing detailed statistics about the pay and working conditions of the whole ECEC workforce and better comparison between the LFS and CEYPS than is currently possible. Second, more coherent statistics collected across the countries of GB is needed in order to allow comparable and comprehensive analysis between, within and across GB. Perhaps Scotland and Wales could consider, for example, running a survey with comparable questions to those asked in the CEYPS which would enable a much more detailed picture of childcare provision to be obtained than is currently possible with the existing data. Third, further research, and ideally the collection of routine statistics, capturing more information about parental childcare choices would enable research to better inform childcare policy about how to match services to need. Finally, the inclusion of questions in large-scale surveys such as the FRS on the characteristics of people providing informal childcare to preschool children would provide much needed information about how informal carers support formal childcare provision. The childcare workforce is shrinking over time in size and families are relying on informal care alongside formal services (Simon, Owen, and Hollingworth 2015). It is important to understand more about informal carers so that government policies can better understand the impact of this caring on people and society and so that policies encouraging more women back to work can take account of the role played by informal carers in enabling this to happen.