The Population of Non-corporate Business Proprietors in England and Wales 1891–1911

This article uses population censuses to provide the first consistent counts of the population of business proprietors for 1891–1911. After appropriate adjustments for imperfect Census design the article confirms the persistence of own account self-employed as the most common businesses throughout the period. However, it identifies a turning point around 1901 when the business numbers decisively shifted towards larger firms, where employers with waged workers began sub- stituting for many own account businesses. Developments were, however, multi-faceted, with important sector differences, and some fields of female business beginning to take off over the period, especially in retail and the professions.


Introduction
This article gives the first accurate and consistent counts of the whole population of non-corporate business proprietors in England andWales 1891-1911, and a breakdown into employers and own account self-employed by sector and gender. Business historians have long lamented the lack of large-scale statistics for the nineteenth-and early twentieth-century British business population. There was no national registration process and consequently no contemporary official data on the whole business population. Hannah (2007: 415;2014) referred to information on nineteenth century business numbers as a 'statistical dark age' . Jeremy (1998: 331) argued that the lack of data leads to reliance on varied sources that are so 'scarce and variable' that economy-wide understanding of business dynamics is highly restricted: 'over the long period the statistics are not comparable and therefore not to be trusted for secular comparisons' .
A useful debate has emerged on the need for greater 'development of generally accessible machine-readable datasets' (Wardley 2001: 129). However, development of such databases has chiefly focused on corporate and large businesses, with significant research into the 100 largest firms; but nothing of scale for Britain has emerged on non-corporate and smaller businesses (see Hannah, 1983;Jeremy, 1991: 568-72; and recent controversies about the 100 largest firms: Wardley, 1999;Hannah, 2014). Yet the importance of non-corporate businesses KEYWORDS Business proprietors; self-employed; employers; sector change; gender; female entrepreneurship; Census; family firms; sole proprietors; professions; maker-dealers; retail; manufacturers; mining; farmers as both partnerships and individual proprietors for this period is well-established. Alfred Marshall (1919: 314) referred to non-corporate partnerships as 'the representative firm in most industries and trades' , Pollard (1965: 233) called them 'the typical firm' , and Clapham (1932: 112) concluded that 'masters' , sole proprietors, and partnerships remained the most numerous form of enterprise into the 1930s, with Hannah (2014: 867) arguing that partnerships offered most of the advantages of 'corporate-ness' without becoming companies. While data do exist on some firms, especially larger ones, and on some sectors at some time-points as a result of official enquiries, regulatory reports and other records, there is no source with systematic national coverage aligned over time that covers all or most firms, especially small firms and self-employed sole traders. Erickson (1959: 7) noted that even for iron and steel, a relatively well-recorded industry, much business history derives from 'the proportion of the whole population which historical sources enabled us to study' .
The lack of data on the majority of businesses, which were overwhelmingly small and non-corporate, has limited analysis of business trends and proper understanding of the role of different types of firm within the size distribution. This makes assessment of distribution between entrepreneurial, corporate and other activities problematic. As Crossick (1995: 40) observed, in the nineteenth century 'the vitality of small enterprise was not sufficiently important in Britain to shape the gathering of Census statistics'; Payne (1988: 22) argued small businesses were the 'regiments of the anonymous' with a tendency for business histories to be 'inherently biased towards the successful' . Similarly, while the 1890s were observed by Marshall as the period when corporations began to take over, it has been impossible to assess the speed and extent of this in relation to the rest of the business population. This article seeks to fill some of these gaps; it reveals the persistence of non-corporate firms, individual self-employment, and the importance of sector and gender differentiation. This is a counterpoint to the dominant trend of much business history for the period before World War I that focuses on the emergence of the large firm and corporations.
A source of information on most of the business population, including almost all small enterprises, has been lurking within sight but out of reach. The population Census contained information on business proprietors through counts of employers and the self-employed over , but much of this information was either never published or was reported only in partial or limited summaries. The original Census household records contained the information but were inaccessible on the scale required to yield full population estimates. However, since 2014 the full individual records of the Censuses from 1851 to 1911 have become available electronically through the Integrated Census Microdata (I-CeM) database (Higgs and Schürer, 2014). 1 This can be used to identify business proprietors in a way not previously possible.
The article uses the population Census records to construct a time series of the population of business proprietors for the first period in which the Census fully covers all non-corporate employers and self-employed: 1891-1911. Self-employment was defined at this time as individuals operating on 'own account' not employing others. The focus is on estimating the 1891-1911 business population by accurate and consistent identification of non-corporate business proprietors. As the Censuses provide data on all individuals for the whole population, this is an essential preliminary for opening up a wide range of opportunities for whole-population research by business historians at the individual level. This is a step towards analysis of the business characteristics of all individuals, their industry sector, gender, family structure, and location, opening major opportunities for 'big data' . The estimates from this article provide the starting point for such analysis by accurately identifying the population of proprietors. These are part of a uK Data Archive (uKDA) database deposit available to all business historians. 2 The next section of the article discusses the information that has been previously available from the Census and its limitations. Section 3 outlines how I-CeM can be used to identify the business proprietors for . Section 4 develops the methodology. Section 5 presents the estimates of the population of business proprietors by main sector, business form, and gender. The conclusion summarises the results and assesses the implications of these new estimates for further research.

Published Census business proprietor numbers and limitations
Some information on the business population identified in Censuses of this period was published. For example, in 1891 a table of the numbers of employers and 'own account' businesses for a selection of occupations by gender aged 10 years and upwards was published (PP, 1893-4: x-xx), but this excluded over 70 occupations likely to include business proprietors, with important omissions for all agriculture, mining and quarrying, some textiles, all professions, merchants, agents, dealers in money, and insurance. As shown in Table 1 (bottom row), the published tables included only 74% of the proprietors contained in the manuscript Census records now available electronically. In 1901 the same occupations were excluded from publication with coverage reduced to 73% of all proprietors (PP, 1903: 186-201). The 1911 publication included farming, and laundry and bathing services for the first time (PP, 1913: 12-67). This markedly increased the coverage to 83% because these classes included numerous proprietors.
Publication coverage varied by enterprise type. Table 1 (rows 4 and 7) show that, for 1891 and 1901, the coverage of employers was 67-72% compared to own account at 76%. In 1911, mainly as a result of including farmers, published employer summaries markedly improved to 91% but own account remained low at 77%. However, the greatest deficiency for business historians in using the published tables from the 1891-1911 Censuses was inaccurate identification of the business responses analysed. Table 1 illustrates some of the difficulties of using the published data. If we believed the Census published statistics, (row 2) there were 113,000 or 29% more employers in 1891 than 1901. Conversely, own account (row 5) had grown by 34% over the 10 years 1891-1901, before falling by 8% in 1901-11. These numbers look implausible and are not supported by any previous secondary analysis or contemporary observation. The published figures appear even more implausible when individual sectors are examined. For example, employer numbers in the large categories of lodging and boarding-house keepers, innkeepers, physicians, beer sellers, dressmakers, shopkeepers, shoe and boot makers all appeared to fall by over 40% between 1891 and 1901; other large categories such as blacksmiths, grocers, cow keepers, watch and clock makers, laundry workers, biscuit dealers and butchers appear to have declined by over 30%; wheelwrights, greengrocers, solicitors, fishmongers and masons declined by over 25%. Consolidation was occurring in some of these sectors at this time, but there is no reason to believe this scale of change, which would have occasioned major crises for these sectors; indeed, most literature suggests that most of these sectors had growth of absolute business numbers over the period at least in line with population growth. Hence it is crucial before using these data to assess the 1891 Census for misallocation bias -a possibility that was observed by Census administrators, but was not tackled (PP, 1893-4: 36). Moreover, a further set of limitations results from the fact that many householders did not respond to the Census question at all, so that the published statistics and those recorded in the electronic records shown in Table 1 are only part of the total possible population of employers and own account. Thus, non-response as well as potential misallocation biases in 1891 have to be adjusted before we can obtain reliable estimates of proprietor numbers.

The Census as a source for the non-corporate proprietor population
The data used in this article are derived from the original householder responses electronically captured in I-CeM version 2 (Higgs et al., 2015;Schürer et al., 2016). These records are transcriptions made by the commercial genealogy provider FindMyPast (FMP) in conjunction with The National Archives (TNA). For 1891 and 1901 the Census Enumerators Books (CEBs) were transcribed, and for 1911 the original householder returns. The contribution of I-CeM has been to convert the FMP-TNA genealogy resource into a coded database that is amenable to research analysis. However, I-CeM provides only a starting point, especially for use as a business source.
I-CeM attempted to provide a standardized coding of all Census respondents. However, a number of corrections to Census responses in the original CEBs, which would have been made by Census clerks before production of published tables, cannot be captured. As a result it is important to impose on I-CeM data a number of controls to achieve consistency. The issues affect all years, but are greater in 1911 because use of the original householder returns remove any intervention by Census enumerators that would have limited variations in, for example, how occupations were described. To reduce the complexities, a significant amount of additional data cleaning, occupational coding correction, and improvements have to be undertaken to achieve consistent description of business activities . one of the most important pre-requisites is identification of industry sector. To achieve this, the occupational coding in I-CeM must be checked against the original descriptors used by Census respondents. This was undertaken in a three-stage process. First, the total proprietors in each occupation category were compared with the totals given in published Census reports (PP, 1893-4: x-xxv;1903: 186-201;1913: 12-25). The work undertaken by Census clerks thus guides the corrections needed. For any category where the I-CeM total was more than twice the reported total, the most common occupation descriptor strings were inspected and all strings with more than 100 proprietors or more than 5% of the total in that category were checked and, if necessary, more accurately coded. Secondly, all occupation strings with 25 or more proprietors were checked by hand and corrected if necessary. Thirdly, all proprietors with portfolios of activity, indicated by multiple business descriptors, were checked and coded by hand (about 10% of all proprietors). overall, for each year, occupations were corrected for around 300,000 individuals representing over 1.5% of the working population; of these, about 55,000 in each year were corrections for businesses proprietors, around 3% of the total. These checks significantly enhanced accuracy and ensured aligned and consistent estimation. However, it is important to note that, despite these efforts, and despite the Census Instructions, a proportion of people gave unspecific occupation titles (such as 'merchant' , 'manufacturer' , 'general labourer' etc.). Consequently, although counts of proprietors as a whole are not affected, the industry to which they are assigned cannot be identified and this constrains efforts to calculate mean employee size using occupational codes to assign workers. other consequences of this constraint are discussed in conclusion.
The analysis must also correct for biases in the original Census process. The population Census was not a business Census. There was a legal obligation to reply and to provide accurate information which ensured near-complete coverage and a high level of accuracy of what it collected. However, it was designed and administered by the General Register office (GRo) to count the population primarily for demographic analysis and assessment of occupation-specific mortality (see Higgs, 2005), with information on industry and the economy a secondary consideration. This resulted in defects of survey design and administration.
The employment status question initiated in 1891 asked householders to put a cross in one of three columns (numbered 7, 8 and 9) headed, respectively: 'employers' , 'employed' , or 'neither employer or employed' . These three columns were grouped with the occupation Instruction under a heading 'Profession or occupation' . The terms used were defined more fully in the general Instructions as follows: A cross must be made in Column 7, headed 'Employer' , when a person is a master, employing under him workers in his trade or industry; in Column 8, headed 'Employed' , when the person is working in a trade or industry under a master; and in Column 9, headed 'Neither Employer nor Employed' , when the person neither employs other workmen in his trade or industry, nor works for a master, but works on his own account. Married women assisting their husbands in their trade or industry are to be returned as 'Employed' . 3 Recognising difficulties with this design, the Census Instructions were modified in subsequent Censuses. In 1901 and 1911, rather than putting crosses in columns, householders had to write their employment status in a single column. For 1901 (and almost identically in 1911) this Instruction was: 4 Write opposite the name of each person engaged in any trade or industry, either (1) 'Employer' (that is, employing persons other than domestic servants) (2) 'Worker' (that is, a worker for an employer), or (3) 'own Account' (that is, neither Employer nor working for Employer, but working on own account).
The Census Instructions set out the distinction between the three kinds of employment status and also explicitly included married as well as other women. This should have resulted in all sectors, genders and ages of employers and own account replying. The Instructions for 1901 and 1911 were generally believed by GRo to have worked well, and a similar question has been included in all subsequent Censuses. 5 For example, in 2011 the Census question was: 'In your main job, are (were) you: An employee? Self-employed or freelance without employees? Self-employed with employees?' 6 The 1891-1911 Instructions and administration introduced important constraints that have to be managed when using these Censuses to estimate proprietor numbers. First, all three years put the least common option (employer) first and the most common (worker) last, something considered a possible defect in modern survey design. Although question ordering is recognised as an imprecise science (Groves et al., 2004), when combined with the format of multiple columns in 1891 this probably increased the tendency to over-record the first option (employers) in that year.
Secondly, all three years suffered from a lack of priority given to this question by Census administrators resulting in high levels of non-response bias. It has been previously recognised as an important constraint that the GRo included this question only reluctantly (Schürer, 1991: 20-26;Higgs, 2005: 112). This was partly because of the costs of administration and processing, which explains the reluctance to tabulate and publish the results; but mainly arose because the GRo continued to see the Census as a primarily demographic and medical assessment and had little interest in the economy. The GRo was forced to add the question only because of sustained pressure from the Treasury, bodies such as the Royal Statistical Society, prominent economists and social statisticians such as Charles Booth and Alfred Marshall who wrote and publicised a lengthy memorandum of suggested reforms (Acland, et al., 1890), and others concerned to improve the value of the Census for information about industry and the economy. Following the recommendations of a Treasury Committee enquiry into the taking of the Census, the Local Government Board directed the GRo to include a question almost exactly in the form used in 1891 (Treasury, 1890). Nevertheless, the GRo continued to resist the question, as their comments show (Brydges Henniker 1888: 120-1). As a result, as well as not fully tabulating and publishing results, low priority was given to data collection which resulted in acquiescing in high levels of non-response for the whole period 1891-1911. These should have been checked for completeness by the enumerators who collected the Census forms. The non-response rates to the employment status question were far higher than for any other Census question: once the noneconomically active are removed (scholars, retired, those living off own means and so on) 16%, 18% and 20% of people in, respectively, 1891, 1901 and 1911, gave no answer. Moreover, non-responses were non-random, biased by gender, position in household, age and other factors (see: Smith et al., 2017). Consequently, correction for non-response bias by Census respondent category is required.
In addition, the 1891 Census suffered from six further defects that are unique to the Instructions used in that year and led GRo to its subsequent redesign. First, the terminology of 'Employer' and 'Employed' were so similar that they appear to have confused respondents who could easily misread and cross the wrong column. When combined with 'employer' being the first and least frequent category, this probably increased the number of incorrectly identified employers. Secondly, the 1891 definition of the last category 'own account' was confusing. As noted earlier, the definition was expanded in the general Instructions, as 'the person neither employs other workmen in his trade or industry, nor works for a master, but works on his own account' . This long-winded and negative phrasing was complex and easy to misread as applying to employers who worked on their own account while also employing other people, thus resulting in an over-estimate of own-account individuals. Similarly, thirdly, it is also believed that many respondents could have read 'master' , which the Instructions regarded as synonymous with 'employer' , to refer to own account traders who employed no-one else. This potentially resulted in misallocations that inflated the numbers of employers. Fourthly, the term 'own account' was also confused by many respondents to mean living on 'own means' through income from investments, annuities, welfare, pensions, etc. Checks of the actual responses to occupations against their employers status demonstrates that this inflated the number putting a cross in the own account column by including many who were not economically active. Fifthly, the question may have encouraged respondents to inflate their importance by falsely returning themselves as employer rather than employed, or employer rather than own account. These defects resulted in a significant danger of upward bias to misallocate some workers or own account as employers, and some workers as own account. A sixth source of bias was the Instruction for wives who were 'assisting their husbands … to be returned as "Employed"' . Although typical of the time, this encouraged the status of some wives who were business partners or co-preneurs to be recorded as workers, leading to a small potential undercount of their own account or employer status. This is an important issue for analysis of female entrepreneurship, as observed in previous studies (Hatton and Bailey, 2001;Davidoff and Hall, 1997;Kay, 2009). However, the use of the original Census records overcomes many of the problems other researchers have encountered when using published Reports, which often edited out female roles and this has led to severe criticism of the Census as a source for female proprietorship (see Higgs, 2005;Kay, 2009;Barker, 2006). Also, given the large numbers of wife and other female entrepreneurs actually recorded, the effects of gendered questions were probably small. We find much higher rates of female entrepreneurship from the Census records than any previous study (e.g. Kay, 2009;Aston and Di Martino, 2017) suggesting that, for all its problems, the Census is generally the most complete source available for the study of female entrepreneurship.
Some of the main defects of the 1891 design were recognised at the time in a highly critical Census Report. The GRo believed that the choice of columns crossed was very unreliable: 'there were often strong reasons for believing that it [crossing] was made in the wrong column' . They suggested that 'oftentimes this use of the wrong column can scarcely have been other than intentional; being dictated by the foolish but very common desire of persons to magnify the importance of their occupational condition. ' This resulted in the 'the otherwise unintelligible fact' that some occupations had more 'employers than employed, more masters than men' , particularly for 'Builders, Provision Dealers, Coal Dealers, Road Contractors, Dealers in Hemp, etc., Dealers in Cane, Rush, etc. and others. ' 7 It also suggested that in addition to the intentional use of the wrong column, there was a lack of familiarity with filling forms and a general inability to cope with the Instructions that led to many genuine mistakes in a question that GRo considered complex and had been forced to include. 'It appears scarcely reasonable to expect such an [ordinary working] man laboriously to spell out the Instructions, and, following them duly, to select out of the three columns the proper one in which to make his cross' . As a final riposte to those compelling the use of this question, GRo stated that 'although … we have not considered ourselves justified, after the instructions given to us by the Local Government Board, altogether to discard the statements as to employers and employed from the Census volumes, we hold them to be excessively untrustworthy' (PP, 1893-4: 36).
Historical Census administrators were slow to adopt insights from statistical sampling theory to help understand potential biases, and make post-survey adjustments. The first substantial research to deal with these issues began to appear in the 1950s and 1960s (uS Bureau of the Census, 1950;Eckler and Hurwitz, 1958;Hansen et al., 1961;Jabine and Tepping, 1973). The GRo itself made no attempt statistically to test the validity of its claims of response biases, nor to clean and correct data, nor even report actual numbers of potentially biased responses. I-CeM now allows adjustments to be made. Modern published Census tables are weighted and adjusted for significant non-response and other biases. occupational description, which is normally captured as open reporting, is now understood to be particularly difficult for both respondents and coders (Conrad et al., 2016: 77-80). From modern studies, by testing the effects of different question formats, it is understood that complex occupation questions should be split up (Elliot, 1983;Campanelli et al., 1997). It is also now known that responses are more complex and uncertain for own account and employers in the smallest establishments, because of their multi-attribute activity (e.g. for the 1971 and 1981 Censuses: Martin et al., 1994: Tables 1 and 2). All the indications from modern Census analysis are that the format of the 1891 question in particular, and to a lesser extent the questions in 1901 and 1911, were difficult to respond to, difficult for Census enumerators to record accurately, and for clerks to code. When combined with the reluctance of the GRo to administer the question at all it is not surprising that biases occurred.
one other aspect of the GRo criticism, however, can be set aside. GRo commented that significant problems arose from crossing multiple columns; 'in numerous instances … no cross at all was made; in many others, crosses were made in two or even in all three columns' (PP, 1893-4: 36). GRo believed this was inconsistent with the Instructions and reinforced indications of unreliability. of course it was also a defect of GRo administration. However, the phenomenon appears to be inconsequential, as already has been suggested by Schürer (1991: 26). Checks of approximately 36,156 individuals between I-CeM codes and the original CEBs in London, Brighton, Birmingham and oldham indicated only 33 double crosses (0.09%). Moreover these cases appear to be entirely reasonable ways of recording the individuals concerned. For example, 'fishmonger & grocers traveller' and 'farmer, coal merchant manager' would be correctly crossed for both own account and worker. Indeed, the number of multiple crosses appears to be far lower than the probable number of multiple occupations in the population. In the rest of this article the responses of multiple ticks are dealt with in the same way as the GRo itself: by assigning individuals to their first or 'main' occupation listed. However, multiple business descriptors allow identification of an important class of portfolio businesses that are the subject of more detailed analysis (Radicic et al., 2017).

Methodology
The previous discussion has identified the two main issues that have to be addressed in order to achieve dependable estimates of business proprietor numbers: misallocation bias in estimates of employer numbers in 1891; and non-response bias for all three years. The methodology developed to manage these complexities follows a three-stage process: data editing, cleaning and definitions; correction for non-response bias; and specific 'corrections' of upward bias for 1891.

Data editing, cleaning, and definitions
Survey 'editing' is a standard method of post-response Census processing (Lyberg and Kasprzyk, 1997: 355-8). It is rarely discussed explicitly for historical Censuses, but is an essential prerequisite to remove biases. Since each respondent gave two pieces of information, their employment status and their occupation, the tendency to cross the wrong status column, inflate status, or write an incorrect status can be adjusted by inspection of the actual occupational descriptors given. For many this provides a reliable means to correct for response bias toward employer or own account status, as well as reducing I-CeM coding and other errors. using this check, large numbers of incorrect responses to employer status were detected even after the earlier occupational cleaning. Most were uncontroversially wrong, e.g. 'labourers' , 'scholars' , and 'domestic servants' that had employer status. others resulted from confusion about occupational terms; for example, many respondents 'living on own means' , 'annuitant' , 'living on investment income' and 'unoccupied' frequently crossed own account. All these categories were 'cleaned' from employers and own account and re-coded to worker or unoccupied.
There were various stages to this cleaning: first, to re-code from employer or own account status all students, scholars, and pupils, individuals under the age of 15 years, all non-economically active, and all own means, annuitants and retired. Secondly, all individuals whose 'main' occupation was non-business were also re-coded (unless another business occupation was given; to which they were then coded); e.g. foreign diplomats, prisoners, reform school inmates, vagrants, MPs, ministers of the crown and peers; civil service officers and clerks; prison officers; police; poor law service, municipal, parish, and other local or county officers; army, militia & yeomanry, navy; clergy, monks, nuns, sisters of charity, and church, chapel, and cemetery or charity officers; clerks; and nurses. Thirdly, all definitively worker categories were re-coded, such as domestic servants, all types of labourer, farm servants, navvies, and so on. Fourthly, also re-coded were all categories of those with working titles that defined an employee status, such as apprentices, journeymen, assistants, attendants, mechanics, artisans, or machinists.
This resulted in a substantial reallocation which it is believed solved almost all of the misattribution of workers as employers or own account. In all, the numbers re-coded to worker status were approximately 70,000 in 1891, 69,000 in 1901, and 113,000 in 1911. The occupational descriptors cleaned in this way generally reaffirmed the original Census Instructions to count people to their main or primary occupational status and exclude those who crossed a column or termed themselves something that was infeasible as a business, or whose business income was a small element available as a by-product of their employment. A major category of the latter was the clergy who often described themselves as employers or own account, because they took some personal fees, even though they were office holders, or because they perceived themselves as the employer of a curate or let their position to another incumbent.
other decisions on who were validly proprietors are detailed in Smith et al., 2017: Table 24, which gives the full definitions used. Key decisions were to regard all managers, branch managers, and company agents as employees of larger enterprises. This accords in almost all cases with their self-identification in their responses. Company directors, who were identifiable in the Census in a few cases, were excluded as the focus of this article is on non-corporate activity. 8 Partners are partially identified in the Census responses, but this is a small proportion of all partners and was not explicitly required by Census Instructions (Bennett, 2016). With a focus on proprietors this ensures that all should be captured, i.e. all partners are recorded, but they cannot be fully attributed to their firms. This results in a lack of alignment with business numbers, which will be smaller than the number of proprietors reported. As far as can be determined there was no trend in partner response rates since the proportion identifiable remained approximately constant over the three Census years. Nor is there any reason to believe that size of partnerships changed significantly, e.g. for tax purposes the Inland Revenue assumed a constant number of partners per firm throughout the period (PC, 1906;Stamp, 1916: 245; as also assumed by Feinstein, 1972).
A more complex issue is internal labour markets. We define those that had to accept conditions and prices set by others as waged. Hence, we consider teams, gang sub-contracting, putting out and sweating as generally not sufficiently autonomous to be included as employers or own account. However, we are reliant on Census self-reporting, and make the assumption that individuals understood their level of autonomy when they responded (other than making mistakes or non-responses). We use the distinction between direct and indirect control, as adopted by Pollard (1968) and Littler (1982). Littler viewed as employers the heads of work-groups such as gang leaders and claimed that this was important into our period in iron foundries, shipyards, the building industry, glass production, potteries, and though kinship networks in textiles such as through the cotton minder-piecer system. However, the self-reporting in the Census rarely identifies these groups. For example, contrary to Littler (1982: 65-72), they are not self-reported in shipyards, and individuals such as 'ironmasters' were not team leaders and subcontractors within larger businesses. In the Census, Mineral Statistics, and trade directories coal masters, ironmasters and steel masters were almost all large scale business proprietors. 9 'Gang masters' in the Census were almost always stable proprietors who let out horses. Hence, it appears that internal contracts are largely excluded in Census self-reporting, and hence are excluded here. The exception is mainly in the building trades, where we accept self-identification as own account as accurate, as it would be today. Also we accept self-identification in 'putting out' manufactures of gloves, hosiery, boots and shoes, and straw plait and similar industries. These were mostly highly concentrated mainly in Bedfordshire, Buckinghamshire, Northants, Leicester, and a few centres in SW England. The self-reporting as own account or employers in these groups and locations is relatively high, especially among females, which we accept as accurate. They fit with the concept of 'master craftsmen' who possessed their own tools, often assembled their own raw materials, operated from their own premises, and sometimes trained apprentices and/or managed one or more journeymen to whom they paid wages and hence could be employers (Woodward, 1995). These groups are not among those that are affected by our re-allocations for over-reporting (as detailed below)

Non-response bias
once the data had been cleaned, as described above, the non-responses are reduced substantially, to 4.6%, 4.8% and 5.3%, respectively, for 1891, 1901 and 1911. However, the remaining non-respondents were not randomly distributed; with position within household (RELA code in I-CeM) a significant element in non-response, which interacted with gender.
Non-response was much higher for individuals other than the 'head' (the person who filled in the Census form), such as adult sons, daughters, other relatives, lodgers, etc. (Smith et al., 2017: Tables 15-19). The non-response was particularly high for female relatives who were not heads. Non-responses were also high in sectors such as clothing and dress dealing (drapers, hosiers, haberdashers); domestic and service staff and cooks; cotton and silk manufacture (including ribbon, weaving, dyeing, bleaching etc.); personal services such as washing and bathing, hairdressing; professions such as barristers, solicitors, doctors, dentists, artists, performers, education; food sales (butchers, fishmongers, cheesemongers, milksellers, grocers); and woollen manufacture including carpets, blanket, flannel. Nonresponses were lowest among professions following scientific pursuits ('analytical chemists' , 'inventors' , 'botanists' and similar occupations); ironmongers; chemists and druggists; and blacksmiths.
The standard way to deal with non-response bias is to estimate weights which can be used to adjust actual response numbers or proportions. There are competing methods to deal with this bias: some over-represent the non-respondents using weights equal to the inverse of the response rate for each group in the sample; others randomly allocate the non-respondents to any of the possible responses according to their proportions in the respondent set (Kish, 1967) Because it is of crucial importance to maintain the integrity of the dataset without incorporating extraneous information, the method used here is weighting derived from the individual's characteristics that influenced non-response. The weights were calculated using a logit regression which estimates the probability of whether an individual was a respondent or not to the employment status question, using a range of independent variables to control for those features that principally correlate with the observed non-response bias in the data: gender, relationship within household, and occupational sector. 10 In addition, because some individuals have non-responses or unclear answers to other questions (including gender or RELA), they cannot be included in the regression estimate. These have to be built back as a second stage in order to calculate a total adjusted estimate of proprietor numbers.
The logit regression estimates are shown in Table 2. This reports the probability of whether a respondent in the same 'non-response class' should be weighted (with weight greater than one), to compensate for others for this sub-population who did not respond. For example, if there is a response rate of 0.75 the average weights for that response class should be 1/0.75, and so on. At the individual level, the logit regression calculates the probability of being a respondent. The inverse of these probabilities become the weights used for each observation. The weighting method does not assign non-respondent individuals to response categories, but instead gives respondents additional weight using the inverse of the response rate. For the use of other researchers the full set of weights is available and should be used in conjunction with the database deposit. 11 Table 3 gives estimates of total proprietor numbers for 1901-1911 in rows 2 and 4, after weighting for non-response bias. For 1891 the corrections for non-response bias have to be undertaken after correction for upward bias (row 7), discussed later below. The estimates for 1901-11 increase estimated employer and own account numbers by 7-13%. Worker numbers are also increased, so that the estimated totally economically active is 5% higher in 1901 and 6% higher in 1911 after taking account of non-respondents. The Table also reports upper and lower bounds for the estimates, as discussed later below.

Correction of 1891 misallocation biases
As noted above, an upward bias in the responses for the status question in 1891 tended to inflate the numbers of employers. The GRo stated this as an important issue, and comparisons of the numbers of each employment status over time in Table 1 have already indicated that it was substantial. Table 2. estimates of the weighted logit regression 1891-1911 used to reallocate non-respondents. the estimates give the probability a response (non-blank employment status); i-CeM reLA relationship codes are simplified to nine categories (CFu is a member of the continuous family unit), working titles relate to assistants or employees of the business head, unknown are those where no reLA code is given (which includes visitors on Census night). *** indicates that estimates are significant at the 99.9% significance level. However, correcting misallocations is complex. A comparator source of information is needed to indicate the level of true response. No comparable data are available for the earlier year of 1881 as the Census did not include the employment status question; consequently the only sources available on a similar basis are for 1901 and subsequent Censuses. Clearly 1901 is preferable as the closest comparable source. The approach adopted is to use the 1901 Census as the main comparator, supplemented with any trends from 1911, and then to test possible corrections against known secondary sources for each occupation category for each of the 629 of the 797 occupation categories in I-CeM that contained employers or own account. Four methods were used: (1) The preferred method is a robust logit regression model based on 1901 Census responses to allocate between employer and own account taking account of the most significant explanatory variables for employer status in 1901. The variables used were the 629 occupation categories, gender interacted with marital status, population density of the Registration Sub-District, number of domestic servants, and weights based on household relationship codes (RELA). These variables were determined after a range of experiments with alternatives and relate to the main sources of bias previously detected . After estimation with the 1901 data the coefficients were applied to the 1891 data to give the probability of being an employer for 1891 using 1901 coefficients. Thus, the 'correct' employer attribution was calculated using the 1891 independent variables values but with the estimated 1901 coefficients. This method has two outputs: (1A) which is the summation of the individual mass density to an aggregate-level: and (1B) which is the summation of the rounded numbers. (1A) is the preferred extrapolation of numbers because it has no bias. However, (1B) is the only method that identifies individuals. (2) A secondary method is a simple linear extrapolation of the change in ratio between employers and own account between 1901 and 1911 applied to 1891; this is appropriate where a sector was experiencing continuous growth or decline at the same rate over the period which method (1) could not assess. It was particularly relevant to a few largescale sectors experiencing structural change, such as farming and blacksmiths. (3) A possible alternative method is the average of the ratio between employers and own account for 1901 and 1911; this is most appropriate where a sector was relatively static and may be preferred because it averages out any random variation in the individual Censuses. (4) Accepting the actual Census responses in 1891; this is to be preferred where the occupational category is clear and unambiguous as to employer, own account or worker status and respondents are believed to have been accurate. Checks of this allocation were made against the same occupations in 1901 and 1911 for those occupational codes where actual responses were judged to be reliable.
The estimates of the logit regression (Method 1) are given in Table 4, shown for 1901. This gives the probability of the binary employment status variable having a value of 1 if the individual was an employer, or 0 if own account, using estimates for 629 occupation categories, and covariates. only five representative occupational categories are shown for the 629 estimated. using these estimates, the databases are then exchanged so that the allocation of employer and own account status is made using 1891 data with the 1901 coefficients.
The first three methods provide alternative estimates of the numbers of employers that could be reassigned to own account to compensate for upward biases towards employers; the fourth method accepts the original responses unchanged. For each of the 629 occupations these estimates were compared for trends and absolute numbers against the published Census responses, the GRo Census commentary, patterns known from the secondary literature, and a calculation of the ratio of workers to employers and own account to indicate a simple measure of mean firm size (although this is a very imprecise indicator; see comments in conclusion). The actual responses were accepted where the employment status of that occupation was unambiguous (method (4) above) as for pig iron or steel manufacturers, or where the numbers in the occupational category were too small to make reliable estimates (such as resin manufactures, which had only five employers in 1891). In all other cases an estimate was derived from one of the three other methods. In general method (1) was preferred since it draws on the widest range of information at the individual level. It uses the attributes of the individuals from 1901 and, assuming these remain the same, uses the attributes in 1891 to allocate status for the non-respondents. However, if method (1) was used in all cases the downward correction of employer numbers would be very high compared with other information sources: far greater than shown in Table 3. As a result a mix of estimation methods is preferred. In the final outcome, 430 occupation categories were corrected using the regression method (1), 11 used linear extrapolation method (2), 186 used actual 1891 responses, and two had such fractionally low estimates of potential employers derived from any method that they were assigned wholly to own account. No comparisons justified the choice of the average ratios method (3). Table 3 provides final estimates for the 1891 counts of total numbers, after corrections for non-response and 1891 misallocation biases. However, to identify the individuals who should most likely be assigned to employer status, a further step is required. After choice of estimation methods, the actual individuals can be assigned using their individual characteristics from the logit regression in most cases (method (1)) or their actual responses (method (4)). However, for the 11 cases where method (2) is used the logit regression still identifies individuals to reassign, with any difference between the regression estimate and the linear extrapolation made up by randomly assigning individuals from own account to employer if the linear estimate was higher, or from employers to own account, if the linear estimate was smaller than the regression estimate.
The final estimates for 1891 after all corrections, shown in row 8 of Table 3, reassigned 132,000 individuals from employer to own account to correct for upward biases. A further 43,000 were identified as employers after accounting for non-response bias. Most corrections to 1891 derive from rectifying upward bias, followed by the data cleaning stages; nonresponse bias re-weighting provided the fewest alterations; whereas for 1901 and 1911 data cleaning and weighting for non-respondents are more equal in effect.
The numbers given are point 'estimates' , but a large proportion are the actual responses. Where the respondent replies to the question all responses are accepted as accurate. It is only non-respondents and over-estimates in 1891 that are adjusted. This means that ranges of the estimates that are calculated are narrow, as most of the respondents are identified precisely. In addition, where individuals are estimated using method (1) the very large data size ensures small standard errors. Hence confidence bounds given in Table 3 and subsequent tables show a very narrow range in all cases. 12 For example, in 1891 the range for workers is only 0.05% of the total economically active, for own account 0.39%, and for employers 0.52%; the estimates at the midpoints are ± half these percentages. These differences in ranges mostly reflect different population sizes, indicating that we have greatest confidence in identifying workers and least confidence for employers. But even for employers the range of 0.5% (± 0.25%) is very narrow, as to be expected from a Census where the data are generally accurate. Table 3 summarises the estimates of proprietor numbers after all corrections to the Census responses. The penultimate column can be compared with the estimates of total proprietors in the published Census and raw I-CeM data ( Table 1). Imposing the sort of corrections that modern Censuses would apply to published tables increases the number of proprietors by about 200,000 for each year. For the other columns, Table 3 also indicates that the dip in employer numbers for 1901, shown in both published tables and I-CeM, did not actually occur: its appearance was an artefact of the upward bias of the 1891 published figures. However, Table 3 confirms that a true peak of own account numbers occurred in 1901, with a decline beginning thereafter; though this is less marked than in the published or I-CeM data. The new estimates indicate the unreliability of the published Census Report tables even for what they reported and the need to use the alternative estimates given here.

Overview of corrections 1891−1911
The peak in numbers in 1901 is not quite reflected in a peak in the rate of total noncorporate entrepreneurship. As shown in Table 5, proprietors reached a high point of 13.8% of the working population in 1891, before beginning to fall back. This reduction was the result mainly of decreases in own account, since employer proportions of the population declined only slightly by 1911. The pattern of non-corporate proprietor numbers also reflects total business numbers, since corporate numbers were still very small even by 1911. Even though corporate numbers had been growing rapidly, they represented only about 2% of the total business population in 1911. There was a beginning of increasing corporate numerical influence, but the main impact of business changes for the period was the diminution of own account. The interrelation with corporate growth may account for the shift of sole traders into waged employment, but overall the main implications from the data are the continued and growing significance of non-corporate employers up to World War 1. However, the trends in level of output or profits were very different. Feinstein's (1972) estimates indicate that, while we show non-corporate employer numbers continued to grow until 1911, the corporate sector was moving towards dominating total output. 13 The continued growth of the non-corporate sector but with a lower share of total output indicates growing competitive pressures which in the period was mainly reflected in reduced own account numbers; this in turn also reflects increases of wages that made own account less attractive compared to employee status. That general trend has been previously well known, but our estimates now date its onset definitely to the period after 1901.
Going beyond the aggregate numbers, the corrections we have made to the Census for each occupational category are mostly small, with only few categories accounting for most reassignments. Within the 629 occupations containing employers and own account, the largest 37 that contain 1000 or more employers reassigned, account for 80% of the corrections; within these the largest five categories account for 41%, the largest 10 for 53%, and the largest 20 for 67% of reassignments. It is important to evaluate these main categories in more detail to assess the quality of the estimates by occupation.
The largest sector is farming, where 30,000 were reassigned from employer to own account (15% of all reassignments). Comparison over time in the number of farmers and their employment status was recognised by GRo as complex because of changes in early Censuses to how the contributions of wives and farmer's children were counted (Hatton and Bailey, 2001;Higgs, 2005: 63-8). The GRo made a special tabulation in 1911 that sought to construct a continuous time series of farm employment on a standardised basis (PP, 1911). This provides a guide to real trends which indicates that there was a small increase of total farmers (employers, own account and workers) over 1891-1901 (GRo did not breakout the categories separately). In comparison, the original raw Census count of farm employers showed a large decline. our estimates indicate that the correct picture was a continuous increase in farm employers and own account 1891-1911, but an important relative decline of own account. Additional confirmation of this trend is provided by the GRo partial Census publication of farm employers for 1871 for 17 'representative counties' . When scaled up this indicates that for the whole of England and Wales farm employers were about 102,511 in 1871 (PP 1871). 14 Although some consolidation of own account farming occurred after this time, it is implausible that employer numbers increased by 55% to the 1891 raw Census figure over the intervening 20 years at a time when agricultural employment fell by 16%. our reassignment also fits with previous accounts of late nineteenth-century agriculture in which it has been argued that farmers decreased the size of their workforces and land holdings in response to falling prices and rising wages, after the 1873 agricultural depression (Grigg, 1987;Afton and Turner, 2000;Daunton, 2007: 47;Montebruno et al., 2018).
The second largest group is innkeepers, where 18,000 are reassigned. In 1891, 32,515 returned themselves as employers compared with 14,461 in 1901, a highly unlikely shift. There is complexity here because of the unknown extent to which family members were involved in businesses and might have been perceived by respondents as employees. The logit regression suggests that the true number of employers was 13,974, which is in line with a linear extrapolation from 1901-1911 and bears a similar ratio to the total population change. With this correction innkeepers were 0.1% of the economically active in 1891 and 0.1% in 1901 compared to the unlikely 1891 figure of 2.6% without correction. The measurement base is not comparable, but the ratio of publicans' excise licenses to the economically active in 1891 was 5.4%, and 4.8% in 1901, which proportionally shows a similar trend (IR, 1892: xiii-xv;1902: 48-9). our estimates also fit with the acknowledged decline in the number and profitability of public houses across the nineteenth century and the gradual emergence of tied pubs run by managers rather than tenants (Jennings, 2016: 48-9, 53;Knox, 1958).
The third largest reassignment was of grocers and tea dealers (12,000), and the fifth largest butchers (7000). These were the two largest retailers affected; again the extent of family employment is unknown. The logit regression correction estimates grocers as 0.1% of the economically active population and butchers 0.14%, in line with the 1901 ratios of 0.1% and 0.12%, and in contrast to the implausibly high uncorrected 1891 figures of 0.15% and 0.23%. There was a periodic Directory of Grocery, Oil and Colour Trades from 1877 onwards (but nothing for butchers). This offers a guide to trends, although this is only available at a general level as it combined grocers (employers and own account) with many other trades. Nevertheless, the total given is a useful indicator of general trends, which indicates a rapid increase in numbers after the abolition of tea and coffee excise licenses in 1870: 'reduction of licenses … has given large impetus to the sale' of coffee and tea, and hence to 'an expansion of grocers' (Kelly, 1877, v.). The successive directory listings continued to increase in size with no indication of the reduction 1891-1905 shown in the uncorrected Census. While there is no comparable directory for butchers their trend can be expected to have been similar. The late nineteenth century witnessed a growth in demand for meat which allowed butchery to remain profitable by shifting away from butchers who slaughtered animals towards butchers as just retailers (Winstanley, 1983: 140-45;Perren, 2006: 3). our estimates of a small increase in these two sectors between 1891 and 1901 thus fit with other sources. The fourth largest reassignment, of 10,500 dressmakers, draws in a large proportion who were home-based proprietors, predominantly female. The 1891 Census return of 19,911 dressmakers is almost twice the 1901 return of 10,212. After correction using the logit regression, dressmakers were 0.08% of the economically active population, compared to the initial figure of 0.16% and similar to the figure of 0.07% for 1901. The womenswear industry in nineteenth-century England was, broadly, stable. Working-class women tended to make their own clothes, buying supplies from drapers; dressmakers, meanwhile, generally catered for middle-class women. While there was a growing demand for ready-made clothing, little was produced in Britain; instead, ready-made women's clothing was mostly imported from Germany. It was only in the twentieth century that British production of ready-made womenswear took off at greater scale (Kershen, 1997: 39-41;Jeffreys, 1954: 321−3). Consequently our estimates confirm the slow increase of employers in dressmaking between 1891 and 1901.
The next largest reassignments of over 5,000 employers to own account were shoe and boot makers, then laundry services, and then blacksmiths; over 3,000 were reassigned for lodging houses, bakers, greengrocers, and biscuit and confectionary dealers. In all these cases our logit regressions indicate small increases 1891-1901 compared to the rapid decreases in employers shown by the raw Census numbers. Several of these are in line with the estimates for grocers; the Directory of Grocery, Oil and Colour Trades covered confectionery, fruit, and Italian warehouse trades (pasta, jams, oil, etc.), and again indicated an upward not downward trend in numbers of employers and own account combined. For lodging houses there is evidence from the Inhabited House Duty, which was extended to these in the 1890s. Inland Revenue Reports (IR, 1892: 27;1912: 75) indicate that numbers increased rapidly over the period, rather than declining rapidly as the uncorrected Census suggested, albeit the actual numbers are not comparable as the Duty included a wide range of other types of establishment, such as hotels, which are categorised elsewhere in I-CeM.
The next 25 occupations, which had reassignments of over 1000, were almost all retail dealers or maker-dealers. There were, however, a few professional occupations in this group (physicians, solicitors, and auctioneers), some agricultural occupations (cow keepers, and market gardeners), and some building trades (builders, carpenters, and painters). Various directory comparisons for these and the rest of the 629 occupations estimated were made, 15 which again indicate upward trends similar to those for grocery, excise licenses, and Inhabited House duties, and hence support the scale of corrections estimated. Indeed, if anything, our corrections perhaps leave too many employers in the tabulations for 1891 which may actually be own account. on balance, however, the estimates are probably the best that can be created with the information that now is available. It is to be noted that the sector preponderance of misidentified employers claimed by GRo quoted earlier is thus not supported by this analysis. While 'builders and provision dealers' fit the GRo claim quoted earlier, 'coal dealers, road contractors, dealers in hemp, etc., dealers in cane, rush, etc. ' do not match the GRo claim, as only small corrections are estimated for these sectors.

Estimates of proprietor numbers 1891−1911
The purpose of this article is to establish reliable estimates of the number of business proprietors over 1891-1911. These estimates are for aggregates, but are also available for every individual for each sector. From these estimates it is possible to begin assessing the implications for the business history of this period.

Aggregate sectors
Tables 6 and 7 summarise the evolution, by sectors and gender, of employer and own account numbers. The sector classification used translates the Census occupational codes into 13 business sectors. This is an aggregation of the I-CeM occupational codes that contained business proprietors. As noted by Charles Booth in 1886, the Census codes did not identify the relative size of different branches of industry, but instead the types of occupation of people across all industries (Booth, 1886). Hence, for example, a blacksmith would be recorded in the same way whether they had a small independent rural workshop or worked as employees in a shipyard or cotton mill. occupational categories cut across industries. The GRo did not introduce an industry coding until 1911, which was even then too imperfect to be of much use here. The industry aggregation we have used is a modification of Booth's classification as suggested by Armstrong (1972), but at a more aggregate level and with a stronger focus on differentiating sectors with complex structures, notably those characterised by maker-dealers . The aggregation circumvents most of the deficiencies of Census categories such as blacksmiths, and generalised Census classifications such as 'manufacturer' , which cannot be allocated to detailed industrial sectors. However, we acknowledge that this is imperfect and a simplification of a complex structure.
For employers, in all but five sectors, there was continuous growth of male employer numbers; it was particularly rapid in retail, professions (law, accountancy etc.), personal services (laundry, hairdressing), and mining, where employer numbers all grew by over 50%. Importantly, despite the emerging and increasingly dominant role of incorporated enterprises in manufacturing and transport, this only partially affected non-corporate proprietor numbers which experienced steady growth in both sectors. For females, growth was even more spectacular: over 100% increases in refreshment, transport, professions (mainly as school proprietors), commerce, and mining, though the absolute numbers were often very small. An important element was also the 95% increase of female employers in farming, although from a very low base. As noted above, these estimates of female-headed businesses revise previous estimates of female employer participation markedly upwards and also counter most of the previous criticisms of the Census as a source for female proprietorship. In contrast, male farming employers fell in number 1891-1901, and then grew 1901-11. There was a similar dip in male food sales, refreshments, and agricultural produce. Some of these changes may result from the South African War, which has been noted as reducing numbers in some worker categories in 1901. 16 Female employers also experienced a dip in 1901 for personal services, refreshments, maker-dealing, manufacturing, agricultural produce, and food sales. Maker-dealers showed small declines 1891-1901 before rising again in 1911. These changes combined to cause a decline in the total number of female employers between 1891 and 1901.
These shifts in sector evolution have not been fully observed in previous literature. While the period was generally one of buoyant economic activity which provided opportunities for proprietors in many sectors, there were also consolidations underway that limited the potential for smaller businesses. In the agricultural produce industry this took the form of a shift to large-scale milling and other food processing; in manufacturing, traditional maker-dealer industries, finance and commerce, larger and corporate enterprises were beginning to make major inroads. For farming (immune at this time from pressures from incorporation), male employer numbers grew modestly, while female employers nearly doubled. This, and the often related activities of quarrying and transport, appears to suggest a start of shifts in proprietorship towards women as men pursued other opportunities in the waged sector, reflecting a growing gender division of labour in the management of farms, as also identified by . It is striking that this change was developing before World War I, which has been previously argued as the main watershed that accelerated female involvement (Gail, 1989: Braybon, 1989Grayzel, 2002). Indeed, female business involvement before as well as during the war was lauded by De Beck (1916). For the own account, sector developments, shown in Table 7, mirrored many of the changes for employers. The 1901 peak of own account for males occurred in most sectors (farming, manufacturing, maker-dealers, transport, agricultural processing, food sales, refreshment, and finance and commerce). These indicate a turning point in the fortunes for self-employment in sectors that were beginning to develop as large employer industries, in most cases (manufacturing, transport and agricultural processing) as either partnerships and large proprietorships, or through incorporation. In contrast continuous growth of own account was most rapid in construction, retail, professions, and personal services where sole traders remained viable. For women, growth was rapid in retail, and less rapid in construction, manufacturing, professions, food sales, refreshment, and finance. Self-employment for females declined rapidly in farming, maker-dealing, personal service, and agricultural produce; and for food sales declined in 1911 after rising 1891-1901. But even after these declines female business participation as own account is radically higher than previously estimated. Between 1901 and 1911 a shift appears to have occurred for both men and women in farming from own account to employer status. This was a significant reversal of earlier trends. A similar shift occurred for both genders in personal services and refreshment, and to a lesser extent for females in maker-dealing. However, the most notable relative shift for women involving the greatest numbers was in refreshments where growth of employer numbers was twice the decline in own account: an already strongly feminised sector began to see an important shift towards larger scale female-headed businesses that employed others, rather than individuals trading on a small scale often from the home. This has been previously observed previously, but only in a more limited way (Crossick, 1984).
An important contrast of evolution by status and gender is also evident by comparing the general trends between the Tables. The aggregate estimates show a steady increase in employer numbers and in the total of all business proprietors, but a rise and then decline of own account. The sector distribution of this trend can now be interpreted from Tables 6 and 7. The rise and then decline of own account is most striking for male farming, food sales and refreshment, and to a numerically lesser extent in manufacturing, transport, and finance. This is echoed in female own account in only four sectors: maker-dealing, personal services, food sales and refreshment; of these, the decline for females 1901-11 in maker-dealing and personal services is numerically largest, accounting for 58,000 own account. The upward trend over 1891-1901 was a continuation of previous expansion of own account, as confirmed from the earlier Censuses .
The 1901 Census date thus appears to have marked a turning point in the development of self-employment, with a previous period of long-term growth going into reverse. This was particularly a phenomenon for own-account women, where absolute numbers of females were lower in 1911 than they had been in 1891. Conversely, aggregate male own account decreased by a lesser amount 1901-11, and remained in 1911 well above 1891 levels; with sector shifts seeing large declines in maker-dealing and smaller declines in other sectors, offset by some large increases especially in retail. For employers, in contrast, female numbers began to become more significant, experiencing generally more rapid growth than males, especially in maker-dealing, personal services and refreshment where female business proprietors were increasingly prominent. However, male employers remained the majority in almost all sectors. This turning point for own account and female participation has not attracted previous commentary for this period and is the subject of further investigation by the authors elsewhere.

Detailed sub-sector change
It is important also to assess sub-sectors. These give greater detail to interpretation, but also allow more direct comparison with secondary sources (which are usually for specific occupations). The ten most frequent employer and own account sub-sectors for the 629 occupations are assessed in Tables 8−11. For male employers (Table 8) the main sub-sectors remained almost identical over time, though with minor shifts in ranks. Farming was always by far the single largest category, followed by building. Retail sub-sectors (grocers, butchers and drapers), innkeeping, maker-dealers such as tailors, shoe makers and bakers, and building trades were the dominant other groups. The professions, through solicitors, appear for the first time in the top ten in 1911. For own account (Table 9) the sub-sectors were similar. Farmers were still the largest group, though increasingly less dominant than for employers. Maker-dealers, retail and innkeeping remain the other main groups, though there was a different balance from employers: towards greengrocers, general shopkeepers and hawkers/ costermongers. Physicians and market gardeners were common sources of own account not evident in the top ten of employers. Builders do not appear in the top ten, and the building trades are also less prominent than among employers.
For female employers (Table 10) farming, school proprietresses and lodging houses grew significantly in importance, while dressmakers, grocers, and tailoresses declined. However, these remained the most important groups, together with innkeepers, school proprietresses, milliners, and confectioners. These have mostly been previously recognised as important fields of female entrepreneurship (Davidoff and Hall, 1997;Kay, 2009;Aston and Di Martino, 2017), but the numerical prominence of female farmers and school proprietresses has not been given the prominence it clearly deserves. For female own account (Table 11), the top three categories remained the same, reflecting the strong female development in dressmaking, laundry and lodgings, often exploiting their dwelling house as business premises. Beyond these, many of the maker-dealing and retail categories prominent among female employers were also common sources of female own account, with the addition of shirtmaking and seamstresses, hawking, charwomen, and more general shopkeeping. Musicians and music teachers were also common, reflecting another rapidly growing field of female professional opportunity. 17 Across the subsectors shown in these tables the rising and then declining trend is not always as apparent as for aggregate sectors. Clearly at the more detailed level for men a range of different shifts occurred presenting a complex picture. However, for women the story is plainer: a rapid growth of participation as proprietors in sectors such as farming, lodging-houses, professions and retail; but the beginning of a downward trend for female own account as dressmakers and laundresses. Thus, while the 1901 Census does appear to have been a marked turning point at the micro-sector level as well as in aggregate, the development was complex and multi-faceted, playing out in contrasted ways between different sectors and genders, and in different ways between the more prominent small businesses that were employers, and those that were own account.

Assessment and conclusion
This article uses the 1891-1911 population Censuses to provide accurate and consistent counts of business proprietors, and the breakdown of these into employers and own account self-employed by sector and gender. This provides an entry point for many areas of subsequent research for business historians. The Census allows new analysis of trends over time and opens the way for inclusive analysis of all individuals from across the whole population, permitting their business characteristics, industry sector, gender, family structure, and location to be examined and compared with each other. For example, female and male entrepreneurs can be compared at the level of the whole population; within-household characteristics can be examined to identify different types of family firm; employer and own-account proprietors can be compared across each sector. The scope for large-scale conclusions about business proprietorship across the whole non-corporate business population for this period is opened up for the first time; this represents a major opportunity to utilise the historic Censuses for 'big data' analysis. However, fundamental as a preliminary to such research is developing an understanding of what the individual Census responses do and do not show. Hence the primary purpose of this article has been to create an estimate of proprietor numbers for the 1891-1911 period. This article has considered how the Census processes of data collection affect the quality of the estimates that can be made. As noted at the outset, the population Census was not designed as a business Census, which constrains the information available. It is also accepted that the estimates are limited in various respects: they cannot include the corporate sector which the Census made no attempt to cover; and there is no information on employee numbers or the output of the businesses. The article has also demonstrated that the published and 'raw' responses in the CEBs encoded in I-CeM cannot be used without adjustment. The article has addressed two main issues that have to be managed before it is possible to construct dependable estimates of proprietor numbers: first, non-response bias across all years; and second, misallocation bias in 1891. While these issues have been previously known to exist, neither the Census office nor subsequent scholars have made any attempt to correct for them. using the e-census it has now been possible to make the adjustments required. The final estimates of aggregate proprietor numbers, and the identified individuals, are part of a uKDA database deposit; with the weights used to make the different adjustments to proprietor numbers provided as supplementary material and in a separate web-based resource (Montebruno, 2018). A major contribution of this article is to confirm the persistence of small businesses as the most common type of firm throughout the period covered, and to measure their extent. While the persistence of the family firm, partnerships and sole proprietors into the 1930s has been understood since the seminal studies by Marshall and Clapham, we now know much more about non-corporate business numbers and dynamics. A major shift was taking place from own account towards employers of waged labour. As shown in Table 3, the number of employers continued to increase steadily, with the proportion increasing from 35% (1891) to 39.5% (1911) of all proprietors. Many of these will have been the smallest firms, but if trends from 1851-1881 found by Bennett et al., (2018) continued there was a mix of expansion of the 3-10 employee firms, as well as increasing emergence of large firms (over 500 employees), which was reflected in a small overall increase in mean firm size from about 24 to 25.5 employees. However, these calculations should be taken as only a guide, since they rely on attributing workers to sectors through their occupational status, which as noted earlier is imprecise.
In contrast, own account proprietors declined from 65% of all proprietors in 1891 to 61% in 1911. The single operator was beginning to be under significant pressure, which suggests that the period covered was a turning point. As shown in Table 3, the own account proportion actually increased 1891-1901 to 67.6% of all proprietors in 1901, but then rapidly declined as the number of employers steadily increased. The number of own account fell 1901-11 by over 98,000 despite a rapidly growing population. This trend was to some extent apparent in the uncorrected published numbers shown in Table 1. However, from the adjusted estimates in Table 3 we can now be confident this trend was real, and we know its magnitude. It is also reflected in the overall proportion that proprietors formed of the working population, which reached a high point of 13.8% in 1891, before beginning to fall back. This change was entirely a result of decreases in own account; employer proportions of the population declined only slightly by 1911 and then numbers continued to increase.
We can be confident about the scale of the 'estimates' constructed, as indicated by narrow confidence bounds. This reflects the benefits of a Census that includes the whole population: the data size is large (giving narrow standard errors for regression estimators), most responses were accurate, and while corrections for non-response bias and misallocation biases are essential, the resulting estimates are generally in a narrow range of possibilities. Hence the scale of female entrepreneurship identified, which is much greater than previous estimates, appears real and may even under-estimate. Also the turning point identified around the 1901 Census seems to have been truly a time where the growth of proprietorship in the British economy was decisively shifting towards large firms, or larger small firms (both non-corporate and corporate), and hence where wage employment was substituting for own account.
We have also presented estimates of proprietor numbers for aggregate sectors and the most common sub-sectors by gender. only a brief overview has been possible, and the actual data is being made available for any researcher to analyse at the level of individual, or for whatever sector is desired. The main trend evident is again the strong contrast between the steady increase in employer numbers, and rise and then decline of own account. Although this story has strong gender and sector diversity, the sector distribution shows that the own account rise and then decline was mainly driven by numerical changes in males in maker-dealing, refreshment, and food sales, and to a lesser extent in farming, manufacturing, transport, and finance. Changes in female own account were generally smaller numerically, but larger relatively: particularly large for maker-dealing, personal services, and food sales. However, some sectors of female own account began to take off over the period, especially in retail, and in professions such as school proprietresses, musicians and singers. Female employer numbers also became more significant and grew more rapidly than for males, although male employers remained the dominant category in all sectors. Thus, while the 1901 Census was a marked turning point, at the detailed level the development was complex and multi-faceted. Within these patterns the strength of female entrepreneurship throws new light on the high level of female participation both as employers and own account, at far higher levels than previously identified. The high level of female participation evident from the original Census records demonstrates them to be generally a superior source of information to the trade directories, insurance, bankruptcy and other records that have been previously used, and counters many of the criticisms that have been levelled against the value of using published Census records (e.g. Barker, 2006;Kay, 2009;Aston and Di Martino, 2017). While defects certainly remain, the Census records appear to be a more complete coverage than has ever been previously envisaged. The proprietor population estimates in this article open up important avenues for further research. First, a wide range of research is now possible at the individual level. As a Census resource, data on individuals for the whole population is now provided that identifies almost all non-corporate business proprietors, allowing for more inclusive approaches of business history and opening new avenues for 'big data' analysis by business historians. This allows the demographic and other information on individuals included in the Census to be used. For example, the relationship between migration and entrepreneurship can be examined on a larger scale than previously possible (Godley, 2001;Smith et al., 2018). other long-standing narratives about England and Wales entrepreneurship can also now be considered in the light of these data. The very large number of own-account proprietors we find supports existing historiography on the dominance of personal capitalism; while the fall in own-account proprietors by 1911 confirms the historiography of a shift to dominance of waged employment. Further nuance to the traditional accounts of business history in this period is also suggested by sector-and gender-specific changes. Secondly, establishing the numbers of non-corporate business proprietors allows comparisons to be made with the corporate sector. Various adjustments of coverage are needed, but reasonably accurate comparisons between employers, own account, and corporate businesses are possible which allow assessment of how business ownerships evolved for this important period. We have indicated the beginning of a shift towards growing corporate dominance of output in 1911, which appears to be mainly a result of own account shifting into waged employment. These are preliminary comparisons based on Feinstein (1972) on which further research is being developed, but they begin to open new approaches to debates about productivity and levels of business concentration by mode of organisation.
Thirdly, the locational coding can be used to open up tracking of locational change over time. Each individual is located in a household within a parish in I-CeM. Although there are some errors in attribution of parish location acknowledged in the I-CeM codes (Higgs et al., 2015: 114-15), much analysis can with care be undertaken at that level. Moreover, at the more aggregate level of Registration Sub-Districts the data are more locationally accurate, with the potential for spatial analysis to reveal urban-rural and other locational differences. Again, further research is being devoted to these patterns and their relation to structural changes in the economy as a whole. This reveals interesting geographies of entrepreneurship which highlight the importance of services and retail, rather than heavy industry, to England and Wales entrepreneurship, modifying older accounts of the economic geography of nineteenth-century England and Wales (e.g. Lee, 1981); as well as allowing assessment of the relationship between transport and entrepreneurship. Fourthly, the occupational descriptor strings allow proprietors with multiple businesses to be identified. This opens scope to assess portfolio development. Portfolios have been recognised as an important means by which businesses diversify and grow. Preliminary analysis of farm business portfolios in 1881 demonstrates the potential of the e-census data to be used to identify different types of business strategies: as responses to necessity or locational opportunity (Radicic et al. 2017). Fifthly, the estimates of proprietor numbers in 1891-1911 provide a means to start investigating the continuities over longer periods of time: by attempting to join up with later twentieth-century Censuses, and by linking to earlier Censuses which followed a different format. Finally, an important aspect of joining up data at the individual level also opens scope to track individual business proprietors over time, which offers opportunities to engage with research questions about the determinants of business growth and decline. Both forward and backward joining up of data to create panels are challenging, and are the subject of ongoing research by the authors.