Data note: Spatializing South African agricultural censuses, 1918–2017

ABSTRACT Agriculture is an intrinsically spatial production process. Where on the landscape agriculture occurs affects the environmental (e.g., soil, water, climate) factors that have large output and production risk consequences. The location of agriculture also has substantial logistic, policy and market performance implications. To facilitate analysis of the spatial dynamics of agriculture, we developed a collection of new ADM 2 boundary files whose geographical dimensions and naming standards map directly to the 18 agricultural censuses that report farm inputs, outputs and related statistics for South African agriculture over the period 1918–2017. The statistical aggregates – representing Magisterial and Municipal Districts –, changed in number, area size and boundaries over time. Cross-referencing these changing statistical aggregates to our newly digitised census boundaries, is an essential step for any geospatial assessment of the causes and (productivity and environmental) consequences associated with the changing physical footprint of South African agriculture over the past century.


Introduction
Statistics South Africa recently published the results of the 2017 census of commercial agriculture.While the release may not seem extraordinary, it commemorates the 100-year anniversary of the country's agricultural census, which began in 1918.Over the century, agricultural censuses have occurred at irregular intervalsbut usually once or twice per decade -, with results published down to the ADM 2 (i.e., administrative unit 2) level.Although most censuses featured accompanying printed maps, past publishers probably never imagined a future where all these boundaries would be digitised and the corresponding data joined thereto for mapping and geostatistical analysis.By digitising and cross-referencing boundary files to agricultural statistics for each of the census years, we have now made this possible for the first time in a century.
However, this undertaking faced numerous challenges due to the significant changes in administrative boundaries over the past 100 years.These changes vary in nature, from simple name alterations of administrative units to shifts in their shape and area (even while retaining their original names), along with the formation of new administrative units from existing ones.Additionally, beginning in 2017, Statistics South Africa transitioned from publishing the Agricultural Census based on Magisterial ADM 2 boundaries to Municipal ADM 2 boundaries.Magisterial and Municipal boundaries are distinct non-concordant spatial entities, presenting a further set of challenges when seeking to standardise the spatial representation of South African agricultural production statistics.
This note addresses the measurement challenges posed by these evolving South African administrative boundaries by developing a standardised set of 18 maps that accurately and digitally join with tabulated agricultural census data.We detail the alignment and digitisation process, resulting in the creation of a collection of maps that can be used to represent the nation's agricultural census data over the past century.Agriculture is an intrinsically spatial production process.Thus, locating that production on the geographical landscape is critical for understanding the environment (soil, water, climate) linkages with agriculture (e.g., Greyling, Pardey, and Senay 2023), the logistics that connect agricultural production to where it is processed and consumed (e.g., Joglekar and Pardey 2016), and a host of other, spatially explicit factors such as exposure to pests and diseases (Senay et al. 2022) that affect the ecological and economic performance of the sector.
This data note is structured as follows: Section 2 outlines the methods employed in this study, specifically the study area, map sources, and digitisation process.Section 3 presents the study's results, and Section 4 offers concluding remarks.Additional information regarding each of the census years is included in the Supporting Material.

Study area and map sources
This study encompasses the entire geographical extent of the Republic of South Africa (RSA).The country boundaries were administratively enshrined in the South Africa Act (1909) when the Union of South Africa was formed in May 1910.The Union included the former territories of the Cape, Transvaal, and Natal colonies and the Orange Free State, which by way of a national referendum, became a fully sovereign Republic in May 1961. 1 Table 1 provides a summary of the sources of historical maps used to construct the digitised series of maps presented in this paper.A total of 13 printed and one digital source were the primary reference material we used to construct this map compilation, augmented by additional information gleaned from Law (1999).

Boundary digitisation and census harmonisation
To accurately represent South Africa's Magisterial District boundaries between 1918 and 2017, we utilised historical maps from sources such as the David Rumsey Map Collection (2023), Bartholomew (1922), Touring Club Italianio (1929), Statistics South Africa (1993), FAO (2014), Law (1999) and others.These maps were accessed in digital form (for years 2002, 2007, and 2017) or digitised from scanned maps (for all other years), then transformed and georectified to align with the country's digital national border sourced from FAO (2014).The georectifying process which assigns a geographic coordinate (lat-long) to a scanned map image was done using the World Geodetic System 1984 georeference system (Kumar 1988 and see Supplemental Material, p1 for more technical details).Depending on the map and the desired fit, various polynomial transformations were applied to optimise the alignment of the outer national boundaries and minimise the root mean square error of control point deviations.Modifications were made to identify, merge, or add specific administrative units as required for each of the corresponding agricultural census years delineated in Table 1.The resulting maps offer a comprehensive and consistent spatial representation of South Africa's administrative districts that concord with the data reported in each of the agricultural census years.
The ADM 2 units reported in the 2017 agricultural census were Municipal boundaries, different in concept and geography from the Magisterial Districts reported in all prior censuses.Digitised boundaries for this latest census year were obtained directly from ISCGM (2016), and for the 2002 and 2007 censuses were accessed from the FAO GAUL administrative boundaries dataset series published by Note: Boundaries for 2002 and 2007 include multipart polygons, which were exploded and numbered so that the associated data could be proportionated according to their respective areas.In 2017 the reporting of agricultural census data switched from magisterial to municipal district boundaries.For the year 1993, the shapes and positions of districts were used only as a visual anchoring and boundary guide to improve the 1993 census map as a scale bar was not provided on the printed map.
FAO (2014).The administrative boundaries for 1993, 2002 and 2007 required additional processing to address the issue of multipart polygons; geographically distinct but linked mapped polygons that are assigned the same district name.For census year 1993 we observed 17 multipart polygons, and for 2002 and 2007 there were 28. 2 These multipart-part polygons were created through the integration of the former homeland areas into the national Magisterial District boundaries.For our application, we assigned unique sub-IDs to the same-named polygons after they were exploded to ensure the subsequent concordance between the tabulated statistical data and its mapped geographical representation.
Digitising ADM2 boundaries is a means to an end.The overriding objective is to align these digitised boundaries with the statistical divisions reported for all the published agricultural censuses spanning the period 1918-2017.This spatial alignment process required harmonising and adjusting all the district names recoded in the boundary files with the district nomenclature used in each of the corresponding census reports.For certain census years, specifically 1988 and 2017, selected data for some districts were reported as the sum of two or more districts.In these instances, joining the boundary data with the statistical data required attention to this aspect and implementing the join in a way that facilitated adjustments on a case-by-case basis.More complete details of our digitising procedures are reported in the Supplementary Material.

Change in the district area
Setting aside the census year 2017, where the data are reported according to Municipal boundaries, over the period 1918-2007 the number of Magisterial District boundaries in South Africa increased from 207 to 411 (Table 1).During this period, the districts were also subject to numerous name and boundary (and thus, most likely, geographical area) changes, all of which we recorded as a "change" in an ADM2 unit when comparing one census to the next.These census-to-census changes in the Municipal boundaries are summarised in the three righthand columns in Table 1.Thus, for example, while the net change in ADM 2 attributes between the 1918 and 1922 agricultural censuses amounted to a single district, this masks the addition of five new districts and the removal of four, underscoring the complexity of the changing administrative landscape.In most instances, the reported change represents a change in the boundary rather than the name of the district.Mapped indications of these census-to-census boundary changes are included in the Supplementary Material.
Table 2 provides a summary sense of the changing areal extent of the districts over time.The average district size halved from 0.592 million hectares in 1918-0.298 million hectares in 2007, while the median district size declined from 0.317 to 0.166 million hectares.Figure 1a presents an area density plot of district sizes for the 1918, 2007, and 2017 census years.All three distributions are right-skewed, with means well below their respective medians (Table 1).Notably, the degree of skewness almost doubled from 3.3 in 1918 to 6.1 in 2007.Over this period, the maximal area size of a district changed little (ranging from 5.291 to 5.696 million hectares), while the smallest district size shrunk markedly from 0.019 million hectares in 1918 to just 0.001 hectares in 2007.The prevalence of smaller districts increased over time.By 2007 202 (49%) of the districts were less than 0.162 million hectares in size, compared with 52 (25%) in 1918.Consequently, districts smaller than one million hectares accounted for almost 70% of the country's total area in 2007, well up on their 50% share in 1918 (Figure 2b).In contrast, Gordonia in the Northern Cape Province, the largest district, grew from 5.332 to 5.696 million hectares, such that this single district constituted 4.7% of South Africa's total area in 2007.In constructing this digital collection, extensive efforts were made to adhere to the highest geodetic standards, nonetheless there may remain certain discrepancies, omissions, and inconsistencies within the digitised maps.These inaccuracies can stem from limitations in the original data sources or issues that arise during the data pre-processing stage.Should you identify any errors, we encourage you to assist in their rectification by reaching out to us via the contact information available on our website: http://www.gems.umn.edu/.The depicted boundaries are not authoritative.
Changing the spatial resolution of the data from Magisterial District to Municipal boundaries in the 2017 census represents a significant spatial discontinuity in South Africa's agricultural census series.The number of spatial units reported in 2017 ( 213) is almost half the 2007 total (411), returning the ADM 2 resolution of the data to where it was a century ago, in 1918, when there were 207 ADM 2 spatial units (Table 1).The distributional structure of district size in 2017 aligns with neither the 1918 nor 2007 distributions (Figure 1).For example, the smallest district (Mandeni in the province of KwaZulu-Natal) in 2017 was 0.055 million hectares, almost three times larger than the 0.019 million hectares of the smallest district (Boksburg in Gauteng province) in 1918.In contrast, at the other end of the size distribution, at 4.441 million hectares, the largest district in 2017 (Dawid Kruiper in Northern Cape province) is smaller than its 1918 counterpart (5.332 million hectares for Gordonia district).The 2017 size distribution is still right-skewed, but the degree of skewness is now 3.048, making it the least skewed during the entire analysis period.As Figure 1b reveals, the cumulative area share of districts below one million hectares is 60%, compared with 70% in 2007 and 50% in 1918.In 2017, districts smaller than 3 million hectares accounted for more than 90% of the country's total land mass area, higher than the corresponding 2007 share (85%) and 1918 share (75%).

Spatial representation of the changes
Figure 2 provides a geographical representation of the location and timing of changes in the number, shape, and configuration of Magisterial Districts.Figure 2a reveals the changes that occurred between agricultural census years 1918 and 1922, while Figure 2b shows the cumulative boundary changes that occurred between census years 1918 and 1993.As the boundary change statistics reported in Table 1 indicate, the preponderance of changes was clustered from 1930 to 1956, 1965, 1993, and 2002.

Comparison with other sources
The only other compilation of historical South African subnational boundary files known to us is Giraut and Vacchiani-Marcuzzo (2009).Their compilation consists of nine datasets containing South African provincial, homeland, and Magisterial District boundaries spanning the period 1911-2001, primarily intended for use with the country's population censuses.A notable difference between the Giraut and Vacchiani-Marcuzzo (2009) compilation and the present series is that we used 11 primary maps relative to the four maps for the Giraut and Vacchiani-Marcuzzo (2009) collection.Table S1 in the Supplementary Material summarises details for each of the years in the respective compilations, including the number of Magisterial Districts that were digitised for each of these years.

Discussion and conclusion
Simple counts of the number of provinces or districts, and their changes over time (Table 1), fail to fully reveal the nature and extent of changes in the statistical representation of the South African agricultural landscape over the past century.For example, while a comparison of the 1918 versus 2017, ADM 2 counts might suggest the spatial delineation of the agricultural statistics has changed little over the past century, our analysis reveals this is not the case (see, for example, Figure 2A and B).Moreover, functionally joining polygon boundary files to their respective tabulated statistics also requires an alignment and standardisation of district nomenclature, as we have done for this study.This study's harmonised and standardised polygon data set makes a spatially consistent representation of South African agricultural census data over the past century possible.Using these new data, Greyling, Pardey, and Senay (2023), for example, examined the impact of changes in South African agricultural policy over the past century on the location of the country's maize production and the spatially explicit productivity and climate risk implications of that changing physical footprint.Maize is the predominant crop within South Africa, accounting for half the country's entire cropped area from 1948 to 2007 (Liebenberg 2012; Directorate of Agricultural Statistics 2020; Greyling and Pardey 2019) and 27 percent of daily calorie consumption (FAO 2022).Thus spatially-associated changes in the yield performance and climate risk exposure of maize have direct and potentially profound economic, livelihood, and food security consequences.We expect the set of boundary files we produced for this study will unlock the potential for further research into the changing historical agricultural and land use patterns in South Africa and their various causes and consequences.

Data
The boundary files contributed by this data note are publicly available via Data Repository for U of M (DRUM, https://doi.org/10.13020/fh35-7m54).Note that the file names in this collection correspond to the respective agricultural census year.Details of the concordance between the census and mapped boundary year are listed in Table 1.Notes 1. Namibia, formerly known as South West Africa, gained independence from South Africa on March 21, 1990, after being administered by South Africa since the end of World War I (Encyclopedia Britannica 2021).2. Geographically small, multipart polygons were evident in other years through digitisation errors or represented non-agriculturally relevant islands (e.g., Robben Island off the coast of Cape Town).In these instances, the polygons were deleted from our boundary files given they were inconsequential for the subsequent joins with the agricultural statistics data.

Figure 1 .
Figure 1.Feature area distribution, a) Density plots and b) Cumulative area share.

Figure 2 .
Figure2.Temporal variation in the number, shape, and configuration of magisterial districts.Disclaimer: This map is a component of the University of Minnesota's GEMS Informatics digital historical administrative map collection.In constructing this digital collection, extensive efforts were made to adhere to the highest geodetic standards, nonetheless there may remain certain discrepancies, omissions, and inconsistencies within the digitised maps.These inaccuracies can stem from limitations in the original data sources or issues that arise during the data pre-processing stage.Should you identify any errors, we encourage you to assist in their rectification by reaching out to us via the contact information available on our website: http://www.gems.umn.edu/.The depicted boundaries are not authoritative.

Table 1 .
Mapping boundaries to agricultural censuses: Sources, scale and district change summary.
Note: 2017 was reported according to municipal not magisterial districts like all the prior censuses.