Cars and socio-economics: understanding neighbourhood variations in car characteristics from administrative data

Abstract There were 30.7 million registered cars in Great Britain in 2011, outnumbering the total number of households recorded by the census. Despite this, the Driving and Vehicle Licensing Agency’s (DVLA) database of car model registrations remains underexplored as an indicator of socio-economic characteristics. In the past, car ownership itself has been frequently considered as a census proxy variable for affluence. However, this is an increasingly dated interpretation as ownership has become more widespread across society and the value of cars varies considerably. Understanding the geography of different car types, however, is likely to be more informative of local population characteristics as the choice of model is dependent on several factors, notably including the cost and the purpose of the vehicle. In partnership with the Department for Transport (DfT), a car segmentation was produced that grouped every car model registered in England and Wales in 2011 into 10 distinctive categories based on the vehicle’s key characteristics. Data representing the total number of registered cars for each car segment and three age groups were made available at a small area geography (known as lower layer super output areas – LSOAs) to be analysed for this study. It revealed that each car segment is uniquely distributed across London, and the rest of England and Wales. The patterns were then compared with key 2011 Census variables on socio-economics to understand the extent to which spatial patterns of broad car characteristics correspond with variances in indicators of social make-up.


INTRODUCTION
The 2011 UK Census confirmed that almost 75% of households had access to at least one car or van, and 42.6% of which had access to multiple vehicles. As a commodity, variations in car ownership between neighbourhoods can tell us a lot about the consumption behaviour of residents. Much previous research has considered car ownership as an indicator of affluence due to the costs associated with purchase and maintenance (Galobardes, Shaw, Lawlor, Lynch, & Davey Smith, 2006). Although today, even some of the poorest households have access to a car due to the falling costs of cars of certain characteristics. Also many households from the most affluent postcodes are car-less due to their urban setting. However, makeup or segmentation of Regional StudieS, Regional Science car ownership in terms of the types of car or model owned is likely to be associated with affluence or the socio-economic characteristics of the owners. Instead, considering the types of cars could be a more useful indicator of consumption practices. Cars can be distinguished or segmented by their physical characteristics, and cars are known to range in value considerably. Therefore, each segment is inherently likely to be purchased by a particular socio-economic group due to prices and marketing strategies (Evans, 1959).
There is an increasing need for more proficient and reliable small-area population data to be made available for years between censuses. There has been growing pressure to harness data from alternative sources such as administrative records held by government departments. Administrative data could provide a unique insight into consumption practices in England and Wales. The Driver and Vehicle Licensing Agency (DVLA) makes recordings of every vehicle in Great Britain as soon as it is registered (Department for Transport, 2015), although currently the DVLA releases no open data about cars at the neighbourhood level. As the records were originally collected for administrative purposes, data about purchasers' characteristics are absent. Similarly, details about the vehicles are relatively limited and prices are not recorded. However, the data do include the model of the vehicle, and these can be easily segmented into distinctive categories.
Whilst there have been studies that have sought to establish a link between consumers' characteristics and choice of vehicle type based on small survey samples (Choo & Mokhtarian, 2004), no prior research has attempted to identify an association between local population characteristics and car segments using a near 100% sample of vehicle data which are available from government administrative records. This paper, therefore, presents a car segmentation of every registered car in 2011 in England and Wales recorded in the DVLA databases. The segmentation is based on the structure and style of vehicles and comprises 10 key groups, and it was necessary to reduce the data so they could be disclosed by the DVLA whilst at the same time retaining key variances useful for analysis. Data on the number of cars per segment have been subsequently released at a small-area level for this study by aggregating the registered addresses of every vehicle; cars were also aggregated into three age groups. The segmentation provided an invaluable insight into variances in car consumption across England and Wales and within cities. The research, therefore, aims to use the segmented car data to understand the general trends in the access to car types across England and Wales, focusing on their registered locations as aggregated into LSOAs. Furthermore, the research will also run tests to identify how socio-economics are associated with local car characteristics.

BACKGROUND
There is a wealth of literature on modelling car ownership (de Jong, Fox, Pieters, Vonk, & Daly, 2004). The choice to purchase a vehicle is often a determinant of a range of factors, notably the distance needed to be travelled on a daily basis, the provision of alternative transportation (such as public transport), and the associated costs, including purchase, maintenance and parking facilities. Therefore, consumption of cars is likely to vary by both socio-economics and location.
Car ownership has frequently been considered as a census proxy for affluence or social standing. There is still a relationship between median income and cars per household at a neighbourhood level within England and Wales (Yeboah et al., 2015). Not having access to a car was included as one of the four census variables within the once widely used Townsend Deprivation Index (Townsend, Phillimore, & Beattie, 1988). However, using this variable as a determinant has several shortcomings. The number of cars registered in the UK is increasing every year and there are now more cars than there are occupied households. There are claims that the UK has now reached peak cars as the growth in car ownership has levelled off in recent years and congestion across the road network has peaked (Goodwin & Van Dender, 2013;Headicar, 2013). Cars are no longer a product restricted to the most affluent. The value of cars ranges considerably; older Regional StudieS, Regional Science vehicles can be purchased second hand very cheaply, whilst new models of luxury vehicles often cost over twice the average annual salary. The extreme range in value of cars registered in the UK may mean that while the rate of car ownership may no longer vary as much across the country due to affluence or socio-economics, variations in the type and age of cars between neighbourhoods may be substantial.
Car ownership, however, is known to vary across space due to urban density (Headicar, 2013). Car use in city centres is often activity discouraged by councils through parking restrictions, while London has, in addition, a central congestion charge. City centres also often get very congested during rush hours, which may deter local workers from driving in favour of walking, cycling or taking public transport (Paulley et al., 2006). In addition, there may be less of a need for a personal vehicle due to the density of services and social networks in central areas.
Location is, therefore, still an important influence on car consumption. The congested streets and tight parking spaces have made a new generation of smaller cars popular amongst city dwellers. Consequently, the UK car market has seen a distinctive shift toward smaller car segments since 2008 (SMMT, 2014). It is therefore reasonable to assume that the composition of cars is likely to vary between different urban settings.
Household characteristics are also likely to influence the decision to purchase a particular type of car. The disposable income of the buyer obviously influences the budget available for a personal vehicle. Wealthier individuals may be more inclined to purchasing a first-hand vehicle, and more frequently too (Pearman & Button, 1976). In contrast, those with lower disposable incomes are more likely to purchase previously owned vehicles, and also more likely to own an individual vehicle for a longer period of time.
The choice of model is also influenced by the intended purpose of the car, for example, family use or business trips. The relationship between household types and car model consumption is also reinforced by how vehicles are marketed differently (Evans, 1959). For instance, many manufacturers target advertising at families by emphasizing practicality and cost benefits of particular cars, whilst some vehicles are idolized as status symbols and can represent a form of conspicuous consumption (Mason, 1998). Consequently, the composition of cars varies heavily between neighbourhoods, and cars have become a commodity as much as they are a means of transport (Belk, Mayer, & Bahn, 1982;Ferber, 1962). To many people, vehicles are seen as a reflection of themselves and their social status.
There exists, therefore, a relationship between self-image and brand preference (Ross, 1971). For example, one early study found individuals who are more cautious and conservative to be more likely to consider a small car as a practical and economical convenience ( Jacobson & Kossoff, 1963). The degree of association between cars and socio-economic groups is also grounded in society (Rainwater, 1974), and use of popular terminology such as 'Chelsea tractors' links the importance of neighbourhood to this phenomena ( Jackson, 2006). Consequently, the above features would lead one to assume that the composition of cars in terms of their type and age are likely to vary due to the variances in the characteristics of the local residential population. Much research as already sought to establish the key determinants of car model choice from multinomial logit models and nested logit models using survey data. Much research has considered driver and household characteristics as dependent variables (Berkovec & Rust, 1985;Lave & Train, 1979). Kitamura, Golob, Yamamoto, and Wu (2000) also considered local geographic characteristics such as population density and accessibility in addition to the former. The typical owner of different car segments is therefore known to vary. For instance, Choo and Mokhtarian (2004) used a survey of 8000 inhabitants of the San Francisco Bay Area to profile the owners of American car segments based on mobility, lifestyle, attitudes and demographics. However, the previous studies have been based on individual-level survey data of small sample sizes relative to the number of cars registered. This research will instead establish the utility of aggregate car registration data which is representative of every car registered within a country.
Regional StudieS, Regional Science Currently, the census is the only source of easily available open-source data on vehicle use at the small-area level. However, the census in England and Wales only asked two questions related to car use: 'How many cars or vans are available for personal use in the household?' and 'What is the most commonly used method of transport?' It did not consider any characteristics of the car, such as the segment, price or age. It also only described access, so it therefore cannot account for actual ownership or the registration of company cars (Carr-Hill & Chalmers-Dixon, 2005). Therefore, a new dataset produced from DVLA data would be useful for understanding car ownership trends in England and Wales at a small-area level. Especially as the data could be updated annually.
Some studies have taken advantage of car registration data produced from administrative records as a means of researching population trends through consumption practices. Li, Dodson, and Sipe (2015) studied the intra-urban patterns of the fuel efficiency of vehicles in two Australian cities using data from administrative records, and subsequently identified how these trends link to socio-economics. Another example is Winkelmann and Winkelmann's (2010) study of inequality in Switzerland which considered the number of luxury car registrations per 1000 persons as an indicator of conspicuous consumption. Their study took advantage of administrative data collected by the Federal Roads Office. However, this indicator was only available at a regional level. Prior to this research, no other study has explored UK car registration data at a small-area level through car segmentation. Neither has any research used a near 100% sample of car registration data to establish how the popularity of car segments is associated with local socio-economic characteristics.

DATA
Aggregated car information for the study were provided by the DVLA following a request made to the Department for Transport (DfT). Due to the sensitivity of the data, the DfT could only disclose information at a small-area geography if it met their minimum aggregation requirements for data-protection reasons. In order to retain general information about the cars whilst complying with the data-protection requirements, a classification of vehicle type was employed to segment vehicles into distinctive groups. As there is no formal and universally accepted segmentation of cars registered in the UK by model type, one had to be produced for this study.
The DVLA records data on a range of variables for every car registered in the country, including age and model (Department for Transport, 2013). Upon the first registration, the 'type of body/ vehicle' is recorded, and this is most commonly inputted via the automated first registration and licensing system (Driver & Vehicle Licensing Agency, 2015). However, guidelines are not given to support a typology. There is likely to be little consistency between individual manufacturers and also the registrations manually entered. The DfT also aggregates vehicles into much broader categories; cars, motor cycles, light goods, heavy goods, and buses and coaches (Department for Transport, 2015). Despite various car classifications existing previously, none has been applied to DVLA data in the public domain. Additionally, the number of unique car models on the market is also increasing and cross-over vehicles are also growing in popularity making classifications harder to maintain (Holweg, Davies, & Podpolny, 2009). A segmentation, therefore, was devised for this study with the aim of segmenting all cars registered in England and Wales into distinctive groups based on their size and typical market.
Initially, data on the generic models and age of vehicles across Great Britain were obtained from the DfT. These data included a list of all generic models in the car registration database as of 31 March 2011 by their total frequency. The frequency was also disaggregated into three age bands, which were recommended by the DfT. The first age band is mostly represented by vehicles up to three years of age to denote the youngest cars; many of these vehicles would not yet have had their first MOT. The second group mostly represents cars between three and 11 years old.
Regional StudieS, Regional Science The final group represents all vehicles that are 11 years old or older. The most popular generic models in the dataset are shown in Table 1.
Various vehicle and car segmentations already exist, and these were considered to assist the classification devised for study. Classifications of vehicle type were devised for the models and brands of American vehicles (Bureau of Transportation Statistics, 1999;Curry, 2000;Golob, Bunch, & Brownstone, 1997), and also for Europe (European Commission, 2011). However, the UK market is unique in many respects, and consequently the composition of vehicles is different. Therefore, a UK-based segmentation is required to secure an accurate summary of the types of cars registered within England and Wales.
The car segmentation created for this research is largely based on the structure of the vehicle, much like the Euro Car Segment (European Commission, 2011). It had to be tailored for this research by using the names of vehicles registered in the UK as revealed from a dataset provided by the DfT in order to merge it with the DVLA database. This led to a 10-group car classification using the rough guidelines underlying the Euro Car Segment. However, a noticeable difference is that this classification distinguishes between large family cars and compact executive cars, a distinction more common in British markets.
To allocate cars to the segmentation, manufacturers' websites were trawled for their car descriptions and specs. The majority of cars clearly met the criterion for their assigned car segment, whilst some required a more detailed consideration of the size of the vehicle. A minority of car models had editions that varied substantially from their most common variant, and unfortunately the classification could only consider the generic model due to the data available. These were assigned, therefore, on the basis of their most common variant. There were also cars that fell comfortably into two groups. For instance, some family cars have sports variants or may vary in structure. The 10 car segments are described in Table 2.
The DfT successfully used the car segmentation look-up file to produce counts of cars falling into each car segment at the lower layer super output area level (LSOA) across England and Wales. LSOAs are an aggregate geography that represent between 1000 and 3000 individuals, and were recently updated to account for population changes as recorded from the 2011 Census (ONS, 2015). With just over 34,750 units in England and Wales, LSOAs represent an excellent geography that will allow the research to capture local variances in car characteristics at regional and intra-urban levels. LSOAs have sometimes been termed 'neighbourhoods' under the assumption that neighbourhoods refer to local areas.
The DVLA data were also extracted from the 31 March 2011 edition of the DVLA car registration database, so they align well with the LSOA-level census data which were recorded on 27 March 2011. The age band of the vehicle was also included in the data release so it would be possible to analyse local variations in car ages across England and Wales as a possible proxy for Regional StudieS, Regional Science the quality of the vehicle. Additionally, the DfT included estimated counts of company cars for each LSOA by identifying vehicles registered to keepers recorded as companies. However, the models of these vehicles were not disclosed. The classification segmented 25.6 million cars into car segments at the LSOA level in England and Wales. In addition, the data also identified 2.18 million company cars, although it was not possible to segment these vehicles.
Although the classification is based entirely on the physical structure of the cars, each car segment can be associated with particular vehicle user characteristics and the life-styles of owners. For instance, city cars and superminis are the smallest vehicles and, therefore, are stereotyped with urbanites in areas where there are likely to be mainly restricted parking spaces. There are also segments that describe family cars, including MPVs, which are often suitable for larger households. Executive cars are most commonly stereotyped with business use; it is probable that a large proportion of these could be company cars too. Sports cars and luxury cars are both typically high-end vehicles and more expensive. SUVs, however, are typically marketed to two very different groups. As practical 4x4 vehicles necessary for off-road driving, they are prevalent Regional StudieS, Regional Science in rural areas. However, luxury variants have become very popular; for instance, there were over 150,000 Land Rover Discovery's in the dataset. Within each car segment the vehicles are likely to be in relatively similar value bands due to their common intrinsic properties, with executive, SUVs, luxury and sports segments typically representing the most expensive cars within the segmentation (Baltas & Saridakis, 2009). As with all classifications, this car segmentation suffers from some limitations. Discrete boundaries between segments needed to be defined; some vehicles had to be subjectively grouped as they were appropriate to multiple types. Even within each car segment, the structure of vehicles, their prices and their typical clientele may still range considerably as identified by the SUVs. Although the classification could be disaggregated by the three age groups, the value of newly available cars can still vary considerably within each segment. However, despite some shortfalls, the segmentation is still the most efficient means of effectively grouping vehicles so the data could be accessed and easily interpreted.
Due to the nature of the UK car market, the number of cars from different car segments is very unbalanced (Table 3). High-end car segments are far less common than the other segments. There are more Ford Focuses, Ford Fiestas, Vauxhall Corsas and Vauxhall Astras in the dataset individually than there are executive cars and luxury cars combined. However, the SUV category is quite large with just over 1.76 million registered vehicles in the dataset.
There are also variations between the segments in terms of the proportions in the three age samples (Table 4). The middle age sample of cars which are 4-10 years of age is by far the largest of the three, representing 14.77 million cars. The data revealed that smaller cars have become more popular. The city car category has grown in proportion from 3.7% of all cars aged 10 years and over to 14.5% of cars aged up to three years old (the proportion of city cars within the  Regional StudieS, Regional Science youngest age category is over twice the average proportion of city cars across the whole sample).
The supermini category has also grown. The SUV is the only other category to have increased in the youngest age sample. In contrast, luxury cars have declined the most proportionally and represent just 0.5% of cars aged less than three years old. It is possible that part of this is due to competition from the rise in popularity of luxury variants of SUVs.

ANALYSING VARIANCES IN CAR CHARACTERISTICS AT A SMALL-AREA LEVEL
The car data from the DVLA can be analysed across the 34,753 LSOAs for England and Wales. A very small proportion of cars could not be joined to LSOAs due to incompatible postcodes or uncompleted records, so the total number of vehicles in the LSOA dataset stood at 25.6 million cars, an average of 736.6 cars per LSOA. The LSOA with the most registered cars had a total of over 181,000. This figure is unreflective of the number of vehicles owned by local residents and is most likely due to the activities of a large car leasing company which is currently located there. A limitation of using administrative records to analyse trends in society is that as the data are not recorded for this purpose, data quality checks are not guaranteed. Table 5 displays the descriptive statistics for the LSOA car data with the company cars removed from the sample. The maximum of 7401 still appears somewhat unrealistic considering LSOAs have an average population of 1600 persons. The 2011 Census recorded that the total number of cars and vans available to residents of this particular LSOA was just 709. Although attempts have been made to remove vehicles that are registered to businesses, the data still contain some vehicles that are not registered to household addresses. This specific LSOA hosts the DVLA's headquarters, so it can be considered to be a unique case. Fortunately, it is likely that only a minority of cars not recorded as company vehicles are recorded to non-residential locations. Only six LSOAs contained 2000 or more cars. Figure 1 displays a frequency histogram for the total number of cars per LSOA excluding company cars.
By linking the car data to census data, the ratio of cars per occupied household was mapped across England and Wales. From observing the data at a regional level, it is apparent that the number of cars per household is inversely proportional to population density ( Figure 2). London's high urban density makes it the only region where cars do not outnumber households.

Company cars
The vast majority of the 2.37 million cars registered to companies are new or nearly new. Just over 75.5% of them are under three years old. As it is unlikely that there was a sudden surge in sales of company cars leading up to 2011, it is probable that this high figure is due to cars being sold off and replaced prior to their first MOT. There is a positive statistical association between the size of the workplace population as recorded from the 2011 Census and the percentage of cars that are identified as being registered to a company. The pairing returned a Pearson's coefficient (r) of .453 Regional StudieS, Regional Science Figure 1. Histogram of the number of cars registered to households in each lower super output areas (lSoa) (the outlier has been excluded).

Figure 2.
Map of the number of cars per household as recorded from the lSoa-level car segmentation data.
Regional StudieS, Regional Science (p = 0.00), despite the fact that the distribution of company cars is unlikely to be evenly balanced between employers and places of work. The rest of the data presented in this paper, therefore, only account for cars that are not registered to companies because company car traits are less likely to be reflective of typical car consumption choices by local residents (Prevedouros & Schofer, 1992).

The age of cars
There are notable regional variations in the age of registered vehicles. Table 6 displays the mean percentage of cars of each age banding per region. The data are represented as location quotients, whereby the percentage for each age band in each region is divided by the percentage of that respective band across the entire dataset. Therefore, values of 1 identify the same rates as the national average, and values above 1 identify an overrepresentation in the region. The data reveal that London has far greater proportions of older cars than the national average; whilst the two most northerly regions have higher proportions of newer models. However, across the whole of England and Wales local variations in car consumption behaviour are most striking within cities due to urban spatial inequalities (Department for Communities & Local Governments, 2011). Consequently, many of the following figures have focused on Greater London to exemplify intra-urban patterns. London has a population of 8.4 million and a distinctive geodemographic distribution. The distribution of the proportion of cars from each of the three segments is mapped individually in Figure 3. As the numerical range of each variable varies, the Choropleth maps have all been displayed as quantiles.
There is a clear spatial disassociation between the distribution of newer and older cars within London, and this experienced throughout metropolitan areas across the rest of England and Wales too. Newer cars, especially those sold brand new, are usually more expensive and, therefore, are more likely to be purchased by those with a higher disposable income (Pearman & Button, 1976). Consequently, these patterns may be associated with local socio-economic segregation as persons of similar traits are known to cluster due to land prices, and also preferences associated with their other characteristics (Harris, Sleight, & Webber, 2005). The association between the ages of cars and local socio-economics will therefore be tested below. However, the variations in the types of the car may also exert distinctive patterns across England and Wales.

Car segments
The distributions of the proportion of cars from each individual segment were also compared and analysed across the whole of England and Wales. Table 7 identifies regional differences in the mean proportion of each of the 10 car segments. Table 6. location quotients displaying regional variations in the proportion of each of the car age bands across england and Wales.

Region
Under 3 years old 3-10 years old Over 10 years old  Regional StudieS, Regional Science 0.67 Regional StudieS, Regional Science At the regional level there are only subtle variations in the composition of car segmentations. However, London remains distinctive due to notably high proportions of luxury, executive and compact executive cars. This is possibly due to the concentration of finance-related occupations and the high median salaries in London which may have encouraged conspicuous consumption practices (Stewart, 2011). The smaller-sized car segments are relatively under represented. London is also noted as being distinctive from the other regions in terms of car use. Previous research has identified that the average car driven miles per head in the capital were half the national average (Headicar, 2013). This is largely a consequence of London's urban density and its congested road network, accompanied by the regularization of public transport provision (Paulley et al., 2006). It is also probable that the types of individuals most likely to favour smaller cars are those most likely to opt out of owning a vehicle in London due to these reasons.
As experienced above, the variations within cities are more pronounced presumably due to the proximal socio-spatial inequalities found within most urban areas in England and Wales (Department for Communities & Local Governments, 2011). Figure 4 displays comparisons between each car segment across Greater London and identifies notable distinctions. The data are also represented as five quantiles for each segment.
The spatial distributions identified distinctive associations and disassociations between car types, as exemplified in London. Each segment is also uniquely distributed. Luxury cars are most concentrated in central-west London where house prices are the highest, and their concentration is so strong that they are actually the majority car type in some central LSOAs despite being the smallest car segment (0.82% on average). Whilst SUVs and sports cars also cluster in affluent neighbourhoods, SUVs also have a tendency to be prevalent in more rural parts of the country. In contrast, Figure 4 reveals that large family cars are more abundant proportionally in areas of outer London known to be more deprived and more ethnically diverse. Households from these neighbourhoods are also more likely to have young children (Longley & Singleton, 2014). To test the associations between car segments at the neighbourhood level, the proportion of each of the segments was compared against each other in a Pearson's correlation matrix (Table 8).
As would be expected, there are correlations between the presence of cars of similar purposes and sizes at the neighbourhood level. For instance, the smallest car type (the city car) correlates with the next smallest, the supermini. The supermini then correlates with the small family car and so on. However, interestingly, the city car does not share a positive association with small family cars presumably due to distinctive customer bases. The only other segment in which city cars share a strong correlation is sports cars, a more expensive segment. This is possibly a consequence of city cars being purchased by young professionals as the vehicles are typically targeted at these cosmopolitan groups. The strongest associations are between executive and compact executive cars. In contrast, the greatest negative correlation is between executive cars and superminis. Each of the high-end segments is positively associated with each other, whilst the MPV segment only positively correlates with the two 'family car' types.

COMPARISON WITH NEIGHBOURHOOD STATISTICS
To achieve an understanding of how car consumption is associated with neighbourhood characteristics, the DLVA data was joined to LSOA-level census records. The 2011 Census recorded some 27.3 million cars and vans in England and Wales to which the population had access, 785.4 per LSOA on average. However, census statistics on car access cannot be validly compared with the DVLA data as the 2011 questionnaire did not require respondents to distinguish between cars and vans. Additionally, the census did not record ownership, as it recorded access to vehicles instead, so it cannot be used to evaluate the DVLA data. Despite this, there is a strong relationship between the number of registered cars per household and the availability of cars and vans Regional StudieS, Regional Science per household across England and Wales. The indicators share a statistically significant Pearson's correlation coefficient of r = .958.
Census data are a very useful measure of LSOA-level population characteristics, especially given that the 2011 Census itself was recorded just a couple of days prior to the DVLA database used for this research. It is also the only open record of small area attributes such as socio-economics, demographics and housing which has a near 100% coverage. It would therefore be insightful to observe how household characteristics from the census are associated with the types of cars registered locally from the DVLA data. Regional StudieS, Regional Science

Socio-economics
Affluence and socio-economics have been widely considered as the most substantial influences on car model choice amongst consumers. It is therefore expected that a socio-economic indicator would associate with the composition of car model segments across England and Wales. To test this, the car data were compared with the National Statistics Socio-Economic Classification (NS-SEC) (first described by Rose & Pevalin, 2001). Table 9 displays the associations between car age, and car segment and NS-SEC groups. The correlations between NS-SEC groups and car segments reveals distinctive associations between traits in car ownership and social standing. Car segments that are most commonly seen as high-end purchases are more prevalent amongst neighbourhoods with higher proportions of NS-SEC groups 1 and 2. There is also a small positive correlation between these groups and city cars, presumably due to the high concentration of these areas in central more affluent locations. Cars more commonly associated with family use are more prevalent in lower socio-economic neighbourhoods. Large family cars and MPVs have the strongest associations with the lowest socio-economic group. This is possibly due to the fact that more deprived households tend to have larger families (Land, 2004).
As the proportion of luxury cars which are in the 10 years or older age category is far larger than the average for the rest of the car segments, it is possible that older and therefore cheaper variants are more evenly distributed across cities. The positive correlation between these cars and higher socio-economic groups is slightly weaker than that for sports cars and executive cars. In contrast, city cars have higher proportions in their youngest age group. This may also contribute to their popularity amongst groups with higher disposable incomes. Regional StudieS, Regional Science .608 .564 .394 Regional StudieS, Regional Science The correlations with NS-SEC group 4, small employers and own account workers, do not fit in so linearly with the results obtained from the other socio-economic groups. Due to the nature of the occupations described in the group, it is most likely to be more prevalent in rural areas where primary industries are more predominant (Rose & Pevalin, 2001). Consequently, its high association with SUV registrations is unsurprising.

Multivariate analysis
In order to explore the more complex multivariate relationships between variables from the car and the socio-economic data, a canonical correlation analysis (CCA) was implemented. The CCA is a means of identifying and measuring the most substantive linear relationships between combinations of variables from two separate datasets (Thompson, 1984). The method accounts for the fact that all variables in both data can associate with each other, and can therefore identify statistically significant trends which are not observable from univariate analysis. The advantage of the technique is that it can summarize the key trends across complex data in the form of a small number of linear models; it is therefore a very useful method for exploratory statistical analysis.
The CCA works by attempting to identify the general associations between multiple variables from two different sources by correlating canonical variates, which refer to linear combinations of weighted variables from both datasets (Stewart & Love, 1968). The total number of variates produced matches the number of variables from the smaller variable set. Each variate (or function as they are sometimes termed when paired) is representative of a distinctive linear relationship between the variable sets, although their R 2 measures will vary substantially so usually only some of the significant variates are considered.
The following analysis was undertaken using the CCA package for R (González & Déjean, 2015). The car and NS-SEC variables were input as the X and Y parameters respectively. Table 10 displays the results of the CCA. Regional StudieS, Regional Science The CCA demonstrates that there are several statistically significant multivariate relationships between car traits and socio-economics across England and Wales. The first canonical variate has R 2 = 0.752 and also represents 58.3% of variance in the analysis. The other variates are less explanatory, but they represent linear associations that the first function does not account for.
To interpret the relationships the canonical correlations represent, canonical loadings for the four most explanatory variates are presented in Table 11. Canonical loadings represent the correlations between the observed variables individually and their respective canonical variates. Canonical cross loadings refer to correlations between the variables and the variates from the opposite dataset.
The canonical correlations reveal several interesting multivariate relationships between the variables from the two datasets. The first canonical variate is largely reflective of the major linear associations between socio-economics and the car model segments and age groups which could be implied from the above analysis. It is very positively associated with NS-SEC group 1, and the loadings from the car variables are, therefore, very similar to the correlation coeffcients between these variables and the aforementioned NS-SEC group (as seen in Table 9). The remaning Regional StudieS, Regional Science variates detect other patterns that are more subtle and not observable from bivariate analysis. The second variate reveals an association between small car model types, cars under 11 years old and socio-economic groups 1-3. This could be indicative of (often young) professional urbanites who favour newer, small hatchback cars more suitable to city streets. By plotting the loadings for the first two variates, it is possible to visualize the associations between the variables across the two primary dimensions of the CCA ( Figure 5). Expectedly, the chart illustrates that there is a notable association between higher social status (NS-SEC groups 1 and 2) and sports, compact executive, executive and luxury car model types. Moreover, such variables are clearly disassociated with the lower socio-economic groups and the family car model types. Figure 5 also identifies the affiliation between NS-SEC group 4 and SUVs, presumably due to their shared rural ties.
As seen in Table 11, the third variate identifies an association between older vehicles, both executive car model types and NS-SEC group 8 (Never worked and long-term unemployed). As executive cars are typically associated with higher socio-economic groups, this finding suggests that cross-tabulating the car segmentation by the three age groups for each LSOA would expose more intricate patterns. Whilst looking at age and model separately has been very insightful, the value and quality of cars range considerably with age, as does the popularity Figure 5. distribution of variables across the first two canonical variates.
Regional StudieS, Regional Science of certain car model types (as demonstrated in Table 4). Furthermore, including additional variables on alternative geodemographic attributes (such as household composition and urban density) may also help one identify and understand further variances in car compositions across England and Wales.

CONCLUSIONS
There is a distinctive geography of registered cars in England and Wales due to variances in the popularity and affordability of different vehicles. Through aggregating the DVLA car registration data to form a segmentation of 10 categories, it was observed that each car segment is distinctively clustered across the LSOAs. Likewise, there is a disassociation between the registered locations of cars of distinctively different segments. Generally car segments spatially cluster within cities and form the basis of two main groups: family cars and prestige cars. The latter refers to the more expensive car segments that were found to be positively associated with more affluent neighbourhoods largely due to the higher costs of purchasing these vehicles. Although not all cars meet this assumption, city cars are typically the smallest vehicles and well within the cheapest range of vehicles sold in the country. However, many of these vehicles are very young, and they are also marketed at urbanites and, consequently, these cars are typically clustered near city centres. In contrast, SUVs are more common in rural areas, but are also found in the suburbs. There is also an association between the ages of vehicles within neighbourhoods too, presumably due to similarities in costs and quality of cars within age bands.
Much of the spatial variances in car consumption between neighbourhoods can be accounted for by the distribution of residential socio-economic characteristics. This is largely due to the presumed associated variances in affluence and cultural preferences. Although the association between social class and the types of cars driven is discussed widely in the popular media, the extent of this relationship has never been thoroughly researched using a complete dataset of registered vehicles. No prior research has comprehensively considered variances between the types of models of cars, their ages and socio-economic characteristics at a small-area level. This research has gone beyond considering car ownership as an indicator of socio-economic status and demonstrated the value in exploring the characteristics of vehicles instead.
The results of the canonical correlation analysis revealed several interesting relationships between the car and NS-SEC variables. The findings also suggest that there are countless opportunities for further research into car registration data, including modelling car traits by wider geodemographic characteristics as socio-economics are not the singular determinant of car model choice. In addition, cross-tabulating the car segments by age groups for each small area would reveal far more intricate trends in car ownership.
Overall, this research has confirmed that there is indeed a strong relationship between socio-economics and car characteristics at a small-area level. It has also demonstrated the potential value of harnessing administrative data to achieve a greater insight into local consumption practices. As a comprehensive dataset representing 100% of registered cars, which are distributed across 75% of households, annual releases of car registrations by segment could contribute to filling the data void which exists between census years in the UK.