Bivariate choropleth map documenting land cover intensity and population growth in Poland 2006–2018

ABSTRACT Bivariate choropleth mappings are proposed in this paper as an effective GIS-based approach that supports the monitoring of Sustainable Development Goals, listed in the Agenda 2030, namely build-up area expansion and population dynamics, the Tier II indicator for SDG 11 (11.3.1). This paper focuses on the application of a multidimensional approach to mapping land use efficiency in Poland between 2006 and 2018. The resulting data are assigned to counties and visualized as choropleth maps categorizing the counties based on the strength of land cover intensity and population growth. The proprietary class interdependence index (CII) based on the relationship between these two variables is used to evaluate class ranges of the bivariate choropleth map. The proposed cartographic presentation might be a useful and important tool in SDG monitoring as it provides information about the relation between population and land consumption rate, portraying the most correlated data in classes along the diagonal.


Introduction
The existing subject literature on cartography focuses mainly on univariate maps that portray only one attribute of the geographic information set (Brewer, 1994;Hai & Guo, 2009;Pokonieczny et al., 2020). Meanwhile, bivariate choropleth maps are a powerful way to convey information about related geographic phenomena. Their functional purpose is to reveal spatial relationships and patterns between two variables (Dunn, 1989;Tyner, 2009). Visualizing geographic relationship on a single map is more effective than using two side-by-side univariate maps and provides an insight into understanding the mapped phenomena (Carstensen, 1984). This is an important advantage, especially since the correct visual comparison of maps is a task that often exceeds the perceptual capabilities of the average user (Leonowicz, 2002a). However, a bivariate map is more visually complex than a univariate map, which makes it quite difficult for the user to process information. Additionally, designing bivariate choropleth maps successfully is challenging due to the density of added information (Leonowicz, 2006;Weiner & Francolini, 1980). A common method of creating bivariate choropleth maps is the superimposition of two choropleth maps (Dunn, 1989). The first maps developed with this method were published in the 1970s by the US Bureau of the Census, using techniques called overlay, crisscross, merge, or crossing to combine two choropleth maps along with their colour schemes. As a result, a bivariate map with a new set of colours was produced (Monmonier, 1989;Olson, 1981). However, such twodimensional maps produced ambiguous representations that conveyed information poorly, being too difficult to interpret. In 1980, Wainer and Francolini paid particular attention to the difficulties in interpreting their legends. Indeed, the legend of these choropleth maps contained as many as 16 (4 × 4) value classes, and the adoption of the colour system was controversial (Dunn, 1989;Fienberg, 1979). The interpretation of bivariate maps was more difficult than of univariate ones, but not because of the adopted method of cartographic presentation, but because incorrect graphic solutions were used. These doubts, therefore, led many researchers to undertake studies on the readability of multivariate cartograms (Leonowicz, 2002a). Cuff and Mattson (1982) and Trumbo (1981) analysed the nature of a bivariate colour map legend and compared it to the easily understandable scatterplot diagram. The next study focused on the effectiveness of the used colours schemes and proved that they could significantly improve the readability of maps (Eyton, 1984;MacEachren et al., 1999;Roth et al., 2010). According to Kraak and Ormeling (2011), colour is the main visual variable. The most standard way of colouring choropleth maps is the approach of Brewer et al. (2015), which includes two-dimensional colour map strategies. Leonowicz (2002b) claimed that there must be a relationship between the variables selected for analysis, about which the map reader will be informed. Since the essence of the method is the simultaneous presentation of two related geographical phenomena, their choice cannot be accidental. This relationship should be visible in the image of the spatial distribution of both variables. The first type of dependency that can be represented on a bivariate choropleth map is the coexistence of certain phenomena, and a special case here is a cause-and-effect relationship (Nusrat et al., 2018). As Leonowicz (2002b) points out, there is also a second type of dependency, which occurs when two independent variables describe one phenomenon that is presented on a bivariate choropleth map. However, when choosing variables, one should not be guided by the size of the correlation between the phenomena. Despite a lack of purely statistical dependence, their presentation on a multivariate cartogram may contain information quite relevant to the reader (Carstensen, 1984;Stevens, 2015). Thus, even when the overall correlation is low, there may be a strong relationship between phenomena in individual regions. Trumbo (1981) claimed that most correlated data are in classes along the diagonal, while non-correlated data are presented outside the diagonal.
What information the map provides depends on the method of presentation and the colour scales (Brewer et al., 2015;Calka & Cahan, 2016;Strode et al., 2020;Trumbo, 1981). Bivariate maps can make use of a variety of symbol strategies such as shaded proportional symbols, shaded cartograms, split symbols, shaded isolines, or star plots (Friendly, 2008;Kimball & Kostelnick, 2017). Colour combinations are the most popular method, which is also used in this article (Cybulski et al., 2020). Combining two sets of data requires the appropriate selection of classes. So far, literature has provided several methods for defining bivariate map class divisions, such as quantiles, equal intervals, standard deviation, maximum breaks, optimal classification, the Jenks natural breaks method, or regression analysis (Calka, 2018;Leonowicz, 2002b). These methods are based on classifying both datasets separately. Nevertheless, there is still a lack of analyses that would study the relationship between data and assess the selection of cartogram classes based on this relationship.
The essence of the bivariate cartogram method is to show the values of two geographical phenomena within the boundaries of the spatial division units on the map. Although they can represent any pairing of thematic variables, they are typically employed to examine the relationships between socioeconomic phenomena. With growing urbanization, the sustainability of cities has become increasingly important. Pursuing a universal approach, Agenda 2030 identifies European countries that need transformation towards sustainability (United Nations General Assembly, 2015). Although cities have been using indicators for a long time, it has only been in recent decades that attempts have been made to compile these indicators into sets that reflect many different aspects required to assess a city's sustainability (Melchiorri et al., 2019;Venables, 2017). One of the most important aspects of analysis is the expansion of built-up areas and population dynamics, i.e. land use efficiency, listed as a Tier II indicator for SDG 11 (11.3.1) (Anderson et al., 2017;Cai et al., 2020;Paganini & Petiteville, 2018). (2019) used area cartograms and tile maps to present SDG indicator data also visualized as world choropleth maps (Roser, 2018). The capacity to monitor the SDGs requires a wealth of open, reliable, and globally comparable data Nowak Da Costa et al., 2017;Walford & Kurek, 2016). When designing land use efficiency maps, one needs to focus on several factors, including the clarity of the information and spatial relationships as well as patterns between the two variables. Nevertheless, the cartographic presentation of data relationships is still not sophisticated enough.
The aim of the paper is to develop a choropleth map of land use efficiency in Poland for 2006-2018. A bivariate choropleth map was designed to present the relationship between changes in population number and built-up areas in a clear and legible way. In addition, a class interdependence index (CII) was developed to evaluate the class ranges that were used to create the bivariate choropleth. The developed map can help policymakers and planners ensure that cities remain economically productive and environmentally sustainable. This map can also be extremely useful in monitoring land resources in urban planning or disaster management.

Study area
The research covers Poland, a country located in Eastern Europe (Figure 1). It is the 10th most populated country in Europe and the 63rd in the world. Recently, the population number has slightly decreased to 38,268,000 at the end of 2020. It reached its maximum so far of 38,538,000 at the end of 2011. Since then, the number has been declining because of economic, demographic and cultural reasons. The Central Statistical Office (GUS) reports that this decline, in addition to the general trends observed before (e.g. a decrease in the number of births) was significantly influenced by an increase in the number of deaths, also caused by the COVID epidemic (GUS, 2020).
A significant decline in the population can be observed in the outlying areas of towns, municipalities, regions, and in the borderlands. According to the Central Statistical Office, between 2005 and 2019 the population growth was negative in 7 out of 16 provinces of Poland. In 2019, an increase in population was recorded in only four provinces. An analysis of the 2002-2018 statistics shows that there was population decline in the majority of outlying and peripheral counties of north-east, east, south-west, west, and central Poland. On the other hand, real population growth almost invariably occurs mainly in suburban areas of large cities.
The size of built-up areas is constantly on an upward trend. According to GUS data, over the last 10 years, the total number of flats and houses in Poland has increased from 13.3 million to 14.8 million, but this increase was very uneven. In some areas, there was a ten-year increase of more than 30-40%. The number of flats and houses grew the fastest in larger cities and their neighbouring counties. In many cities, the phenomenon of suburbanization can be observed, which means a decisive development of suburban regions (GUS, 2020).

Data
The research used data of the Coordination of Information on the Environment (CORINE) Land Cover, obtained from the Chief Inspectorate of Environmental Protection (https://clc.gios.gov.pl). Land cover is defined as the physical state of the land surface, and, as noted by Fisher et al. (2005), is determined by direct observation. CORINE Land Cover data are available for most countries in Europe for 1990Europe for , 2000Europe for , 2006Europe for , 2012Europe for , and 2018. Land cover data are widely used both at the European Union level, in its environmental, agricultural, and regional development policies, as well as at the level of individual countries and even regions. The CORINE Land Cover is the only source of geospatial data on the state of land cover and its changes that took place in the European territory between 1990 and 2012 (Bielecka & Jenerowicz, 2019). The methodology for land cover inventory and updating was based on visual interpretation of satellite images. These   Figure 2(a)) and 2018 (Figure 2(b)) were used in the analyses. The population number data related to counties were obtained from the Central Statistical Office.

Methods
The methodology for SDG 11.3.1 is established and referenced in the SDG indicators Metadata Repository managed by UNDESA (https://unstats.un.org/sdgs/ metadata). The indicator is classified as a Tier II indicator, meaning that SDG 11.3.1 is conceptually clear with a methodology for its monitoring. However, data for calculating this indicator are not regularly produced or available, presenting challenges for monitoring despite the established methodology. To estimate land use efficiency (LUE), first, it is necessary to calculate the land consumption rate (LCR) and the population growth rate (PGR), according to Equations (1) and (2). To perform the experiments, the established methodology was applied using the CORINE Land Cover as input data. LUE monitors the ratio of land consumption rate to population growth rate. It is used to quantify the sustainable land at the time of urban growth, both in demographic and economic terms.
where Pop ttotal population in the initial year for a spatial unit; Pop t+ntotal population in the final year for a spatial unit; tthe number of years between the two measurement periods.
where Urb ttotal area extent of the built-up area for a spatial unit for the initial year using CORINE Land Cover data [km 2 ]; Urb t+ntotal area extent of the built-up area for a spatial unit for the final year using CORINE Land Cover data [km 2 ]; tthe number of years between the two measurement periods.
To develop a land use efficiency bivariate choropleth map a diagonal model was used. The model provided information on the relationship between builtup area changes and population number changes. A bivariate map with 9 classes (3 × 3) was proposed, as suggested by Brewer (1994), because a smaller number of classes is more suitable. Using a sequential colour scheme at a 45-degree angle, this model shows the progression of correlated data (Slocum et al., 2005). To present the differences between the areas on either side of the main diagonal, different colour schemes are used on the opposite sides of the diagonal (Trumbo, 1981). The proposed model attempts to show data correlation along the diagonal using a grey colour scheme, while non-correlated data are shown in complementary colours outside the diagonal. This diagonal model provides information on land use efficiency, i.e. the relationship between the rate of land consumption and the population growth rate. Different shades of blue represent different population growth rates (e.g. bigger changes in populations between 2006 and 2018 are marked darker than others) and different shades of red represent different land consumption rates. The use of shades from light gray to dark gray along the diagonal in the proposed colour schemes allows for a clear presentation of classes related to the sustainable development of the population and the growth of built-up areas. The darker the blue colour, the more uneven increase in the population growth rate is in relation to the growth of built-up areas. The darker the red colour, the greater the growth of built-up areas is in relation to an increase in population. These are areas diverging from sustainable development and requiring improved planning.
Establishing the class ranges is an important step in the development of bivariate choropleth maps. For better visualization of areas with negative ratios of land consumption rate to population growth rate, a range of values of up to one was adopted for both variables to show areas for which there was a decrease in population and built-up areas between 2006 and 2018. The remaining class ranges were selected on the basis of the value of the developed class interdependence index (CII) calculated iteratively using the Python script. The values of the class ranges with the maximum value of the index was adopted for the development of bivariate choropleth maps. It reflects the situation when the data on diagonals are the most correlated, and the data in other classes have the lowest correlation coefficient. This index is based on Pearson's correlation coefficient (C nn ) calculated for the data in each class and can be applied to an array of any size n × n (Figure 3).
The class interdependence index was computed according to the Equation (3): where C 11 -Pearson correlation coefficient value for class with PGR<1 and LCR<1; C ii -Pearson correlation coefficient value for other classes on diagonals; C ij -Pearson correlation value for other classes outside the diagonals; ndimensional matrix; irow number in correlation matrix; jcolumn number in correlation matrix.
The class interdependence index assumes values from <0,n>. A higher index value indicates a greater correlation of the PGR and LCR values in diagonal and a smaller value in other classes. The developed method is universal, and the index developed here can be used for arrays with any number of rows and columns.

Results
Three thematically related maps for the 2006-2018 period are presented in this paper. The main map depicts land use efficiency across Polish counties. This map is meant to illustrate the relationship between the rate of land consumption (LCR) and the population growth rate (PGR). It provides a generalpurpose and easy-to-understand overview of the changes that took place in Poland between 2006 and 2018. The first class, for the values of LCR<1 and PGR<1, contains only 7 objects. These are counties with a similar decrease in the number of people and in built-up areas over time. Counties marked in blue, indicating population growth, make up only 7% and are mainly concentrated near large cities. Classes marked with shades of grey apply to 106 counties, accounting for almost 27%, where an increase in the number of people and buildings is fairly similar. Counties marked with shades of red with an increase in the number of buildings and a decrease in the number of people constitute a majority of 55%.
The first class of the choropleth map (PGR<1 LCR<1) was determined arbitrarily. The highest CII index of 2.33 was obtained for the cartogram with the class values of PGR<1.05 and LCR<1.15. For the sake of comparison, the index was also calculated using standard classification methods. For the Quantile method, the CII index was 1.21, while for the Natural Breaks Method it was 0.41. The univariate maps located below the bivariate map present LCR and PGR information using the same colour schemes and class ranges.

Conclusions
The use of a bivariate choropleth map makes it possible to show the relationship between the rate of population change and land consumption rate in Poland in 2006-2018. The map makes it possible to display these two variables simultaneously with the use of a single colour scheme. Additionally, the resulting two-dimensional map allows the users to assess the land use efficiency in Poland and could be useful in Sustainable Development Goals monitoring. Due to the forecasts that predict that the population of inhabitants of urban areas will grow to 5 billion by 2030, current trends focus on implementing effective town planning and urban area management practices that will help us overcome the challenges resulting from urbanization. The developed map allows for an easy identification of areas characterized by an even increase in the population and built-up areas, but it also enables the user to  identify problem areas that require improvements in spatial planning quickly and easily. These are mainly areas that are subject to rapid urbanization processes, where the number of inhabitants increases dramatically, but also those with strongly expanding built-up areas yet a relatively low increase in population.
However, this method is not without limitations. Firstly, the method presented here requires the availability of reliable data on the population and size of built-up areas. An important aspect is that these data must refer to the area units for which a bivariate choropleth map will be developed. An appropriate data preparation process should, therefore, be carried out. Additionally, for the map to be legible, the number of classes should be limited and an appropriate colour scheme should be selected. The selection of class ranges is an essential issue in developing a bivariate colour scheme. The CII index proposed in this article makes it possible to develop the class ranges of bivariate choropleth maps in an innovative way, considering both variables at the same time. The inclusion of the index in the process of developing bivariate choropleth maps can speed it up and improve it, and at the same time showing the relationship of the presented variables in a more effective way.
To develop a map the functional aspects should be considered. The resulting maps can be interpreted to provide information about land use efficiency. The application of bivariate choropleth maps for the evaluation of land use may support decision-making for promoting the sustainability of land management. Further research will focus on conducting an experiment with the participation of map users that will then be analysed in order to compare univariate and bivariate choropleth maps in terms of the clarity of information in the monitoring of Sustainable Development Goals.

Software
Correlation analysis was calculated using the STATIS-TICA 13.1. and Python 3.9.2. The map was generated in ArcMap 10.6.1 for Desktop by ESRI. Final map sheet design was performed using CorelDRAW 2018 and Adobe Photoshop CS6.

Disclosure statement
No potential conflict of interest was reported by the author(s).