A standardized dataset of built-up areas of China’s cities with populations over 300,000 for the period 1990–2015

ABSTRACT China’s urbanization has attracted a lot of attention due to its unprecedented pace and intensity in terms of land, population, and economic impact. However, due to the lack of consistent and harmonized data, little is known about the patterns and dynamics of the interaction between these different aspects over the past few decades. Along with the implementation of the 2030 Agenda for Sustainable Development, a standardized dataset for assessing the sustainability of urbanization in China is needed. In this paper, we used remote sensing data from multiple sources (time-series of Landsat and Sentinel images) to map the impervious surface area (ISA) at five-year intervals from 1990 to 2015 and then converted the results into a standardized dataset of the built-up area for 433 Chinese cities with 300,000 inhabitants or more. This dataset was produced following the well-established rules adopted by the United Nations (UN). Validation of the ISA maps in urban areas based on the visual interpretation of Google Earth images showed that the average overall accuracy (OA), producer’s accuracy (PA) and user’s accuracy (UA) were 91.24%, 92.58% and 89.65%, respectively. Comparisons with other existing urban built-up area datasets derived from the National Bureau of Statistics of China, the World Bank and UN-habitat indicated that our dataset, namely the standardized urban built-up area dataset for China (SUBAD–China), provides an improved description of the spatiotemporal characteristics of the urbanization process and is especially applicable to a combined analysis of the spatial and socio-economic domains in urban areas. Potential applications of this dataset include combining the spatial expansion and demographic information provided by UN to calculate sustainable development indicators such as SDG 11.3.1. The dataset could also be used in other multidimensional syntheses related to the study of urbanization in China. The published dataset is available at http://www.doi.org/10.11922/sciencedb.j00076.00004.


Introduction
Urbanization is a complex socio-economic process that transforms the built environment and shifts the spatial distribution of a population from rural to urban areas. A major consequence is an increase in the land area and population of urban settlements, as well as in the proportion of urban residents compared to rural dwellers (Chen, Zhang, Wu, & Chen, 2010). According to the latest statistics from the United Nations (UN) Department of Economic and Social Affairs (DESA), in 2018, 55% of the world's population (4.2 billion people) lived in urban settlements, compared with 3.4 billion in rural areas. Towards the end of the Agenda for Sustainable Development in 2030, the share of the world's population living in urban areas is expected to reach 60% (UN DESA, 2018). Globally, Africa and Asia are urbanizing faster than other regions of the world. The urbanization level in China first exceeded 60% in 2019, suggesting that the process has now entered its middle and later stages. Excessive urbanization can lead to various urban problems, including air pollution, water shortages, traffic congestion and a worsening security situation, problems which definitely hinder the realization of urban sustainable development goals (UN, 2015).
Detailed contemporary datasets that accurately describe urban settlements can support the measurement of the impacts of the expansion of urban areas, the monitoring of environmental changes and the planning of interventions . However, varying standards currently apply to datasets of urban areas (Zhou & Shi, 1995). The basic terminology used for urban planning standards in China defines urban built-up areas as connected areas that have been constructed within urban administrative regions and that have municipal and public facilities (Ministry of Housing and Urban-Rural Development of the People's Republic of China, 1999). The United States Census Bureau considers population density as the main indicator for defining an urban area (Xu & Hua, 2005) while planners in Britain use satellite-derived geospatial data and the density of real estate to delineate urban agglomeration areas (Li, Sun, et al., 2012). At present, many cities at all levels in China adopt their own methods to identify the boundaries of urban built-up areas rather than follow a unified standard that has been lacking up to now (Li, Sun, et al., 2012).
Before the advent of freely accessible remotely sensed data, some research exploring the urbanization process in China relied on census data (Chen, Lu, & Zhang, 2009;Li, Chen, & Xu, 2009;Tan & Lv, 2003). However, the diverse methodologies used for measurementsin particular, inconsistencies in definitions -distorted any comprehensive comparison of the spatial and socio-economic aspects of urbanization in most cases Seto, Fragkias, Güneralp, & Reilly, 2011). For example, there are relatively few research results on SDG11.3.1 -i.e. the ratio of the land consumption rate to the population growth rate (LCRPGR) -due to the lack of harmonized ISA and population distribution data in urban areas (UN, 2015). The main reason for this is that the preconditions for calculating SDG11.3.1 depend on the production of urban built-up area data and the spatial decomposition of corresponding demographic data (Buettner, 2015); however, the concept of built-up area lacks a well-recognized definition and there are still many uncertainties involved in the spatial disaggregation of socio-economic data whose overall accuracy is not high (Gallego, Batista, Rocha, & Mubareka, 2011). At present, there are not many international open data sources that can achieve the harmonized association and coupling between built-up area and population distribution in urban areas. Typically, the Global Human Settlement Layer (GHSL) product  is intensively used in the calculation of this indicator at various scales Nicolau, David, Caetano, & Pereira, 2019;Schiavina et al., 2019).
Notwithstanding the huge challenges that stand in the way of translating Earth observation data into sustainable development indicators, multisource satellite data will continue to play an important role in the service of the 2030 Agenda for Sustainable Development (Anderson, Ryan, Sonntag, Kavvada, & Friedl, 2017;Andries et al., 2019). Furthermore, the emerging Big Earth Data technologies make it possible to achieve a scientific synthesis by combining Earth observation and census data (Guo, 2017;Guo et al., 2020). In this study, we used the UN World Urbanization Prospects (WUP) database that provides population counts for cities with 300,000 inhabitants or more as a reference (UN DESA, 2018). After generating large-area ISA maps of China's major urban agglomerations, we converted the ISA results to urban built-up area data for 433 cities at five-year intervals for the period 1990 to 2015. This was done in order to minimize the discrepancy between the expansion of urban areas and population growth in these areas (Güneralp, Reba, Hales, Wentz, & Seto, 2020). In this paper, the data sources and methodology used to create large-area ISA maps and to generate built-up area products are described. We also describe how we compared our results with existing built-up area datasets and produced aggregated estimates and an analysis of trends from the perspective of urban agglomeration. Finally, we discuss the implications of these estimates and trends that we identified in terms of the link between land and population urbanization in the context of the assessment of urban sustainability in China.

Data sources
According to the WUP database (https://population.un.org/wup/), there were 433 countylevel or higher cities with 300,000 inhabitants or more in China in 2018 (UN DESA, 2018), most of which fell within urban agglomerations. All of the cities identified as lying within these different urban agglomerations are shown in Figure 1 along with their populations. In order to map the ISA distribution for individual urban areas, we used Landsat 5/7 TM/ ETM+ surface reflectance data , Sentinel-1 SAR backscatter data (2015) and Sentinel-2 surface reflectance (2015) time-series, as well as auxiliary data including Shuttle Radar Topography Mission (SRTM) digital elevation data, OpenStreetMap crowdsourced geospatial data, high-resolution Google Earth data, urban extent polygons and administrative areas.
The remote sensing data included Landsat data (Landsat 5/7 TM/ETM+ surface reflectance images) with a spatial resolution of 30 × 30 meters covering the period 1 January 1990 to 31 December 2015 (https://earthexplorer.usgs.gov/) and data from the Sentinel series (Sentinel-1 SAR Ground Range Detected (GRD) scenes and Sentinel-2 surface reflectance images) with a spatial resolution of 10 × 10 meters covering the period 1 January 2015 to 30 June 2016 (https://scihub.copernicus.eu/). The SRTM digital elevation data provided by the National Aeronautics and Space Administration (NASA) were used to calculate slope images and generate mountain masks (https://www2.jpl.nasa.gov/srtm/). The OpenStreetMap crowdsourced geospatial data (https://www.openstreetmap.org/rela tion/270056) and high-resolution Google Earth images (https://earth.google.com/web/) were used to facilitate the segmentation of peri-urban and rural areas. The urban extent polygons and administrative areas were derived from the Global Rural-Urban Mapping Project, Version 1 (GRUMPv1) (http://sedac.ciesin.columbia.edu/) and the Database of Global Administrative Areas, version 3.6 (GADMv3.6) (http://www.gadm.org/), respectively, both of which were used to determine the boundaries of urban areas for the 433 Chinese cities. All of the data except for those obtained from OpenStreetMap and Google Earth were available and processed on the Google Earth Engine (GEE) platform. Table 1 shows the different sources of remote sensing imagery, auxiliary data and demographic information used in this study.
In this study, we aimed to create a standardized urban built-up area dataset using the same definition of urban agglomeration area (UAA) adopted by the WUP database for cities in China (UN DESA, 2018). To this end, we produced large-area ISA maps in urban areas and then converted these into standardized built-up area products for these 433 cities. We employed a range of spectral indices to generate the 1990-2015 ISA maps based on remotely sensed data acquired from multiple sources. Subsequently, various types of auxiliary data were used to create the desired products for urban areas through manual segmentation of peri-urban and rural areas together with reference to several freely available products of urban extent derived from ISA data using automated urban- rural segmentation methods. Following the well-established rules adopted by the UN, we carried out the conversion to the standardized built-up area products from the 1990-2015 ISA maps in urban areas, which conformed to the definition of UAAs. Finally, we implemented data postprocessing to guarantee the spatial accuracy and temporal consistency of the final product.

Extraction of large-area ISA maps of urban areas
We used the same approach as in our previous studies to create large-area ISA maps based on multisource remotely sensed data that included Landsat and Sentinel timeseries and developed an automatic mapping procedure on the GEE platform to support SDG11.3.1 monitoring and assessment (Jiang et al., 2021;Sun et al., 2019). Due to the availability of the different types of remote sensing imagery, we used different data sources and extraction methods to conduct ISA maps for the two periods of 1990-2010 and 2015. To produce the 1990-2010 ISA maps, we generated time series of the Normalized Difference Building Index (NDBI) (Zhang, Odeh, & Han, 2009), Perpendicular Impervious Index (PII) (Tian, Xu, & Yang, 2017), Modified Normalized Difference Water Index (MNDWI) (Xu & Hua, 2005), Normalized Difference Vegetation Index (NDVI) (Chen et al., 2004) and slope images from the Landsat 5/7 and SRTM DEM data. Based on these intermediate products, we adopted a range of thresholding strategies to derive the potential impervious surface area (PISA) and the PISA in arid and semi-arid regions (PISA_bare) as well as a water mask, vegetation mask and mountain mask. After masking the water, vegetation and mountain regions, the PISA and PISA_bare information were then utilized to produce a target impervious surface area (TISA) map. For the 2015 ISA map, we also applied customized thresholds to generate water, vegetation and mountain masks based on the MNDWI and NDVI timeseries derived from the Sentinel-2 and SRTM DEM data. At the same time, we employed a stepwise logical operator to create high-brightness impervious surface area (HISA) and PISA maps using the Sentinel-1 data. In a similar way as for the 1990-2010 ISA maps, the HISA data were then utilized to produce the TISA map after masking the water, vegetation and mountain regions. For the urban areas, we used OpenStreetMap crowdsourced geospatial data and Google Earth high-resolution images in combination with GRUMP urban extent polygons and GADM administrative areas to determine the urban impervious surface area (UISA) based on the TISA data through manual segmentation of the peri-urban and rural areas. We also referred to several freely available urban extent products derived from the ISA using automated urban-rural segmentation methods Kuang et al., 2021;Zhou et al., 2018). It should be noted that we slightly extended the extent of the mapping to ensure full coverage of urban areas; some rural settlements that were incorrectly included as a result of this were excluded during the conversion from the ISA to built-up area maps. The final UISA maps  were obtained by applying a 3 × 3 majority filter to the UISA time-series. The proposed methodological framework for ISA mapping , ISA mapping (2015)  Significantly, bare land belongs to neither the UISA nor the built-up area, whereas areas of open public space are considered not to belong to the UISA but are usually considered to be part of the built-up area. In addition, it is quite difficult to judge whether a periurban settlement should be aggregated into the built-up area of a multi-city agglomeration. Hence, attention should be paid to the conversion from UISA to built-up area rather than simply generating the former as a surrogate of the latter. Further investigation of the conversion from UISA to built-up area is clearly warranted, especially when conducting a combined analysis for use in urbanization studies.

Conversion from UISA to built-up area
According to the 2018 revision of the WUP, the term "urban agglomeration" is defined by "the de facto population contained within the contours of a contiguous territory inhabited at urban density levels without regard to administrative boundaries" and it usually incorporates "the population in a city or town plus that in the suburban areas lying outside of, but being adjacent to, the city boundaries" (UN DESA, 2018). Although the concept of "urban agglomeration" might seem a little abstract, the UN has provided several practical criteria for carrying out spatial data processing to ensure that any product of built-up area using other concepts can be adjusted to conform to the definition of UAA (UN-Habitat, 2019).
Following the rules recommended by the UN, we conducted a conversion from UISA to built-up area for the 433 cities in our study so as to produce a combined analysis based on Earth observation and census information and avoiding the spatial disaggregation of population data. First, we linked areas of urban patches that had an area of 200,000 square meters or more and that were less than 200 meters apart to form a continuous main urban area. Any polygons with an area of less than 57,600 square meters were eliminated from the vector products derived from the UISA map; "holes" with areas of less than 200,000 square meters lying inside the main urban area were filled to preserve the continuous distribution pattern. We also preserved the peri-urban patches that had close functional relations with the main urban area to improve the accuracy of the extent of built-up area. Then, aided by the high-resolution Google Earth images, we chose to locate manually the park and greenbelt patches that contained visible artificial structures and merged these newly added polygons into the main urban area to generate a built-up area that satisfied certain criteria. As the ISA product contained no pervious surface areas such as public open spaces (e.g. park and greenbelt patches), this last processing step was essential for the generation of built-up area from the ISA map. According to the minimum mapping unit for urban land adopted by the National Geomatics Center of China (Chen et al., 2015), artificial surfaces should occupy an area of at least 8 × 8 pixels in satellite images with a spatial resolution of 30 × 30 meters. This is why we eliminated polygons with an area of less than 57,600 square meters before the preservation and supplementation steps. A flowchart describing the conversion of the UISA map to the built-up area product for cities in China is shown in Figure 3.
Furthermore, we conducted the postprocessing of layer overlap to ensure that any urban land identified as built-up area at a particular time would subsequently remain unchanged along with that any non-urban land excluded from the built-up area would also be counted out at earlier times. In this paper, we gave the final product the name "standardized urban built-up area dataset for China" (SUBAD-China). The extraction results included a high-resolution boundary of the UISA, and SUBAD-China retained the continuous patterns of urban built-up area that were consistent with the visual interpretation based on Google Earth images. As an example, the spatial expansion of the UISA and built-up area compared with Google Earth images in 2015 for Shenyang City (Liaoning Province), Chengdu City (Sichuan Province) and Changji City (Xinjiang Uygur Autonomous Region) are shown in Figure 4.

Validation of UISA products
As far as we know, a large majority of the validation efforts made at different scales are based on reference point datasets collected through visual interpretation of high-and very high-resolution satellite imagery (Sabo et al., 2018). Since we used different remote sensing data sets and processing steps to create the UISA maps for different time periods, there were slight differences in the validation processes that were used although randomly sampled points were used for the validation of both the 1990-2010 and 2015 UISA products. In the case of the 1990-2010 UISA products, 4000 validation points, each with a size of 30 m by 30 m, were randomly generated for each year to achieve a dense and even distribution. It should be mentioned that we generated the random validation points based on the areal extent of each urban agglomeration, which means that a larger city has proportionally more validation points than a smaller one with the total number being constrained. The distribution of validation points for the 1990-2010 UISA product is shown in Figure 5 and, as examples, we also show the validation points for Chengdu City (Sichuan Province) and Shenyang City (Liaoning Province) for fiveyear intervals during the period 1990-2010 within the corresponding urban extent polygons. For the 2015 UISA product, 224,000 validation points, each with a size of 10 m by 10 m, were generated to ensure full coverage of China's major socioeconomic zones and details of the selecting and labeling of these validation points can be found in Sun et al. (2019).
In the next step, we formulated a unified interpretation standard with reference to the visual interpretation of Google Earth images to determine how to judge whether a certain surface in an urban area was impervious or pervious (Sun et al., 2019). Each random point -whether it was a 30 m by 30 m point in the 1990-2010 UISA product or a 10 m by 10 m point in the 2015 UISA product -was coarser than a single pixel in the highresolution Google Earth images (which have a size of 0.6 m by 0.6 m). Each validation pixel in UISA maps belonged to only one type (either urban or non-urban); however, when we superimposed one of these pixels on the Google Earth imagery, it was possible that there was some overlap with other land-cover types. If the amount of urban land in the area of overlap occupied more than a predefined threshold of 50%, we considered this area to be urban; if it did not, it was considered to be non-urban. During the validation process, image interpreters recorded a value of '1ʹ when the visual interpretation results were consistent with the extraction results and a value of '0ʹ when they were inconsistent. Using this as a basis, we generated confusion matrices between the classified data and reference data and then calculated the OA, PA and UA to validate the UISA maps for different time periods.
The OA, PA and UA results for the 1990-2015 UISA products for every five years during this period are shown in Figure 6. The overall averages for the OA, PA and UA are 91.24%, 92.58% and 89.65%, respectively. In the case of the 1990-2010 UISA product, the average OA was 89.65% with a peak of 92.88% in 2000 and a lowest value of 89.25% in 1995. For the 2015 UISA product, the average OA was 88.03% with the lowest accuracies occurring in semi-arid and arid regions (Sun et al., 2019). It can be seen that the accuracy indicators were slightly lower for the 2015 UISA product than for the 1990-2010 UISA maps as a whole. This difference probably occurred because we used different sample sizes and numbers of validation points for the two products. In this study, we generated larger sample sizes and fewer validation points for the 1990-2010 UISA product than for the 2015 UISA product. Typically, a random point in the 1990-2010 UISA product included more land-cover types than a point in the 2015 UISA product when it was superimposed on the high-resolution Google Earth imagery. In addition, the accuracy indicators for 2000, 2005 and 2010 UISA products were relatively higher than for the 1990 and 1995 UISA products because high-resolution Google Earth imagery are not available for the period before 2000 -for this period, visual interpretation had to be carried out using Landsat imagery itself as a reference. The validation results for different periods showed a considerable difference due to the impacts of different reference data, sample sizes and numbers of validation points used during the validation process. Notwithstanding all of this, in most cases, the UISA products were validated as having an accuracy of over 85% or close to 90%, which meets the accuracy requirements of conversion from UISA maps to built-up area products.

Comparison with other built-up area datasets
The existing built-up area datasets come from two main sources: the China Urban and Rural Construction Statistical Yearbook published by the National Bureau of Statistics of China since the 1980s and the online geospatial software tool "Platform for Urban Management and Analysis (PUMA)" sponsored by the World Bank (Dong, Wang, & Zhao, 2017). PUMA produces built-up area datasets for China as well as for other Asian countries (https://puma.worldbank.org/). However, the online tool only allows downloads of built-up area datasets in 2000 and 2010 currently, and the datasets also do not cover all cities in China. To make it easy to conduct a comparison between different built-up area datasets, we chose the 433 Chinese cities with 300,000 or more inhabitants listed in the WUP database as a reference and carried out a regression between SUBAD-China and datasets from the National Bureau of Statistics and World Bank. Because of the mismatches in the lists of urban areas and temporal inconsistencies, there were 413 matching cities in our dataset and that from the National Bureau of Statistics while there were 254 matching cities in our dataset and that from the World Bank. Scatter plots for these matching cities showing the built-up area estimates derived from the different datasets plotted on the x-and y-axes are shown in Figure 7. For both 2000 and 2010, there is a good fit between the built-up area estimates derived from our dataset and those derived from the national statistics; however, overall, SUBAD-China gives lower estimates than the national statistics, particularly for 2000. We also compared the estimates using our dataset and that from the World Bank. It should be mentioned that using the World Bank dataset as a substitute for the national statistics clearly produces an overestimate. Despite the temporal inconsistency caused by adjustments to administrative divisions and changes in the statistical methods used, the built-up area dataset produced by the National Bureau of Statistics of China can be considered as a reliable reference for certain time periods. Fortunately, such built-up area datasets derived from remote sensing data (e.g. SUBAD-China and the World Bank dataset) can be used to characterize the long-term trends based on time-series as long as the standardization and automation of the data processing can be guaranteed.
The comparison results between the estimated built-up area derived from our dataset and the World Bank for Lanzhou City (Gansu Province), Yuxi City (Yunnan Province) and Nantong City (Jiangsu Province) are shown in Figure 8 in a spatially explicit manner. It is clear that both the 2000 and 2010 results from the World Bank give a larger built-up area than SUBAD-China and that, for 2010, the boundary of the built-up area in our dataset nearly matches that in the referenced Google Earth images. These results indicate that, because of this overestimation problem, the built-up area dataset downloaded from PUMA is not suitable for quantifying the built-up area for cities in China. In the case of Lanzhou, PUMA classified many patches of bare land as being built-up area and some free-standing rural settlements were identified as built-up area in the case of Yuxi. In addition, a large water body was improperly included in the built-up area in Nantong. It could be concluded that SUBAD-China provides an alternative means of accurately delineating the spatial expansion of built-up areas by solving the problems encountered with the national statistics and the World Bank dataset that were described above. Combined with the WUP database, we can use the built-up area dataset that produced in this study to explore China's urbanization process in terms of LCRPGR.
As  Figure 9. It can easily be seen that the SUBAD-China estimates fit the national statistics well in 1990, 2000 and 2015 whereas the UN-Habitat dataset gives significant overestimates, especially for the megacities of Beijing, Shanghai and Guangzhou. In the absence of local knowledge, relying on the geographical proximity and ISA density to delineate the urban extent would tend to produce overestimates of the built-up area with a large proportion of rural settlements being incorrectly included. In conclusion, the above comparisons with existing built-up area datasets provided by the National Bureau of Statistics of China, World Bank and UN-habitat show that SUBAD-China can be used as a reliable data source to determine urban expansion in China associated with population growth.

Calculation of SDG 11.3.1 for 433 Chinese cities
As we stated earlier, SDG 11.3.1, LCRPGR, which denotes the ratio of the land consumption rate to the population growth rate, can be used to quantify the relation between expansion and population growth in urban areas. Here, we combined the built-up area product with the WUP database to calculate this indicator for 433 Chinese cities. To simplify the analysis, we clustered the 433 sample cities into four classes based on the LCRPGR distribution to quantitatively identify the relation between land consumption and population growth, as shown in Table 2.
The urbanization process in China measured in terms of changes in LCRPGR for 433 county-level or higher cities is shown in Figure 10. Overall, it can be seen that the value of LCRPGR rose from 1.34 during the period 1990-1995 to 2.15 during the period 2010-2015, and the land consumption rate exceeded the population growth rate throughout the whole period 1990-2015 except for the period 1995-2000, indicating that, for many Chinese cities, there are no restraints on the consumption of land and that this should be the focus of strengthened regulations. In detail, the proportions of cities with 0 < LCRPGR ≤ 1 (47.11%), 1 < LCRPGR ≤ 2 (38.34%) and LCRPGR > 2 (14.09%) occupied the top three places for the period 1990-1995, whereas for 2010-2015 these places were taken by LCRPGR > 2 (51.50%), 1 < LCRPGR ≤ 2 (29.56%) and 0 < LCRPGR ≤ 1 (18.01%). It is notable that the proportion cities with LCRPGR > 2 experienced a significant upward trend but the proportion with LCRPGR ≤ 0 remained below 1% throughout the whole time.  Moreover, during period 1990-2015, the proportions of numbers of cities with LCRPGR > 2, 1 < EGRLCR ≤ 2 and 0 < LCRPGR ≤ 1 were 15.94%, 60.05% and 23.56%, respectively. Based on the estimates and analyses for 2015, it seems that some limits should be put on the expansion of urban built-up areas and that the coordinated development of urbanization in China faces huge challenges (Kuang, 2020b).

Spatiotemporal evolution of cities in different urban agglomerations
We also considered the patterns and dynamics of the LCRPGR from the perspective of urban agglomerations at national, regional and local levels (Fang, Ma, & Wang, 2015). Table 3 lists 19 urban agglomerations that correspond to each of these three levels. Figure 11 shows the LCRPGR distribution for different urban agglomerations in China from 1990 to 2015. During the period 1990-1995, the urban agglomerations with 0 < LCRPGR < 1 were Hu-Bao-E-Yu (0.93), the Middle Reaches of the Yangtze River (0.92), the Shandong Peninsula (0.90), Lanzhou-Xining (0.86), Central Guizhou (0.67) and the Central Shanxi Plain (0.45); the average values of the LCRPGR in the other urban agglomerations ranged from 1 to 3. During the period 2010-2015, the urban agglomerations with LCRPGR > 3 were Central-Southern Liaoning (3.93), the Greater Bay Area (3.51), the Beibu Gulf (3.48), Harbin-Changchun (3.19) and Central Guizhou (3.13); the average values of the LCRPGR in the other urban agglomerations still ranged from 1 to 3. It is also obvious that the LCRPGR of cities in different urban agglomerations was higher after 2000 relative to the corresponding average values for 1990-2015. It can be concluded that the relation between land urbanization and population urbanization was significantly weakened due to rapid urban expansion combined with a lasting slowdown in population growth, especially in the cases of Central-Southern Liaoning and Harbin-Changchun. The LCRPGR for the cities in Beijing-Tianjin-Hebei, the Yangtze River Delta and the West Coast of Taiwan was relatively stable and remained close to 1, indicating that the cities in these urban agglomerations had achieved a better coordination between land urbanization and population urbanization. Cities where population densification is taking place 0 < LCRPGR ≤1 Cities where the rate of spatial expansion is greater than the demographic growth 1 < LCRPGR ≤2 Cities where spatial expansion is taking place at a rate that is at least double the rate of demographic growth LCRPGR > 2

Usage notes
To our knowledge, the standardized urban built-up area dataset introduced in this study is the first product using the same definition of UAA adopted by the WUP database for 433 cities in China (UN DESA, 2018). In addition, the comparisons made with contemporary data produced by the National Bureau of Statistics of China, the World Bank and UN-habitat indicate that our results have a high spatial accuracy and good temporal consistency and thus can be used to characterize the process of urban expansion in China. Users can directly combine the information on spatial expansion derived from this dataset with the population counts provided by the UN to conduct comprehensive studies of urbanization in China without resorting to other global high-resolution gridded population distribution datasets (Freire Figure 11. Distribution of the ratio of the land consumption rate to the population growth rate (LCRPGR) for different urban agglomerations grouped by time period.  Lloyd et al., 2019). SUBAD-China contains 2,598 vector files in shapefile format, which include data for all of the 433 Chinese cities listed in the WUP database that have different urban sizes and income levels with populations over 300,000. We also extended the time-period covered by the dataset from 1990 to 2015, with data provided at five-year intervals, in order to depict the spatiotemporal evolution of urbanization in China over recent decades. Furthermore, the UISA maps and the corresponding built-up area products for these cities will be consistently updated and refined to ensure the quality of their spatiotemporal coverage and accuracy. The production of this dataset will close some of the data gaps in the calculation of SDG11.3.1 and benefit other downstream applications relevant to the multidimensional analysis of the coordination of land consumption, population growth and economic development in urban areas.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Open Scholarship
This article has earned the Center for Open Science badge for Open Data. The data are openly accessible at http://www.doi.org/10.11922/sciencedb.j00076.00004.