Identifying discrepant regions in urban mapping from historical and projected global urban extents

ABSTRACT Although several products of the global urban extent with fine resolutions (e.g. 30 m-38 m) have been developed, quantitative evaluations of these products across spaces and times are still missing, which is crucial to future urban growth modeling. Here, we analysed the discrepancy of six global fine resolution urban extent products across spaces and times. First, we measured the area variations of urban extent among these urban products in each 0.25° grid used in the Land-Use Harmonisation (LUH2), a commonly used product with future projections in Earth system modelling. Then, we analysed the potential urban growth within each 0.25° grid using the LUH2 data under eight scenarios shaped by shared socioeconomic pathways (SSPs) and representative concentration pathways (RCPs). Finally, we identified those regions with a large discrepancy and noticeable growth of urban extent under historic and SSP-RCPs scenarios. We found the discrepancy among six products occurs in either highly developed (e.g. the United States-US and Europe) or rapidly developing (e.g. China) regions. Moreover, Eastern US, Europe, and West Africa deserve more attention in the future due to their distinct urban growth and relatively large discrepancy of urban areas. The derived results are crucial to future global urban sprawl modeling .


Introduction
The dynamic of urban areas is an important indicator in evaluating human activities and global environmental change (Gong et al., 2019;Zhou et al., 2015). Maps of the global urban area with fine resolutions can support numerous studies about anthropogenic activities, such as human footprint, adaption and mitigation of climate change, and sustainable development (Mu et al., 2022b;Zhang et al., 2020;. The rapid worldwide urbanisation has significantly impacted vegetation, biodiversity, urban thermal environment, and energy consumption (Li et al., 2019b;Meng et al., 2020;Mu et al., 2021). Urban size is also associated with resource consumption and alters the terrestrial carbon sequestration and the intact habitat landscape (Milesi et al., 2003;Mu et al., 2022a). Besides, it is a consensus that the global urban extent is likely to expand continuously, particularly in rapidly developing regions such as Africa and East Asia , resulting in a noticeable increase in energy consumption and carbon emission. Given that global cities emit 75% of the CO 2 emission (Edenhofer, 2015), accurate and timely mapping of global urban extent is critical to adapting and mitigating future risks caused by climate change and global urbanisation (Li et al., 2019a).
With the advent of freely accessible satellite observations and the cloud-based planetary-scale geospatial analysis platform (Gorelick et al., 2017;Xu et al., 2021Xu et al., , 2022), many regional and global urban extent products have been developed using moderate-and fineresolution data such as Landsat and Sentinel L. Liu et al., 2021). The representative products such as the Global Human Settlement Layer (GHSL; Pesaresi et al., 2016), and the Global Artificial Impervious Area (GAIA; Gong et al., 2020), and the Global Annual Urban Dynamics (GAUD; X. Liu et al., 2020). These global urban extent products cover a longer temporal span, with annual or decadal mapping frequencies . Besides, many products were produced using machine learning approaches (e.g. support vector machine and random forests) globally (Du et al., 2020) at moderate resolutions (e.g. 30 m), while some advanced mapping techniques such as the deep learning method recently was employed to generate the Global Land Cover product in 2020 at fine resolution (e.g. 10 m; Karra et al., 2021). Although these global urban extent products (e.g. 30 m) have been significantly improved over the past decades, they have noticeable differences in their diverse mapping performance across spaces (L. Liu et al., 2021).
The spatiotemporal discrepancy of various recently developed global urban extent products is still unknown, especially in regions with considerable growth of urban areas in historical and future scenarios. Understanding regions with large discrepancies among these existing global urban extent products thus are of great importance to developing advanced mapping algorithms in the future, particularly in areas that would experience a distinct increase in population and economy. In the near future, the expansion of global urban areas could be simulated by both the climate and socioeconomic scenarios, which can be characterised by diverse narratives using the shared socioeconomic pathways (SSPs) and representative concentration pathways (RCPs), respectively . The SSPs scenarios provide multiple socioeconomic development pathways using different narratives of population and gross domestic product (GDP) change throughout this century (Rosa et al., 2017). The RCP scenarios present future climate change (e.g. temperature and precipitation) under different CO 2 concentrations which have been widely used in Earth System Models and the Intergovernmental Panel on Climate Change (IPCC) report . Future land use/cover change under different scenarios (i.e. SSPs and RCPs), as well as the historical land cover data, were integrated into the Land-Use Harmonisation (LUH2) data (Hurtt et al., 2020), which have been widely used in the interdisciplinary fields such as land science and climate change . The combination of historical and future scenarios can comprehensively assess discrepancies in existing global urban extent products and identify those discrepant regions with a noticeable increment in the future.
In this study, we developed an analytical framework for evaluating the discrepancy of global urban extent products from aspects of historical maps and future projections. This study provides a comprehensive analysis to improve the understanding of global urban extent mapping, because it considers the discrepancy between the existing products and future SSP-RCPs scenarios. First, we measured the discrepancy of urban extent mapping characterised by the variation of six existing global urban extent products in 2015 on the global scale. Then, we analysed the spatial heterogeneity of urban area growth within each 0.25° grid using the LUH2 data under diverse scenarios determined by SSPs and RCPs. Finally, we identified those discrepant regions by integrating the discrepancy map of existing products and the future urban growth scenarios. The remainder of this paper is organised as follows: Section 2 presents the dataset we used in this study; Section 3 introduces the methodology of the discrepancy analysis; Section 4 provides results and discussion, and Section 5 concludes this study.

Datasets
We evaluated the discrepancy of global urban extents mapped from these products with multiple epochs (e.g. annual) and fine resolutions (e.g. 30 m) (Table  S1), including GAIA , GAUD (X. Liu et al., 2020), GISA , GHSL (Pesaresi et al., 2016), MSMT_IS30 , and FromGLC (Gong et al., 2013a). Although these products have different terminologies (e.g. impervious surface, urban, and human settlement), most of them were defined as urban extent that was dominated (i.e. commonly above 50%) by artificial areas (e.g. roads and roofs; Gong et al., 2020). Therefore, we named them urban extent hereafter, given that this term has been commonly used in urban remote sensing. We chose 2015 as the representative year for comparison because this year is available in these six products (Table S1) and selected the period of 1985-2015 from three sets of annual urban extent products to characterise the historical urban land use change. Moreover, we collected the LUH2 data to characterise future urban expansion scenarios. As a crucial dataset in the Six Coupled Model Intercomparison Project (CMIP-6; Liao et al., 2020), the LUH2 data provide diverse scenarios of urban areas growth under future climate change and socioeconomic development scenarios (Hurtt et al., 2020), spanning from 2015 to 2100 with a spatial resolution of 0.25°. In this study, we used the LUH2 data to identify those regions that may experience distinct growth of urban areas in the future.

Method
We implemented a comprehensive comparison of global products of historical and projected urban extents under future SSP-RCP scenarios ( Figure 1). The proposed evaluation framework includes three steps. First, we calculated three indicators (i.e. deviation, standard deviation, and variations coefficient) of urban extent products (Figure 1(a)). Then, we investigated the urban area growth within each 0.25° grid using the LUH2 data under eight scenarios shaped by SSPs and RCPs (Figure 1(b)). Finally, we defined those discrepant regions by considering the combined discrepancy of historical urban extent products and the growth potential under future SSP-RCPs scenarios in each 0.25° grid ( Fig. 1(c)).

The discrepancy in historical urban extent
To identify the discrepancy of these existing urban extent products, we calculated their urban area percentages within each grid and measured their deviations quantitatively. First, we identified the product closest to the mean of these six datasets by exploring the deviation of urban areas in each product locally (e.g. within each grid). After that, we quantified the discrepancy of mapped urban areas among different products within the 0.25° grid using their mean, minimum, maximum, and variation coefficient ( where σ and μ represent the standard deviation and the mean of urban areas among these products, respectively.

The projected urban extent under future scenarios
We used the LUH2 data to characterise the urban area growth by 2100 under various climate change and socioeconomic development pathways. First, we analysed the distribution of potential regions that may experience distinct growth of urban areas in the near future, along with the latitude and longitude profiles. The SSP narratives provide the pathways of socioeconomic development (e.g. population and gross domestic product; GDP) across different countries and regions (Table 1), while the RCP represents the climate change scenarios under different CO 2 concentrations by 2100 ( Table 2). The combined scenario, such as SSP2-RCP4.5, represents the integrated effect of urban land use change under the socioeconomic narrative of SSP2 (i.e. middle of the road) and the climate change scenario of RCP4.5 (i.e. a pathway for stabilisation of radiative forcing at 4.5 W/m 2 by 2100; Rogelj et al., 2018). The scenario of SSP2-RCP4.5 indicates the future development of human society follows the pathway in  which the demography, economy, and technologies trend do not shift markedly from the historical pattern (Fricko et al., 2017).

The identification of discrepant regions
We identified those discrepant regions with a large discrepancy among six global products in 2015, meanwhile with a noticeable growth of urban extent under future SSP-RCPs scenarios (2015-2100). First, we quantified the urban area growth (i.e. the fraction within the 0.25° grid) from 1985 to 2015 using the average of three annual urban extent products (i.e. GAIA, GAUD, and GISA). Then, we measured the maximum area growth of urban extent within each 0.25° grid from future growth scenarios. In conjunction with the standard deviation of the existing six global urban extent products in 2015, we identified those discrepant regions associated with a relatively large discrepancy in global urban extent products and a high potential for area growth in future SSP-RCPs. These regions may deserve attention in future urban sprawl modelling, even the earth system science studies.

The discrepancy of urban extent among global products
Globally, regions with relatively large discrepancies among these six global urban extent products are mainly the US, Europe, East, and Southeast Asia (Figure 2). These regions are either highly developed (e.g. the US and Europe) or rapidly developing over the past years (e.g. China). For example, there are larger deviations of urban areas in the eastern US among these existing global urban extent products, i.e. the minimum area difference to their mean is more than 1,500 km 2 in eastern grids, where the GHSL, GAIA, and GISA have smaller area bias compared to other results (Figure 2 (a)). In Europe, regions with relatively large area biases are mainly distributed in western Europe, where the products with the minimum area difference are diverse. China experienced a noticeable urban expansion over the past decades, and most urbanised lands are in the eastern part of the country. The minimum area difference in North and South China products are GAUD and GAIA, respectively. In addition, the discrepancy in the tropical area (e.g. Indonesia) is related to the capacity of the limited optical satellite images due to the cloud coverage. Overall, regions with a most significant variation coefficient and standard deviation have higher levels of uncertainties, such as Southeast Asia, Central Asia, and Midwest US (Figure 2(b)). Those regions with a relatively large discrepancy of urban extent among products are within the geographical range between 20°N and 60°N latitudinally, where most global urbanised areas are located, such as the United States (US), Europe, India, and China ( Figure 3). Here, the over-and under-estimation of urban areas within each grid was defined as the product's area difference with the largest area gap to the referred area (i.e. the mean of six products), as illustrated in Figure 3. In general, urban areas of GAIA, GISA, and MSMT_IS30 in China are similar. In contrast, the MSMT_IS30 has more urban areas than other products in the remaining regions, such as the US, Europe, and India (Figure 3(a)). A similar result was also observed along with the longitudinal profile (Figure 3 (b) and Figure S1). Multiple remotely sensed observations, including the Landsat and Sentinel-1 data, have been used in the MSMT_IS30 . In addition, we found that the variation coefficient was relatively low in regions with a large number of urban areas. This phenomenon is likely related to the datasets and methods adopted in the classification.
The mapped urban extents from different products show a noticeable spatial discrepancy across different regions (Figure 4). Overall, these products, such as GAIA, GAUD, GISA, and MSMT_IS30, can well capture the extent of urban areas in suburbs of metropolitan cities (Figure 4). For instance, many small artificial areas around the city fringe areas (e.g. Wuhan, China) were extracted by GAIA and GISA. Similar results were found in cities such as Warsaw (Europe) and Chicago (US) in MSMT_IS30. Meanwhile, in tropical cities (e.g. Jakarta, Indonesia), there is a distinct overestimation of extracted urban lands in GHSL and FromGLC. In summary, GHSL and FROM_GLC are not recommended for Southeast Asia. We found that MSMT_IS30 extracts more continuous roads than other data in Chicago. For the annual urban extent products, GAIA identifies part of the bare soil as urban in Wuhan and New Delhi. However, GISA extracts less urban extent than other products in Chicago. These representative regions can help customise for different user groups and improve methods of global urban extent mapping in the future.
The considerable differences in mapped urban extents are likely attributable to the definition and mapping approaches employed in different products (Table S1). In general, the urban extent is commonly defined as pixels that are dominated (i.e. more than 50%) by artificial elements (e.g. roads, buildings) in remote sensing, such as GHSL (Pesaresi et al., 2016), GAIA , and FromGLC (Gong et al., 2013a). However, for GAUD (X. Liu et al., 2020), GISA , and MSMT_IS30 , their samples were derived from existing urban products with different resolutions, e.g. from GlobeLand30 (30 m) to the ESA-CCI product (300 m) and night-time light data (1 km). Besides, although Landsat observations are primary datasets used in these six products with a fine resolution (30 m), ancillary datasets with finer resolution (e.g. Sentinel-1) also play a crucial role to the mapped results. For instance, small urban extents (e.g. roads) in the US were extracted in MSMT_IS30 due to the inclusion of Sentinel-1 data . In addition, there are two primary approaches for generating long-term and annual records of urban extent (i.e. GAIA, GAUD, and GISA). One is the annual classification of urban extent using machine learning approaches (e.g. random forest) followed by post-processing, such as GAIA and GISA. For GAUD, its urban extents in 1985 and 2015 were fused from other global urban extent products, and its annual dynamics were derived using the temporal segmentation approach.
The merits of these global urban extent products determine their applications in specific fields. For instance, the temporal information revealed from the urban extent time series products (e.g. GAIA, GAUD, and GISA) is helpful to the urban growth modelling work, due to the long-term and annual urban extent dynamics. Thus, the difference of urban growth during different urbanisation stages can be well characterised Li et al., 2017). Also, for landscape and natural resources managers, fine resolution data such as FromGLC, which includes urban and other natural land cover types, are recommended since the interactions between human and natural can be reflected in relevant studies .

Projected urban extent under scenarios
The future growth of urban areas is more influenced by SSPs than RCP scenarios ( Figure 5). Overall, regions with distinct urban growth in future are mainly distributed in the US, Europe, and West Africa (Figure 5(a)), represented by relatively low and high growth rates of urban areas in each 0.25° grid. Besides, as the two most populated countries globally, the increment of urban areas in China is anticipated to be distinctively lower than in India in the near future. This phenomenon is closely related to China's relatively low (or declined) population growth since 2030 (Song et al., 2021;Van Ruijven et al., 2019). By contrast, India's growth rate in urban areas is relatively high under future scenarios (e.g. SSP4) due to its continuously increased population ( Figure 5(b)). Meanwhile, the global urban expansion under SSP3 is distinctively lower than other scenarios (e.g. SSP1 and SSP5), probably due to the relatively low pace of economic growth caused by the regional rivalry Riahi et al., 2017). The fossil-fuelled development scenario (SSP5) suggests a significant future urban expansion in developed regions such as the US and Europe, i.e. most urban area increments under the SSP5 scenario were observed latitudinally from 30°N to 50°N ( Figure S2). It is noted that the inequality of urban expansion between developed (e.g. US and Europe) and developing regions (e.g. Africa and India) is amplified in this scenario.
There is a distinctive spatial difference in future urban area growth across scenarios and spaces compared to the base scenario of SSP2-RCP4.5 ( Figure 6). The scenario of SSP2-RCP4.5 follows the pathway of historical development, as the adopted socioeconomic narrative of 'middle of the road' and the moderate concentration of CO 2 with 4.5 W/m 2 by 2100 (Fricko et al., 2017). Overall, future urban expansion under SSP1-RCP1.9 (Sustainability) is similar to the base scenario, with lower urban expansion due to sustainable usage of resources and production (Figure 6(a); Hurtt et al., 2020;Van Vuuren et al., 2017). For SSP3 (Regional rivalry) and SSP4 (Inequality), most increased urban areas would likely occupy lands in South America, Africa, and Middle Asia, whereas urban areas decline in developed regions such as the US and Europe Riahi et al., 2017;Figure 6(b-c)), compared with the base scenario. By contrast, the global urban expansion under SSP5 is the opposite of SSP3 and SSP4 (Figure 6(d)). It is worthy to note that the urban expansion in China is low and consistent across these scenarios. Instead, the historical increment of urban areas from 1985 to 2015 in China is most significant globally due to the rapid pace of urbanisation.

Identified discrepant regions
Both developed (e.g. US and Europe) and developing regions in Africa and middle Asia deserve more attention in future global urban extent mapping projects. Geographically, the eastern US is associated with discrepancy due to the difference in existing global urban extents and the divergent growth patterns reflected by the SSP and RCP scenarios in the future (Figure 7(a)). For regions in Canada, Eastern Europe, and Russia, the discrepancy of existing global urban products is significantly compared to urban areas, while future and historical growth are not significantly different. For regions in Africa, South America, and Central Asia, their urban areas are estimated to grow in future  Note: the discrepancy of existing global urban extent products was characterised by their standard deviations, and the future growth rates were calculated using the SSP-RCP scenario with the maximum urban area growth. scenarios. We further examined the urban fraction of these 0.25° grids and analysed their distributions along the above three dimensions in Figure 7(a). We found regions with relatively large discrepancies, including the future growth scenarios and existing urban extent products, mainly occurred in urban areas with fractions greater than 20% (Figure 7(b); Fry et al. 2011). However, the discrepancy is relatively high for high urban fractions (i.e. greater than 80%), and future global urban expansion is likely to occur in regions with moderate urban fractions (e.g. 40%~60%; Figure 7(b)). In addition, the urban growth rate from 1985 to 2015 is significantly greater than the prediction in the climate change scenario. These historical increases mainly occurred in China, India, and Europe.
We identified four discrepant regions by considering historical urban growth, SSPs, and RCPs scenarios (Figure 8). The selected regions clearly show discrepancies across different urban fraction levels. There is a distinctive growth of urban extent in the US regarding the future growth scenarios and the discrepancy of existing urban extent products. Similarly, moderately urbanised grids (i.e. 40%~60%) are likely to experience rapid growth under future scenarios (Figure 8(a)). Under scenarios, the growth rates of urban areas in West Africa are higher than in Eastern Europe and China but lower than in the US (Figure 8(b)). Although the discrepancy of urban extent among products is distinct, the urban area growth of Eastern Europe has been relatively slow historically and under future scenarios (Figure 8(c)). For instance, the historical urban growth regarding the urban fraction in the US and Eastern Europe is not exceeded 0.25 because the urbanisation process has been completed in the early stage (i.e. before 2015) in most developed regions Zhou et al., 2018). The historical growth and discrepancy of urban extent in China are significant, especially in areas with urban fractions greater than 40% (Figure 8(d)).

Conclusions
In this study, we implemented a comprehensive evaluation of the discrepancy of urban extent in historical and future scenarios. First, we evaluated the discrepancy of six global urban extent products. Then, we analysed the future growth of urban extent under different SSP-RCPs scenarios. Finally, we identify those discrepant areas for global urban extent mapping by linking the existing global urban extend maps and future growth under SSP-RCPs scenarios. We found the discrepancy of the global urban extent products occurs latitudinally from 20°N to 60° N. These regions are either highly developed (e.g. the US and Europe) or are rapidly developing (e.g. China) over the past years. Under the future scenarios, the urban growth rate in the US, Europe, and West Africa are more distinct than that in the other regions; however, the future urban growth rate in China is relatively stable, although a distinct urban expansion occurred in China over past decades. Meanwhile, we found SSPs are more important to future urban growth than RCPs. The urban growth under the future scenario is mainly in the eastern US and Europe, while historical growth is mainly distributed in eastern China. This study first evaluated the global urban extent mapping discrepancy by combing the historical mapping results and future growth scenarios. Meanwhile, the identification of discrepant regions is helpful to the global urban extent mapping studies since difficulties in those regions deserve attention in the future with improved mapping approaches and satellite observations. In addition, those discrepant regions also show great potential for future global urban sprawl modelling under diverse socioeconomic and climate change scenarios.