New urban map of Eurasia using MODIS and multi-source geospatial data

Abstract Urban areas are of paramount significance to both the individuals and communities at local and regional scales. However, the rapid growth of urban areas exerts effects on climate, biodiversity, hydrology, and natural ecosystems worldwide. Therefore, regular and up-to-date information related to urban extent is necessary to monitor the impacts of urban areas at local, regional, and potentially global scales. This study presents a new urban map of Eurasia at 500 m resolution using multi-source geospatial data, including Moderate Resolution Imaging Spectroradiometer (MODIS) data of 2013, population density of 2012, the Defense Meteorological Satellite Program’s Operational Linescan System (DMSP-OLS) nighttime lights of 2012, and constructed Impervious Surface Area (ISA) data of 2010. The Eurasian urban map was created using the threshold method for these data, combined with references of fine resolution Landsat and Google Earth imagery. The resultant map was compared with nine global urban maps and was validated using random sampling method. Results of the accuracy assessment showed high overall accuracy of the new urban map of 94%. This urban map is one product of the 20 land cover classes of the next version of Global Land Cover by National Mapping Organizations.


Introduction
Urbanization processes are universal phenomena, taking place worldwide wherever humans reside. Urban areas are among the most rapidly changing and expanding elements of the landscape (Taubenböck et al. 2010). They constitute a major trend particularly in recent years. From an environmental perspective, urbanization has emerged as an extremely important environmental issue in many parts of the Earth (Schneider, Friedl, and Potere 2010). New estimates show that more than half of the world population lives in urban areas. Furthermore, most of the population growth expected to occur in urban areas will be concentrated in cities and towns of less developed regions by 2050 (UN 2011). Urbanization processes have been characterized not only by increasingly built-up areas, but also by remarkable industrial expansion, economic and social developmental activities, and rapid consumption of natural resources. Urban areas affect the global environment through urban heat effects, conversion and fragmentation of natural ecosystems, pollution of air, soil, and water, and reduction of biodiversity (Zhou et al. 2014), all occurring in the main source regions of carbon emissions (Svirejeva-Hopkins, Schellnhuber, and Pomaz 2004). Therefore, high-quality mapping with regularly updated urban land use information is necessary to elucidate the impacts of the urban areas on global climate and natural resources. Thus, remote sensing becomes an essential field for up-to-date and area-wide data acquisition, especially in explosively sprawling cities of developing countries (Taubenböck et al. 2010).
Several studies have specifically addressed mapping the urban extent on a local scale using fine-resolution satellite data. For example, (Xiao et al. 2006) detected land use/cover change and urbanization trends using annual urban growth rate supported by two scenes of satellite multi-spectral images, (Batisani and Yarnal 2009) assessed urban landscape pattern change from 1993 to 2002 using Landsat TM images, (Taubenböck et al. 2012) the German Aerospace Center (DLR) monitored the growth of different megacities using spaceborne Earth observation data. However, these studies of urban areas have consistently relied on fine resolution data related to analyses of the spatial dynamics of the urbanization process and the detected annual urban growth rate for small areas at specific regions. Therefore, the need for global urban maps encouraged different international groups from academia, government, and OPEN ACCESS industry to develop various global-scale maps that are useful to identify urban areas as a primary class among different land-cover classes ).
The current trends of global land cover mapping are to produce a new map with higher accuracy and better resolution (Chen et al. 2015) and to use multiple existing maps for improvement and to publish and share training data (Tateishi et al. 2014). Table 1 provides a summary of the urban definitions, data, and methods used for each map of the most widely used global urban products.
This work of the new urban map was supported based on previous urban mapping efforts of GLCNMO 2008 using up-to-date multi-source geospatial dataset (Phong et al. 2013;Tateishi et al. 2014). The urban area definition is based on Land Cover Classification System (LCCS) given in an earlier report (Tateishi et al. 2011), where urban and built-up land must dominate greater than 50 percent of an image pixel for that pixel to be labeled urban. Green fields and water bodies not directly related to human development activities were not classified as urban. When vegetation (such as big park, settlement area with much green or golf course) and water bodies dominated a pixel, these areas were not considered urban, even though in terms of land use, they may function as urban space.
The aim of this research is to produce the newest urban map covering Eurasia at 500 m spatial resolution based on the newly data-set of Moderate Resolution Imaging Spectroradiometer (MODIS) data of 2013, population density of 2012, DMSP-OLS nighttime lights of 2012, and ISA data of 2010. The specific objectives are the following: (1) to analyze and share a newly developed urban map of Eurasia of 2012-2013; and (2) to compare the accuracy of coarse-resolution urban maps with Landsat and high-spatial resolution Google Earth imagery.

Data and pre-processing
Urban mapping of Eurasia 2012-2013 was conducted mainly using the data described below.

MODIS NDVI data of 2013
The sources of MODIS data are the MODIS/Terra + Aqua Nadir BRDF −Adjusted Reflectance 16-day L3 Global 500 m SIN Grid V005 (MCD43A4). These are 16-day composite, seven-band, 500 m, and 10-degree tile data. The source MODIS data were reprojected into a latitude/longitude coordinate system, and were mosaicked to produce the data of Eurasia continent. Mosaicking and reprojection were done using MODIS Reprojection Tool (MRT) software distributed from US Geological Survey. Resampled pixels have 15.0012 arc second resolution, which is nominally called 500 m resolution. Cloud removal from MODIS data is extremely important to obtain cloud-free images over the entire land area of the earth for various land studies. Cloud cover is the major limitation of using MODIS data for various applications, particularly in tropical regions. Therefore, cloud-contaminated pixels were replaced through linear interpolation of two cloud-free pixels before and after the cloud pixel in cases where the cloud-contaminated period is equal to or less than six 16-day periods. The cloud-contaminated pixel was replaced by the average of 2012 and 2014 MODIS data at the same location and the same time of the year if the cloud-contaminated period is longer than the six 16-day periods. The interpolation process is applied for predictor needs of areas to be known over the land mass area. In other words, daily values of estimated area over the domain of interpolation area at 500 m spatial resolution are provided to extract the actual area of urban extent and other land cover of the earth away from overestimation or deduction of the actual area. Preprocessed MODIS data consist of 23 periods of 16-day composite in 2013 (Tateishi et al. 2014). MODIS data were used to calculate the Normalized Difference Vegetation Index (NDVI) by the following equation (Rouse et al. 1974).
where NIR is the near infrared band and R is the red band, corresponding, respectively, to bands 2 and 1.
A Maximum Value Composition was then applied to all NDVI images to select pixels that are less affected by clouds and other atmospheric perturbations (Holben 1986). Here, maximum MODIS NDVI data of 23 periods was generated, as expressed in Equation (2). where NDVI_1, NDVI_2, …, NDVI_23 are the multi-temporal MODIS NDVI images acquired from 2013/01/01 to 2014/01/01.
In fact, the aim of using Max 16 day MODIS NDVI is to exclude large green areas, such as parks and to minimize non-vegetated areas in the urban areas. The final Max MODIS NDVI image was used to reproject and resample the other used data using commercial software ENVI and PCI.

Global population data
The LandScan 2012TM Global Population Database was developed by the Oak Ridge National Laboratory for the United States Department of Defense. 1 LandScan data are the finest resolution global population distribution data available, representing an ambient population (average over 24 h) (Dobson et al. 2000).
The global population distribution data is ESRI Grid format at approximately 1 km resolution (30″ × 30″). The population density calculation for 1 km pixel was derived mainly from the population data and area of pixels using ArcMap 10.2.

DMSP-OLS nighttime lights data
DMSP-OLS collects imagery at night and has a number of unique features that meet the needs of wide-scale, frequently repeated surveys of urban growth. The human settlements product represents stable lights in which clouds, gas flares, lightning, and other ephemeral and extraneous signals have been screened out, leaving only "city lights" (Christopher et al. 2001).
DMSP-OLS data of 2012 (Version 4 DMSP-OLS Nighttime Lights Time Series, 2012) 2 used for this study are cloud-free composites produced using all available archived DMSP-OLS smooth resolution data for calendar years. The products are 30 arc second grids, spanning −180° to 180° longitude and −65° to 75° latitude.

Global Distribution and Density of ISA data, 2010
These data include roads, parking lots, driveways, sidewalks, and other manmade surfaces. While high spatial resolution is necessary to observe these features, the new product reports the estimated density of ISA on a 1 km 2 grid based on two coarse resolution indicators of ISAthe brightness of satellite observed nighttime lights and population count (Christopher et al. 2007). Because no ISA data are available for 2013, data for 2010 were used. 3

Gross Domestic Product (GDP) per capita
By one estimate, 80% of the world's GDP is generated by urban areas (UN 2011). Because cities attract businesses and jobs, they bring together both the human and the entrepreneurial resources to generate new ideas, innovations, and increasingly productive uses of technology. Consequently, four threshold values were decided based on GDP and used as a basic index for mapping urban areas of Eurasia.
The GDP per capita data is provided as excel file. 4 The data is used to divide the countries into four groups according to the level of the development (Table 2), then the GDP per capita data was joined with the population density data, which is in ESRI Grid format, using ArcMap 10.2.

Other data
Other data used for this study for validation and threshold purposes are Google Earth and Landsat images. Landsat images (30 m) downloaded from The United States Geological Survey (USGS). integrated together using EASI modeling language of PCI 9.1 software.
Overall, the final urban map of Eurasia 2012-2013 includes three land cover classes: urban areas, water bodies, and land, as shown in Figure 1.

Comparative analyses
Three comparative analyses were performed to evaluate the performance and accuracy of the method used. Several images were used from Landsat and high-resolution imagery in Google Earth that are reasonably close to 2012-2013 to evaluate nine existing global urban maps to reveal the dynamics and changing of built-up areas and urban growths over time in the study area.

Visual comparison
Global urban maps of the GLCNMO 2003 and 2008 were produced and distributed previously by the Global Mapping Project coordinated by the International Steering Committee for Global Mapping (ISCGM). 5 Continuously for contribution to global environmental conservation, this study is a part of the next version of GLCNMO ver. 3, which includes urban class as one of the 20 land cover classes. Therefore, an urban map produced in this study was compared with the present GLCNMO urban maps of 2003 and 2008, respectively, as a first step in the comparison. The urban map of GLCNMO 2003 is accurate in continental and national scales. More specifically, the misclassified pixels of urban areas occurred in developed regions, such as Europe. Some developing countries were not represented, such as Phra Phutthabat city-Saraburi province in Thailand. This result is caused by the threshold value decided based on only one level for all continents. The urban map of this study showed good agreement in terms of the regional and spatial distribution of urban extent of GLCNMO 2008. Furthermore, this new version of urban map showed expanded urban areas compared with GLCNMO 2008, as shown in Figure 2.
These results affirm that the objective of this study is to provide newly available information about the extent, growth, and distribution patterns of urban areas at a continental scale. Using Landsat 8 images and other urban maps, a visual comparison shows that the method used for this study mapped the urban areas well (Figure 2).

Threshold processing and integration
In order to derive optimal thresholds specific to different cities, 60 Landsat images (25 different cities in Europe, 5 different cities for each China and India, 8 cities in Southeast Asia, and 17 cities for other countries in Eurasia continent) were selected based on population, geographic region, and income that were acquired from 2010 to 2013. The urban mapping of Eurasia of 2012−2013 was carried out by the following steps; Step 1: Candidate of urban area is derived from population density data by the threshold which is decided by visual interpretation of Landsat ETM + images and Google Earth images, as mentioned above. Four threshold values were decided based on GDP per capita (Tateishi et al. 2014). That is, higher GDP countries have lower threshold values of population density to extract candidate of urban areas, as shown in Table 2.
Step 2: In order to exclude less nighttime light area from urban, DMSP-OLS data were used. The excluded areas were decided by the threshold that is decided by visual interpretation, as mentioned in Step 1.The higher GDP countries have higher threshold value of DMSP-OLS data, due to their high levels of energy consumption (Welch 1980), while lowest threshold values were given to a developing cites, because these cities are substantially less bright compared to industrialized cities.
Step 3: In order to exclude less ISA from urban areas, ISA data were used. The excluded areas were decided by the threshold which is decided by the same way as the Step 1. The higher GDP countries have higher threshold value of ISA.
Step 4: In order to exclude more vegetation areas from urban, MODIS NDVI data were used. Generally, the cities of developing regions exhibit the least complex, most compact, least porous, and densest urban forms. Cities of developed regions display diametrically opposed tendencies (Huang, Lu, and Sellers 2007). The excluded areas were decided by the threshold which is decided by the same way as the Step 1. The higher GDP countries have higher threshold value of MODIS NDVI data ( Table 2).
The resultant urban map was produced by excluding areas of less nighttime light (Step 2), areas of less impervious surface (Step 3), and more vegetated areas (Step 4) from candidate of urban areas (Step 1).
The four parameters; (population density data, DMSP-OLS data, ISA, and MODIS NDVI data) were

Areal comparison
The results presented in Figure 3 show  note: Urban areas were extracted from a landsat image 8 (2014) using maximum likelihood classification system, where urban areas produced by other products were overlaid on a color composite image of landsat 8 bands (2, 4, and 7) exposed, respectively, through red, green, and blue.
GLC2000, MOD12Q1, GLCNMO v1, and GlobCover v2.3 were combined to produce a new map showing reliability of the mapped result for urban/non-urban areas. These maps were resampled to match the spatial resolution of each global urban map. The resultant map, designated as the "potential urban map" in this study, was created to collect validation data and to analyze the accuracy of the urban maps of Eurasia as shown in Figure 4 (Schneider, Friedl, and Potere 2009;Tateishi et al. 2014).
Here, approximately 350 random points were taken from agreement areas of the "potential urban map" using random sampling to obtain the confusion matrix. These interpreted validation points were double-checked by three image interpreters who were selected based on their outstanding skills in image interpretation. The final quality control confirmation was conducted by only one interpreter, the most experienced, who reviewed and ensured all validation points. In all, 271 points were selected after checking and interpretation with reference 2004-2006 and 2009, the estimated urban areas were very low compared with Landsat and other products, as presented in Figure 3. This limitation can be referred for several factors, such as misclassified urban areas using the classification system, definition of urban extent and data used, and contribution to the differences between products (Zhou et al. 2015).
Overall, despite the different spatial resolutions, methodologies, and data from UMD LC, GLC 2000, and GlobCover (v.2.2 and v.2.3), they showed nearly equal estimations of urban areas, whereas other products showed either overestimation or underestimation of the real extent of urban areas, which are explainable by the fact that a group of contiguous pixels might represent urban areas mixed with other land cover classes, such as vegetated areas or water bodies.

Accuracy assessment
Six existing global urban maps, including Global Land Cover characterization (GLCC) v2.0, 10 UMD LC,  Examination of the urban class producer's accuracy and non-urban class user's accuracy revealed that five urban maps had the highest values of more than 91%: GLCNMO 2013, GLCNMO 2008, GLCNMO 2003, and MCD12Q1. That fact indicates that where pixels were classified as urban, the majority were urban in the potential urban map. The non-urban potential urban map pixels mostly concurred with non-urban classified pixels in the five urban maps. The lower non-urban class producer's accuracy and urban class user's accuracy were exhibited by UMD LC and GLC2000, presenting how both of their classification systems were unsuccessful at mapping numerous urban areas mapped by the potential urban map.
For accurate and updated urban mapping, the results revealed that the method used for this study is efficient for producing urban maps on a continental scale, with high accuracy and good agreement with validation data. The results of the accuracy assessment will serve as a guideline for future users, elucidating the differences among these products, enabling choices for appropriate applications at different fields of interest and facilitating the consideration of global urban maps that were examined here for urban area usage. Figure 6 presents two visual sensitivity analyses conducted to assess the impact of the optimal threshold values on different cities for extracting the actual urban areas of the Eurasian continent. In the first analysis, the threshold values of DMSP-OLS ( Figure 6(A)) were examined in Tokyo city, the results of this sensitivity analysis indicate that the choice of the optimum value of the DMSP-OLS threshold has a clear impact on extracting the urban extent in a good way. As the figure shows, the optimum threshold of DMSP-OLS of 20 agrees well with the source image of DMSP-OLS. This value is used for this study ( Table 2) for group 4, which are present high-income cities. The other two values of thresholds (5 and 50) do not match well with the DMSP-OLS image, respectively, showing overestimated or shortened urban areas.

Sensitivity analysis of threshold method
to Google Earth images based on sufficient interpretation experience, including 118 points urban and 153 points non-urban areas.
Related to the number of these validation points, as described in one earlier report of the literature , no previous systematic sampling was conducted for the entire world using such numerous independent units. Furthermore, for global or continental-scale land-cover maps, reference samples are based mostly on image interpretation. Most global land-cover maps are either cross-validated from training samples or are estimated with a few samples, on the order of a few hundred to thousands.
These validation points were used to create urban/ non-urban matrices to analyze the level of agreement and accuracy between the potential urban map and the other 10 urban maps of Eurasia. Figure 5 presents results of users, producers, and overall accuracies for both urban and non-urban classes that were derived from the error matrices of the existing urban maps used for this study. These results show the highest overall accuracy of 94% and kappa coefficient of 0.87 produced by the new urban map of Eurasia 2012-2013. In addition to that, this urban map has the largest user and producer accuracies for each urban and non-urban class among the other nine urban maps, which indicates that an accurate approach was used to extract both urban and non-urban areas.
The results indicated a similar level of overall classification accuracy of GLCNMO 2008, GLCNMO 2003, and MODIS 500 m (collection 5) of 83-89%. Furthermore, a similar level of kappa coefficient, where GLCNMO 2008 has 0.75 and MCD12Q1 has 0.78. GLCNMO 2003 has 0.67 and MODIS 500 m (collection 5) has 0.64. Conversely, UMD LC has the lowest overall accuracy and kappa coefficient of 58% and 0.1, respectively, with low user and producer accuracies because of missing urban areas in the classification system. Very similar levels of overall accuracy of 72%-75% were exhibited, respectively, by GlobCover v.2.3, GlobCover v.2.2, MOD12Q1 V004, and GLC2000, as well as kappa coefficients of 0.47, 0.44, 0.43, and 0.40. Figure 5. results of accuracy assessment of urban maps produced from 10 products.
Three comparative analyses applied in this study showed large variations among the 10 existing urban maps. However, the accuracy assessment revealed that the urban map of Eurasia produced in this study has the highest overall accuracy of 94% and a kappa coefficient of 0.87 among the other existing urban maps for the Eurasian continent.
This urban map is a part of the next version of GLCNMO Ver. 3. Consequently, the future global urban map of GLCNMO ver. 3 will set different threshold values for other continents, depending on GDP, population density, and the urbanization level of regions.
Continuous mapping of regional urban areas can provide users, the scientific community, and global organizations with up-to-date information of urban extents, especially demonstrating that urban areas are rapidly expanding worldwide, with more than half of the world's population living in urban areas, and accommodating another two billion people by 2030 (UN 2011). Therefore, a great need exists for timely information related to the location, growth, and extent of urban areas using earth observation satellite-based techniques.
The ranges of GDP thresholds values ( Table 2) have examined in London city ( Figure 6(B)), the highest threshold values (>20,000 USD) which presents the high income of GDP derived from high-resolution imagery clearly concentrated in the centre of London. The upper middle income of GDP, which has a threshold value of (5000-20,000) USD, appears on the outskirts of London. This analysis indicates that the sensitivity of the optimal thresholds is high and that it performs well in extracting the appropriate urban extent of Eurasian urban areas.

Conclusions
In this study, a new urban map of Eurasia of 2012-2013 was produced by assigning different threshold values to delineate the extent of urban areas from coarse spatial resolution satellite imagery. Specifically, four threshold values were decided based on GDP per capita to extract the candidates of urban areas at a continental scale by setting higher threshold values of DMSP, ISA, and MODIS-NDVI to higher GDP countries, and by assigning a higher threshold value of population density to lower GDP countries. Nguyen Thanh Hoan received the PhD degree in Remote Sensing from Chiba University, Japan in 2010. Currently, he is Head of Department of Environmental Study and Analysis at Institute of Geography, Vietnamese Academy of Science and Technology. His research interests are forest, land cover monitoring, and environmental change.
Ahmad Al-Hanbali received his PhD of remote sensing from Chiba University in 2007. Currently, he is a consultant at Geosphere Environmental Technology Corp. His research interest is in remote sensing and geographic information system (GIS) and their applications in water resources, land use/ cover analysis, spatial environmental modeling, spatial decision support systems, and multi-criteria decision-making.
Bai Xiulian is a PhD candidate at the Center for Environmental Remote Sensing, Chiba University. Her research interests are land use/cover change detection and bare area mapping using remote sensing data.