Open land-use map: a regional land-use mapping strategy for incorporating OpenStreetMap with earth observations

Abstract A land-use map at the regional scale is a heavy computation task yet is critical to most landowners, researchers, and decision-makers, enabling them to make informed decisions for varying objectives. There are two major difficulties in generating land classification maps at the regional scale: the necessity of large data-sets of training points and the expensive computation cost in terms of both money and time. Volunteered Geographic Information opens a new era in mapping and visualizing the physical world by providing an open-access database valuable georeferenced information collected by volunteer citizens. As one of the most well-known VGI initiatives, OpenStreetMap (OSM), contributes not only to road network distribution information but also to the potential for using these data to justify and delineate land patterns. Whereas, most large-scale mapping approaches – including regional and national scales – confuse “land cover” and “land-use”, or build up the land-use database based on modeled land cover data-sets, in this study, we clearly distinguished and differentiated land-use from land cover. By focusing on our prime objective of mapping land-use and management practices, a robust regional land-use mapping approach was developed by integrating OSM data with the earth observation remote sensing imagery. Our novel approach incorporates a vital temporal component to large-scale land-use mapping while effectively eliminating the typically burdensome computation and time/money demands of such work. Furthermore, our novel approach in regional scale land-use mapping produced robust results in our study area: the overall internal accuracy of the classifier was 95.2% and the external accuracy of the classifier was measured at 74.8%.


Land-use mapping at regional scale
Land-use maps derived from remote sensing imagery play a vital role in monitoring human-environmental interactions such as landscape changes, ecological services (conservation), and urban planning and management (Lambin et al. 2001;Agarwal et al. 2002;Weng 2002;Abdullah and Nakagoshi 2007;Sumarga and Hein 2014;Hegazy and Kaloop 2015). Whereas, land cover maps represent the observed biophysical cover of the Earth's surface, land-use maps describe the arrangements, activities, and inputs people undertake within a particular land cover type to produce, modify or maintain it. These specified and precise aspects of land-use patterns are indicative of the challenge in establishing distinctive use attributes and accurately mapping them. Ground-validated data are therefore essential to verify remotely sensed data used to infer land-use characteristics. In practice, there is a gap between remote sensing earth observations and their translation into subjective mapping products depicting how the land is used and affected by human activity. Remotely sensed data-sets can be used to monitor land cover dynamics, but are insufficient on their own for deriving land-use characteristics -the manner in which people utilize, and thus modify the land on the ground. The latter is the concept of "land-use", which is distinct from, and often confused with, the term "land cover". Land-use reflects and results from anthropogenic activities on the Earth's surface. The distinction between land cover and landuse poses a challenge and elicits a strategy, for mapping and monitoring the landscape changes and processes.
In modern history, ecosystems are strongly affected by anthropogenic factors (Folke, Holling, and Perrings 1996;Vitousek et al. 1997), including agriculture, building construction and urban expansion, forest timber extraction, and preservation systems (national and state parks) (Matson et al. 1997;Hartley 2002). Macrosystems of forests, croplands, and waterways are driven by human need and associated activities (Foley et al. 2005). The interactions between and the effects of different land-use types may be forces of landscape-wide importance. For example, in the Southeastern United States, significant OPEN ACCESS expansions of urban areas may convert forested land to urban uses, and pine plantation (forestry) may come from cropland Greis 2002, 2013). These kinds of landscape conversions could represent macrosystem changes, depending on scale and extent, and can have immediate and local implications for landowners and management practices.
Although land-use mapping techniques are well studied and have been applied to mapping most developed areas from different perspectives and at local scales (Lambin et al. 2001;Bryan, Barry, and Marvanek 2009;Bateman et al. 2013;Lawler et al. 2014), it is challenging to map heterogeneous land-use at a regional scale with high spatial resolution. At the macrosystems scale, it becomes more difficult to map land-use due to the lack of training samples and validation points, and the necessity of heavy computing tasks. However, despite these obstacles, successful regional scale land-use mapping projects have been accomplished. The Australian Collaborative Land-use Mapping Program mapped land-use in the 17.3 million km 2 of Queensland using a Markov chain Monte Carlo machine learning technique (Lesslie, Barson, and Smith 2006;Bryan, Barry, and Marvanek 2009). The integration of remote sensing data with Volunteered Geographic Information (VGI) data platforms such as Open Street Map (OSM) and cloud computing platforms such as Google Earth Engine (GEE), provides a significant resource for land cover and land-use mapping and related research. Moreover, the combining of remote sensing data and OSM constitutes a powerful tool to monitor, characterize, and quantify the landscape. The product will be an important source of information for researchers and policy-makers for investigating land parcels.
Supervised classification has been proven to be an efficient tool for mapping land patterns and land changes accompanying the spatial and temporal configuration of landscape heterogeneity (Congalton 1991). However, it becomes ineffective/inapplicable at regional scale and for fine resolution mapping because the training data (i.e. ground-validated data) collection process is costly and time-consuming, and is difficult to automate. The processing and collecting time of classification in situ training sites significantly slows down the use of such methods in the applications of mapping regional land patterns.

Volunteered Geographic Information as a source of training data
VGI marks a new era in mapping and visualizing our world, and its data and applications have grown significantly during the last decade (Elwood 2008;Haklay, Singleton, and Parker 2008;Zook et al. 2010). OSM is one of the well-supported VGIs and has been studied and applied in multiple disciplines, but there remain large quantities of information that need to be investigated.
OSM has unique advantages, allowing its use to be fitting a wider range of applications as compared to the official geographic databases. OSM has the capability to create superior maps when considering temporal change trajectories by providing "up-to-date" data (Estima and Painho 2013). From a geographic perspective, this is significant because scaling has always been a major issue in the mapping and monitoring of the Earth. For the research undertaken in this study, OSM is an attractive choice to achieve the research objectives because of the growing coverage of big spatial data and cloud computing capabilities (Zook et al. 2010).
OSM contains not only road network data, but also land-use information, which might be derived from a combination of other ground-level features. Previous studies have shown that OSM is a valuable and structured data source for mapping land cover and land-use by providing high positional accuracy in comparison with the corresponding commercial data-sets Painho 2013, 2015;Jokar et al. 2013;See et al. 2013See et al. , 2015Johnson and Iizuka 2016). For improving the value and potential of collecting and mapping, OSM provides a VGI platform, to which many volunteers can contribute and collaborate. Moreover, the achieved OSM data-sets enable the study of past land changes based on multi-temporal trajectories (Neis, Zielstra, and Zipf 2012).
Recent studies have investigated the spatial distribution balance of OSM-trained classification maps. A total 76% agreement was found in a study based in Portugal, and 64% and 77% agreement was found in two cities in Germany as compared with the corresponding official land-use/cover databases (Estima and Painho 2013;Jokar, Helbich, et al. 2015;Jokar, Mooney, et al. 2015). However, most OSM land cover and land-use mapping applications have focused on urban areas at the local scale, and without using earth observation databases for cross-referencing. Moreover, to produce more accurate and reliable mapping of land-use patterns, the temporal features of the land should also be considered. There was no study that considered and analyzed the temporal land changes, which are critical for reflecting human activities. Different land-use types have distinctive time series signals that can be derived from earth observation images. For example, the land-use type of residential areas typically is not characterized by rapid or seasonal changes, whereas cropland and managed forestlands land-use types have a high temporal variation which can be observed in their respective time series signals.

Incorporating OpenStreetMap with earth observations
Remote sensing techniques have been widely used for observing and monitoring landscape changes, terrestrial features, and can extensively decrease the time and cost for large-scale mapping. Moreover, the Web 2.0 brings a revolution of how people collect, map, and analyze geo-located data (Haklay, Singleton, and Parker 2008). For earth observation data-sets, GEE is a newly developed mapping and analysis platform which enables large-scale spatial analysis by its special infrastructure and automatically parallelized computation techniques (Johansen, Phinn, and Taylor 2015;Padarian, Minasny, and McBratney 2015;Patel et al. 2015). By storing and bringing together large amounts of earth observation data on Earth Engine's server, GEE can analyze big data "on-the-fly" (Yu and Gong 2012). Personal databases can also be uploaded to the server as additional assets to achieve time series generating, zonal statistics, spectral analysis, and many machine learning classification techniques.
GEE has been studied for its strong ability for mapping landscape at the regional scale in both time and space. Yang generated a Mayan forest land cover map by building OSM-derived training samples into a high-dimensional Landsat-based Random Forest classifier of the whole Yucatan Peninsula, and then delineated land patterns with road networks (Yang 2017). GEE was also applied to monitoring land changes at the regional scale based on Landsat 30-m spatial resolution (Soulard et al. 2016;Azzari and Lobell 2017). The major objective of this research is to incorporate OSM into regional landuse mapping using GEE to investigate and quantify land-use types in the Southeastern United States. The broader aim of this study is to develop an integrative and innovative regional scale mapping strategy that can be replicated for similar research purposes. Overall, the methodology presented here seeks to improve land-use mapping strategies by reducing economic and temporal burdens, and contribute to the science of monitoring human-environment interactions.

Study area
This study focuses on the land-use patterns of the heterogeneous Southeastern United States. The Southeastern United States has high landscape heterogeneity, with heavily managed forestlands, highly developed agriculture lands, and multiple metropolitan areas. Human activities are transforming and altering land patterns and structures in both negative and positive manners. Based on EPA ecoregion descriptions (Omernik and Griffith 2014), land cover in the Southeastern United States is a mosaic of cropland, pasture, woodland and forests, and wetlands. For this research, the study area consists of one Landsat scene of Worldwide Reference System II (WRS-2) Row 17 Path 39 ( Figure 1). This Landsat scene is an illustrative example of land patterns of the Southeastern United States, which covers an area of approximately 185 km 2 . The heterogeneous landscape in the study area consists of a mixture of highly developed cropland, plantation forests, multiple metropolitan areas, and industrial and commercial lands. Human land-use activities and management practices are transforming landscape patterns and processes in the region, resulting in both negative and positive consequences for stakeholders and ecosystems.

Methodology
This research proposes a methodology to produce a regional scale land-use map by extracting training data from OSM. Random Forest is chosen as the classifier because it has become a widely used algorithm for remote sensing image classification (Breiman 2001). Random Forest has the ability to handle high-dimensional raster databases (Senf, Hostert, and van der Linden 2012), which suits the framework of this research.

Data sources
3.1.1. OpenStreetMap data structure OSM is a fully free and openly accessible map of road network data. All OSM data were downloaded from the Geofabrik website http://download.geofabrik.de. Community volunteers collect and submit geographic information to the global OSM database (Ciepluch et al. 2009). OSM data quality has been broadly assessed with consistently positive reviews. Girres and Touya (2010) performed a spatial analysis of the quality of OSM street network representations in the UK and France, respectively, through a comparison to ground-truth data obtained from the corresponding national mapping agency. Both case studies found that on average, the quality of the data was reasonably good but exhibited a significant spatial heterogeneity. Neis, Zielstra, and Zipf (2012) analyzed how the quality of the OSM street Underlying map shows the epa level iii ecoregions (epa). this map was created using arcGis ® software by esri. arcGis ® and arcmap™ are the intellectual property of esri and are used herein under license. copyright © esri. all rights reserved. for more information about esri ® software, please visit http://www. esri.com variety of human-made land-use types, and the quantity of data points is large enough to suffice for the necessary inputs for large-scale classification mapping.
The OSM-Landuse, OSM-Nature, and OSM-Places were reclassified and converted into five major land-use types: "residential", "forestry", "cropland", "commercial/ industrial", and "waterbody". Although it is noted that the "waterbody" class may not be a land-use type, it is included as it accounts for a significant proportion of Florida land cover. Furthermore, the inclusion of the "waterbody" class in the land-use map serves to avoid the classification bias. Users may also decide to apply different classification frameworks based on the specific research questions and understanding of the study area attributes.

Earth observations
Earth observation is one of the most essential tools for monitoring the earth's surface and its dynamics at regional to global scales. Although earth observation data and remote sensing techniques allow for extensive mapping of characteristics of the land surface, it can be financially burdensome and time-consuming for mapping at large scale in both time and space. GEE provides online access to worldwide coverage of a vast of remote sensing data-sets. The GEE API is currently available upon request to a number of groups for testing and applying large-scale mapping on the cloud server.
The spatial and temporal covariates used in the landuse classification for this research include multi-temporal land reflectance, forest canopy height, DEM, mean-EVI, and the land ownership. The spatial and temporal covariates are important indicators of land-use, ecosystems, and the remotely sensed inputs are consistently and accurately derived. All the data used for the land-use map produced in this research were up-to-date, subset to the boundaries of the study area, and resampled to 30-m spatial resolution. Additionally, all the data used as spatial and temporal covariates cover the entire study area and can be extended to the global scale, thereby making the research methods reproducible for other regions of interest, given there are similar research objectives. The earth observation images used here consist of a singular time step Landsat images and multi-temporal composite satellite images. The temporal heterogeneity of the datasets combines to reflect the characteristics of the landscape of the study area. Furthermore, the heterogeneous data allows learning a supervised algorithmic model (ie Random Forest) in order to extract relevant thematic classes of land-use/land cover from the satellite imagery. Classification is performed on a data-set consisting of the following five major sources (Table 1).
network in different regions of Germany had changed between the years 2007-2011. This multi-temporal study demonstrated that the quality and accuracy of OSM data have improved over time, and more recent studies confirm this assessment (Neis, Zielstra, and Zipf 2012). The OSM database contains a full coverage of the area of interest for this research, with the following subdata-sets: OSM-Places, OSM-Railways, OSM-Roads, OSM-Waterways, OSM-Buildings, OSM-Landuse, and OSM-Nature. In this study, the OSM feature classes applied for mapping land-use patterns include OSM-Places, OSM-Landuse, and OSM-Nature.
The OSM-Landuse feature class is used to describe the human use of land, presented as a polygon class consisting of forests, residential areas, and some industrial areas. Figure 2 shows the composition of OSM-Landuse labels. In the study area of Landsat scene Path 17 Row 29, there are 26 different types of land-use labels on land surface, which illustrate how human activities affect landscapes. Labels of "residential" contribute 43% (675 sites), followed by "commercial" (14% − 214 sites) and "farm" (11% − 175 sites). OSM-Nature is used to describe natural physical land features, including ones that have been modified by humans, which covers most waterbody and conservation areas. In OSM-Nature, all the waterbodies, parks, and recreation areas are represented as polygons. As shown in Figure 3, labels of "water" contribute 75% (2259 sites), follows by "forest" (17% − 503 sites) and "park" (8% − 228 sites). The OSM-Places feature class includes urban and suburb landmarks and attractions, with the following place labels for the study area shown in Figure 4: labels of "helmet" contribute 66% (585 sites); followed by "island" (24% − 211 sites;) and "village" (3% − 30 sites). Based on the general statistics of the data structure of OSM labels, it was determined that OSM-Nature, OSM-Places, and OSM-Landuse represent a Figure 2. Breakdown of osm-landuse labels of study area. nongovernment agencies are integrated for landownership type mapping. Landowners are classified as public and private. There are six sub-types of public ownership, which are federally protected, federal, state-protected, state, military, and local. In addition, there are four sub-types of private ownership: nongovernment organization, private, family, and corporate. The ownership classification strategy is made, based on different management objectives, as well as landowner skills, budgets, and interests. The U.S. Protected Areas (PADUS) is the primary data source to identify public ownership.
To summarize, Figure 5 shows the examples of covariates used to generate land-use maps: including Shuttle Radar Topography Mission (SRTM) digital elevation at 30-m resolution (Figure 5 (b)). We therefore added the layer of OSM-Places. Another challenge arose from the imbalanced class distribution of different extracted featured from OSM. Thus, a layer of managed forest from Google Earth visual interpretation was added. Google Earth was used because it provides higher spatial resolution with a time span of more than 20 years. Therefore, more spatial information can be acquired about what types of land practices occur during the spatio-temporal validation of this study. Further details about the training sample setting strategy is depicted in Figure 8, which shows the workflow reflecting the conversion of OSM features to land-use labels. In this study, for our study area, there are 3068 training samples collected from OSM features, and 380 from visual interpretation from Google Earth over

Training samples setup
The randomly selected training points were converted into a Keyhole Markup Language file. Each chosen point centered at a 30 × 30 m polygon, one-pixel image size. The chosen points and associated polygons were overlaid in Google Earth virtual globe for visualization. The selected points were identified and assigned classes based on the OSM feature types and their surrounding areas. The points were then split into training and testing data-sets for model validation. To avoid the imbalance caused by polygon weight within the training sample (Visconti et al. 2013), OSM-Nature and OSM-Landuse polygons feature classes were converted into points features based on their respective geographic centroids. In Figure 6, the spatial distribution of the training samples of each feature including OSM-Nature, OSM-Landuse, OSM-Places, and Google Earth visualization samples are shown, and Figure 7 shows the visualization of all • To implement advanced pixel-based supervised classifier with high accuracy. • To estimate and measure the variable importance, and • To compare out-of-bag error to independent validation to avoid classification bias.
All data used in this study as inputs for the classifier are completely embedded in GEE. In other words, the GEE IDs of each database used in this study are accessible by via GEE platform, and can be used for to reproduce the methodology presented here and for other study regions. Moreover, the use of GEE enables regional land cover mapping to be accomplished with greater ease than other platforms, as many spatial and temporal covariates are already available in the GEE server and the calculation itself is parallelized automatically to optimize the computing time cost. In addition, the spatial resolution can be resampled easily to match or fit time (3448 in total). The training samples extracted from OSM were re-classified into five classes: (1) residential, (2) managed forest, (3) cropland, (4) commercial/industrial, and (5) waterbody. The training sample set of this study is visualized in Figure 7, which represents the spatial distribution of the training points geographically and illustrates the degree of urbanization.

Supervised classification
The classifier of Random Forest is considered for classification of multi-source remote sensing data mapping in this study because it was proved to perform relatively well for the integration of imagery data and collaborate the environmental covariates into classifier to improve its accuracy in such a complex landscape (Rodriguez-Galiano et al. 2012). For classifying land-use at the regional scale, Random Forest has several advantages over the others and the most notable ones (Breiman 2001) are: Figure 6. spatial distribution of rf training from osm and Google earth engine: (a) centroid center of osm-landuse feature; (b) centroid center of osm-nature feature; (c) managed forest sites from Google earth visualization; (d) centroid center of osm-place feature. the base map was developed by esri using Here data, Delorme basemap layers, openstreetmap contributors, esri basemap data, and select data from the Gis user community. in north america coverage is provided from level 14 (1: 36k scale) through level 16 (1:9k scale). http://goto.arcgisonline.com/maps/World_light_Gray_Base. this map was created using arcGis ® software by esri. arcGis ® and arcmap™ are the intellectual property of esri and are used herein under license. copyright © esri. all rights reserved. for more information about esri ® software, please visit http://www.esri.com (2) Validate with Google Earth on both time and space for the training data-set. (3) Collaborate the spatial and temporal covariates as inputs for Random Forest classifier, run Random Forest classifier using GEE API, and analyze the classification accuracy for each classification type. (4) Extract the forest land from the resulting map at Step (3), superimpose rasterized OSM road networks, and run cluster analysis to generate land-use map.

Land-use classification results
The results of the Random Forest classifier show land mosaics that describe the proportion of the study area that were classified into each land-use type. The landscape of the Landsat scene Path 17 Row 39, consisting of 185 km 2 , is comprised of 62% forestland (2.41 × 10 6 ha), followed by 16% cropland, 3% commercial/industrial lands, and 8% residential areas. Using GEE, the total processing time was 126 s for the entire study area. This process could be extended extensively to be applied at a larger scale, such as continental and even global. The map of the dominant land-use types in the study area is shown in Figure 9 and the internal validation error matrix is provided as Table 2. Figure 9 depicts the land-use patterns of the study area. Managed forest dominates much of the study region, surrounded by with other databases. To convert the uploaded training samples from OSM to GEE, a fusion table was generated representing the geometry and class types of the training samples extracted from OSM. Therefore, the OSM training sample used for this study are embedded in GEE for use by other users. The four-step procedure for mapping land-use is summarized below: (1) Explore the data-sets of OSM-Nature, OSM-Landuse, and OSM-Places, set the land-use classification strategy based on research objective, and reclassify and generate the training sample data-set. Figure 7. spatial distribution of training sites over study area. the base map was developed by esri using Here data, Delorme basemap layers, openstreetmap contributors, esri basemap data, and select data from the Gis user community. in north america coverage is provided from level 14 (1: 36k scale) through level 16 (1:9k scale). http://goto.arcgisonline. com/maps/World_light_Gray_Base. this map was created using arcGis® software by esri. arcGis ® and arcmap™ are the intellectual property of esri and are used herein under license. copyright © esri. all rights reserved. for more information about esri ® software, http://www.esri.com   representable zoomed-in view of land-use mosaics located in Alachua County, Florida. Figure 10(a) is the Google Earth base map and Figure 10(b) is the land-use map of the same area, showing the detailed differentiation of land-use types. The land-use mosaics in Figure 10 demonstrates the lack of detailed accuracy of land-use classifications in the absence of "ground-level" training samples, such as those that were derived from OSM for this study. This highlights the significance and applicability of the integration of OSM and earth observation remote sensing data-sets in achieving highly accurate land-use maps.
In this research, the out-of-bag error is 4.8%, yielding a total classification accuracy of 95.2% for the Random Forest classifier building technique. In comparison to the Google Earth base map, the results of this study indicate that the Random Forest classifier performed well in distinguishing residential and commercial areas, which often proves challenging. Furthermore, this landuse classification and mapping strategy is able to detect residential areas that are under heavy canopy cover. In the forthcoming steps, landscape ecology tools are used to analysis the land-use patterns of the study area and investigate the drivers of the land-use changes.

Accuracy assessment
To perform accuracy assessment on our open land-use maps, we also performed an accuracy assessment for the OSM-derived training samples as an out-of-bag validation to get estimates of class noise levels in the training data (see Table 2). We constructed an error matrix from the results of our land practice map, and calculated the commission, omission, and the overall errors (Table 2). Table 3 shows the confusion matrices for both internal out-of-bag accuracy and external accuracy. Producer's residential areas and with some residential land penetrating forest cover. Urban build-up areas, including commercial/industrial, are seen scattered among other land-use types. Cropland/agriculture land-use is seen interwoven with residential land mainly in the western region of the image. However, smaller patches of agricultural land-use are notably evidenced to be encroaching on denser forest land cover. Figure 10 depicts a human land-use needs and associated activities. Land ownership is an important link between human and environmental factors, especially at the regional scale. At the macrosystems scale, forest ownership patterns explain different types of land management practices and trajectories of land-use change (Turner 1989). In Southeastern United States, forest management is the main driving force of the forest structure, which affects forest ecosystem services (Becknell et al. 2015), and alters forest properties and processes. Southeastern forest system is a fire-dominated system with native trees adapted to short-period stand-clearing events. We classify forest management into four categories: production management ecological management, passive management, and preservation management. The major forest management types are production and passive management due to the dominant ownership of private owners, logging companies, and investment institutions (e.g. Real Estate Investment Trusts (REITs) and Timber Investment Management Organizations (TIMOs)) (Zhang, Butler, and Nagubadi 2012). The land-use map produced at the regional scale will help landowners, governors, and decision-makers for better understanding the land patterns, processes, and consequences produced by diverse land management practices.

Conclusions and perspectives
In this study, we proposed a strategy to mapping landuse and management practices using cloud-computing and combining earth observations with OSM-derived training samples. OSM-Nature, OSM-Landuse, and OSM-Places were reclassified and converted into different major land-use class types. The mapping strategy can be designed based on your understanding of the land of your study of interest. Moreover, the map has more scientific sound, as it is ready for all those types of analysis from landscape ecology, biogeography. It can also be a base map for designated study as well.
OSM is an interesting new platform, which could be implemented in land cover mapping and assessing, because of its geo-located features and labels. The combination of earth observations with OSM features will be a tool for large-scale mapping and will present great opportunities in the creation ideas because of its unique workflow to fulfill specific mapping needs. The advantages of incorporating and deriving training samples from OSM can significantly enlarge the amount of training samples. The Random Forest was used for classification of a multi-source remote sensing based on geo-located supervised training data-set derived from OSM. In experiments, Random Forest performed well with the out-of-bag validation accuracy of 95.2% and external validation accuracy of 74.8%. Results also show that the Random Forest algorithm yields accurate landuse classifications at a regional scale, with 74.8% overall accuracy and a Kappa index of 69.7%. The time cost of accuracy (omission error) and user's accuracy (commission error) were calculated for each land-use class. The internal validation produced the out-of-bag error during the training of Random Forest classifier. Rows represent the reference data and columns represent the classification results. Producer's accuracy and user's accuracy are listed in the last row and column, respectively. The overall accuracy of the thematic map was 95.2%, and overall 95.1% of the forestry pixels were correctly identified as forest. Waterbody shows the lowest user's accuracy of 93.3%, which is because of the light weight in the training samples, and VGI contributors define waterbody differently in OSM-Nature feature layer.
Accuracy assessment was carried out to evaluate the classification results. The reference test pixels were randomly distributed in the study area. A total number of 250 test points were used for evaluation. The external validation was conducted with around 50 points per landuse class random derived samples from Google Earth. The assessment result is shown in Table 3, depicting the error matrix and condition coefficient for the five classes. In Table 3, Kappa index analysis along with per-class is used for accuracy assessment as well. According to Table  3, among the major land-use types, the Kappa index is 69.7% with the overall accuracy of 74.8%. This means the Open Land-use map and virtual validated Google Earth map match at a moderately rank. Class-based analysis of user's accuracies reveals that 77.6%, 78.2%, 71.4%, and 87.5% of the cropland, forestry, residential, and waterbody, respectively. Therefore, it was concluded that these major land-use classes could be used for land-use mapping purposes at a relatively good level of reliability. There was a low user's accuracy on commercial lands, which was 57.4% because of the total number of mills (confuse with land-use type of forestry) located in the study area of the Southeastern Unites States.
The mapping strategy presented in this study is recommended mainly for mapping relatively broad land-use classes at large scale. Based on the level of noise tolerance of random classifier, this mapping strategy even has the potential to generate historical multi-temporal land-use maps for change detection and monitoring.

Implications for land management
Based on the results of this study, the estimated area of forestry, cropland and the other major land-use types could be calculated based on the land-use patterns map. The quantitative estimate of the landscape is an important approach for the evaluation of the impact of the current land-use and might aid in the assessment of human impact on nature resources. Moreover, by incorporating earth observation imagery, it will provide a structured and valuable source of data to monitor land change at the regional scale. Land management practices, usually driven by land ownership types and vice versa, can reflect the objective of ownership types including varied applications of 3S technology in resources and environment management, landscape ecology, and land-use simulations.

Di Yang
http://orcid.org/0000-0002-4010-6163 Chiung-Shiuan Fu http://orcid.org/0000-0003-0528-6231 mapping a single Landsat scene is 126 s by GEE platform, which covers 1.08 × 10 9 pixels. This mapping strategy can be easily extended to a larger scale such as state level, continental level, and, even, global level. According to the temporal availability of our covariates, the up-to-date land-use pattern mapping strategy can be extended to a multi-temporal changes database. It also has the capability to make predictions based on the spatio-temporal land-use pattern changes and patch size distributions. There is still a strong need for a continental coverage land-use map with a higher temporal resolution, which will reflect the land-use transition processes at shorter intervals. It is an urgent issue at present; On the other hand, in order to abate the ecological impacts caused by the rapidly increasing construction of road networks, we need the better road network designs combining both top-down and bottom-up processes, which would retain and preserve large forest patches and decelerate the parcellation rate. The use of open-data source will open our eyes of exploring the area that does not have an official classification database. Researchers and applicants can design their own mapping strategy based on their mapping objectives. The open land map can use single or multiple classifiers to fulfill the mapping goals as well.