FROM-GLC Plus: toward near real-time and multi-resolution land cover mapping

ABSTRACT Global land cover has undergone extensive and rapid changes as a result of human activities and climate change. These changes have had a significant impact on biodiversity, the surface energy balance, and sustainable development. Global land cover data underpins research on the development of earth system models, resource management, and evaluation of the ecological environment. However, there are limitations in the classification detail, spatial resolution, and rapid change monitoring capability of global land cover change data. Building on the earlier Global Land Cover Mapping (Finer Resolution Observation and Monitoring – Global Land Cover, FROM-GLC), we developed the improved Global Land Cover Change Monitoring Platform (FROM-GLC Plus) using methods such as multi-season sample space-time migration, multi-source data time series reconstruction, and machine learning. The FROM-GLC Plus system provides a capacity for producing global land cover change data set from the 1980s with flexibility in spatio–temporal details. The preliminary results show that FROM-GLC Plus provides a framework for near real-time land cover mapping at multi-temporal (annual to daily) and multi-resolution (30 m to sub-meter) levels.


Introduction
Land cover reflects the biophysical surface of the Earth (Anderson 1976;Fisher, Comber, and Wadsworth 2005;Gong et al. 2013;Yu et al. 2014a). It provides fundamental information on human-nature interactions and is a key component of the Earth system Running 2008). Land cover informs all nine societal benefit areas of the Group on Earth Observations (Herold et al. 2008), and the Global Climate and Global Terrestrial Observing Systems has recognized it as an important climate variable (Hansen et al. 2014). Changes in land cover have modified and reshaped the land surface over large areas, profoundly impacting global environments (Foley et al. 2005;Song et al. 2018), including biosphere-atmosphere interactions, land surface energy balances, hydrological and biogeochemical cycles, biodiversity, and ecosystem services (Grimm et al. 2008;Pielke 2005;Reyers et al. 2009;Sterling, Ducharne, and Polcher 2013). In recent years, there has been a growing demand for timely, refined, and updated land cover data. Such data recording land cover distribution and dynamics, not only contributes to earth system modeling and environmental change studies (Buchanan et al. 2009;Feddema et al. 2005;Jung et al. 2006;Yang et al. 2013) but also plays an important role in international policy-making and sustainable development research such as urban planning, public health, and food security Burke et al. 2021;GEO 2017;Gong et al. 2012;Liang et al. 2010;Seto, Parnell, and Elmqvist 2013;Xu et al. 2004).
Satellite remotely sensed data has been widely used for mapping large-scale land cover given its fullcoverage, all-weather, high-frequency, rapid, and dynamic advantages (Friedl et al. 2002;Gong et al. 2013;Yu et al. 2014b;Zhang et al. 2021). The past few decades have witnessed a series of global land cover (GLC) mapping efforts (Ban, Gong, and Giri 2015;Gómez, White, and Wulder 2016). As shown in Table  1, coarse spatial resolution GLC products included the 1-km International Geosphere-Biosphere Programme Data and Information System Cover map (DISCover) circa 1992 (Loveland et al. 2000); the 1-km University of Maryland land cover map (UMD) circa 1992 (Hansen et al. 2000); the 1-km Global Land Cover 2000 map (GLC2000) (Bartholome and Belward 2005); the 300-m European Space Agency Climate Change Initiative (ESA-CCI) land cover product from 1992 to 2020 (ESA 2017); the 500-m Moderate Resolution Imaging Spectroradiometer (MODIS) land cover product (MOD12Q1) from 2001 to 2020 (Friedl et al. 2010;Sulla-Menashe et al. 2019); and the 5-km annual dynamics of global land cover dataset (GLASS-GLC) (Liu et al. 2020b). In 2008, the United States Geological Survey (USGS) provided free access to the whole Landsat archive Wulder et al. 2012). This pioneering policy, along with advances in image processing and machine learning techniques, revolutionized the use of remote sensing data to map land cover. Since 2010, finer spatial resolution GLC products have been successfully generated and publicly released, including the 10-m and 30-m Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) products (Gong et al. , 2019, the 30-m global land cover data (Globeland30) (Chen et al. 2015), and the 30-m fine classification system global land cover product (GLC_FCS30) . In addition to these multi-category GLC datasets, thematic mapping has improved the classification accuracy of individual land cover categories, such as cropland (Fritz et al. 2015;Lu et al. 2020;Potapov et al. 2021), forests (Hansen et al. 2013;Shimada et al. 2014), water bodies (Ji et al. 2018;Pekel et al. 2016;Pickens et al. 2020), impervious surfaces He et al. 2019;Liu et al. 2020a;Pesaresi et al. 2013), and wetlands Murray et al. 2019).
However, the relatively low spatio-temporal details and poor consistency hinder the application of existing GLC products for fine-scale and continuous land cover analysis. At present, 10 m is the finest granularity achieved by GLC mapping, yet this is not sufficient for more sophisticated studies (Rioux et al. 2019;Verburg, Neumann, and Nol 2011). The highest temporal frequency for existing GLC products is usually one year (annual), and seasonal/monthly datasets are rare. Even high spatial resolution data are often accompanied by low temporal frequency, which reduces the applicability of remote sensing-based land cover mapping. Because of technical barriers and budget considerations, there is a tradeoff between spatial resolution and swath width in remote sensing instruments (Zhu et al. 2010). As a result, although constellations, such as Planet's, are the current way to overcome this problem, there are still few sensors that can provide both high spatial resolution and high temporal frequency . For instance, it takes 16 days to acquire two 30-m Landsat images of the same site but the 500-m MODIS sensor can monitor the same site twice a day. To compensate for this defect, algorithms have been developed to reconstruct time-series remote sensing data with missing information (Hilker et al. 2009;Zhu et al. 2010Zhu et al. , 2018. Data fusion is another promising approach that integrates temporal, spectral, and angular features from multiple sensors to build a seamless data cube with both high spatial and temporal resolution . The advances in spatio-temporal methods for remote sensing data fusion have been reviewed by Chen et al (2015) and Zhu et al. (2018). In brief, they may achieve satisfactory accuracy but they are highly reliant on reference image quality and integrity. In consequence, they have restricted applicability for largescale and long-term land cover mapping. Globeland30 (Chen et al. 2015) 2000/2010/2020 30 m Landsat, HJ-1 Globeland30 10. GLC_FCS30  1985-2020 (five-year interval) 30 m Landsat FAO-LCCS a Annual GLC products. IGBP: International Geosphere-Biosphere Programme; FAO-LCCS: Food and Agriculture Organization of the United Nations-Land Cover Classification System.
Sampling is also a neglected issue in GLC mapping because most existing campaigns adopt a supervised classification strategy that is highly dependent on input reference data of varying quality and quantities (Stumpf and Kerle 2011;Tuia, Persello, and Bruzzone 2016). Generally, an increase in the number and variety of training samples leads to more robust classification results (Gong et al. 2019). Nevertheless, sample collection is thought to be the most time-consuming and labor-intensive process in producing land cover maps (Li et al. 2019b;Li and Gong 2016a). Moreover, because samples are usually collected for a specific mapping task, the task must be repeated each time mapping is required for other regions or times. In general, there have not been sufficient attempts to accumulate, share, and migrate samples across projects. Recently, Gong et al. (2019) used a seasonal sample set collected from 30-m Landsat 8 images in 2015 to classify 10-m Sentinel-2 images in 2017. They found that the overall classification accuracy was up to 72.1%. Huang et al. (2020) developed an automatic training sample detection method by measuring the spectral similarity and spectral distance between the reference spectra and image spectra. Li et al. (2021) built an automatic phenology learning (APL) method which generated training samples based on similar temporal profiles of the same land cover type. These pioneering studies provide insights for the establishment of multi-temporal stable sample libraries that will assist with dynamic GLC mapping.
Another limitation with current GLC data is mapping flexibility. Cartography is a demand-driven process, in which the goal is to provide accurate and targeted maps for different users. Currently, most GLC datasets are produced using a predetermined classification system with fixed spatio-temporal resolutions. This may hamper the flexible use of data for multiple tasks. For example, climate change researchers normally require high-frequency data updates, while urban planners are more concerned with refined details on land cover/land use distributions, configurations, and structures. It is challenging to build a flexible and adaptable mapping framework to meet the needs of different research and practitioner groups.
In terms of the data update frequency, most GLC data with a 30 m spatial resolution are generated every five to ten years (Chen et al. 2015;Gong et al. 2013), but land surfaces are highly dynamic both within and across years. For example, impervious surfaces increase sharply in regions with rapid urbanization Li and Gong 2016b), but water bodies show inter-annual variation because of the dry and wet seasons (Pekel et al. 2016). Therefore, there is an increasing requirement to have near realtime updates for land cover data applications. To fulfill this requirement, the FAO developed decadal updated land cover data for 10 small regions in Africa and the Near-East to support cropland monitoring using satellite data with 30 m resolution, but these data have not been generated at a global level because of data availability and computing costs (https://wapor.apps. fao.org/home/WAPOR_2/1). However, near real-time remote sensing data processing has become possible recently, thanks to the rapid development of highperformance computing platforms. For example, the Microsoft Planetary Computer Land Cover Mapping platform is designed to develop instant land cover maps of the U.S. (Microsoft. 2021b). Among the different high-performance platforms, field-programmable gate arrays (FPGAs) show a better balance between power consumption and computing performance compared with central processing units (CPUs) and graphics processing units (GPUs) (González et al. 2016;Mohammadnia and Shannon 2016). This makes them popular in the application of high computational, on-board, and near real-time processing of remote sensing images, such as object detection (Li et al. 2019), scene classification , and land cover mapping (Garcia et al. 2020;Microsoft. 2021a).
Given the above discussion, a paradigm shift is needed to guide future GLC mapping. Recently, Gong (2021) introduced the concept of intelligent mapping with remote sensing (iMap) to promote multi-class, multi-scale, and more accurate mapping services. In contrast to previous efforts, iMap has the advantage of flexible purposes, transferable knowledge, multi-source big data, integrated algorithms, and user-friendly platforms. Inspired by the iMap concept, the current study aimed to leverage multiseason sample space-time migration, multi-source data time series reconstruction, time-series breakpoint detection, machine learning approaches, and high-performance computing platforms to map near real-time, multi-temporal (from annual to daily) and multi-resolution (from 30 m to meter/submeter) GLC (FROM-GLC Plus). Under the FROM-GLC Plus framework, newly acquired remote sensing data can be immediately processed and rapidly classified to achieve land cover monitoring at any given place on the planet in near real-time.

Materials and methods
FROM-GLC Plus is intended to realize near real-time, multi-temporal, and multi-resolution land cover mapping. To achieve this, a new framework was developed in the current study ( Figure 1). First, Advanced Very High-Resolution Radiometer (AVHRR), Moderate Resolution Imaging Spectroradiometer (MODIS), Landsat, and Sentinel 2 data were collected and preprocessed to generate initial spectral time series. Then, global 10-m and 30-m multi-spectral collections were reconstructed using spatio-temporal fusion and data smoothing methods. The multi-temporal feature space was further developed using topography, location, and climate datasets. Second, the feature space was used as the input for sample migration and land cover classification. Based on the first all-season sample set (FAST) collected in 2015 , multi-temporal land cover maps were automatically produced with different sample migration strategies and machine learning methods. Third, post-processing strategies, including a time consistency check (Du et al. 2021;Li, Gong, and Liang 2015) and spatial filtering (Yu et al. 2014b), were used for certain mapping tasks. Refined high spatial resolution land cover maps were developed in this step using super-resolution algorithms with Google Earth images. Finally, optimized classification maps were produced by FROM-GLC Plus, and accuracy assessment and product inter-comparison were performed. The workflow was implemented mainly on Google Earth Engine (GEE), with unsupervised classification processing at the National Supercomputing Center of China. Using these data and the computing resources of both platforms, we built FROM-GLC Plus to be a stable, automatic, flexible, and efficient system for producing global near realtime, multi-temporal, and multi-resolution land cover mapping.

Satellite datasets
The satellite data used for land cover classification included Landsat, Sentinel 2, MODIS, and AVHRR ( Table 2). All Landsat Tier 1 surface reflectance images available in GEE, including data collected by the Landsat Multispectral Scanner (MSS), Thematic  , and Operational Land Imager (OLI) sensors between the 1970s and 2020, were used in this study. For finer resolution land cover mapping, Sentinel 2 data provided by GEE were used to develop 10-m global land cover maps after 2017. Considering the sparsity of the temporal distribution of Landsat data in some regions, the 500-m MODIS MCD43A4 Version 6 NBAR product was used to fill the gaps after 2000, and AVHRR was used to fill the gaps before 2000. For Sentinel 2 images, Landsat data were used for gap filling owing to the relatively low resolution of MCD43A4.

Global sample set
The global sample set used for sample migration and land cover mapping in this study was the FAST dataset ) from the FROM-GLC project . FAST includes more than 91,000 training sample locations and 36,000 validation sample locations. The training sample units were selected based on the dominant spectral cluster of each Landsat image, while the validation sample units were selected at random from global equal-area hexagons Zhao et al. 2014). Taking advantage of Global Mapper, each sample unit was expanded from a single date to four seasons mainly by interpreting Landsat 8 OLI images, MODIS Enhanced Vegetation Index (EVI) time series, and other auxiliary datasets including monthly temperature and precipitation and high-resolution Google Earth images. All the sample units were interpreted and cross-checked by experienced image interpreters. Two classification schemes were provided by FAST ), including 10 level-1 classes and 29 level-2 classes for the FROM-GLC scheme , and 8 level-1 and 33 level-2 classes for the land cover classification system (LCCS) scheme (Bartholome and Belward 2005). Specifically, the "size" of sample units in FAST was classified into 1, 3 × 3, 9 × 9, 17 × 17, and 33 × 33 pixels to match land cover mapping from various image resolutions. The time of image acquisition (DOY, month and year) was also recorded, which made it possible to migrate FAST to different temporal scales. We further analyzed the temporal distribution of sample units of FAST ( Figure S1). Training and validation sample units showed a very similar distribution, both with notably larger numbers of samples in February, May and November, and relatively high numbers in June and August. The relatively even distribution of FAST in all seasons and the large number of labeled samples in the vegetation growing season provided an excellent basis for the sample migration study in FROM-GLC Plus. In addition to the FAST validation samples, FLUXNET site data (Gong 2009) was used as a third-party validation dataset to further evaluate the mapping accuracy.

Image preprocessing and spectral time-series generation
All Landsat MSS Tier 1 raw scenes and TM, ETM+, and OLI Tier 1 surface reflectance imagery provided by GEE from the 1970s to 2021 were used in this study. All data had been atmospherically corrected using the Landsat Ecosystem Disturbance Adaptive Processing System (Masek et al. 2006) and Landsat Surface Reflectance Code ) by the USGS. We then used the C Function of Mask (Foga et al. 2017) algorithm for the cloud and cloud shadow mask. To improve the continuity of observations, the coefficients provided in Roy's paper were used (Roy et al. 2016). Sentinel-2 L2A imagery during 2017-2021 from GEE was also used in this study. All data had been preprocessed using the Framework for Operational Radiometric Correction for Environmental Monitoring (Frantz 2019) by ESA. The poor-quality observations induced by clouds were detected and removed using the QA60 bitmask band which contains cloud mask information (Li et al., 2019a).
Considering that valid Landsat observations were relatively sparse in the time series, high-frequency MODIS and AVHRR data were used to fill the temporal gaps. The MODIS and AVHRR images were quality screened with the Quality Assurance band, and pixels with clouds or shadows were removed. The construction procedure for a daily Normalized Difference Vegetation Index (NDVI) time series, using the combination of Landsat and MODIS as an example, was as follows. First, using a bicubic interpolation algorithm, the 500-m MODIS images were resized and reprojected to 30 m. Then, for days with missing Landsat observations, random points were generated in the overlap area of valid MODIS and Landsat images for the same day. Pairs of observations were used to develop a linear function for NDVI adjustment.
where NDVI Landsat and NDVI Modis are the original NDVI values of Landsat and MODIS; a and b are the slope and intercept calculated with the least-squares method. For cases where linear regression is not possible (i.e. null valid MODIS or Landsat pixels), the mean value of two nearest days (one before and one after) was used as their assigned pixel value. Finally, the Savitzky-Golay filter (Chen et al. 2004) was implemented to smooth the reconstructed daily time series. Spatio-temporal fusion of Landsat and Sentinel-2 also followed the above procedure to generate daily spectral time series (Figure 2).

Sample migration
A global training sample dataset with high accuracy and long time series is one of the main constraints for global dynamic land cover mapping studies (Friedl et al. 2010). In this study, to overcome the difficulties of sample collection, an automatic training sample migration method was utilized. Overall, the sample migration of this study followed the method introduced in Huang's research (Huang 2020). First, for each sample in FAST, the spectral angle distance (SAD) and Euclidean distance (ED) were calculated to measure the difference between the two spectral features in reference time and migration target time. Then, the unchanged samples would further be determined and migrated to the target time by comparing SAD and ED with given thresholds. Considering the multi-temporal scales of classification tasks in this study and computational efficiency, different sample migration strategies were used for daily, ten-day, monthly, and annual land cover mapping. For high time density land cover monitoring (i.e. daily mapping), the spectra of reference DOY were compared with that of the target DOY, based on the daily spectral time series constructed in Section 2.2. For medium time density land cover monitoring (i.e. ten-day and monthly mapping), the mean spectra of the reference and target interval were used. For coarse time density land cover monitoring (i.e. annual mapping), the mosaic 16-day spectral time series curves were used in the sample migration for the target year.
Taking the sample migration of annual land cover mapping as an example, we measured the number of migrations in each training and validation sample during 2000-2021. Their spatial distribution patterns and the proportion of each level-1 land cover category in different migration frequencies are shown in Figure 3. Various sample migration frequencies can be observed for different regions and land cover classes. The obvious lowfrequency area located in the north-central Eurasian continent was mainly caused by the difficulty of migrating snow/ice samples, which was also indicated in the inset donut charts. In contrast, the migration frequency of impervious samples was high. This may be caused by the distinguishable spectral features and the stability of the impervious layer.

Classification
Supervised and unsupervised classification methods are the two main types of methods for land cover mapping. To encompass relatively comprehensive mapping tools, supervised Random Forest (Section 2.4.1) and unsupervised K-means (Section 2.4.2) are both encapsulated in the FROM-GLC Plus framework. To improve the spatial resolution of land cover maps developed with Random Forest and K-means, a super-resolution method for post-processing is introduced in Section 2.4.3. Table 3 summarizes the classification scenarios implemented in this study using the three classification and post-processing methods described above.

Supervised classification with Random Forest
A feature matrix was constructed for the land cover classification based on the generated multi-spectral time series. The feature matrix reflected the distribution of different spectral features over a period, which is conducive to enhancing the precision of land cover mapping results. In this study, we used 11 bands, including Blue, Green, Red, Near Infrared (NIR), Shortwave Infrared 1 (SWIR1), Shortwave Infrared 2 (SWIR2), NDVI, EVI, Normalized Burn Ratio (NBR), Normalized Difference Built-up Index (NDBI), and Modified Normalized Difference Water Index (MNDWI). The per-pixel maximum, mean, standard deviation, and 25, 50, 75 percentiles of each band were calculated to develop the feature matrix for annual land cover mapping. The spectral features of the target DOY were directly used for the daily land cover mapping, and the mean values of each band within the interval were used for the ten-day and monthly mapping. The supervised Random Forest classifier was trained based on the FROM-GLC classification system ). The number of trees was specified as 200, with the other parameters set as the default values provided by GEE.

Unsupervised classification with SW K-means
Sunway TaihuLight is a world-leading supercomputer that currently ranks fourth in the TOP500 list (Media Kit 2021) and achieves a peak performance of 93 petaflops (Fu et al. 2016). Owing to its use of the SW26010 many-core processor, Sunway (SW) TaihuLight provides high-efficiency and highperformance computing. Each processor contains four core groups (CGs). There are 65 cores in each CG, including a managing processing element and 64 computing processing elements (CPEs), which are organized as an 8 × 8 mesh. We adopted SW Kmeans ) to achieve fast and robust land cover mapping. SW K-means was developed from the commonly used clustering algorithm, K-means. SW Kmeans used the hierarchical hardware support of the Sunway TaihuLight, partitioning both by dataflow and centroid. As can be seen in Figure 4, SW K-means used multiple (up to 64) CPEs in one CG to partition the set of centroids and to scale the number of K for cluster centroids C. M group denotes the number of CPEs that were grouped to partition the centroids. SW K-means provides a more feasible solution for complex and large-scale scenarios and applications. Taking advantage of its ultra-high computational efficiency and relative autonomy, SW K-means was used in FROM-GLC Plus to carry out the unsupervised land cover classification. Sentinel 2 imagery (with Blue, Green, Red, NIR, SWIR1, and SWIR2 bands) was analyzed with SW K-means and the number of clustering categories was set to 20. Based on the clustering results, the final land cover maps were then obtained by visual interpretation with reference to high resolution Google Earth images ) using FROM-GLC classification system ).

Super-resolution post-processing
To evaluate the potential of the FROM-GLC Plus for high-resolution applications, we adopted a simple but practical label super-resolution method, using the paired high-resolution image and lowresolution map for land cover mapping with high resolution ( Figure 5). The inputs are the lowresolution map (i.e. 10-m/30-m land cover maps) and the high-resolution image (e.g. Planet and Google Earth images), and the output is the land cover map at the same resolution as the highresolution image. First, the low-resolution map was upsampled to match the high-resolution image. Then, we used a sparse sampling method to collect training data from the paired image and map, where feature X was the RGB value of a randomly selected pixel from the high-resolution image and the label Y was the land cover type of the corresponding pixel from the upsampled lowresolution map. To balance the samples of different types, we set a maximum value K for the number of each type, which depended on the image size. We trained the RF model using the collected training data. To preserve local information and decrease the training difficulty of the RF model, we trained an RF model for each highresolution image patch with a tile size of approximately 2000 pixels × 2000 pixels. Finally, we test each pixel of the high-resolution image using the trained RF model to produce a super-resolution land cover map. Land cover map (10 m)

Super-resolution postprocessing
Based on the input Google Earth images and low resolution land cover map developed with Random Forest or K-means High resolution land cover maps (up to the resolution of Google Earth images) a Spectral time series generated with method in Section 2.2, including Blue, Green, Red, NIR, SWIR1, SWIR2, NDVI, EVI, NBR, NDBI, MNDWI.

Accuracy assessment
We compared our results with other land cover products: (1) 500-m MOD12Q1 (Friedl et al. 2010 The global annual migrated validation samples for 2000-2021 were utilized to independently assess the accuracy of our annual land cover maps. We used a classic confusion matrix to calculate the Overall Accuracy (OA), Kappa, producer's accuracy, user's accuracy, and F1 score. In addition, OA with uncertainty was evaluated using an error matrix-based calibration estimator with a 95% confidence interval (Olofsson et al. 2014). In order to indicate the significance of the difference in mapping accuracy between  other products and our results, Z-score was calculated with the method of Congalton (2019) (when |Z-score| ≥ 1.96, at a 95% confidence interval is significant). Meanwhile, to quantify the impact of the updated remote sensing images on the near real-time land cover classification accuracy, we compared the accuracies of the global land cover maps with monthly migrated validation samples for 2021.
Owing to the heterogeneity of global geographic space, the accuracy of global land cover maps generally differs in space (Yu et al. 2014b). Identifying areas with low classification accuracy may help to improve the overall accuracy of the global classification results. Therefore, the terrestrial ecoregions dataset (Dinerstein et al. 2017), which provided boundaries for 846 ecoregions, was used to generate error maps of global land cover mapping results. For near-realtime land cover mapping in 2021, the impact of the remote sensing data updates on various regions of the world was also quantified using ecoregions.

Results and discussion
The FROM-GLC Plus framework consists of the spectral time-series generation, sample migration (Huang 2020), supervised Random Forest, unsupervised SW Kmeans ) and post processing methods including time consistency check (Du et al. 2021;Li, Gong, and Liang 2015), spatial filtering (Yu et al. 2014b) and super-resolution. In this section, taking advantage of the framework, multi-temporal and multi-resolution land cover maps were developed in Sections 3.1 and 3.2, respectively. Furthermore, to illustrate the generality of FROM-GLC Plus with other data sources than Table 1, daily time series of NDVI were reconstructed based on the GaoFen 1 and Planet imagery in Section 3.3. In addition, examples of near real-time land cover mapping were provided in Sections 3.3 and 3.4, including daily change monitoring of water bodies during 2021.07.15-2021.08.20 in Weishi, Henan, China, and global land cover mapping at 10 m and 30 m in 2021. We further developed annual global land cover classifiers , demonstrating the effectiveness and stability of our method across the years. Results show that FROM-GLC Plus provides a framework for near real-time land cover mapping at multi-temporal (annual to daily) and multi-resolution (30 m to submeter) levels.

Multi-temporal land cover change detection
The flexibility of the spectral time-series generation and sample migration methods in this study allowed FROM-GLC Plus to perform temporally-dense land cover change detection. Three representative examples are shown in Figure 6. Taking the construction of Zhangjiakou Olympic Village in 2019 as an example, the monthly expansion of impervious areas can be seen in Figure 6(a). The ten-day change detection product was suitable for crop monitoring. By comparing the spectral index of the current and previous years, the crop growth situation could be quickly and automatically captured Figure 6(b). Assuming that the target ten-day NDVI value is NDVI d , and the average NDVI value for the same ten-day period over the past five years is NDVI avg . "Higher," "Similar" and "Lower" in the figure mean " Daily land cover detection can be used to monitor a changing rapidly target. As shown in Figure 6(c), the rise and fall of floods in Wangjiaba during the flood discharge period were observable. Furthermore, our method can be used to detect land cover changes in near real-time during emergencies, such as the daily monitoring of water from a rainstorm in Henan in 2021 (Figure 7). The ability of FROM-GLC Plus to facilitate the monitoring of frequently changing land surface features may support the analysis of several international areas of research interest. Figure 8, Figures S4 and S5 show the monthly change of cultivated land between January 2020 and November 2021 in Balkh Province, Afghanistan. The green parcels in the figures denote the cultivated land. This product captures well the dramatic shrinkage of conflicts-induced cultivated land in Afghanistan between 2020 and 2021.

Multi-resolution land cover classification
Using RF, SW K-means, and super-resolution algorithms, FROM-GLC Plus was able to provide multiresolution land cover maps ranging from 30 m to 1 m or even sub-meter level. To prove the effectiveness of the method in producing refined high spatial resolution land cover maps, four subareas located in urban China (Beijing), rural China (Taiyuan), urban U. S. (New York), and rural U.S. (Chicago) were selected. Using MODIS, Landsat, and Sentinel −2 images from 2020, 30-m and 10-m land cover maps were generated with RF. SW K-means was carried out using one single scene from Sentinel-2 in the growing season of 2020. Super-resolution algorithms took Google Earth images and 10-m land cover maps from RF as the input. Because of limitations from the resolution of the available Google Earth images, subareas in China were upscaled 16 times to approximately 0.5 m, and subareas in the U.S. were upscaled 8 times to approximately 1 m. We compared our results with other products, including 10-m GLC products (FROM-GLC10 in 2017 (Gong et al. 2019), ESRI land cover map in 2020 (Karra et al. 2021), ESA WorldCover in 2020; Planet 3-m land cover maps for China in 2018 (Dong et al. 2021a); and EPA 1-m EnviroAtlas maps for the U.S. in the 2010s (EPA 2021) (Figure 9 and Figure  S6-S8).
For the subarea in Beijing ( Figure S6), the ESRI land cover map and ESA WorldCover map confused some grassland and cropland. As shown in the enlarged closeup view of the subarea, FROM-GLC Plus provided Figure 7. An example of near real-time land cover change detection: the daily monitoring of water from a rainstorm in Weishi, Henan, China, 2021. The blue shaded areas in the plot represent the pixel that was classified as water from 25 th , July to 8 th , August. (For the GIF of land cover change, please see Figure S3 in the supplementary material). a more refined and detailed classification result. The shape and boundaries of the playgrounds were accurately depicted. For the subarea in Taiyuan (Figure S7), the greenhouse was also clearly captured using our method. Figure S8 and Figure 9 show the effectiveness of FROM-GLC Plus in extracting buildings and trees. Although the boundaries of features in our maps are not as smooth as those of the EPA's product, FROM-GLC Plus was more automated and more accurate in the classification of some land cover classes, such as trees along the road shown in the zoomed subarea in Chicago (Figure 9).

Fusion with multi-source remote sensing data
In this study, FROM-GLC Plus was implemented using a combination of AVHRR, MODIS, Landsat, and Sentinel-2 images. However, because FROM-GLC Plus can fuse data from multiple sources, other satellite imagery, such as GaoFen 1 (https:// www.cnsageo.com/#/) and Planet (https://www.pla net.com/) imagery, can also be used to reconstruct daily time series. Taking the construction of a cropland daily NDVI time series in Beijing, China, and Jabalpur, India as an example, we fused the time series of GaoFen 1 (Zhong et al. 2021) and Planet data with Landsat and Sentinel-2 images ( Figure 10). GaoFen 1 imagery used here is the analysis ready data (ARD) generated from the archived level 1A Wide Field Viewing (WFV) 16 m data of GaoFen1 from China Center for Resources Satellite Data and Application (CRESDA), and Planet imagery utilized here is the PlanetScope 3.7 m data. Taking the construction in Jabalpur, India as an example, the Landsat and Sentinel-2 images were resized and reprojected to 3.7 m, and then used to fill the temporal gaps in Planet imagery utilizing the method introduced in Figure 2. When compared with the original data, the reconstructed daily time series provided a more detailed and comprehensive description of crop phenology, which is useful for agricultural monitoring and crop classification. The results show the flexibility and inclusiveness of FROM-GLC Plus in terms of data sources, as well as its effectiveness in spatiotemporal fusion. In addition, FROM-GLC Plus can combine the multitemporal and multi-resolution land cover mapping strategies to monitor near real-time land cover change using remote sensing image superresolution algorithms (Dong, Zhang, and Fu 2021b). An example of near real-time land cover change monitoring with super-resolution is given in Figure 11. Compared with the 10-m monitoring results shown in Figure 7, the refined 2.5-m monitoring results show more detailed land cover changes. The rise and fall of the water in the ditch are observable, and flooding among fields is also captured. With the near real-time monitoring and detailed delineation of feature edges and shapes, FROM-GLC Plus may provide the ability to monitor unexpected disasters or other targets from an information-rich spatio-temporal perspective. Our approach thus can provide products for rapid disaster response and field-level crop monitoring, as well as for other tasks that require real-time and detailed land cover change detection.

Global land cover mapping at 10 m and 30 m in 2021
Using MODIS, Landsat, and Sentinel-2 images from January to November in 2021 (as of the experiment of this study, remote sensing imagery for 2021 on GEE was updated to the end of November), the 10-m and 30-m global land cover maps for 2021 were developed in this study ( Figure 12). The confusion matrices with the validation samples migrated to 2021 are shown in Table S1 and S2. Although only images before November were used, the overall accuracy (OA) of the 30-m and 10-m maps reached 74.57% and 72.24%, respectively. Consistent with other research on global land cover mapping, the accurate classification of wetlands was the most difficult in our study. Wetland environments are very diverse, ranging from open water to emergent low vegetation to trees. The accuracy of wetland was reduced due to complex mixed spectral features and insufficient samples. In addition, the rarity of wetlands further increases its mapping difficulty. We also assessed   the accuracy using the independent FLUXNET site data (https://fluxnet.org/data/fluxnet2015-dataset/). Based on the FLUXNET samples migrated to 2021, the OA of the 30-m and 10-m land cover maps in 2021 was 81.43% and 76.30% (Table S3 and S4). Similarly, the relatively low accuracy of wetland and bare land was probably due to the insufficient sample size from FLUXNET.
We further compared the accuracy of current GLC products with our result following the classification system of FROM-GLC ) using the migrated validation samples (Table 4). The OA with uncertainty and Z-score were assessed using the method introduced in Olofsson et al. (2014) and Congalton and Green (2019), respectively. For GLC products with a resolution coarser than 30 m, Zscore was obtained based on the FROM-GLC Plus 30 m. For GLC products with a resolution of 10 m, Z-score was obtained based on the FROM-GLC Plus 10 m. As shown in Table 4, GLC maps developed with FROM-GLC Plus achieved significantly higher accuracy.
We further explored the impact of the increase in the number of images on the accuracy of near realtime global land cover classification in 2021. The accuracy assessment of different sample sizes suggested that the overall accuracy of the global classification results increased with the accumulation of images, and the uncertainty of the accuracy obtained with different sample sizes consistently decreased Figures 13(a,b). However, this did not apply in all land cover classes. For instance, the F1 score of cropland was highest for January to July, followed by a small drop for January to August. This indicated that the use of images in critical periods could be more effective in the identification of some classes. Some land cover classes were sensitive to the accumulation of images. The F1 score of tundra showed an increase for January to May compared to January to April Figures 13(c,d). In contrast, the accuracy of water and bare land fluctuated slightly with image accumulation.
Figures S10 and S11 show the spatial distribution of overall accuracy for the global 10-m land cover classification results. The vast majority of the regions have an overall accuracy exceeding 50%, and in some regions exceeded 75%. Transitional zones with complex and diverse land cover classes were the most difficult regions to classify. Relatively higher accuracies occurred in the areas covered with rainforest and  desert, such as the Amazon rainforest and the Sahara Desert. Compared with the land cover map developed with images in only January, the map with images that covered January-November was more accurate in most areas ( Figure S11). Most of the declines in accuracy occurred in areas least affected by seasonal variations, i.e. low latitudes where forest, grass, and shrubs are mixed, middle latitudes where bare land and snow/ice are mixed, and high latitudes where tundra and snow/ice are mixed.
Annual global land cover classifiers were developed using annual migrated training samples and RF, and the overall accuracies were obtained using the annual migrated validation samples ( Figure 14). Overall, the annual accuracy was relatively stable, with a slight increase in recent years. The overall accuracies ranged from 78.63% to 83.58%, with an average value of 80.60%, which proves the effectiveness and stability of our method across years.

Advancement and limitations
In this study, a flexible land cover mapping system, FROM-GLC Plus, was developed based on the earlier FROM-GLC. Taking advantage of the sample migration, spectral time series reconstruction and machine learning, FROM-GLC Plus offers solutions for near real-time land cover mapping at multi-temporal (annual to daily) and multi-resolution (30 m to sub-meter) levels. Compared with the existing GLC products (Table 1), FROM-GLC Plus shows advancement in the following aspects. First, the spatial resolution of land cover maps was improved from 10 m to meter/sub-meter. Using super-resolution post-processing method, FROM-GLC Plus can automatically generate land cover maps with meter/sub-meter spatial resolution without additional sampling. Second, the temporal resolution of land cover maps was improved from annual to daily. Thanks to the high spatial and temporal resolution spectral time series generated in this study, FROM-GLC Plus can perform temporally-dense land cover change detection. Third, the availability of near realtime land cover maps has been improved. Most of the existing GLC products have a lag of at least one year, which makes the real-time availability of remote sensing data long neglected. In this case, through the use of the latest remote sensing images and flexible sample migration strategies, FROM-GLC Plus provides a framework to enhance the timeliness of land cover maps. The results in Figures 13(a,b) demonstrate that our system can develop land cover maps with roughly satisfactory accuracy at the beginning of the year.
However, this study still has a number of limitations. First, our system is still limited by the accessibility of cloud-free valid remote sensing imagery. Although data fusion methods introduced in Section 2.2 can help with this issue, the few observations in areas highly disturbed by clouds may increase the uncertainty of the mapping results. Introducing Synthetic Aperture Radar (SAR) data may help improve these uncertainties. Second, although our system is designed for global land cover mapping, some of the functions of our system are still difficult to be implemented globally due to the limitation of computing resources of GEE. However, we are working on migrating FROM-GLC Plus to supercomputing platform introduced in Section 2.4.2, which will increase the processing efficiency of the system, thus making it possible to provide near real-time multi-resolution global land cover maps.

Conclusions
FROM-GLC Plus was developed using methods such as multi-season sample space-time migration, multisource data time series reconstruction, and machine learning. This system provides a capacity for producing global land cover change data set from the 1980s. It includes methods for monitoring change in areas that are difficult to map with global mapping using supercomputing platforms and machine learning, as well as change hotspots, with the ability to update land cover changes in near real-time. The preliminary results in this paper show that FROM-GLC Plus provides a framework for multi-temporal (annual to daily) and multi-resolution (30 m to submeter) land cover mapping. In addition, FROM-GLC Plus can perform near real-time land cover mapping, which allows for the application (e.g. agriculture monitoring) of real-time availability of remote sensing data. The preliminary 30-m and 10-m results of FROM-GLC Plus demonstrate its effectiveness.
We also hope that our dataset will serve several needs in the broader research community. Some platforms (e.g. FAO Map Catalog) have started to include our previously released FROM-GLC to further make it available to other research institutions for their research and development activities. The current flexible mapping procedure powered FROM-GLC Plus with high spatio-temporal accuracy classification can help support several ongoing national and international efforts at related thematic issues. For example, the produced outputs may offer the state-of-the-art tools and data foundations for annual national land change survey, monitoring of sudden disasters and achievements of Sustainable Development Goals.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This research was funded by the National Key R&D Program of China (grant number: 2017YFA0604401), Tsinghua University Initiative Scientific Research Program (grant number: 2021Z11GHX002, 20223080017), and the National Key Scientific and Technological Infrastructure project "Earth System Science Numerical Simulator Facility" (EarthLab).

Data availability
The data that support the findings of this study is available at https://figshare.com/articles/dataset/FROM-GLC_Plus_Towards_near_real-time_and_multi-resolution_land_cover_mapping/18497600.