Dataset of the mountain green cover index (SDG15.4.2) over the economic corridors of the Belt and Road Initiative for 2010-2019

ABSTRACT Mountains are undergoing widespread changes caused by human activities and climate change. Given the importance of mountains, the protection and sustainable development of mountain ecosystems have been listed as the goals of the United Nations 2030 Sustainable Development Agenda. As one of the indicators, the Mountain Green Cover Index (MGCI) datasets can provide consistent and comparable status of green vegetation in mountainous areas, which can support the mapping of heterogeneous mountain ecosystem health and monitoring changes over time. The production of explicitly high-spatial-resolution MGCI datasets is therefore urgently needed to support the protection measures at subnational and multitemporal scales. In this paper, the MGCI datasets with 500-meter spatial resolutions, covering the economic corridors of the Belt and Road Initiative (BRI), were developed for 2010 to 2019 based on all available Landsat-8 data and the Google Earth Engine cloud computing platform. The validation of green vegetation cover with the ground-truth samples indicated that the datasets can achieve an overall accuracy of 94.06%, with well-detailed spatial and temporal variations. The archived datasets include the MGCI of each BRI economic corridor, matched to a geospatial layer denoting the economic corridor boundaries. The essential information of the datasets and their limitations, along with the production flow, were described in this paper. The published geospatial datasets are available at http://www.doi.org/10.11922/sciencedb.1005.


Introduction
Mountains are the water towers of the world and have rich bioclimatic vertical zone spectra (Immerzeel et al., 2020). Mountain ecosystems play essential ecological service functions, such as maintaining biological diversity, regulating the regional climate and conserving water resources (Gleeson et al., 2016;Price, 1998). According to the mountain classification data provided by the United Nations Environment Programme (UNEP)-World Conservation Monitoring Centre (WCMC), mountainous areas account for approximately 24% of the world's land surface (Kapos, Rhind, Edwards, Price, & Ravilious, 2000). However, mountains are also highly sensitive to climate change. Under the influence of both climate change and human activities, mountain ecosystems are undergoing rapid changes. Monitoring the health of the mountain ecosystems is thus of great importance for the conservation of mountain environments and the terrestrial ecosystems.
The Belt and Road Initiative (BRI) involves many mountainous countries, and the six economic corridors cross many mountains. For example, the China-Pakistan Economic Corridor (CPEC) crosses the Karakoram Mountains, the Hindu Kush Mountains, the Pamirs, and the western end of the Himalayas, where the terrain is very complex and the ecological environment is highly fragile (Maqsoom et al., 2020). Developing highly quality thematic datasets for the monitoring and evaluation of mountain ecosystem health of the corridors of the BRI's corridors has significant importance for the eco-environment protection and green sustainable development.
Since Agenda 21, sustainable mountain development has been a recognized policy priority in Chapter 13 (Ives, Messerli, & E, 1997). Ensuring the conservation of mountain ecosystems by 2030 was listed as a target in the United Nations (UN) 2030 Sustainable Development Agenda (Target 15.4) in Goal 15, "Life on Land" (UN, 2015). The Mountain Green Cover Index (MGCI), one of the indicators of the Sustainable Development Goals (SDGs) 15.4, is defined as the area ratio of all green plants in the mountain, including the area of forests, shrubs, woodlands, pastures, and farmland to the total area of mountains. It was selected by the International Mountain Science Committee as an important indicator reflecting the state of environmental protection in mountainous areas and was selected into the SDG 15 indicator system (FAO, 2017). The MGCI will provide information on the changes in vegetation (i.e. forest, shrubs, trees, pastureland, crop land, etc.) cover in mountainous areas, and its changes will be generally linked to driving factors such as overgrazing, land clearing, urbanization, forest exploitation, timber extraction, land restoration and reforestation.
Remote sensing and big Earth data technology have become the primary methods for the ecosystem monitoring at global scales and play vital roles in the evaluation of SDGs (Guo, 2017;Hansen et al., 2013;Zhu et al., 2016). With the development of the Google Earth Engine (GEE), start-of-the-art cloud computing platforms using free Landsat archive data and the cloud computing platforms have attracted much attention due to their strong calculation and storage capabilities. For the monitoring of large mountainous areas, these technologies are especially important under the constraints of transport and accessibility. From the metadata of MGCI, it is calculated by the juxtaposition of land cover data interpreted from satellite data by the Food and Agriculture Organization (FAO) of the UN Collect Earth tool and the global map of mountains produced by the FAO/ Mountain Partnership Secretariat (MPS) in 2015 (Bey et al., 2016;FAO, 2017;Saah et al., 2019). Although MGCI data have been released at the national, regional, and global scales, there are several limitations in the current MGCI calculation schemes. For example, the available calculated MGCI data were at the country level. It is difficult to reveal the spatial and temporal variation trends of mountain vegetation at the local scale. As mountains have apparent three-dimensional characteristics, the mountain environmental gradients are highly concentrated due to the unique vertical zone spectra of the climate and soil in mountainous areas (Grabherr, Gottfried, & Pauli, 1994). Detailed MGCI mapping at high spatial resolution is therefore highly needed for the decision-making of the sustainable development protection measures for different regions.
In this study, we produce a dataset of MGCI over the six economic corridors of the BRI during 2010-2019 at a spatial resolution of 500 m, which corresponds to the spatial resolution of the mountain classification datasets. All available Landsat-5/8 TM/OLI images over the study area archived in the GEE cloud computing platform were used for the calculation of green vegetation cover. Spectral and temporal information was used for the green vegetation cover extraction in mountainous areas. As the Normalized Difference Vegetation Index (NDVI) has been commonly used in previous studies for the identification of green vegetation, we mainly focus on the use of frequency and phenological information and therefore different cloud cover conditions were considered for the development of the criteria. Furthermore, the three-dimensional characteristics of mountain surfaces were considered in the datasets by introducing the real surface area instead of the projected area into the calculation of the MGCI. Users can use the datasets for different corridors of the BRI according to their specific research and application purposes.

Study area
The BRI involves jointly building the Silk Road Economic Belt and the 21st-Century Maritime Silk Road. The Silk Road Economic Belt currently consists of six terrestrial economic corridors, i.e. the CPEC, China-Mongolia-Russia Economic Corridor (CMREC), China-Central Asia-West Asia Economic Corridor (CCAWAEC), New Eurasia Land Bridge Economic Corridor (NELBEC), Bangladesh-China-India-Myanmar Economic Corridor (BCIMEC) and China-Indochina Peninsula Economic Corridor (CIPEC) (Figure 1). These corridors run through Asia, Europe and the African continent, connecting the vibrant East Asian economic circle at one end and the developed European economic circle at the other  . As a concept of economic geography, the economic corridors are a kind of economic cooperation mechanism based on the traffic facilities as the carrier to connect different regions. Therefore, taking the core path route of six corridors, the corridors were selected and a 100 km buffer region was created as the economic corridor region. The total selected economic corridor area was 8.82 million square kilometers, among which the mountains accounted for approximately 27.69%.

Data sources
The 30-m Landsat-5/8 standard level 2 surface reflectance (SR)-orthorectified TM/OLI images were used in this study for the green vegetation cover extraction. Under the influence of atmospheric conditions, the Landsat images are usually contaminated by different percentages of cloud coverage. To maximize the use of valid Landsat-5/8 surface observations, all the available Landsat-5/8 standard level 2 surface reflectance (SR)orthorectified TM/OLI images from 2010 to 2019 were used in this study. These Landsat-5/8 images were freely shared and archived in the GEE platform as the image collection of the United States Geological Survey (USGS). Poor-quality observations in the Landsat imagery, including images containing clouds and shadows, were identified by the CFmask band from SR collection and the Fmask band from top-of-atmosphere (TOA) reflectance with Fmask collection (Zhu, Wang, & Woodcock, 2015). The time series of Landsat SR image collections were then used to obtain the NDVI values.
The year of 2015 was selected as the baseline year because the SDGs kicked off at start of 2015. The years 2010 and 2019 years were also selected to compare the MGCI difference with the baseline year of 2015 and therefore the ten-year MGCI change could be finally estimated from the time series MGCI.
The ASTER GDEM V2 data, which cover 99% of the global land areas (83°N to 83°S), was chosen as the Digital Elevation Model (DEM) data source to cover the land surface of the study area as much as possible. The V2 data was improved based on V1, which was downloaded from USGS, with a 30-meter spatial resolution and a 20-m vertical accuracy and was downloaded from USGS (Tachikawa, Hato, Kaku, & Iwasaki, 2011).
The global mountain classification data, was provided by the 2015 FAO/MPS (Kapos et al., 2000). The data was based on the UNEP-WCMC mountain classification at a 500-m spatial resolution derived from the GTOPO 30 (Global 30 Arc-Second Elevation). The definition of each mountain class is included in Table 1.

MGCI calculation algorithm
Following the new MGCI calculation algorithm proposed by Bian, Li, Lei, Zhang, and Nan (2020), three aspects were considered in the new MGCI calculation algorithm, i.e. (1) a high-spatial-resolution green vegetation cover extraction based on the GEE platform and time series Landsat images, (2) a real surface area calculation for rugged mountain surfaces, and (3) an MGCI calculation based on the grid-based MGCI model.
The GEE cloud-computing platform archived a multipetabyte catalog of remote sensing images and geospatial datasets. Therefore, users can utilize the Google's computational infrastructure optimized for parallel computing and big data processing. In the new MGCI calculation algorithm, the green vegetation cover at 30 m in each year in each economic corridor area was extracted from the GEE cloud computing platform due to its strong computing capability and complete Landsat data achieve. Because the spectral reflectance of vegetation has the characteristics of strong absorption in the red wavelengths and the strong reflection in the near-infrared wavelengths due to the chlorophyll in the green leaves, NDVI values derived from Landsat red and nearinfrared bands were used in the algorithm to discriminate vegetation from nonvegetation classes. The observation frequency of each Landsat pixel was initially checked to determine whether a pixel in a year had at least one cloud-free observation every 90 days (Ernakovich et al., 2014). Then, the NDVI threshold was tested in the whole corridor using ground samples and was selected as 0.2. Previous studies also used similar thresholds for vegetation identification (Bian, Li, Huang, Zhang, & Zhan, 2018;Huang et al., 2017).
As the surface area in the mountain region has apparent three-dimensional characteristics, the surface area will always be larger than or equal to the planimetric area of mountains. Therefore, when using the discrete grid data to describe the Earth's surface, great differences arise from using the surface area instead of the planimetric area of the grid to calculating the area-related parameters. Because the MGCI is an area-related index for measuring the mountain ecosystem health according to its definition, the true mountain surface area was considered and further calculated in the new MGCI model. Many methods have been previously proposed for measuring terrain irregularity (Hodgson, 1995;Jenness, 2004;Zhang & Li, 2014). In this paper, a straightforward method of calculating surface area grids directly from gridded DEM data was used for the true mountain surface area calculation (Jenness, 2004). The method generates eight 3-dimensional triangles connecting each cell center point with the center points of the eight surrounding cells. Then, the portions of each triangle that lay within the cell boundary are calculated, and the surface area is finally summed (Figure 2).
The proportion of vegetated mountain surface area within each UNEP 500-m mountain pixel was finally calculated by the new MGCI model. In the model, the surface area of the 30-m vegetated pixels and all surface types within the 500-m mountain pixel were summed. The ratio between vegetated surface area and all mountain surface areas for the 500-pixels was then calculated. The final MGCI data was exported at 500 m.

Data records
In this paper, the MGCI datasets covering mountainous areas within six economic corridors, with the aim of evaluating SDG 15.4.2, are released openly as the reference data for user applications. All the MGCI values contained in the open data are shown in Figure 3. The domain of the datasets ranges from 3.01E° to 127.89 E° and from 0.53 N° to 61.98 N° at a spatial resolution of 500 m. The dataset reflects changes in the MGCI from 2010 to 2019. Each raster file corresponds to 1 year. Figure 3 shows the detailed spatial patterns of the gridded MGCI values at each economic corridor scale from 2010 to 2019. The white regions are not mountainous areas, and the gray colors are mountainous areas from the FAO mountain classification. The red buffer regions are economic corridors where the MGCI was calculated. The gridded MGCI can depict the spatial difference pattern of the mountain vegetation cover within each corridor and can be easily aggregated into different administrative units or watershed scales.
Mountains were the major ecosystem type in CCAWAEC, CPEC, and CIPEC, accounting for 51.44%, 46.39% and 41.89% of the total corridor areas, respectively. For BCIMEC, NELBEC, and CMREC, the mountainous areas accounted for 36.75%, 19.85% and 15.59% of the corridor area respectively. For the years from 2010 to 2019, the mean MGCIs of BCIMEC, CIPEC and CMREC were higher than 90%, which are much higher than the world's average (73.22%). The MGCI of CPEC (33.58%) was far less than the world's average.    Figure 4 further shows the variation in the MGCI in each corridor from 2010 to 2019. On the whole, for CMREC, BCIMEC and CIPEC, the MGCI was always very high during the last ten years. There was no obvious increasing trend for the three corridors due to the good vegetation cover conditions. For CPEC, CCAWAEC and NELBEC, the MGCI were slightly increased, with the absolute MGCI values increasing by 6.61%, 5.79% and 4.67%, respectively.
To understand the MCGI variations in different mountain types, the MGCIs of six mountain types were calculated based on the mountain type data. The area rate of high mountains (>2500 m, only elevation was used) was 13.21%, and these high mountains were mainly located in CPEC, CCAWAEC, and NELBEC, with the area ratios of the total corridor mountain area of 16.88%, 6.30% and 5.02%, respectively. From 2010 to 2019, the  Table 1. MGCI for high mountains increased by 6.22%, with an increasing rate of 1.28% from 2010 to 2015, and −0.03% from 2015 to 2019. Since the human activities in high mountains were lower than those in low mountains, the change in MGCI in high mountains was mainly caused by climate change. Mountains with an elevation lower than 2500 m account for 82.79% of the total mountain area in corridors. These mountains were defined by elevation combined with terrain slope or local elevation range. In general, the MGCI for the lower mountains increased by 3.79% from 2010 to 2019. The increase in MGCI observed in the lower mountains was mainly caused by the combined effects of climate change and human activities such as mining, deforestation, and ecological restoration.

Technical validation
Since there is currently no direct MGCI ground measurement currently and the green vegetation cover is the key intermediate data necessary for MGCI calculations, we used the indirect validation method to validate the extracted accuracy of green vegetation cover. The global sample sets for land cover classification with Landsat-8 shared by Gong et al. (2019) were used to extract and validate the green vegetation. As only land cover samples in 2015 were freely available, the 2015 green vegetation results were compared with FROM-GLC global land cover samples over the whole economic corridor area to evaluate the performance of the developed dataset. There were 1994 samples in 11 classes in the economic corridor regions, and 30% of those samples were finally used for independent accuracy validation.
An error matrix was generated and is presented in Table 2. Generally, both the vegetated and Nonvegetated pixels showed high overall accuracy when validated against the 2015 validation samples. They shared almost the same user and producer accuracy values, with values of 96.55% and 93.96%, respectively.

Usage notes
The MGCI datasets produced between 2010 and 2019 have 3 files in GeoTIFF format and one vector file. Each grid file corresponds to data from one period. The size of each grid file is 1.21 GB with 32 bit float. The vector file was the study area of this research for economic corridors and the projection area.
To fully understand the variation in the MGCI and create a corresponding strategy for the areas requiring protection, it is urgently needed to provide the MGCI at the subcountry level. The MGCI datasets produced in this study can be used to evaluate Overall accuracy = 94.06%, CR = cropland, FR = forest, GR = grassland, SR = shrub land, WE = wetland, WB = water body, TU = tundra, IA = impervious area, BL = bare land, SI = snow/ice, UA = user's accuracy and PA = producer's accuracy.
the spatiotemporal patterns and various factors influencing the mountain health at local to regional scales in the economic corridors of the BRI. For example, for the economic corridors with lower MGCIs, such as CPEC, CCAWAEC and NELBEC, water shortages or low temperatures are the main constraints on the growth of vegetation, and barren land was observed in the mountainous areas. It is necessary to develop the protection and restoration measures for mountain vegetation in these corridors. For economic corridors with high MGCIs, such as CMREC, CIPEC and BCIMEC, more attention should be given to biodiversity conservation measures. Since the MGCI is extracted directly from Landsat observations, the MGCI can be updated at an annual scale by using multisource satellite images such as Landsat and Sentinel-2. However, validation with high spatial resolution images or in-situ observations is recommended when this dataset is used on a local scale.

Disclosure Statement
No potential conflict of interest was reported by the author(s).

Open Scholarship
This article has earned the Center for Open Science badge for Open Data. The data are openly accessible at http://www.doi.org/10.11922/sciencedb.1005.