Fine-resolution mapping and assessment of artificial surfaces in the northern hemisphere permafrost environments

Permafrost degradation has strong and long-lasting effects on anthropogenic land-use activities, contradicting the goal of sustainable development in polar and high-elevation regions. The artificial surface (AS) plays a central role in determining human-environment relationships in permafrost environments. Despite recent progress in monitoring land surfaces, attempts to map permafrost AS with satellite remote sensing have been limited. In this study, we propose an operational framework for fine-resolution mapping and assessment of permafrost AS across the entire Northern Hemisphere landmass. The proposed framework was designed to take advantage of prior knowledge obtained from existing global-scale land-cover products. As a result, a 10 m resolution permafrost AS map for 2016–2017 was created using a locally adaptive classification strategy. We found that the created map exhibited an overall accuracy of 91.7 ± 2.1% with minimum accuracies > 70%. We estimated that the total area of permafrost AS in the Northern Hemisphere was approximately 9,000 km 2 , most notably in the Russian Arctic. Future projections indicate that there will be over one-seventh of the permafrost AS area at high geohazard risk by the end of the twenty-first century. Our study provides new perspectives on the ‘permafrost-human-climate’ nexus, which can advance our understanding of the terrestrial system.


Introduction
Permafrost, accounting for one fifth of the Northern Hemisphere landmass (Obu 2021), is an essential component of the cryosphere (Biskaborn et al. 2019;Smith et al. 2022).Owing to climate change and disturbance regime shifts, the current world near-surface permafrost of approximately 18 million km 2 (IPCC 2019; Ran et al. 2022a) is likely to thaw by 24-69% in the near future (IPCC 2019).In polar and high-elevation regions, rapid permafrost degradation can trigger numerous environmental hazards, including ground subsidence, mass wasting, and thermal erosion, all of which hinder the sustainable development of human communities (Teufel and Sushama 2019).As temperature continues to increase, consequent damage of permafrost thaw-induced disasters is expected to persist (Hjort et al. 2018;Schneider von Deimling et al. 2021;Streletskiy et al. 2023).This, together with the long-standing economic costs of risk adaptation and mitigation (Ran et al. 2022b), necessitates a comprehensive evaluation of the anthropogenic footprint in the permafrost environment (Hjort et al. 2022;Runge, Nitze, and Grosse 2022).
An essential indicator of the anthropogenic footprint is the artificial surface (termed AS hereafter), which includes various non-vegetated surfaces transformed by human beings (Zhao and Zhu 2022).Within the permafrost environment, AS is explicitly defined as man-made built-up structures (e.g.roofs, road pavements and parking lots) or surfaces undergoing mineral extraction (Bartsch et al. 2021).Before the advent of satellite remote sensing, our knowledge of permafrost AS could only be obtained from field inventories (Raynolds et al. 2014) or geographic databases (Hjort et al. 2018).Despite their importance, the uncertainty and amount of labor required are high.Satellite remote sensing has enhanced our capability to gather information on human activities.Scholars have attempted to extract the AS extent as well as its dynamics based on earth-observing imagery, primarily through supervised classification (Gong et al. 2020a;Sabo et al. 2018;Zhang et al. 2020) or thresholding of specific spectral indices (Deng and Wu 2012;Zhang, Schaaf, and Seto 2013).However, previous research has emphasized only warm and temperate zones, leaving permafrost AS information less well-identified.Simultaneously, the world's permafrost environments have experienced unprecedented AS expansion over the past decades, especially in the pan-Arctic (Liu et al. 2022) and several mountainous countries (Luan and Li 2021;Shi et al. 2023).Given the high landscape fragmentation and spectral heterogeneity, large-area fine-resolution mapping of permafrost AS stands out as an imperative task from nearly all aspects of remote sensing, such as data obtaining, classification modeling, and computing capacity.The open archives of fine-resolution satellite imagery and recently available cloud-based platforms, such as Google Earth Engine (GEE) (Gorelick et al. 2017) substantially promote the development of algorithms specifically for permafrost AS identification.For example, Luan and Li (2021) estimated the built-up cover distribution in the ice-rich third-pole region by combining multiple satellite datasets.Similar approaches have also been adopted for characterizing man-made infrastructure along the permafrost affected Arctic coasts (Bartsch et al. 2020;2021).These regional-scale studies have enhanced our understanding of anthropogenic impacts on permafrost, but they did not offer a spatially complete estimation of the entire Earth's terrestrial surface.To date, a consensus on the worldwide permafrost AS estimation and its distribution pattern is still elusive.
Apart from considerable mapping efforts at local and regional scales, pre-existing global land products provide an alternative to the acquisition of large-area AS knowledge, thus supporting a wide range of end-users.Among these products, the most well-known is Landsat-derived globalscale artificial impervious areas (GAIA) (Gong et al. 2020b), which have been extensively employed for quantifying AS dynamics and its ecological impacts (Chen et al. 2022;Shi et al. 2023).Sentinel-2 Multi-Spectral Instruments (MSI) is referred to as the extension of Landsat observations with advanced spectral, spatial, and temporal characteristics.Thus far, Sentinel-2 based land-cover datasets have become available from continental (Liu et al. 2023) to planetary (Brown et al. 2022;Gong et al. 2019) scales, thus making fine-resolution AS mapping feasible.However, pre-existing global land-cover maps are inconsistent owing to differences in the classification scheme, variation in pixel size, and selection of analysis periods.Essentially, the representation of complex landscapes with any single state-of-the-practice global land-cover product substantially simplifies real-world composition (Hermosilla et al. 2022).This paradox is true for permafrost AS covers, as they are composed of various artificial materials and for land use purposes.For example, considerable attempts have been made in recent years to map global built-up areas in the past years (Huang et al. 2022b;Marconcini et al. 2020).However, a much smaller body of literature deals with mining activities, which occupy a considerable share of the permafrost AS and are closely associated with a series of ecological consequences (Hu et al. 2021).In contrast, although the recently released global mining polygon dataset (Maus et al. 2020;2022) can be used to delineate the extent of mineral extraction worldwide, it lacks built-up information; thus, it is incomplete for permafrost AS identification worldwide.
A promising approach for combining the strength of each land-cover product is dataset fusion (Pérez-Hoyos, Udías, and Rembold 2020), which integrates multi-source data sources for optimal prediction.For example, Liu et al. (2020) derived cropland masks by including different pre-existing datasets.Jung et al. (2006) harmonized three land-cover products for carbon cycle simulations using global fuzzy agreement.Reconciling these previous endeavors, this study attempts to investigate the potential of coupling multiple sources of global land-cover archives and satellite images for fine-resolution and up-to-date AS mapping in permafrost environments.Specifically, we focused on the following: (1) Developing an operational framework that can automatically migrate prior knowledge from pre-existing land-cover maps.
(2) Producing a new spatially explicit permafrost AS mapping product for the entire Northern Hemisphere at 10 m resolution.(3) Revealing the most updated spatial distribution and composition (mining and built-up area) of permafrost AS to advance our understanding of human-Earth interactions.

Study area
Our study aimed to map and analyze the AS across permafrost regions in the Northern Hemisphere.To accomplish this objective, we used the Circum-Arctic permafrost and ground ice map (Brown et al. 2002) (https://nsidc.org/data/ggd318/versions/2),which defines the scope of the study.This map was compiled based on multiple national and regional datasets, and provides the distribution and properties of permafrost and ground ice in the Northern Hemisphere (25-90°N) at a spatial resolution of approximately 12.5 km (Brown et al. 2002).Here, permafrost is defined as Earth's land that remains at or below zero degree Celsius for at least two years (Brown et al. 1997).The value range of the indicator for permafrost is 1-20 plus the categories of ice cap/glaciers, inland lakes, and oceans.Due to the limited permafrost occurrence probability (less than 50%), sporadic and isolated permafrost patches were excluded in this study, and we only selected regions featured by continuous and discontinuous permafrost types, together encompassing approximately 15 million km 2 or 15.63% of the total land surface area in the Northern Hemisphere (Figure 1).The mapping years were 2016-2017 to maximize the temporal overlap among the land-cover products used (see Section 2.3).

Satellite imagery and topography data
Satellite images captured by Sentinel-1 & 2 were the main data sources for mapping the permafrost AS.Sentinel-1 consists of two satellites (S-1A and S-1B) that operate in tandem to provide dual-polarization C-band SAR imagery.Among various Sentinel-1 images, scenes of Ground Range Detected (GRD) in the Interferometric Wide (IW) mode can provide calibrated, orthocorrected observations in dual-polarization at 10 m resolution (Veci et al. 2014).To address discrepancies related to coverage heterogeneity and viewing angle, we utilized the normalized Sentinel-1 Global Backscatter Model product (S1GBM), which offers the first worldwide, 'ready for use' Sentinel-1 backscatter mosaics in VV-and VH-polarizations (Bauer-Marschallinger et al.

2021
).The Sentinel-2 Multi-Spectral Instrument (MSI) has been observing the Earth's land surface since 2015.This optical sensor has a spatial resolution ranging from to 10-60 m, varying with the wavelength.We collected all Sentinel-2 Level-2A images that were available within our study area and study period.Each Sentinel-2 image includes 12 surface reflectance bands, one band of the scene classification layer (SCL), and one band of quality assessment (QA60).Additionally, we accessed the Copernicus Digital Elevation Model (GLO-30 DEM, https:// spacedata.copernicus.eu/collections/copernicus-digital-elevation-model)to identify topography information.The GLO-30 DEM dataset is based on radar satellite imagery obtained for the Tan-DEM-X Mission and captures the land terrain of the Earth.We pan-sharpened all satellite and topography data to 10 m based on the bicubic interpolation (Liu et al. 2020;Liu et al. 2023).

Land-cover maps
We assembled pre-existing land-cover products to generate the 'best-estimate' permafrost AS mapping output.Three well-established global datasets, including ESA WorldCover v100 for 2020 (termed WorldCover hereafter) (Zanaga et al. 2021), Dynamic World V1 (DWV1) (Brown et al. 2022), and Global Impervious Surface Area (GISA10 m) for 2016 (Huang et al. 2022a, b) were selected because: (1) they are publicly accessible or freely available upon request; (2) they include AS (or homologous classes) as an independent land-cover type in the classification schemes; and (3) they have a relatively fine spatial resolution (i.e. 10 m) compared with other existing maps.Although these land-cover datasets attempted to map the terrestrial surface worldwide, most did not fully cover the northern high latitudes, where permafrost is ubiquitous, but lacks valid land-cover classification records (Bartsch et al. 2016).Therefore, in this study, we further included the CALC2020 dataset (Liu et al. 2023;Xu et al. 2022), which is a new baseline product specifically covering the entire land surface in the Arctic for 2020.Because DWV1 is a near-real-time product that offers a land-cover time series for the full Sentinel-2 image archives, a multi-year map composite was processed by calculating the pixel-level mode of all valid classification records during the study years (2016)(2017).

Global mining polygon dataset
The global mining polygon dataset (Maus et al. 2020)

Methods
The basic principle of the developed framework is to migrate prior knowledge from pre-existing global land-cover products so that both permafrost AS mapping accuracy and efficiency can be improved.Figure 2 shows the workflow for predicting the existence of per-pixel permafrost AS and characterizing the spatial patterns and composition across the entire Northern Hemisphere.Within the study area, we defined a pixel with a percentage of built-up/mining area greater than 50% as a permafrost AS pixel.We generated two separate permafrost AS maps: built-up and mining land-cover distributions.The proposed framework comprises three modules.Within the first module, we harmonized multiple pre-existing high-resolution land-cover products, including World-Cover, GISA10 m, DWV1, and CALC2020, to generate a pixel-based map composite for identifying consensus and undefined pixels.Consensus pixels were then used to construct a training sample pool.The second module is committed to classifying undefined pixels by implementing a locally adaptive procedure separately for mining and non-mining regions at a 200 × 200 km tile scale.Finally, we adopted an independent validation sample to evaluate the resultant outputs and to estimate the permafrost AS area.The final mapping outputs were aggregated at different levels for spatial pattern and statistical analysis.

Co-registering and legend homogenization
Different land-cover datasets are inconsistent in terms of map projection.To overcome the multi-product mismatch, we adopted the Lambert azimuthal equal-area projection because it can optimize the areal accuracy of our final permafrost AS maps (Brown et al. 2002).Based on this premise, all collected land-cover products were re-projected and co-registered using the nearest-neighbor strategy when necessary.To mitigate the impact of the legend discrepancy on dataset fusion, we defined a harmonized land-cover scheme with three classes: built-up, mining, and non-artificial.For the binary GISA10 m product, the pervious and impervious classes were simply treated as equivalents of non-artificial and built-up covers, respectively.For the other three raster land-cover products (WorldCover, DWV1 and CALC2020), class lumping was conducted to account for the diversity of natural land surface classes.The mining cover type represents a typical anthropogenic disturbance consequence in permafrost environments and was therefore considered an independent class, despite its absence in all collected land-cover products.

Multi-layer compositing
We stacked four harmonized land-cover datasets (GISA10 m, WorldCover, DWV1, and CALC2020) to generate a pixel-based composite.By doing so, the entire mapped extent includes two parts: consensus land-cover pixels and undefined pixels, depending on the compositing result.Specifically, we regarded pixel i as a potential consensus land-cover pixel candidate if it exhibited an identical class label encompassing all the composited land-cover datasets.Due to the limited coverage of CALC2020, the above-mentioned procedure was conducted separately for terrestrial Arctic (latitudinal treeline northward) and non-Arctic regions.Mathematically, the equation for pixel- based compositing is as follows (Equation .(1)).Where AS i is the composite result, AS GISA10m , AS WorldCover , AS DWV1 and AS CALC2020 are harmonized class labels from GISA10 m, ESA World-Cover, DWV1, and CALC2020, respectively.
On this basis, a neighborhood processing was subsequently conducted to eliminate uncertainty of outlier noise and uncorrected misregistration (Dannenberg, Song, and Hakkenberg 2018;Gray and Song 2013).Neighborhood processing was constrained within a 3 × 3 pixel searching window (i.e.900 m 2 search area), and pixels that failed to represent the major land-cover type in its neighborhood (i.e. less than 50 % of the area) were excluded.Pixels must meet all these criteria to be included in the final consensus land-cover pixel pool.Otherwise, pixels remained 'undefined,' which required secondary classification.

Training sample acquisition
The resultant consensus land-cover pixel pool was used for training sample acquisition.Previous studies have suggested that a single universal training sample set would be problematic for largearea AS mapping (Huang et al. 2022b;Liu et al. 2023).Therefore, we developed a novel algorithm that automatically determines the optimal consensus pixels for a local adaptive training sample derivation.The first step of this algorithm was the division of the entire study area into 832 non-overlapping tiles with a 200 × 200 km grid system (see Figure 1).The grid size was determined as a compromise between the local representative and computational efficiency.In each tile, the subset consensus land-cover pixel pool was further grouped into two broad types: mining and non-mining regions, according to the mining polygon dataset.The second step was to allocate the training sample pixels using a stratified random strategy.For mining regions, the two strata used were mining and non-artificial, each of which was allocated 600 pixels.For non-mining regions, we used built-up and non-artificial strata for the sampling design, and a total of 4,000 randomly selected sample pixels were allocated evenly between both strata.Note that there may be inadequate consensus pixels in the target tile.To address this issue, spatially nearest consensus pixels from adjacent tiles were included to ensure training sample quantity and representativeness.Although the reference class label for each sampled pixel can be determined from the pixelbased composite, it is not error free.Therefore, we employed Google Earth very high-resolution imagery, Landsat satellite data, and the Sentinel-2 SCL band time series (since 2015) to refine the initial acquired training sample pixels.After removing pixels with low-level confidence, a tile-specific, locally adaptive training sample set was constructed to perform secondary classification.

Feature extraction
For the secondary classification model development, we extracted four groups of feature metrics from satellite remote sensing and topographic data.The first feature group was per-band values representing the growing season (June, July, and August) Sentinel-2 surface reflectance using the median compositing method.The second feature group consists of four Sentinel-2 spectral band derived indices: the Normalized Difference Vegetation Index (NDVI) (Tucker 1979), Normalized Burn Ratio (NBR) (García and Caselles 1991), Normalized Difference Snow Index (NDSI) (Hall, Riggs, and Salomonson 1995) and Normalized Difference Water Index (NDWI) (Gao 1996).We also included Sentinel-1 backscatter coefficients in the VV and VH bands as the third feature group.These polarization metrics are expected to provide valuable land surface information, especially when valid optical observations are insufficient owing to clouds or high solar zenith angles (Bartsch et al. 2020;Liu et al. 2023;Xu et al. 2022).The fourth feature group included a single topographic metric, the elevation value extracted from the GLO-30 DEM dataset.

Locally adaptive random forest classifier
We used the Random Forest (Breiman 2001) classifier to relabel undefined pixels because of its ability to handle high-dimensional feature spaces and its robustness over large areas (Belgiu and Drăguţ 2016).To balance the prediction accuracy and computational cost, we parameterized each Random Forest model with 500 decision trees and the square root of the total number of input feature metrics as the number of features to split each node (Liu et al. 2019).The Random Forest classification model was adaptively calibrated using a tile-specific training sample set.This approach enabled us to account for broad-scale variability in permafrost landscapes while maintaining the high precision of supervised classification across the entire Northern Hemisphere.A distinct characteristic of Random Forest is that it returns the feature importance, which is used to quantify the contributions of the metrics in classifying undefined pixels at the tile scale.Following the suggestions of X Huang et al. (2022a, b), we used a Gini-based normalization approach to calculate feature importance.The tile-specific Random Forest model training and feature importance calculation were implemented separately for mining and non-mining regions.

Algorithm implementation on the GEE platform
The proposed permafrost AS mapping algorithm framework was implemented on GEE, a cloudbased platform that enables global-scale geospatial data processing and analysis (Gorelick et al. 2017).Among the numerous Sentinel-2 products available in GEE, we used the HARMONIZED surface reflectance collection with the identification index of 'COPERNICUS/S2_SR_HARMO-NIZED'.We also uploaded the mining polygon dataset to the GEE as a data asset.Other datasets, including Sentinel-1 backscatter mosaics, land-cover products, and the GLO-30 DEM, were accessed from the open-source data archive on GEE.Given the inherently local nature of our methods, discontinuous mapping outputs may exist in the borderlands of neighboring tiles.To account for this issue, we enlarged each tile in the GEE by extending it 10 km from the tile perimeter.This operation resulted in 'overlapping' pixels that were mapped more than once by different tiles, and the best estimates were then obtained using the majority voting criterion.Because our algorithm framework was performed at the pixel level in GEE, salt-and-pepper noise inevitably remains in the classification result (Yang et al. 2023).Therefore, we used a sieving filter to eliminate isolated object clusters (less than five pixels).Finally, the maps of built-up and mining extents were merged and exported at 10 m resolution using the Lambert azimuthal equal-area projection.

Accuracy assessment
Two approaches were employed to evaluate the permafrost AS map.First, we evaluated map accuracy and quantify its uncertainty based on a stratified random sampling.The sample-based validation was performed separately for Arctic and non-Arctic tiles to test the effectiveness of the algorithm in different geographical environments.Consistent with the land-cover legend, the three strata were built-up, mining, and non-artificial, each of which was allocated 300 validation pixel locations by assuming that the standard error of a target for overall accuracy is 0.5 % (Olofsson et al. 2014).The reference class label of each sample pixel location was determined based on the interpretation of multisource satellite images by experts (see Section 3.1.3).Based on the validated sample, a confusion matrix was derived for calculating diagnostic measures, including overall accuracy (OA), producer's accuracy (PA), user's accuracy (UA), and their uncertainties represented by 95% confidence intervals (CI 95 ).We also derived the minimum accuracy (MA, defined as the minimum of PA and UA) at the class level to examine the balance between omission and commission errors (Zhu et al. 2016).The second evaluation approach compared our resultant map with the three fuzed maps (GISA10 m, WorldCover, DWV1) and one independent land cover product: GISD30 (Zhang et al. 2022).It is worth noting that none of the products should be regarded as ground truth because they have limited AS mapping accuracies in permafrost environments (Bartsch et al. 2016).Instead, cross-comparison offers insight into the agreement between these products and our map.We performed a comparison at the tile scale, and maps of the difference were acquired for visual interpretation and statistical analysis.

Area estimation
We performed the area estimation at two stages to ensure the validity of all statistics reported throughout this study.First, we followed the approach proposed by Olofsson et al. (2014) to perform error-adjusted area estimations for each of the three strata/classes (built-up, mining, and nonartificial).Based on the acquired confusion matrix, the error-adjusted area estimates and associated uncertainties were calculated using Equations ( 2)-( 6).Where Â is the unbiased area estimate, A is the map-based area estimate, H is the number of sampled strata, ̅ p h is the mean proportion of samples in stratum h.The standard error of the area (s.e.) was estimated based on the variances of p u for the sample pixels in each stratum.s 2 ph is the variance of the sample in stratum h.The 95% confidence interval (CI 95 ) was calculated by multiplying s.e. by 1.96.
Despite the potential of correcting area estimation errors (Friedl et al. 2022;Liu et al. 2018), the sample-based strategy is dependent on sample allocation and stratification, which may limit its practicality (Liu et al. 2022).Therefore, at the second stage, we employed the conventional pixel counting method (Gong et al. 2020b) to calculate area statistics directly from our permafrost AS output.Similar to the sample-based strategy, the gross areas of the three strata were calculated.Then, we focused on the classes of built-up and mining, and disaggregated its area estimate at multiple levels (see Section 3.4.3)for more specific analyses.

Spatial pattern analysis
To understand the spatial patterns of permafrost AS across the Northern Hemisphere, we aggregated per-pixel mapping results to a set of spatial levels, including country, continent, biophysical zone, and permafrost type (continuous/discontinuous).The country and continent level aggregations were conducted based on the global administrative country boundary shapefile released by the National Geomatics Center of China.We used the biophysical permafrost zonation dataset (Ran et al. 2021) to obtain statistical results for five main permafrost stability groups: climate-driven (CD), climate-driven/ecosystem-modified (CDEM), climate-driven/ecosystem-protected (CDEP), ecosystem-driven (ED), and ecosystem-protected (EP).Readers can refer to (Ran et al. 2021) for more technical details of the biophysical permafrost zonation.To project the impact of permafrost thaw-induced disasters on AS in the near future, we linked our map to a list of geohazard potential index layers (Karjalainen et al. 2018), within which the per-pixel geohazard risk level is identified as one of three categories: low, moderate, and high.The three scenarios include representative concentration pathway 2.6 (RCP 2.6), RCP 4.5, and RCP 8.5, which belong to two periods:2041-2060 and 2061-2080.

Sample-based evaluation
The sample-based evaluation shows that our mapping results have high accuracies (Table 1), with an achieved OA of 91.7+2.1% for the entire study area.At the class level, all mapped land-cover types performed reasonable and overall balanced classifications, with MAs of at least 72%.In general, the two permafrost AS classes had larger values of omission errors than commission errors (UA > PA), indicating a higher probability of underestimation for permafrost built-up and mining covers in the Northern Hemisphere.On the contrary, the opposite outcome (UA < PA) was found for the non-artificial class.Based on the statistics, we further found that the majority of error metrics are within a 5% difference between Arctic and non-Arctic regions, confirming reasonable mapping robustness across varied geographical environments.

Inter-comparison with contemporary global datasets
Figure 3 shows the spatial patterns and summarizes the statistics of the tile-level permafrost AS area differences between our estimations and the four global land-cover datasets.For all inter-comparison cases, the majority of mapped tiles exhibited good consistency (within +25 km 2 area difference), indicating the reasonable mapping performance of our results using pre-existing products as the baselines.According to the statistics, our results show slightly higher permafrost AS estimation than WorldCover (Figure 3a) and GISA10 m (Figure 3c) with mean area difference values of 6.19 and 0.25 km 2 per tile, respectively.On the contrary, lower permafrost AS estimations by our output are found compared to DWV1(Figure 3b) and GISD30 (Figure 3d) with mean difference values of −33.37 km 2 and −0.21 km 2 per tile.Spatially, the inter-comparison results were heterogeneous, with clusters of both positive and negative values.Consistent across all comparison cases, notable clusters of positive values appeared mostly in southern and southeastern Siberia.For the rest of the terrestrial Arctic (e.g.Russia's Yamalo-Nenets district, Alaska, and Canadian Arctic), the opposite tendencies can be observed.Compared with the Arctic regions, the difference distributions between our map and baseline datasets have a higher degree of agreement in non-Arctic regions.Specifically, vast consistent areas in the Qinghai-Tibet Plateau from the WorldCover case were also observed using DWV1, GISA10 m and GISD30 as references.Among all non-Arctic tiles, the largest disagreement stands out in Central Asia and the European Alps, where our mapping results show observed overestimation and underestimation referring to WorldCover/GISA10 m/GISD30 and DWVI derived results, respectively.We compared our permafrost AS mapping results with contemporary datasets by selecting five regional subsets.These subsets cover all continents in the Northern Hemisphere and are representative of climate, land use, and social conditions.Through visual interpretation of Google Earth images, we show that our resultant maps are overall comparable with those from references, while providing more accurate information that is not fully captured by fuzed land-cover products.Similar to the tile-level evaluation, WorldCover and GISA10 m exhibited an under-predicted extent of permafrost AS in most subsets.DWV1, however, showed detectable commission errors (Figure 4b, d, e).The greatest mapping difference is found in the two sub-regions of industry, where our outputs are the only product that correctly identifies both built-up and mining covers (Figure 4a, c).The robustness of our results is further confirmed by comparing with an independent global land cover product (GISD30) (Zhang et al. 2022), which displayed underestimations of permafrost AS in sub-regions a, b, and d.

Feature importance
Figure 5 and 6 show the importance of feature metrics in classifying permafrost built-up and mining cover, respectively.For the built-up output produced, we found that contributions from all feature groups varied substantially over space.Comparable performances by Sentinel-2 surface reflectance bands and spectral indices were observed, although the latter seemed more helpful in a few Russian Arctic tiles (Figure 5a, b).Our results also confirm the usefulness of Sentinel-1 derived backscatter coefficients, especially in the Yamal-Gyda Peninsula, Laptev Sea coast, and North Alaska (Figure 5c).Among the four feature groups, topography was the most beneficial for explaining the spatial distribution in the classified permafrost built-up cover.Spatially, there are over 56% valid tiles (containing undefined pixels), where topography plays a major role (Figure 5d).We further aggregated the per-tile feature importance statistics to the entire study extent and ranked the mean contribution of each metric (Figure 5e).As expected, the most helpful metric was elevation (mean importance value of 11.9), followed by Sentinel-2 surface reflectance B1 (mean importance value of 8.1) and Sentinel-1 VH-polarization band (mean importance value of 6.4).For mining cover classification, feature importance variability among tiles was less detectable (Figure 6a-d).Relatively balanced contributions from different feature groups were also observed through the bar plot, and the top three contributors were NDVI, Sentinel-2 surface reflectance B1, and elevation, with mean importance values of 6.5, 6.4, and 6.0, respectively (Figure 6e).

Spatial patterns and composition
Based on the mapping results and probability sample data, we estimated that the area of permafrost AS in the Northern Hemisphere is 8,953+345 km 2 (CI 95 ), representing an approximately 1.2% share relative to contemporary global artificial surfaces (Gong et al. 2020b).We also implemented a pixel counting-based area estimation, which exhibited a consistent magnitude of permafrost AS extent (8643 km 2 ).Tile-based aggregation was performed on a spatially continuous map of permafrost AS to better characterize its spatial patterns and composition (Figure 7).Among 832 tiles covering the study area, we found that 478 were occupied with detectable permafrost AS distributions (more than 1 km 2 ), which resulted in a mean area value of 18.1 km 2 per tile.Spatially, permafrost AS covers are typically concentrated in Western and Southern Siberia, where intensive human activities have been reported by previous research (Liu et al. 2022;Walker et al. 2009).This also mirrors the limited permafrost AS occupation elsewhere on Earth, despite observations in several mountainous regions (Figure 7a).The overall permafrost AS distribution was a result of integrating built-up and mining mapping outputs, with area contributions of 71.4% and 28.6%, respectively.The spatial patterns of permafrost built-up and mining covers generally match well with each other (Figure 7b, c), but the latter exhibits a slightly larger mean area (18.0 km 2 per tile) than the former (13.8 km 2 per tile).Permafrost AS land-covers are unevenly distributed across continents, biophysical conditions, and permafrost types (Figure 7d).Of the three continents involved in this study, Europe leads the tile-level mean area of permafrost AS and is mostly located along the Alps.In the biophysical domain, the largest mean area of tile-specific permafrost AS was observed in the ED zone, tightly followed by EP and CDEM regions.We also found that AS covers constructed in discontinuous permafrost environments exhibited a slightly larger tile-scale area estimation than those in continuous permafrost environments.
At the country level, Russia is the world leader in permafrost AS areas, accounting for approximately 55% of the total Northern Hemisphere.Canada is another country with permafrost AS areas greater than 500 km 2 .These two countries, together with the United States, China, Norway, Afghanistan, Kyrgyzstan, Pakistan, Mongolia, and Switzerland, are the top ten contributors ranked by their estimated permafrost AS area shares (Figure 8a).Within these countries, some notable permafrost AS clusters have been previously recorded in studies at the local and regional scales.Our estimates are generally in line with these estimates and offer much more spatially explicit details.For example, we found that our map captures the AS distributions and compositions of two Russian Arctic cities (Vorkuta and Yakutsk) where built-up and mining covers coexist (Figure 8b, c).The fine spatial resolution of our map becomes even more essential for small-scale permafrost AS identification, as illustrated by the case of the Diavik Diamond Mine located in the Northwest Territories, Canada (Figure 8d).In non-Arctic regions, permafrost AS areas are typically constructed as built-up covers, but their patterns vary substantially because of the joint influences of both natural environments and anthropogenic drivers (Figure 8e-f).

Projected geohazard potential on permafrost AS
By combining our permafrost AS map with the spatially explicit dataset of near-future geohazard potential (Hjort et al. 2018), we predicted the impacts of permafrost degradation on artificial infrastructures under model-based climate scenarios for 2041-2060 and 2061-2080 (Figure 9).The statistical results show that 14.4∼36.4% of permafrost AS will be located in areas with high geohazard potential, depending on the modeled time period and scenario (Figure 9a).In other words, there are more than one-seventh permafrost areas in the Northern Hemisphere, where artificial construction will become extremely unsafe.For permafrost built-up covers, the largest percentage of high geohazard potential was detected under RCP 8.5, which was closely followed by RCP 4.5, and RCP 2.6, respectively (Figure 9b).We found a similar trend for permafrost mining cover, but with smaller magnitudes of high geohazard potential proportions (Figure 9c).The difference between the two time periods is small, except for RCP 8.5, in which permafrost AS areas featured by high geohazard potential nearly doubled during 2061-2080.
Discernible spatial imbalances of high geohazard potential on permafrost AS were also revealed across all modeled climate scenarios (Figure 10).For permafrost built-up covers, the shifted climatic conditions under RCP 2.6, during 2041-2060, resulted in four major hotspots of high geohazard risk, including the Kanin-Pechora tundra, South Siberia, Southwest Alaska, and East Qinghai-Tibet Plateau (Figure 10a).These hotspots are regions where permafrost infrastructure is densely located and are largely associated with industrial/traffic development (Ji et al. 2019;Kumpula et al. 2011;Raynolds et al. 2014;Zhang et al. 2022).The model application of RCP 4.5 and RCP 8.5, during 2041-2060, will expect 10 and 18 more tiles to change from low/moderate risks to high geohazard potential, most of which are spatially adjacent to pre-existing high-risk tiles.Comparably, the 2061-2080 period is likely to witness more permafrost built-up covers identified as having high geohazard potential, especially under two scenarios forced by more extreme climate parameters (Figure 10b).In the worst scenario (RCP 8.5) for the 2061-2080 period, there will be 68 and 52 more tiles suffering from high geohazard potential when compared to RCP 2.6 and RCP 4.5, respectively.For permafrost mining covers, low/moderate risk types always prevail, although a few tiles with high geohazard potential are spread across regions at high latitudes (Figure c, d).As expected, the largest increase in geohazard potential will occur during 2061-2080 under RCP 8.5, in which the number of high-risk tiles nearly doubles.

Full use of existing land-cover products for permafrost AS mapping
Human disturbance is a consequence and cause of global permafrost degradation (Biskaborn et al. 2019;Hjort et al. 2022).Thus, accurate mapping of permafrost AS is important for understanding the interaction between socioeconomic development and cryosphere dynamics.Despite numerous attempts at AS monitoring using satellite remote sensing, the identification of permafrost AS remains a challenging task, primarily because of its spectral heterogeneity and small patch size (Liu et al. 2023).Over large areas, permafrost AS mapping is typically implemented by classifying all pixels within the study domain, which is laborious and vulnerable to inter-region variability.
Empowered by the open source of Earth observation satellite images and the availability of cloud computing platforms, recent years have witnessed a burgeon of planetary-scale high-resolution land-cover products, some of which encompass information that is highly relevant to permafrost AS.To date, however, no global land-cover dataset has been specifically used for permafrost environments, and they are inconsistent with each other.As a result, no single previously created global land-cover product is directly applicable to large-area permafrost AS mapping.To address this challenge, we developed a new operational framework that takes full advantage of existing global land-cover datasets for permafrost AS mapping.By implementing three sequentially integrated algorithm modules, the proposed framework can leverage the strength of land-cover datasets while overcoming their drawbacks.We evaluated the proposed framework using validation sample points and contemporary AS maps, both of which exhibited satisfactory performances under varying environmental settings.
The proposed framework provides a new paradigm for migrating prior knowledge from multiple pre-existing land-cover datasets.Rather than simply extracting class information from previously generated maps (e.g.Huang et al. 2020;Liu et al. 2023), the randomly stratified training sample in this study was derived from the consensus land-cover pixel pool, so the impacts of land-cover imbalance and misclassification error can be greatly reduced or eliminated (Li et al. 2021).This benefit is substantial because training sample representativeness can significantly influence the supervised classification output regardless of the mapping data and algorithm employed (Foody and Arora 1997;Radoux et al. 2014;Yang et al. 2023;Zhu et al. 2016).Apart from the improvement in training sample collection, identifying consensus and undefined pixels further benefits permafrost AS mapping in a timely and cost-effective manner.We directly used consensus pixel labels for AS mapping so that secondary classification can focus only on undefined pixels that exhibit inconsistent class types using reference land-cover datasets.This 'exclusion-inclusion' method not only speeds up the entire mapping process but also minimizes the uncertainty incurred by secondary classification.
Permafrost AS areas are typically distributed with heterogeneous patterns and compositions (Liu et al. 2022).Therefore, the constructed classification model should reflect the local properties of classes (Bartalev et al. 2014;Belgiu and Drăguţ 2016;Deng and Wu 2013).Instead of building a generalizable model over large areas, some studies have sought to develop regionally specific algorithms that consider spatial variability when mapping land-cover.Gong et al. (2020b) proposed different artificial impervious area mapping approaches for arid and non-arid regions.Similar strategies have also been adopted for global-scale cropland identification (Xiong et al. 2017;Zhang et al. 2021).Reconciling these successful efforts, we took a step forward by applying a locally adaptive procedure to classify undefined pixels.In this study, based on multi-source feature metrics, a random forest classifier was automatically calibrated at the tile level using the spatially nearest training sample.Consequently, comparable classification accuracies were achieved under various geographical conditions (Table 1).The superiority of the locally adaptive classification is further confirmed by the remarkable feature importance discrepancy at the tile level.Sentinel-1 backscatter coefficients, for example, have been proven to be more important for permafrost AS identification in the Russian Arctic than in the Qinghai-Tibet Plateau (Figure 5 and 6).In summary, the aforementioned results elucidate the potential of large-area, self-adjusted permafrost AS mapping without significant user intervention.

Environmental and policy-related implications
We provide a new Northern Hemisphere-wide permafrost AS map at 10 m resolution, which is of theoretical importance to human-cryosphere relationships and practical relevance to sustainable development goals.According to our estimation, the total permafrost AS area in the Northern Hemisphere is 8953+345 km 2 (CI 95 ), which is higher than that in previous studies at regional or continental scales (Bartsch et al. 2021;Chen, Pandey, and Seto 2023;Xu et al. 2022).Our generated map can benefit ongoing efforts to understand the environmental impacts of human-induced land-cover changes on permafrost.For example, converting vegetated land surfaces to AS covers dramatically alters the Earth's radiation budget, potentially accelerating the permafrost degradation process (Hjort et al. 2022).Spatially explicit information on permafrost AS is also helpful in improving permafrost hydrology (Bring et al. 2016) and biogeochemical estimates (Miner et al. 2022), both of which in turn aid policymaking for better management and planning of man-made infrastructures underlain by permafrost degradation.
Based on the resultant map, we noted some observable permafrost AS hotspots, mostly appearing in the Russian Arctic, the United States (Alaska), and Northern Canada (Figure 7 and 8).These patterns are consistent with previous research (Bartsch et al. 2021;Liu et al. 2022;Melvin et al. 2017;Raynolds et al. 2014) and confirm the pivotal role of the aforementioned countries in leading infrastructure construction on permafrost.With improved spatial resolution, the generated map can uncover the hidden distribution of small-scale permafrost AS areas that would otherwise not be observed via coarse-resolution datasets.This improvement is particularly beneficial for monitoring permafrost settlements holding indigenous communities, which constitute the majority of the population in polar and high-elevation regions that are extremely vulnerable to permafrost degradation-induced hazards (Huntington et al. 2019).Based on our study, not only the pattern of permafrost AS, but also the composition information was identified.We created individual mapping results for built-up and mining land use.Such an advantage is critical because a permafrost environment characterized by built-up covers is fundamentally different from that experienced by mining activities, and this will affect a great number of biotic and abiotic processes (Tang and Werner 2023;Zhao and Zhu 2022).
As the climate is constantly changing with shifts in temperature globally, a vast array of studies have predicted the intensification and extensification of permafrost degradation in the future (AMAP 2017;Hjort et al. 2018;Miner et al. 2021;Teufel and Sushama 2019), making efforts to maintain safe infrastructure elusive or even in vain.Using projected climate scenarios, we estimated that high geohazard potential will cover 14.4∼36.4% of permafrost AS areas in the Northern Hemisphere, a magnitude generally comparable to previously modeled results (Hjort et al. 2022;Streletskiy et al. 2023).Given the continuation of permafrost AS growth associated with natural resource extraction (Liu et al. 2022), we may expect greater geohazard potential that was not fully informed in the current model predictions.The combination of the projected geohazard potential and our map also provides the possibility of mitigation initiatives targeting and prioritizing the least resilient permafrost AS areas.Our study shows an overall greater geohazard potential for built-up covers than for mining covers, and pre-existing high-risk permafrost AS areas will continue to expand to spatially adjacent regions (Figure 9 and 10).Consequently, policymakers and planners can use more focused engineering operations to enhance the structural resistance of buildings to permafrost degradation.

Limitations and future work
Although the generated permafrost AS map shows promising performance, there is a need for future research on the main data sources and modules of the proposed framework.In this study, we focus on terrestrial environments with more than 50% permafrost coverage, which could ignore AS areas distributed in sporadic or isolated patches.We acknowledge that there exists a scale mismatch between our generated AS product and the used permafrost extent map, thus adding extra uncertainty into the final area estimation.For example, in the Alps where permafrost distribution is typically fragmented, some AS clusters outside the permafrost extent may be incorrectly included.This situation can be improved by using higher-quality global permafrost extent data, especially those associated with finer spatial resolutions (e.g.Obu et al. 2019).Additionally, the supervised classification of land-cover considering permafrost environments requires accurate prior knowledge to be commensurate with the size of the pixel.In this study, the accuracy of the acquired training sample relies on the quality of the fuzed global maps and the mining polygon dataset.If these products offer incorrect information, then a biased training sample set is expected.Another critical step for accurate permafrost AS mapping is the extraction of the feature metrics.For this study, all geospatial metrics feeding the machine learning model were derived from global free-accessible Sentinel-1&2 and GLO-30 DEM data.Despite the overall high accuracy, confusion (e.g.bare soil incorrectly identified as AS) still exists in the final map.To further reduce classification errors, many other types of geospatial layers, such as OpenStreetMap (Chen et al. 2021), climatic variables (Crowther et al. 2015), and nighttime lights (Gong et al. 2020a;Levin et al. 2020) can be taken advantage of.
The basic assumption for locally adaptive mapping is the existence of sufficient spatially close sample points for classifier calibration (X Huang et al. 2022b).However, this assumption may be violated because of imbalanced training sample availability among classes and across regions.In this study, we applied a locally adaptive procedure by dividing the entire study area into a 200 × 200 km grid system, which worked well for capturing the patterns of permafrost AS in the Northern Hemisphere.Nevertheless, a lack of permafrost AS training samples was still observed for a few tiles, causing the proposed framework to be less useful in these particular areas.Therefore, further efforts should be emphasized on the tradeoff between the local representativeness of the training sample and its availability over space.In addition, the heterogeneity of permafrost landscapes leads to the common existence of mixed pixels, which may add uncertainties to the final output (Chen, Pandey, and Seto 2023).The implementation of a sub-pixel algorithm enhances the reliability of the proposed framework.Finally, it should be noted that permafrost AS may encompass some small-scale built-up or mining structures (e.g.pipelines) which are not reflected in our study, and this issue needs to be further addressed using higher resolution satellite remote sensing data such as the PlanetScope.
The ongoing land use policy transition for degraded permafrost adaptation warrants not only the current land-cover distribution information, but its long-term, high-frequency dynamics.For example, recent studies documented several hotspots of remarkable infrastructure development in the Northern Hemisphere permafrost environments, including the Yamal Peninsula, Russia (Bartsch et al. 2021), northern Canada (Liu et al. 2022), and the Qinghai-Tibet Plateau, China (Zheng et al. 2023).Although the subject in this study is focused on permafrost AS mapping for 2016-2017, our product can be used as the baseline for tracking the spatial-temporal changes of global permafrost AS over the past decades, thus carrying the potential for better management and planning of permafrost infrastructure aiming at achieving sustainable development goals.

Conclusions
In this study, we developed a novel framework for fine-resolution mapping and assessment of permafrost AS across the entire Northern Hemisphere.This framework was implemented by taking advantage of pre-existing land-cover products and classifying large-area land surfaces in a locally adaptive manner.The evaluation results obtained using the independent validation sample set and contemporary datasets support the feasibility of the algorithm.Following the proposed framework, we estimated that the total area of permafrost AS in the Northern Hemisphere reaches 8,953+345 km 2 , mostly located in regions characterized by intensive human activities.At the class level, built-up cover, rather than mining areas, dominated the permafrost AS pattern.According to future projections, high geohazard potential will threaten 14.4-36.4% of permafrost AS areas towards the end of this century.Collectively, the new framework provided improved estimates of large-area permafrost AS and enhanced our understanding of permafrost environmental dynamics under human disturbance and climate change.

Figure 1 .
Figure 1.Study area with 832 tiles of the 200 × 200 km grids.Continuous and discontinuous permafrost regions are labeled by green and brown colors, respectively.The background terrain map is provided by ESRI.

Figure 2 .
Figure 2. Flow chart of the developed framework for fine-resolution permafrost AS mapping in the Northern Hemisphere.

Figure 3 .
Figure 3. Tile-level permafrost AS area difference (km 2 ) between this study and WorldCover (a), DWV1(b), GISA10 m (c) and GISD30 (d).The insert histograms show the frequency distributions of the permafrost AS area difference, with the red line indicating the mean value.

Figure 4 .
Figure 4. Regional subsets of permafrost AS mapping comparison.(a) Komi Republic with the center of 67.5°N, 64.2°E;(b) Prudhoe Bay with the center of 70.2°N, 148.4°W;(c) Fort Smith with the center of 64.7°N, 110.6°W;(d) Changzhu with the center of 29.2°N, 91.8°E; (e) Matrei in Osttirol with the center of 47.0°N, 12.5°E.All subsets are mapped with WGS84 latitude/longitude coordinate system.Built-up, mining and non-artificial covers are labeled by red, yellow and green colors, respectively.

Figure 5 .
Figure 5. Feature importance in identifying permafrost built-up covers.The spatial pattern of feature importance is quantified and shown at the tile scale for each of four groups including surface reflectance (a), spectral index (b), backscatter coefficient (c), and topography (d).The gray color indicates no undefined pixel within the corresponding tile.All valid tiles are further aggregated to obtain the mean importance value of each used metric across the entire study extent (e).

Figure 6 .
Figure 6.Feature importance in identifying permafrost mining covers.The spatial pattern of feature importance is quantified and shown at the tile scale for each of four groups including surface reflectance (a), spectral index (b), backscatter coefficient (c), and topography (d).The gray color indicates no undefined pixel within the corresponding tile.All valid tiles are further aggregated to obtain the mean importance value of each used metric across the entire study extent (e).

Figure 7 .
Figure 7.The map of permafrost AS for the Northern Hemisphere.At the tile level, not only the overall permafrost AS area (a), but contributions from built-up (b) and mining cover areas (c) are displayed.The insert histograms show the tile frequency distributions.Boxplots (d) show permafrost AS area statistics by biophysical zone (CD, CDEM, CDEP, ED and EP), by continent (ASI, EU and NA represent Asia, Europe and North America), and by permafrost type (C and D represent continuous permafrost and discontinuous permafrost).

Figure 8 .
Figure 8.(a) Country-level statistics of total permafrost AS area estimation (km 2 ), accompanied with spatially-explicit mapping results of five selected subsets including: (b) Vorkuta, Russia, with the center of 67.5°N, 64.1°E; (c) Yakutsk, Russia, with the center of 62.0°N, 129.7°E;(d) Diavik diamond mine with the center of 64.5°N, 110.3°W;(e) Tuqiang town, China, centered at 52.9°N, 122.8°E; (f) Targar, Kyrgyzstan, centered at 43.3°N, 77.2°E.All subsets are mapped with WGS84 latitude/longitude coordinate system.Built-up and mining covers are labeled by red and yellow colors, respectively.

Figure 9 .
Figure 9. Projections of high geohazard potential percentage under different climate scenarios for the 2041-2060 (blue bar) and 2061-2080 (yellow bar) time periods.(a)-(c) are statistical results for overall permafrost AS, permafrost built-up covers, and permafrost mining covers, respectively.

Figure 10 .
Figure 10.The map of high geohazard potential under different climate scenarios.(a) permafrost built-up cover for the 2041-2060 period; (b) permafrost built-up cover for the 2061-2080 time period.(c) permafrost mining cover for the 2041-2060 period; (d) permafrost mining cover for the 2061-2080 period.Tiles of no permafrost mapped AS are labeled by gray color.

Table 1 .
Accuracy metrics and associated CI 95 estimates for the created permafrost AS map using validation sample.UA, PA, MA and OA are user's accuracy, producer's accuracy, minimum accuracy, and overall accuracy, respectively.