The superiority of the Adjusted Normalized Difference Snow Index (ANDSI) for mapping glaciers using Sentinel-2 multispectral satellite imagery

ABSTRACT Accurate monitoring of glaciers’ extents and their dynamics is essential for improving our understanding of the impacts of climate and environmental changes in cold regions. The satellite-based Normalized Difference Snow Index (NDSI) has been widely used for mapping snow cover and glaciers around the globe. However, mapping glaciers in snow-covered areas using existing indices remains a challenging task due to their incapabilities in separating snow, glaciers, and water. This study aimed to evaluate a new satellite-based index and apply machine learning algorithms to improve the accuracy of mapping glaciers. A new index based on satellite data from Sentinel-2 was tested, which we call the Adjusted Normalized Difference Snow Index (ANDSI). ANDSI (besides NDSI) was used with five different machine learning algorithms, namely Artificial Neural Network, C5.0 Decision Tree Algorithm, Naive Bayes classifier, Support Vector Machine, and Extreme Gradient Boosting, to map glaciers, and their performance was evaluated against ground reference data. Four glacierized regions in different countries (Canada, China, Sweden, and Switzerland-Italy) were selected as study sites to evaluate the performance of the proposed ANDSI. Results showed that the proposed ANDSI outperformed the original NDSI, and the C5.0 classifier showed the best overall accuracy and Kappa among the selected five machine learning classifiers in the majority of cases. The original NDSI yielded results with an average overall accuracy of (around) 91% and the proposed ANDSI with (around) 95% for glacier mapping across all models and study regions. This study demonstrates that the proposed ANDSI serves as a superior and improved method for accurately mapping glaciers in cold regions.


Introduction
Glaciers are an essential component of the cryosphere that provides essential water resources to communities in cold regions (Parajka et al. 2010) and critically impact the Earth's energy balance, and are key contributors of modern global sea level rise.Glaciers can play an important role in regulating the radiation balance of the Earth's surface, thereby affecting the climate (Bengtsson 2014).As large, frozen masses of ice, glaciers reflect sunlight, which helps to cool the Earth's surface.Their melting also releases fresh water into the ocean, which influences ocean currents, affecting regional climate patterns.According to a study by Marzeion et al. (2018), glaciers and ice caps have contributed 21% to the observed sea-level rise from 1900 to 2015.This finding demonstrates the critical role that glaciers play in the Earth's system, particularly in relation to the water cycle and climate change.Glaciers also serve as a crucial source of freshwater for many regions around the world (Reznichenko et al. 2010).They store water during the winter and release it slowly during the summer months, providing a consistent supply of water for ecosystems and human populations.According to a study by Kaser et al (Kaser, Großhauser, and Marzeion 2010), more than one-sixth of the global population relies on water from glaciers, particularly in high-altitude regions like the Himalayas.This highlights the significance of glaciers in supporting freshwater resources and the potential impact of their decline on global water security.The findings of recent studies provide further evidence of their significance, underscoring the need for continued research and conservation efforts to ensure the sustainability of these critical natural resources.
Timely and accurate monitoring of glaciers' extents and their spatiotemporal variability are critical for many applications (Hill, Rachel Carr, and Stokes 2017;Kneib et al. 2021;Paul, Huggel, and Kääb 2004;Redpath et al. 2013) such as water resources management (Bolch and Marchenko 2009;Huss 2011;Rangecroft, Harrison, and Anderson 2015), hydrological modeling, and impacts assessments of climate and environmental changes in cold regions (Bindschadler et al. 2001;Bishop et al. 2004;Che et al. 2008;Kargel et al. 2005).Since the 1970s, many studies have relied on band ratios and indices for a number of satellite imagery platforms that leverage the distinct spectral signatures of snow and ice in the visible, near-infrared, and shortwave-infrared wavelengths (Foster et al. 2009;Wang et al. 2019).Previous studies have been trying to map glacier and snow-covered areas by combining some satellite band ratios, such as the shortwave infrared (SWIR) and nearinfrared (NIR) (Hall, Riggs, and V Salomonson 1995;Dozier 1989).Band ratios and indices are commonly used to map snow-and glacier-covered areas with satellite data because they allow for the differentiation of different surface features based on the reflectance properties of different spectral bands.Snow and ice have distinctive spectral signatures in the visible, nearinfrared, and shortwave-infrared regions, which can be used to identify them in satellite imagery.The Normalized Difference Snow Index (NDSI) (Hall, Riggs, and Salomonson 1995) has been applied to map glacier and snow cover extent using a number multispectral satellite image platforms including MODIS, Landsat, and Sentinel-2 (Cayo et al. 2022;Singh et al. 2021;Xie et al. 2020;Zhang et al. 2019).
In addition to band ratios and indices, supervised classification algorithms have also been used to map snow and ice cover using satellite data.These techniques involve the use of statistical and machine learning algorithms to classify pixels in satellite imagery into different land cover classes based on their spectral characteristics and other features.Overall, the use of band ratios and indices, along with classification algorithms, has proven to be a valuable tool for mapping snow and ice cover using satellite data.These techniques have been widely used in research and applied settings, including for monitoring and predicting the impact of climate change on glaciers and snow cover extent.Various NDSI thresholds have been explored for classifying glaciers using different satellite platforms, varying from 0.3-0.8 and with overall accuracies varying between 75%-96% (Singh et al. 2021;Zhang et al. 2019), with the most accurate performance achieved by (Cayo et al. 2022) when including a minimum NDSI value for detecting glacier pixels.
Classifying surface features such as water is an important issue for water resource management, due to the changing surface melt water pond (e.g.Salerno et al. 2016;Sneed and Hamilton 2007) and ice-marginal lake extents over time (e.g.Brianna et al. 2022;Zhang et al. 2022).However, there are some limitations to the use of NDSI for glacier mapping.For instance, the index may not be able to distinguish between snow and ice or identify water pixels.Choubin et al. (2019) recommended removing water bodies from images before classifying using the NDSI, which greatly limits the scalability of this approach.Therefore, this study aims to assess the efficacy of combining machine learning algorithms with satellite indices for accurately mapping glaciers.To address these limitations, a new approach called Adjusted Normalized Difference Snow Index (ANDSI) has been proposed in the current study.
Recently various machine learning methods are increasingly used to improve glacier/snow mapping.Different machine learning classifiers such as support vector machines, k-nearest neighbors, gradient boosting-based models, decision tree-based models, and artificial neural network-based models have been applied successfully in different regions for glacier mapping (Alifu et al. 2020;Qiu et al. 2022;Z.;Xie et al. 2020;Khan et al. 2020).
Identifying glaciers can be affected by the spectral resemblance between glaciers and materials around glaciers (Robson et al. 2020).Satellite imageries, particularly Sentinel-2 data, have corroborated the capability in glacier mapping owing to freely available to the public, high temporal, spatial, and radiometric resolution, and low noise (Alifu et al. 2020;Paul et al. 2016;Yan et al. 2021) compared to other satellite products, such as MODIS and Landsat.Paul et al. (2016) employed Sentinel-2 data to map glacier extents and surface facies, and compared the results to those from Landsat 8. Wangchuk and Bolch (2020) utilized Sentinel-1 and Sentinel-2 data to map glacial lakes and assess their changes over time (Wangchuk and Bolch 2020).However, previous studies identified several key limitations of NDSI for mapping snow/glaciers (Choubin et al. 2019;Sibandze et al. 2014), such as the difference between snow/glacier pixels and background pixels, detecting water pixels as snow/glacier pixels, and subjective and varying threshold values for NDSI in different regions (Kulkarni, Rathore, and Singh 2010;Sood et al. 2020a;Xiao et al. 2010).The problem of the NDSI cannot be solved by removing water bodies from satellite images and also this solution is not an efficient way for glacier mapping, because of the need to identify water bodies and data pre-processing to remove water bodies from the original image.Therefore, we are motivated to address these limitations of NDSI by proposing a new satellite-based index in this study which is based on an adjustment of NDSI by a soil-based satellite index (Char Soil Index (CSI)) that can improve the capability of NDSI to distinction between land surface features.
Therefore, this study aims to propose a new methodology for glacier mapping using Sentinel-2 multispectral satellite imagery.To address the limitations highlighted in previous glacier mapping approaches in complex snow-covered environments, we propose a new ANDSI.This novel index aims to improve classification accuracies with respect to the NDSI by incorporating the near-infrared, green, and shortwaveinfrared bands, which enable better differentiation between glaciers and surface water.This additional spectral information has the potential to improve the differentiation between glaciers and water bodies, which is critical for accurate glacier mapping in snowcovered regions.Furthermore, the integration of machine learning algorithms with ANDSI provides an opportunity to enhance classification accuracy and overcome the limitations of traditional threshold methods.The ANDSI takes the advantages of NDSI and CSI and can cover a wider barycenter of the spectral response range (560 to 2190 nm), which allows for extracting of more details of the glacier with a wider bandwidth (35 to 180 nm) to carry out a better detection between inland water and glaciers pixels.Specifically, the objectives of this study are to: i) develop and evaluate an alternative remote sensing index (ANDSI), which can solve the limitation of the classical NDSI for glacier detection (by keeping advantages of NDSI); ii) test and analyze the ability of the proposed ANDSI globally on different regions using Sentinel-2 imagery data; iii) explore the application of different machine learning based classifiers for mapping glaciers by satellite-based indices.

Study area and satellite data
Due to the focus of the proposed satellite index (ANDSI) which is based on cold regions water resources management, four glacierized regions in Northern Hemisphere, specifically from Canada, China, Sweden, and Switzerland-Italy were selected as study areas in this study.Glaciers are crucial to the ecosystem, and water resources in these four regions, and these four regions represent different environmental conditions to test our methods.Glaciers in the Northern Hemisphere are generally more closely monitored and studied due to their greater proximity to large population centers and their impact on water resources in areas such as the Himalayas, the Alps, and the Rocky Mountains.The retreat of these glaciers can have significant consequences for local water supplies, agriculture, and hydropower generation.
The selection of these four study regions aims to provide a diverse representation of glacierized areas with variations in geographic location, climatic conditions, and glacier types.By including regions with distinct characteristics, this study can evaluate the performance of the proposed ANDSI in different settings and assess its generalizability across a range of glacierized environments.Table 1 lists the location of these study areas and the used satellite images.We only selected Sentinel-2 satellite images taken between August and October because the snow cover generally would be in its minimum coverage and high-quality satellite images are easier to obtain due to less cloud coverage during these months.Snow does not contribute to glacier mass balance from August to October (due to the extremely few snow events), therefore this period was considered as the ideal season for separating glaciers from snow cover using satellite imagery.
Sentinel-2 is an optical satellite and is part of the Copernicus Programme's Earth observation mission, which has been launched by the European Space Agency (ESA) to better grasp the Earth system and monitor global natural resources.The Sentinel-2 mission provides free multispectral images at various resolutions (10 m, 20 m, and 60 m), and the current study mapped glacier regions at a 10 m spatial resolution.Sentinel-2 satellites offer obvious advantages over other multispectral imaging satellites for glacier studies.First, glacier studies may benefit from Sentinel-2 data's high spatial resolution and a suitable dynamic range (Paul et al. 2016).Second, the Sentinel-2 satellite sensor allows to access a larger region view with a wider swath width.Due to the shorter return period, Sentinel-2 imagery can detect change in glacier areal extent more accurately (Pandžić et al. 2016).The MSI Level-1C data of Sentinel-2 which is radiometrically and geometrically corrected, has been used for glacier mapping (Alifu et al. 2020;Veettil 2018;Yan et al. 2021;Zhang et al. 2019), thus the current study also used this data (https://scihub.copernicus.eu/).Figure 1 shows the top of the atmosphere (TOA: calculated based on Ranghetti et al. 2020) images from Sentinel-2 for each study region.We used four bands of Sentinel-2 data based on used bands in NDSI and CSI to use their advantages for mapping glaciers in this  study.These four bands are the Green band (Central Wavelength (CW) = 560 nm), Near-infrared (NIR) (CW = 842 nm), Short-wave Infrared (SWIR1) (CW = 1610 nm), and SWIR2 (CW = 2190 nm).Green and NIR bands are at the spatial resolution of 10 m and the SWIR1 and SWIR2 which have bands at 20 m spatial resolution.All Sentinel-2 satellite data processing, including changing the spatial format to the 10 m (with bilinear method), was handled based on the "sen2r" package (Ranghetti et al. 2020) in the R programming language.

Remote sensing indices
The Normalized-Difference Snow Index (NDSI) (Hall and Riggs 2014) has been used extensively to map snow and glacier extents using Sentinel-2 imagery and is calculated as Equation 1 (Ali et al. 2020;Kuter, Akyurek, and Wilhelm Weber 2018;Luo et al. 2022;Mityók et al. 2018;Sood et al. 2020b;Stojković, Marković, and Durlević 2023;Wang et al. 2022).Also, the current study used CSI as satellite index which is effective in detecting surface features for developing ANDSI (Equation 2).The CSI was initially designed to estimate the degree of soil charredness or blackness resulting from wildfires.It is a relatively new index and was developed as a response to the need for an accurate and quantitative method of assessing the severity of wildfire effects on soils (Sparks et al. 2014). (2) The current study tested a new satellite based index from Sentinel-2 imagery data for overcoming the limitations of NDSI for mapping glaciers, this new proposed index (ANDSI) can be calculated by Equation 3.
where, Green represents band 3, NIR (Near Infrared) represents band 8, and SWIR1 as a short wave infrared spectral band refers to band 11, and SWIR2 refers to band 12 of Sentinel-2 data.
The ANDSI includes both the NDSI and CSI in order to cover a wider barycenter of the spectral response range.ANDSI can better distinguish inland water-and glacier-covered pixels in satellite images, eliminating the need for removing water-covered pixels in the case of NDSI-based classifications.
Glacier mapping via thresholds was implemented in the current study to compare ANDSI with original NDSI.For this aim, NDSI ≥ 0.42 was considered (based on literature reviews) as a threshold for glacier mapping via NDSI.To explore a potential thresholding approach for mapping glaciers using the ANDSI, we tested threshold values of −0.25 ≤ Ln (ANDSI) < 0 which can have a maximum differentiation between glacier and non-glacier regions for ANDSI.So we defined the threshold by trial-and-error based on the differentiation of glacier and non-glacier boundary in studied regions.

Machine learning classifiers
Machine learning techniques have demonstrated their efficacy in various environmental studies, even when employing a single input variable (Fathian et al. 2019;Moazenzadeh and Mohammadi 2019;Olyaie, Zare Abyaneh, and Danandeh Mehr 2017).Five types of classifier algorithms, namely Artificial Neural Network (ANN), C5.0 Decision Tree Algorithm (C5.0), Naive Bayes Classifier (NBC), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGB), were implemented based on: (i) glacier mapping = f (NDSI) and (ii) glacier mapping = f (ANDSI) to evaluate their performance in mapping glacier regions.Utilizing a single input variable simplifies the classification process by machine learning by focusing on a specific feature which is the goal of the current study (evaluating the effect of NDSI and ANDSI as input of machine learning algorithms).Also, the use of a single input feature streamlines the interpretation of the machine learning results (outputs).Therefore, the current study used two satellite-based indices (namely NDSI and ANDSI) individually as input features of machine learning algorithms.
The machine learning classifiers were implemented in R programming languages (R Development Core Team 2019).The process of developing machine learning algorithms for glacier mapping involved several steps, which can be summarized as follows.(1) Data preparation: pre-processing satellite imagery, calculating NDSI and ANDSI, and gathering ground reference samples.(2) Machine learning algorithm selection: ANN, C5.0, NBC, SVM, and XGB, commonly used in image land classification applications.(3) Model training: The machine learning algorithms were trained using labeled training data.Each pixel in the training data was classified as either a glacier or non-glacier pixel, with four defined classes based on section 3.3.The machine learning algorithms learned to identify patterns in the input data associated with glacier pixels.(4) Model testing: The trained models were evaluated on a separate set of data to determine their overall accuracy and Kappa coefficient values.Information regarding the testing data and assessment metrics can be found in section 3.3.

Artificial neural network
An Artificial Neural Network (ANN) is a network of linked nodes that processes data using mathematical techniques.It is a self-adaptive system that may modify its structure in response to internal or external stimuli.The Feed-Forward Neural Networks (FFNN) is one of the most common and simplest types of ANN (Schmidhuber 2015), and it was used in the current study as one of the classifiers.

C5.0 decision tree
C5.0 Decision Tree Algorithm (C5.0) technique is a widely used technique for classifying instances as a decision tree approach.The decision tree is constructed in accordance with a multistage or hierarchical decision-making method (tree structure) (Quinlan 1986), and C5.0 is intended to construct a tree for the binary dependent variable.Each node in the decision-tree structure (C5.0) makes a binary choice that categorizes one or more classes as distinct from the other classes.

Naive Bayes classifier
The Naive Bayes Classifier (NBC)'s idea is related to the "probabilistic classifiers" that are based on Bayes' theorem algorithm (naive) independence assumptions between the features.NBC capability has been proved as the simplest Bayesian network model capable of achieving high levels of accuracy (Piryonesi and Tamer 2020).

Support vector machine
Support Vector Machine (SVM) is a type of machine learning algorithm that is based on associated learning algorithms (Vapnik 2000).SVM is based on the concept of mapping multidimensional data into a higher dimensional feature space with a hyperplane that linearly separates the original data while minimizing the margin between classes (Huang, Davis, and Townshend 2002).The SVM model was implemented using a radial basis function (RBF) as the kernel function in the current study.

Extreme gradient boosting
Extreme Gradient Boosting (XGB) is a relatively novel implementation of gradient-boosting decision trees (Chen and Guestrin 2016).In this method, a group of weak learners is brought together to create a strong learner by using an additive approach.A learner is initially fitted with the whole dataset, and a second learner is then added to match the residual errors of the first learner.This procedure is repeated during the learning process (Chen and Guestrin 2016).We utilized the XGB algorithm through its xgbTree implementation within the caret package in R (Kuhn 2019).

Ground reference samples and performance assessment
Ground reference samples are required to evaluate the performance of remote sensing indices and machine learning algorithms for this study.The ground reference samples data were collected from satellite images with minimum cloud cover for each region, which was selected between August to October 2021 to better detect the glacier areas from snow-covered areas.These ground reference samples were selected based on the visual interpretation of the Sentinel-2 satellite imagery and expert knowledge of the study areas.The stratified random sampling is a well-established statistical sampling technique that has been used as an acceptable sampling technique in land cover classification (Abdi 2020;Costa et al. 2022) and particularly glacier mapping (Mitkari et al. 2022;Sood et al. 2022).It is often preferred over other sampling techniques because it helps to ensure that the sample accurately represents the population being studied.Therefore, a stratified random sampling method (de Vries and Pieter 1986) was used to achieve sample points from the studied regions.For this aim, each satellite image is classified into four categories: (i) glacier; (ii) non-glacier (means neither glacier nor water body); (iii) no data (and cloud); (iv) water body.Therefore, ground reference samples were selected randomly from each of the defined classes based on TOA images as high-quality satellite images.Figure 3 shows the ground reference samples for training and testing of machine learning classifiers.After implementing the stratified random sampling strategy and considering the complexity of each satellite image, the following number of reference samples were selected: 3700 (in 502.24 km 2 ) for CAN, 5600 (in 3941.14 km 2 ) for CHI, 7500 (in 618.31 km 2 ) for SWE, and 2000 (in 15.81 km 2 ) points for SWIT.80% of these ground reference samples were used for training machine learning classifiers and the remaining 20% of the ground reference samples were used for testing models' performance in glacier mapping.
Two metrics were used to evaluate the performance of different classifiers with the ground reference samples, including overall accuracy (OA) and Kappa coefficient (K) (Karl, Painter, and Dozier 2013).The metrics are calculated as Equations 4 and 5: where Po represents a relative observed agreement among raters, and Pe represents the hypothetical probability of chance agreement.If we assume that we have two classes (glacier and other classes: non-glacier, inland water, and no data in Figure 4); therefore, TP indicates that the model correctly predicts glacier, FN is an outcome where the model predicts glacier as others, FP indicates that the model predicts other as a glacier, and TN indicates that the model correctly predicts others.Figure 4 shows the concept of True Positive (TP), False Negative (FN), False positive (FP), and True Negative (TN) in this study.

Holdout cross-validation strategy
The ground reference samples were carefully selected to ensure a representative distribution across the study regions.By employing the stratified random sampling method, the ground reference samples were determined based on visual interpretation and specialist knowledge of the study areas.The splitting of the reference samples into training and testing datasets ensured an unbiased evaluation of the classifiers' capability.Then the ground reference samples were split into training and testing stages for running machine learning classifiers.For this aim, 80% of ground reference samples were considered for training of machine learning classifiers and the rest of the ground reference samples (20% of data) were considered for testing models' performance in glacier mapping.K-folds cross-validation was employed to assess the performance of the machine learning models during the training phase.This cross-validation strategy separates the training data into k equalsized subsamples, trains and evaluates the model k times while using a different subset as the validation set in each iteration (Ringrose and Hand 1997).Here, we used 10 folds (k = 10) to train machine learning models.In Figure 5, the blue subsamples are used to train the model and the green subsamples are used to validate model performance in each fold.mapping in all studied regions.When considering the overall accuracy, ANDSI outperformed NDSI, boasting an average accuracy of (around) 95% across all models and regions, compared to NDSI with (around) 91%.This underlines ANDSI's superior ability in distinguishing glacier-covered areas, resulting in enhanced mapping accuracy.In terms of Kappa coefficients, ANDSI achieved an average of 0.92 across all models and regions, surpassing NDSI's 0.85.These results highlight the enhanced accuracy and reliability of the ANDSI over NDSI, suggesting its potential to improve glacier mapping accuracy across studied regions.

Performance of five machine learning classifiers with NDSI and ANDSI
Figure 6 shows the group results by visual assessment in the four studied regions (CAN, CHI, SWE, and SWIT) by five machine learning classifiers (ANN, C5.0, NBC, SVM, and XGB).It can be seen that the areas between inland water and glaciers are not well classified due to the complex pattern (e.g.intricate scattering of glaciers on the image) of inland water and glaciers.Nevertheless, the proposed ANDSI is able to fix these noise effects by providing different values for inland water and glaciers in all four studied regions (Figure 6).All machine learning classifiers show acceptable accuracies for classifying NDSI.However, classified maps by C5.0 and SVM show more "no data," which indicates these two models are more sensitive to outliers or any kind of "no data" in satellite images.
Figure 7 shows the result of calculated ANDSI from Sentinel-2 data and classified ANDSI using five machine learning algorithms.It is obvious that the proposed ANDSI can detect inland water from glacier regions better than NDSI in the four studied regions.As shown in TOA images, there are inland water bodies in the study region (SWIT), which NDSI can not recognize and NDSI considers inland water pixels the same as a glacier (Dixit, Goswami, and Jain 2019;Gaur et al. 2022;Kulkarni et al. 2006;Sharma, Tateishi, and Hara 2016).The ANDSI demonstrates a high ability to detect water pixels from glacier regions, and it outperforms the original NDSI in the studied regions.Regarding the performance of machine learning classifiers, the NBC model in some cases can not classify inland water bodies well (e.g.SWE), but other machine learning classifiers can classify all four considered classes well.Generally, ANDSI considers fewer values for pixels of inland water bodies and more values for pixels of glacier regions, then inland water bodies and glacier parts can be detected more clearly than original NDSI.
The ANDSI shows the ability to detect inland water bodies and can detect small scattered points (glacier) in the complex shape of glacier distribution.In other words, the proposed ANDSI enables the processing of complex images where glacier and inland water pixels are scattered near or far from each other without any specific pattern.This feature can be applied to identify glaciers that are connected to lakes or rivers.Lakes and glaciers are clearly visible in CHI, then both original NDSI and ANDSI performed closely together with less complexity.But CAN, SWE, and SWIT images have more complexities, and in these cases, the ANDSI has overcome these complexities, so ANDSI can be applied more efficiently than NDSI.Due to the large range of ANDSI values in CAN, CHI, and SWE, a natural logarithm (Ln) function was used for mapping the mentioned maps to make all defined classes in the maps visible.
Given that all classifiers demonstrate comparable performance in distinguishing inland water from glaciers, we focus our discussion on comparing the NDSI and ANDSI, using the ANN model as an illustrative example.Overall, the ANDSI can distinguish between inland water pixels and glacier pixels much more reliably compared to the NDSI, eliminating the need for pre-filtering water pixels as proposed by Choubin et al. (2019) in previous NDSI applications.In Figure 8, the red circle in the SWIT map is inland water and NDSI cannot detect this surface water while ANDSI detects it (red circle) clearly.In other words, ANDSI is sensitive to understanding the difference between water pixels and glacier pixels and it can distinguish water and glacier pixels from each other.By using ANDSI, we do not have too much error from water pixels detection, and also we do not need to remove water bodies from our study, which is recommended by Choubin et al. (2019) for improving the capability of NDSI (Choubin et al. 2019).Other obvious cases can be found from the SWE map, where there are several inland water bodies inside a red large rectangle, which NDSI is unable to detect as inland water, and NDSI identifies them as glaciers, which is a source of error in remote sensing studies.However, they are detected as inland water by the proposed ANDSI.The red circle in the SWE map is the most obvious case which can prove the capability of ANDSI in distinguishing between water bodies and glaciers.Inside this red circle, there are water and glacier which are clung to each other, and NDSI detects both of them as glaciers and thus is unable to understand the difference between pixels in such a sensitive case.However, ANDSI is able to distinguish between inland water and glacier in this case and detects them close to the TOA image.Other red rectangles in the SWE map also show inland water bodies which NDSI is unable to detect them, while ANDSI detects them as inland water bodies correctly.This shows that the behavior of ANDSI is closer to the TOA which can be seen in nature.
The TOA image (the false color composite: shortwave infrared, near-infrared, and red bands, generated by the "sen2r" package in R) of SWIT from Sentinel-2 is shown in Figure 9, along with images of  Figure 9 shows that NDSI cannot identify the difference between water bodies and glacier regions due to the consideration of the same value for pixels of the inland water bodies and glaciers.Maps of SWE, which are calculated indices from Sentinel-2 and classified maps via classifies are shown in Figure 10.This is a sensitive case due to the existing water bodies between the glacier and non-glaciers regions on the map.The original NDSI identifies glacier and non-glacier parts well, but it also detects water bodies as glacier pixels.That means NDSI cannot distinguish between water body pixels and glacier pixels in this case.While all water bodies' pixels are detected by the proposed ANDSI with acceptable accuracy, even ANDSI detects narrow lines of land changes (which is clear by comparing ANDSI with Sentinel-2 images in Figure 10).

Comparison of the machine learning based and threshold based classifications
To improve the accuracy of glacier detection using NDSI, several previous studies have used threshold values on NDSI based on the Otsu (1979) method.For instance, Hall et al. (1998) suggested a NDSI threshold of 0.4 for optimal snow/glacier detection, which was also used in many other studies (Xiao et al., 2001;Kulkarni, Rathore, and Singh 2010).Similarly, Lu et al. (2022) detected glacier regions in the Tibetan Plateau using Sentinel-2 images and a threshold of NDSI > 0.4.Burns and Nolin (2014) used Landsat, IKONOS-2, and QuickBird images to map glacier in Peru, with a threshold value of NDSI ≥ 0.42 for detecting glaciers.The threshold values reported in the literature, including 0.4, 0.42, 0.5, and 0.6, were evaluated and compared to determine their effectiveness in mapping snow-covered/glacier regions in different study areas (Burns and Nolin 2014).
Based on the previous studies, we selected a threshold value of 0.42 for NDSI to compare the ability of machine learning classifiers and thresholdbased approaches for glacier mapping.Thus, we evaluated the ability of NDSI with thresholds and machine learning classifiers for glacier detection in the four studied regions of our study.Figure 11 shows the comparisons of detected glaciers from different methods.The limitations of using NDSI with a threshold are apparent from Figure 11, indicating  that many glaciers remain undetected in CAN when a threshold is applied, and water bodies are misclassified as glaciers in CHI when NDSI is applied using a threshold.Applying NDSI with a threshold (NDSI ≥0.42) could not detect the difference between water bodies and glacier regions, and it recognized water pixels as glacier pixels.In contrast, applying a threshold (−0.25 ≤ Ln (ANDSI) < 0) on ANDSI (based on the differentiation of glacier and nonglacier boundary on the images) can detect glacier pixels well, and it yielded much better results than using NDSI with a threshold.Using a machine learning classifier (C5.0) resulted NDSI is more accurate than using NDSI with a threshold.Also, applying the proposed ANDSI using a threshold is more reliable than NDSI with a threshold, and using a machine classifier (C5.0) with ANDSI is even better than using NDSI with machine learning classifiers.
The use of the ANDSI for mapping glaciers can provide several advantages over well-established approaches, such as NDSI.Firstly, as discussed earlier, ANDSI can better discriminate between glacier and water pixels by minimizing the spectral similarity between snow and ice-free areas, resulting in a more accurate mapping of glacier extents.This has been demonstrated in the current study by comparing the accuracy of ANDSI with NDSI and several other approaches, showing that ANDSI consistently outperformed NDSI in terms of glacier mapping accuracy.Additionally, the use of Sentinel-2 imagery in combination with ANDSI allows for higher spatial resolution mapping of glaciers compared to other widely used satellite imageries.This higher resolution can provide more detailed information on glacier facies and morphology, which can be particularly useful for monitoring glacier changes over time.Furthermore, the use of machine learning algorithms in conjunction with ANDSI can provide a more automated and efficient approach to glacier mapping, which can be beneficial for large-scale studies or monitoring programs.Overall, the use of ANDSI for mapping glaciers can provide a more accurate, high-resolution, and automated approach compared to well-established approaches (e.g.NDSI), making it a useful tool for glacier monitoring and research.

Advantages and disadvantages of ANDSI and NDSI, and outlook for future studies
NDSI is one of the most widely used indices for mapping snow, glacier and ice cover using satellite imagery.NDSI has been found to be a useful tool for mapping glaciers due to the distinctive spectral signature of snow and ice cover in these bands.However, there are both advantages and disadvantages to using NDSI for glacier mapping, and therefore the current study suggests an alternative index (ANDSI), which may offer improved performance in certain conditions.This study sought to solve some limitations of the previous studies by proposing new satellite-based index and applying different machine learning algorithms for glacier mapping.The better performance of the proposed index (ANDSI) than the widely used NDSI was demonstrated and discussed.ANDSI is a modification of NDSI that aims to improve its performance in certain conditions.ANDSI uses a transformation of NDSI to adjust for variations in the background reflectance, which can improve its sensitivity to thin snow cover and reduce its sensitivity to atmospheric effects There are some advantages of ANDSI over NDSI, including: (I) improved sensitivity to thin snow cover: ANDSI has been shown to be more sensitive to thin snow cover than NDSI, which can be particularly important in regions with variable snow cover depths.(II) One limitation of NDSI for glacier mapping is that it cannot distinguish between water and glacier pixels, as both have a similar spectral response in the green and shortwave-infrared bands.This can be problematic in regions where glacial lakes are common, as these lakes can pose a significant hazard if the ice dam fails and causes a glacial lake outburst flood (GLOF).In contrast, ANDSI includes additional spectral bands in the red and near-infrared regions, which can help to distinguish between water and glacier pixels.Therefore, using ANDSI instead of NDSI may be more appropriate for studies in regions with a high prevalence of glacial lakes or other bodies of water.(III) Greater robustness to variations in the background reflectance: The adjustment used in ANDSI helps to account for variations in the background reflectance, which can improve its accuracy in regions with variable land cover types or topography.Disadvantages of ANDSI compared to NDSI can be listed as follows: (I) increased complexity: The transformation used in ANDSI adds an additional level of complexity to the index calculation, which can make it more difficult to interpret and use for some applications.(II) Limited availability: While many satellite sensors provide the necessary bands for calculating NDSI (green and SWIR1), ANDSI requires additional spectral bands in the NIR and SWIR2.This can limit its use in applications where these bands are not available.
Both NDSI and ANDSI offer advantages and disadvantages for glacier mapping using satellite data.NDSI is a simple and widely available index that can be highly accurate in certain conditions, but is sensitive to water pixel effects and has limited sensitivity to thin snow cover.ANDSI is a modification of NDSI that offers improved sensitivity to thin snow cover and reduced sensitivity to water pixel effects, but is more complex to calculate and has limited availability on some satellite sensors.The choice of index will depend on the specific application and the conditions of the study area.In general, NDSI may be a good choice for studies where water pixels are not a major issue (regions with less water bodies) and the snow cover is deep and extensive, while ANDSI may be a better choice for studies where thin snow cover is common or water pixel effects are a concern.
In addition, one limitation of this study is that the quality of satellite images, being the main data source, can play an important role.The low resolution of satellite images, such as temporal and spatial resolution and gaps in the images, can make a complex condition for glacier mapping.The interpretation of snow/glaciers from satellite imagery requires certain skills in remote sensing, Geographic Information System (GIS), and glacier-hydrological knowledge.Due to the effect of local cloudiness in satellite images (Lee and Tariq Mahmood 2015) using radar remote sensing products can be recommended for future potential studies.In addition, this study used a single satellite image for glacier mapping.However, a better separation of snow and glaciers using multi-temporal satellite images is recommended as a possible solution for increasing the accuracy of glacier mapping.In addition, the current study used the top-of-atmosphere TOA reflectance data from Sentinel-2 (radiometrically and geometrically corrected Sentinel MSI Level-1C data) following previous studies (Alifu et al. 2020;Veettil 2018;Yan et al. 2021;Zhang et al. 2019).It would be interesting to evaluate if using atmospheric corrected (surface) reflectance data could further improve the accuracy of mapping glaciers in future studies (Rumora, Miler, and Medak 2020).
The type of classifier and used algorithm in machine learning classifiers are limited.However, to test and ensure the quality of the classification this study employed five different kinds of classifiers including ANN, C5.0, NBC, SVM, and XGB.It is recommended to test coupled machine learning models via optimization algorithms to increase the ability of glacier mapping.

Conclusions
Accurate monitoring of glaciers and their dynamics are essential for many applications in cold regions.Previous studies reported several limitations of NDSI such as its inability to differentiate glacier pixels from water pixels.This study proposed a new satellite-based index, namely the ANDSI, to compare with NDSI and further improve the accuracy of mapping glaciers using the Sentinel-2 satellite imagery data.ANDSI improves NDSI (which uses green and SWIR1 bands) by adding the NIR and SWIR2 bands.Furthermore, this study evaluated the performance of five machine learning algorithms (ANN, C5.0, NBC, SVM, and XGB) in glacier mapping with the satellite-based index NDSI and ANDSI as individual input feature.This study evaluated the proposed index in four different glacier areas in Canada, China, Sweden, and Switzerland-Italy.The results showed that overall the proposed ANDSI outperformed the widely used NDSI in accurately delineating glacier extents and overcoming limitations associated with differentiating glaciers from water bodies.Among the five selected machine learning classifiers, overall the C5.0 Decision Tree Algorithm showed the best overall accuracy and Kappa in the majority of cases, but the differences in performance among these classifiers were not significant.This study demonstrated that the proposed ANDSI can serve as a superior and improved method for accurately mapping glaciers in cold regions.The use of ANDSI will thus benefit many relevant studies in cold regions such as hydrology, glaciology and landscape changes.It would be interesting to further test ANDSI in other satellite imagery data (e.g.Landsat and MODIS) and other glacier regions (e.g. in the Southern Hemisphere).Further study can investigate the utilization of deep learning techniques in association with the proposed ANDSI to train and a wider range of satellite imagery sources.

Figure 1 .
Figure 1.The top of atmosphere (TOA) reflectance images from Sentinel-2 for each study region.They are generated by the "sen2r" package in R (Ranghetti et al. 2020) based on false color composite (short-wave infrared, near-infrared, and red bands).

Figure 2
Figure 2 shows the flowchart of procedures and methods used in this study.The procedures and methods can be divided into four main steps: (1) preprocessing of Sentinel-2 images and development of sample sets, (2) proposing a new remote sensing index for glacier detection in different regions, (3) training and selection of classifiers, and (4) glacier mapping, evaluation of the performance of our proposed remote sensing indices, and comparison with NDSI.

Figure 2 .
Figure 2. The flowchart of the current study.

Figure 4 .
Figure 4. Concept of true positive TP, false Negative FN, false positive FP, and true Negative TN in calculating the metrics in this study.

Figure 3 .
Figure 3.The ground reference samples for training and testing of machine learning classifiers.

Figure 5 .
Figure 5. Implementing five machine learning classifiers by k-fold cross-validation for the current study.
Figure 9 also focuses on a sensitive area (where water bodies and glacier areas are almost spatially connected) between inland water and glaciers, and the details of each index are visible in these areas.Figure 9 also shows the classified maps of NDSI and ANDSI by ANN based on the zoomed map.The original NDSI has a specific value for both inland water and glaciers, which is marked in blue on the NDSI map.The NDSI considers approximately 0.97 for inland water and glaciers, while ANDSI considers approximately 0.09 for inland water (orange color) and approximately 1 for glaciers (yellow color).

Figure 6 .
Figure 6.Maps of calculated NDSI from Sentinel-2 versus the classified maps using NDSI and five machine learning classifiers in four study regions.

Figure 7 .
Figure 7. Maps of calculated ANDSI from Sentinel-2 versus the classified maps using ANDSI and five machine learning classifiers in four study regions.

Figure 9 .
Figure 9. (Left panel) NDSI and ANDSI values and (right panel) classified images resulting from each index for the Sentinel-2 TOA image (the false color composite: shortwave infrared, near-infrared, and red bands, generated by the "sen2r" package in R) captured on 23 rd September 2021 in SWIT.

Figure 8 .
Figure 8. Glacial pixel identification via the original NDSI and proposed ANDSI using ANN classifier.The red shapes are the areas where the ANDSI distinguishes glacier pixels from water and the NDSI does not.

Figure 10 .
Figure 10.(Left panel) NDSI and ANDSI values and (right panel) classified images resulting from each index for the Sentinel-2 TOA image (the false color composite: shortwave infrared, near-infrared, and red bands, generated by the "sen2r" package in R) captured on 23 rd September 2021 in SWE.

Table 1 .
Location of the four study areas and the date of used Sentinel-2 satellite imagery data.
Table 2 presents the overall accuracy and the Kappa coefficient of NDSI and the proposed ANDSI for glacier