Mapping of lakes in the Qinghai-Tibet Plateau from 2016 to 2021: trend and potential regularity

ABSTRACT Lakes over the Qinghai-Tibet Plateau (TP) have large quantities and areas. As an important component of fragile plateau ecosystems, these lakes have attracted increasing attention. However, owing to the limitations of technology and methods, changes in smaller lakes on the TP have received less attention. In this study, we used Google Earth Engine (GEE) with the Analysis Ready Data (ARD) preparation framework to obtain preprocessed Sentinel-1 data covering the plateau. The D-LinkNet framework was introduced to achieve lake extraction, and the lake dataset was completed from 2016 to 2021. The lake dataset showed an area accuracy of 86.49% and Intersection over Union (IoU) of 0.72-0.99 in different regions. The findings were as follows: during the study period, the TP lakes tended to be stable after increasing, with an increase in area and number of +7.6% and +14.8%. Except for the northwest TP, the other regions show the same general trend. In particular, the Tarim Basin exhibited a lake variation pattern independent of the TP. Significantly, we found frequent lake activity in the Kunlun Mountains, Qaidam Basin, Mountain Qogir, etc. Effects of the ongoing La Niña event on the TP lakes may occur in the next few years.


Introduction
The Qinghai-Tibet Plateau (TP) is the highest and youngest plateau in the world and one of the most sensitive areas to global climate change (Liu and Chen 2000;Kuang and Jiao 2016). The lake areas in the TP account for more than half of China's total lake area (Zhang et al. 2019;Wan et al. 2019). Previous studies have indicated that most lakes in the TP have expanded over the past few decades, although lakes in the southern TP have been shrinking (Qiao, Zhu, and Yang 2019). Precipitation is the dominant factor driving lake changes in the TP (approximately 70%) (Song et al. 2014), and the trend of lake expansion may be driven by the wetting of the TP (Zhang et al. 2020). Moreover, variations in lakes on the TP may have been driven by changes in large-scale atmospheric circulation, and lakes may have a response mechanism to strong climate change. For example, there were inflection points in 1997 and 2015, when lake changes were affected by the El Niño phenomenon, after which the lake continued to expand (Zhang et al. 2019;. Numerous lakes distributed on the TP provide feedback on global climate change, which may further influence the regional environment and climate. Small lakes play an important role in the global ecosystem. Emerging research has shown that small lakes are more active than large lakes and terrestrial and marine ecosystems in almost every process (Downing 2010). The number of small lakes distributed on the TP is much larger than the number of larger lakes. In particular, glacial lakes developed in the Himalayas may have higher climate sensitivity and risk levels (Veh et al. 2019;Zheng et al. 2021;Zhang et al. 2015). Existing TP lake research generally pays little attention to small lakes, instead focusing on the change and climate response of lakes > 1 km 2 (Zhang et al. 2020;Zhang et al. 2018). Furthermore, new lake extraction methods are often applied to limited areas Liu et al. 2021;Wang et al. 2020).
Because of the extreme geographical and environmental conditions of the TP, satellite remote sensing is considered to be a limited practical method. Monitoring lake changes using multispectral remote sensing (Landsat, MODIS, etc.) is a common method at present (Nie et al. 2017;Veh et al. 2019), but multispectral data are restricted by their temporal and spatial resolutions, especially clouds (approximately 45% of the TP has a high daily cloud cover problem (Yu et al. 2016)). Lake mapping on the TP is therefore difficult. In contrast, synthetic aperture radar (SAR) sensors can capture images under all weather conditions. Many studies have demonstrated the potential of satellite SAR data (Sentinel-1) for water bodies and lakes (Malenovsky et al. 2012;Twele et al. 2016;Zeng et al. 2017;Zhang, Zhang, and Zhu 2020). The GEE platform has enabled the execution of large-scale and long-term geospatial analyses (Gorelick et al. 2017). Among its large store of geospatial datasets, the GEE houses a complete and continually updated archive of Sentinel-1 Ground Range Detected (GRD) data. The provision of analysis-ready Sentinel-1 data on the GEE breaks the barrier of the complexity of SAR preprocessing and allows for easy acquisition of available Sentinel-1 images. As GEE has received increasing attention, research using the Sentinel-1 dataset through the GEE platform appears frequently (DeVries et al. 2020;Mahdianpari et al. 2019;Zhou et al. 2019).
Highly automated water classification is essential for large-scale lake mapping. Many studies have reported various methods based on Sentinel-1, such as image segmentation (Zhang et al. 2020;Huth et al. 2020), and machine learning (Wangchuk and Bolch 2020;Chatziantoniou, Petropoulos, and Psomiadis 2017). In recent years, there have been remarkable achievements in deep learning technology for target recognition (He et al. 2017;Ronneberger, Fischer, and Brox 2015). Some scholars have attempted to apply deep learning technology to automatic water extraction from SAR data, such as PA-UNet  which introduces an attention block and pyramid module to UNet, a compact convolutional neural network trained on multi-source data (Scarpa et al. 2018).
This study aims to map lakes on the entire TP with high accuracy and to analyze variations in lake area based on the Sentinel-1 dataset. The paper is organized as follows: (1) We mapped all TP lakes from 2016 to 2021 from Sentinel-1 using the D-LinkNet. (2) The variation in TP lakes was analyzed to discuss the characteristics and regularity of lakes in combination with the local environment. (3) We offered referential opinions on frequent lake changes.

Study area
The TP between 26°00 ′ −39°47 ′ N and 73°19 ′ −104°47 ′ E is approximately 2,800 km long from east to west, and 300-1,500 km wide from north to south. It has a total area of approximately 2.5 million km 2 and an average altitude of approximately 4,500 m (Baumann et al. 2009). The geographic location and altitude of the TP are shown in Figure 1. The climate of the TP is a comprehensive effect of the East Asian and South Asian monsoons and the westerlies (Schiemann, Luethi, and Schaer 2009). Its unique environmental conditions produce a special plateau climate, with strong solar radiation, low air temperatures, large daily temperature variations, and small differences between annual mean temperatures (Yao et al. 2012). The annual average temperature is 1.6°C, with a minimum temperature of −1 to −7°C in January, and a maximum temperature of 7-15°C in July. The annual cumulative precipitation is approximately 413.6 mm.

Satellite data
Ideally, satellite images from the same month (e.g. October for the TP) should be used for lake extraction (Zhang et al. 2019). Nevertheless, the lake area is usually stable in September, October, and November, and the images of these months can be used as a supplement (Zhang, Li, and Zheng 2017). The Sentinel-1 image has random inherent noise, which affects the lake extraction. Therefore, the image selection period should be determined to ensure the stability of the images obtained through GEE processing. Based on practical operations and previous studies, we conducted a filter for the data usage time. The freezing of TP lakes starts from mid-November to early January, whereas the end of the freezing varies from early December to early February. The unfreezing of lakes continues until late June (Guo et al. 2018;Qiu et al. 2019). On this basis, we found that there are many lakes whose ice age lasted until July, and we finally determined that the use period of lake data is from September to October.
Sentinel-1 images were obtained through GEE and stored on the cloud platform in the form of ImageCollection. The GEE ImageCollection includes S1 GRD scenes, which are processed using the Sentinel-1 Toolbox to generate a calibrated and ortho-corrected product. VV polarization provides the best separation between the water and the background (Horitt et al. 2003). Adding other images may reduce the accuracy of lake extraction; therefore, we only used a single-band image with VV polarization (Possa and Maillard 2018). Both the ascending and descending orbit mode of the Sentinel-1 are used in this study. Sentinel-1 images from September and October 2016-2021 were used as the basis for lake extraction. The images from 2014 and 2015 were excluded because of the lack of image quantity and unstable quality. Therefore, we obtained an ImageCollection that participated in the GEE calculation with 778, 1,180, 1,288, 1,249, 1,343 and 1,359 Sentinel-1 images in 2016-2021, respectively.
Landsat-8 images on GEE are atmospherically corrected surface reflectances from the Landsat-8 OLI sensor. These data were atmospherically corrected using LaSRC and included a cloud, shadow, water, and snow mask produced using CFMASK, as well as a per-pixel saturation mask. Landsat-8 images have outstanding spatial and temporal resolutions. These images were used to verify the accuracy of the great lake (Selin Co) using a manual image inspection. We obtained an ImageCollection on GEE by screening the Cloudcover less than 5%, and the clear image was acquired from the ImageCollection through a median operation on the GEE (ee.ImageCollection.median). We chose 2017 Landsat-8 image for accuracy assessment to better test the performance of lake extraction results in different years.
Gaofen (GF) images with a spatial resolution of 8 m were used to evaluate the accuracy of lake extraction in terms of Sentinel-1 images. The specific satellite data used for lake extraction and accuracy assessment are presented in Table 1. Owing to the scarcity of GF images, images from November and December were used in addition.

Auxiliary data
The dataset 'The lakes larger than 1km 2 in Tibetan Plateau (V3.0) (1970-2021s)' was downloaded from the National Tibetan Plateau Data Center (TPDC) website. The TPDC lake dataset was generated using Landsat series data. It is available annually after 2010. In this study, the TPDC lake dataset was used as a reference dataset to compare with our lake extraction results Zhang et al. 2019;Guoqing 2019b).
A vector boundary of the TP with a total area of 3×10 6 km 2 (above the 2,500 m contour) and a mean elevation of approximately 4,000 m a.s.l. was downloaded from the TPDC website (Guoqing 2019a;Zhang et al. 2013). Although the specific scope of the TP has not yet been clearly delineated, this boundary is likely the most extensively used. Most of the boundary area is China, but it also includes countries such as Nepal, India, and Pakistan. This boundary contained 12 watershed reference ranges, including Amudeira, Brahmaputra, and Ganges.
The NASA SRTM Digital Elevation 30 m data (Farr et al. 2007) was used to calculate the slope for the subsequent terrain filter on GEE. Sentinel-1 has high spatial resolution and requires digital elevation model (DEM) with sufficient accuracy to match. SRTM provides high-specification 30 m DEM data, which may be one of the most adaptive data. Figure 2 illustrates the lake extraction process. Generally, the DEM of a lake's location is relatively stable, and areas with large slopes can default to non-lakes. Thus, Sentinel-1 ImageCollection went through a screening process of filtering out slopes ≥ 15°using SRTM data. This step can significantly reduce data redundancy and partly remove the impact of mountain shadows. After additional preprocessing by Analysis Ready Data (ARD), the ImageCollection suitable for TP lake extraction can be obtained. We then performed a median operation on the ImageCollection to obtain the Sentinel-1 image with sufficient preprocessing. We designed a data format conversion equation to simplify the local processing flows when Sentinel-1 images needed to be output on the GEE. The backscattering coefficient of the preprocessed image is between −47∼18 dB; therefore, this phase converts pixel values to the range (Uint8) of computer vision (CV) and can be designed as.

Data preprocessing
Image Clip is an adaptation process for deep learning of images. We cropped the TIF image of the whole TP into many PNG images with a pixel size of 1024 × 1024, so that they can be accepted by the D-LinkNet model. The output results of D-LinkNet are binary images, which are divided into true values (lakes) and false values (background). Geographic coordinates were assigned to the output results according to the correspondence between the row and column numbers and the original image. In addition, converting the resulting raster image with geographic information into a vector layer can yield consecutive annual lake datasets, which provides the possibility for follow-up largescale lake change research and accuracy verification.

ARD
In this study, ARD was referring to an API on GEE (Mullissa et al. 2021). The processing of the Sentinel-1 SAR backscatter data ingested in GEE is limited to thermal noise removals, data calibration, multi-looking, and range-doppler terrain correction (Lewis et al. 2018; Siqueira et al.

2019).
To support additional Sentinel-1 SAR data preprocessing in GEE, Mullissa et al. (2021) present the framework for preparing Sentinel-1 SAR backscatter ARD, which includes additional border noise correction, speckle filtering, and radiometric terrain normalization.
The determination of the parameters shown in Table 2 has been tested in some areas of the TP. In general, larger kernel size corresponds to better denoising effect, but it blurs the edge of the lake. In contrast, smaller kernel size retains more image details, but lacks image noise removal (Mullissa et al. 2021). In our test, the MAP GAMMA filter (Lopes et al. 1990) and the sigma LEE filter (Lee et al. 2009) performed the best. We selected the MAP GAMMA filter, as it had more advantages in processing speed (Mullissa et al. 2021). It should be noted that radiometric terrain normalization (parameter: Terrain flattening) is important to mitigate the effect of topography on the SAR backscatter. Our research, however, screened complex terrain by slope and focused on lakes, and thus, this module was not loaded.

Lake extraction model training
D-LinkNet, the semantic segmentation neural network we used, achieved the best performance in the CVPR DeepGlobe 2018 Road Extraction Challenge. The D-LinkNet adopts encoder-decoder structure, dilated convolution and pretrained encoder. The network is built with LinkNet architecture (Chaurasia and Culurciello 2017) and has dilated convolution layers in its center part, for more details refer to .
In the training phase, we used cropped TP Sentinel-1 images and their corresponding label images obtained by visual interpretation as our training sets. These images, including the subsequent input and output images, were uniformly stored in a 1024×1024 size, with a single band. To avoid overfitting on the training set, we performed data augmentation in the same manner as in the initial article, including horizontal flip, vertical flip, diagonal flip, color jittering, image shifting, and scaling. For binary classification problems, binary cross-entropy + DICE loss may be the best combination of loss functions that we chose, and Adam was chosen as the optimizer. The batch size was set to 4, and the initial learning rate was set to 2×10 −4 . When the loss no longer decreased for more than 19 epochs during training, the learning rate was reduced by 5. By continuously adjusting the image numbers and features of the training set, a stable network structure was obtained with 171 epochs and a loss of 0.06134.

Accuracy assessment
To evaluate the accuracy of the lake extraction, we obtained the accurate area of the lake (ground truth) by visually interpreting the validation satellite dataset (Landsat-8, GF-1, and GF-6) to reflect the most realistic accuracy.
IoU is a common indicator in target detection. It is often used to measure the accuracy of predicted results in the target detection task. The results of lake extraction can be regarded as a binary image with location information; therefore, it is suitable for this evaluation method. The formula used is as follows: Where A is the area of the lake extraction, and B is the area of the ground truth. We used the area ratio as the area accuracy of lake extraction, which is obtained by the area of the lake extraction divided by the area of the ground truth. Figure 3 shows the spatial distribution of the TP lakes and land cover in 2021. At present, the TP has 42,721 lakes with a total area of 54,007 km 2 (Figure 8). Although unobservable small lakes (less than 1 km 2 ) account for only 5.93% of the total area of TP lakes, they account for 96.16% in number. The spatial distribution of lakes in the TP shows clear regional differences. The central TP has the largest number and area of lakes, and almost all the large lakes occur in this region. There is a clear outline of lakes in the northeastern TP, including Qinghai Lake, which is the largest inland lake in China. In contrast, the outlines of lakes in the northwest, south, and southeast TP are seldom observed. This does not imply that these regions do not have lakes, but could be a result of factors such as topography leading to smaller and more dispersed lake morphology, especially in the southeastern and southern TP (e.g. the Hengduan Mountains and Himalayas).

Spatial distribution of lake extraction results
Six sub-regions R1-R6 with different locations and sizes were selected to evaluate the accuracy of the lake extraction results. We ensured the dispersion of the validation regions in spatial locations, and the environment of these areas included permafrost (R1), swamps and wetlands (R2), grasslands (R3), glaciers (R4,R5), and forests (R6). The lake types included glacial lakes, dammed lakes, and inland lakes. Moreover, we used Landsat-8 images to verify the accuracy of Selin Co, which is a typical great lake on the TP (Figure 3 R3). Based on the characteristics of the remote  Figure 4 shows a comparison of the lake extraction results in 2016-2021 versus the TPDC lake dataset in sub-regions R1-R6. Compared with the TPDC lake dataset, our lake extraction results included lakes with areas smaller than 1 km 2 in addition to lakes with areas larger than 1 km 2 . The area comparison of extracted lakes and TPDC lake dataset with ground truth is shown in Table 3. Since there are few lakes less than 0.5 km 2 in the TPDC lake dataset, only lakes greater than 0.5 km 2 were compared. The accuracy of extracted lakes is better than the TPDC lake dataset, with an absolute deviation of 2.54% and an overall deviation of −2.17%, while the absolute deviation of TPDC lakes is 4.08% with an overall deviation of 3.84%.   in the three sub-regions were 0.9851, 0.9842, and 0.9986, respectively. In areas with flat terrain, lake extraction results can detect extremely small lakes, which are often difficult to distinguish through visual interpretation of multispectral images, as shown in Figure 5 (b). The lake extraction results have a perception of wetlands close to that of water, which may cause misclassification. In Figure 5 (c), due to the discontinuity of the contours of the lake and wetlands around the lakes, there are omissions of Selin Co. A river with a larger volume connected to the lake may be classified as a part of the lake. Furthermore, there may be areas with a slope greater than 15°in the interior of a large lake, forming hollows in the extraction results. Figure 6 shows the comparison of false color composite images, visual interpretation results, and automatic extraction results of glacial lakes in sub-regions R4-R6. The IoUs of glacial lakes in the The background images in the second and third columns are the corresponding Sentinel-1 SAR images. The red circle represents the discrepancy between the lake extraction results and the real lake shape. The green circle shows that the lake extraction performance is better than visual interpretation. three sub-regions were 0.82, 0.72, and 0.86, respectively. Compared with the inland lakes in Figure  5, the morphology of glacial lakes over mountainous and hilly areas was more dispersed. There is a side effect of removing the terrain, which makes the lake extraction form incomplete. The lakes attached to the end of the glacier may have different characteristics from those of other lakes far from the glacier, resulting in inaccurate extraction results Figure 6 (a). Our lake dataset had poor performance in extracting glacial lakes, as shown in Figure 6 (b) and (c), and there was a general lack of lake morphology. Owing to the combined influence of snow cover, terrain, and lake size, there are many false-positive and false-negative errors for tiny lakes and glaciers.

Accuracy assessment of lake extraction results
As shown in Figure 7, the overall average area accuracy of the extracted lake dataset was 86.49%. To better analyze the extraction accuracy in lakes of different sizes, we divided lakes into three categories according to their area. Specifically, lakes less than 1 km 2 account for the majority of the total, and these lakes are the most important part of our extraction process. Underestimation of the lake area generally occurs in various lakes. Due to the few pixels reflected in the image of the smaller lakes, this underestimation is most obvious in lakes with a size of approximately 0.01-0.1 km 2 , which is 81.35%. This error will decrease with an increase in the lake area: 92.98% for lakes approximately 0.1-1 km 2 , and 98.54% for lakes > 1 km 2 . Figure 8 shows the changing trend of the lakes in the TP. Although smaller water bodies were extracted, lakes with an area larger than 0.01 km 2 were counted in this study. The area and number of lakes were counted by size. The area and number of TP lakes have similar change trends, both increasing first and then decreasing, but reached a peak in 2018 and 2019, respectively, with an area of 54,154 km 2 and 45,653. The maximum change occurred during 2016-2017, which was +4.5% in area and +16.1% in number. After 2019, the changes in TP lakes tended to stabilize. In addition, the change characteristics of the different lake sizes were different. Large lakes occupy a larger proportion of the area, which dominates the variation in the total area, whereas small lakes affect the change in total number. Lakes larger than 500 km 2 contributed 35.3% of the total area change, and lakes less than 0.1 km 2 contributed 85% of the total number change. From 2016 to 2021, the area and number of lakes increased by +7.6% and +14.8%, respectively. In line with the overall trend, the area and number of lakes of different sizes increased. However, this consistency between parts and the whole may not exist in interannual variation. For example, from 2019 to 2020, lakes larger than 100 km 2 showed an expansion inconsistent with the total trend. To better analyze the regional differences in TP lakes, we counted the area and quantity characteristics of lakes in different regions according to the watershed. Figure 9 shows the area ratios and changes in lakes in each basin, which provide a quantitative reference for lake characteristics. The Inner Plateau Basin has the largest lake area and number and is also the most densely distributed area of lakes. In 2021, there will be approximately 40,000 lakes with a total area of 35,000 km 2 . It is worth noting that there is a large decrease in the number of lakes with a stable total area. In the northeast TP, the Hexi Corridor and Yellow Basin have the second largest number and area of large lakes, after the Inner Plateau Basin. Nevertheless, the lake features in the Qaidam Basin seem to be out of place in the northeastern TP, which may be due to the combined effect of topographic factors and climate in this region. In contrast to the northeastern TP, the lakes in the southwest TP appear very fragmented, including the Yangtze, Mekong, and Salween Basins. Possibly limited by steep terrain, barely any large lakes have developed. The average lake area in the southwest TP was the smallest, as low as 0.025 km 2 . A similar phenomenon continues in the Ganges and Brahmaputra basins in the southern TP. Unexpectedly, the Tarim Basin, which is situated on the far northern TP, also has the same lake characteristics as the southern and southeastern TP. The interannual variation of lakes in the Tarim Basin is large and irregular, and is independent of the TP.

Violent variations of the TP lakes
Our extraction lake dataset records all small lakes (0.01-1 km 2 ) in TP. These small lakes experienced more intense changes, manifested as the appearance/disappearance (A/D) of lakes. The density map of lake A/D was used to represent violent lake changes based on ArcGIS spatial analyst tool, as shown in Figure 10. Statistical analysis showed that almost no A/D occurred in larger lakes. There were only eight lakes larger than 10 km 2 , of which the largest area was 25 km 2 . The average area of the A/D lakes is approximately 0.015 km 2 . Overall, lakes A/D were in roughly the same areas. These locations are often at the junction of different watersheds, such as the Southwest Tarim Basin, north of the Inner Plateau Basin, and the northwest and northeast Yangtze Basin. The most prominent is the Qaidam Basin, where the number of lakes A/D accounts for 1/5 of the total. After the Qaidam Basin, the Yellow Basin has the same non-negligible volume of lakes A/D. Except for the southern TP, the frequency of lakes A/D was the lowest at the TP margin, including the Amu Darya, Hexi Corridor, and Yangtze basins. The area and number of lakes A/D were approximately 1,200 km 2 and 100,000, respectively. Consistent with our previous research Figure 8, these lakes appeared more frequently than disappeared, with a number ratio of 22.2% and an area ratio of 54.7%. The A/D of lakes has temporal and spatial regularities. The appearance of lakes tends to expand from the inside to the outside ( Figure  10 (a)), and the disappearance of lakes tends to shift from west to east (Figure 10 (b)). It is noteworthy that in the strip extending from the southern edge of the TP to Tarim (corresponding to the Himalayas and Karakoram), although there are few lakes, there is a frequent A/D of lakes. This trend is clearly observed in 2021, and there will be more new lakes than extinct lakes (approximately 53.7%).

Feasibility of experimental program
Qinghai Tibet Plateau has a vast territory with numerous lakes. This may be limited by instruments and technology, and less research tends to map a new complete TP lake dataset. Some scholars have attempted to provide more accurate lake datasets and have achieved good results. For example, TPDC lake dataset and 'A lake data set for the Tibetan Plateau from the 1960s, 2005, and 2014' (Wan et al. 2019), have excellent time spans (from 1960-2021). Nonetheless, these studies still have deficiencies. As the most widely used method, the threshold-based extraction methods have the advantage of operation speed, but may have poor accuracy, which is less than 95% for larger (>500km 2 ) lakes (Huth et al. 2020;Zhang, Li, and Zheng 2017). The methods based on machine learning improve the extraction accuracy, e.g. SVM, random forest, and U-Net (Wangchuk and Bolch 2020;Zhang et al. 2020;Zhang, Zhang, and Zhu 2020). But these methods generally have limitations in the scope of implementation and uncertain portability. The present study was designed to overcome these shortcomings and to complete a sufficiently comprehensive TP lake dataset.
Aiming at the barrier of the high cloud cover of the TP, we selected Sentinel-1 with a SAR sensor to extract the lakes. In addition, Sentinel-1 data also has other advantages that make it comparatively ideal data for our consideration. It has VV polarization that is sensitive to water and a spatial resolution of 10 m that matches high-precision lake extraction. Moreover, the long ice period of the TP is another obstacle to large-scale lake extraction. The pixel characteristics of ice sheet and water on Sentinel-1 are quite different, so we have screened the data usage time to avoid misclassification. Nevertheless, the defect of the Sentinel-1 time scale is unsolvable; therefore, the mapping of the long-term lake dataset requires additional data to supplement. Using the GEE platform, we achieved complete preprocessing of Sentinel-1 data using ARD. Additional preprocessing helps to remove some of the inherent noise of Sentinel-1, but there is still noise owing to image overlap. Therefore, more preprocessing must be considered when using Sentinel-1 images on a largescale (Fu et al. 2022). Currently, the most critical part is the method of automatic lake extraction (D-LinkNet). Roads have characteristics similar to lakes, with connectivity, discrete distribution, and unique contours. This inspired us to introduce the D-LinkNet into lake extraction. As a sophisticated lake extraction method, the traditional encoder-decoder model, U-Net has been widely used (Feng et al. 2019;He et al. 2021). And as an improved encoder-decoder model, the performance of the D-LinkNet is reliable. In addition, D-LinkNet adopts pretrained ResNet34 as its encoder, which brings running speed improvement, and can support a large-scale of remote sensing image processing. The details are presented in the next section.

Details of lake extraction
The performance of lake extraction differs not only in the lake's location but also in its size. Lakes with an area less than 0.1 km 2 are displayed with less than 1,000 pixels on the Sentinel-1 image. These lakes are usually less in water volume and cannot reflect the clear features of the lakes in the Sentinel-1 image. In addition, to minimize the misjudgment of background objects, we adopted a conservative strategy for model training. In the training set, we only treat pure water pixels as lakes, and mixed pixels are marked as background values. These problems may affect the extraction accuracy for small lakes.
Spatial differences in lake extraction performance may be affected mainly by geomorphological differences. Here, we combined the topography and land cover type to discuss their impact on lake extraction (Figure 3). The main land cover types on the TP are grasslands, bare lands, woodlands, glaciers, deserts, saline-alkali lands, and swamps. Most of the TP is grassland and bare land, and is also the flattest region. This region covers the range of the inner flow area, which has the largest number and area of lakes with the highest extraction accuracy of our lake dataset. Forests are mainly distributed in the eastern and southeastern TP and gradually decrease southward. The distribution of glaciers and forests is similar in the southeastern TP, and there is an extensive distribution of glaciers in the northwestern TP. The mountains of the TP are usually covered with snow, glaciers, and forests, which makes lake extraction difficult. Few large lakes develop on these mountains, and the small lakes are affected by high altitudes and have perennial ice cover. Owing to the influence of slope, lake area, and ice cover, it is difficult to extract lakes from the mountains. Fortunately, in a study by (Wang et al. 2020) glacial lakes were extracted from the TP to provide a comparison, with average relative area errors of ±13.2% in the high-mountain Asia region.
The huge Kunlun Mountains block the Indian Monsoon and guide the westerlies northward, isolating the Tarim Basin from the TP. This may be the reason why the variation in lakes here shows a regular yearly fluctuation. Deserts, saline-alkali land, and swamps are concentrated in the Qaidam Basin and its southern area and the Kunlun Mountains. As indicated in the results, in the northeastern TP, our dataset requires manual classification with prior knowledge to distinguish between lake-like areas (e.g. wetlands and swamps) and real lakes. Finding methods for more accurate lake extraction in the northeastern TP is also a challenge that needs to be addressed, which may require sufficient a priori knowledge and visual interpretation.

Potential characteristics of TP lakes
Although there have been many studies on lake changes, regularity, and mechanisms in the TP, we believe that there is still a lack of detailed patterns of lake changes in recent years. With the extracted lake dataset, we hope to provide a relatively specific overview of lake changes and point out some worthy portions. In contrast to the TPDC lake dataset the lakes larger than 1 km 2 are consistent with the trend in our lake dataset, but the trend of all lakes differed only in 20-21 years. Contrary to the conclusion that the lake volume of the southern TP decreased over a long period (Qiao, Zhu, and Yang 2019;Lei et al. 2014), except for the western TP, our study shows that the lakes in the TP expanded before 2019 and stabilized thereafter . The variation of larger lakes was closer to the trend of the total area, and the variation of smaller lakes was closer to the trend of the total number. In particular, the change in the smaller lakes is more dramatic than that of the entire TP and larger lakes. The area of small lakes is determined by their quantity, and these lakes are frequently undergoing A/D. This may be caused by the local environment and climate, as described in greater detail in the following section.
There are many scrappy lakes along the middle Kunlun Mountains, especially east Hoh Xil Figure 11 (a). Widespread glaciers and Gobi may have caused the lake to form a peculiar shape. Clear contours and uncomplicated topography make changes in these lakes easy to capture, and a past study shows that the lakes are expanding (Fang et al. 2016). The saline-alkali and swamps of the Qaidam Basin are mixed with many water bodies Figure 11 (b), causing our model to misclassify these areas into lakes. However, frequent A/D of lakes indicates that there may be strong hydrological processes. Some studies have suggested that permafrost degradation has contributed to the expansion of desertification and enhancement of soil erosion in the Qaidam Basin (Cheng and Wu 2007;Jin et al. 2009). Others consider the climate of the basin to have transformed from warm and dry to warm and humid (Yafeng et al. 2003;Jiang et al. 2012;Chen et al. 2014). Our results show that the lake area in the Qaidam Basin has increased since 2016 Figure 9 (d), which may prove that the basin is becoming wet. The frequent lakes A/D in Zoige in Figure 11 (c) may be a result of the mutual transformation between lakes, swamps, and wetlands (Bai et al. 2013). Unlike the aggravating desertification and wetland degradation over the past few decades, lakes in the Yellow Basin have changed little in recent years Figure 9 (c). Almost all studies have pointed to this environmental improvement in human activities (Qiu et al. 2009;Zhang, Wang, and Wang 2011;Hu et al. 2015).
Glacial lakes have become the focus of lake research on the TP because of their higher climate sensitivity and the potential risk of glacier lake outburst floods (GLOFs). Our lake dataset records approximately 7,000 lakes in the Hindu Kush-Karakoram-Himalayas-Nyainqentanglha-Hengduan region. This result seems to be in accordance with over 5,000 glacial lakes reported in earlier studies (Zhang et al. 2015;Veh, Korup, and Walz 2020). The Karakoram Mountains have the world's largest glaciers in middle and low latitudes. The stabilization of the lake area in the Amu Darya and Indus basins Figure 9 (k), (l) is benefited by the positive glacier mass balances of Karakoram (Kaab et al. 2015;Neckel et al. 2014). Surprisingly, we monitored the activity of the lakes in Qogir Figure  11 (d), which may be due to the steep terrain and highly dynamic glacier behavior (Iturrizaga 2011). High climate sensitivity and debris cover on glaciers make the Himalayas the most active region of glacial lake changes on the TP (Scherler, Bookhagen, and Strecker 2011). Himalayan glacial lakes are more than three times more dangerous than other regions (Veh et al. 2019;Zheng et al. 2021). Consistent with previous studies Zhang et al. 2015;Qu et al. 2022), Himalayan glacial lakes are increasing, which was more evident in 2021 Figure10 (a). Furthermore, we found the most intensive lake activity in the Himalayan piedmont near the Shiquan River Figure  11 (e). As proglacial lakes, these lake activities may have a greater impact on the downstream Brahmaputra Basin (Song et al. 2016). Unlike the extensive study of the Himalayas, lakes in the Hengduan Mountains are less concerning. In fact, unique topography creates distinct climate patterns with more pronounced effects on glacier retreat and lake expansion (Zongxing et al. 2009;Wang, Yao, and Yang 2011;Pan et al. 2012). The lakes in the Hengduan Mountains have great spatial differences, with the most significant lake changes in the Boshula Mountain in the west Figure 11 (f). At the same time, the Boshula Mountain contains half of the glaciers in the entire Hengduan Mountains, and the retreat of the glaciers here may have led to the expansion of the lake (Wang et al. 2017). However, for glacial lakes, stronger lake activities do not imply a higher GLOF likelihood, whereas calving glaciers and ice avalanches may be the driving factor (Veh et al. 2019).
Changes in the environment and climate have an important impact on the Change of TP lakes. TP possesses the largest areas of permafrost in the mid-and low-latitude regions of the world (Yang et al. 2010;Zhao et al. 2004). 'A new map of permafrost distribution on the Tibetan Plateau' (Zou et al. 2017) shows that the TP has 1.06 × 10 6 km 2 of permafrost, and 1.46 × 10 6 km 2 of seasonally frozen ground. Permafrost is mainly distributed in the northern Inner Plateau Basin and Indus Basin, while the rest is almost seasonally frozen ground. The higher temperature in the southern TP may indicate that water contribution from permafrost was already limited, while increased permafrost degradation will cause rapid lake expansion in the central and northern TP Kang et al. 2010). In the Indus Basin, the number of lakes has increased significantly, which indicates that lakes may be greatly affected by permafrost degradation. In the Inner Plateau Basin, however, the number of lakes has decreased. This may suggest that the driving factor for lakes in the Inner Plateau Basin is not permafrost, and their retreat is more pronounced. Precipitation contributes to the majority of the water supply for the lake expansion (74%), followed by glacier mass loss (13%), and permafrost degradation (12%) . The strong El Niño event in 2015/ 2016 caused a decrease in precipitation on the TP, which may lead to a decrease of TP lakes in 2016 (Figure 8). The subsequent La Niña restored the area and number of the lakes (Lei et al. 2019). Similarly, the 2019 El Niño caused a decrease in TP lakes, especially small lakes. But the subsequent La Niña did not increase the lakes. As of 2022, it has been the third year of the ongoing La Niña event. Strangely, the TP lakes that were supposed to increase have remained stable. Due to the current special climate background and lake change patterns, we believe that the TP lakes may undergo dramatic changes in the following years. In the future, the changes of TP lakes may usher in a new turning point, but they will still maintain the trend of expansion (Zhang et al. 2020).

Conclusions
The aim of the present research was to complete the mapping of lakes on the Qinghai-Tibet Plateau and analyze the variation of the lakes based on this dataset to support more extended research. The TP lake dataset that we mapped had high accuracy, with an average area accuracy of 86.49% overall, and 98.54% for lakes >1 km 2 . The IoU values for the interior TP and glacial lake were 0.991 and 0.802, respectively. In general, our dataset had relatively good results for different lake types, but the extraction of wetlands and glacial lakes was not satisfactory. Furthermore, it is feasible to obtain preprocessed Sentinel-1 images based on GEE, but additional datasets are needed for long-term studies. It is worth looking forward to modifying the framework of D-LinkNet to make it more suitable for special regions.
The TP lakes tended to be stable after expansion in 2016-2021. Although TP lakes are expanding, smaller lakes have distinct patterns of change from larger lakes, and may be more responsive to local environmental characteristics and climate change. Except for the northwestern TP, the lakes in each basin show a growing trend. Due to the glacier mass balances of Karakoram, lakes in the Amu Darya and Indus basins are stable. The Tarim Basin is divided by the Kunlun Mountains, giving it a different climate model than that of the TP. Notably, we observed significant changes in the lake around Hoh Xil, Qaidam Basin, Zoige, Mountain Qogir, Shiquan River, and Boshula Mountain. Hoh Xil has a large and dense small lake group that can effectively reflect the climate change in the Kunlun Mountains. The unique land cover types from the Qaidam Basin to Zoige have created complex hydrological processes. Although the mechanism is unknown, hydrological processes presumably occur during the mutual transformation between wetlands, swamps, lakes, and rivers. This hydrological process is of great significance for soil erosion and ecological management in the downstream Yellow Basin region. There are huge glaciers and ice avalanches on Qogir, which makes glacial lakes extremely threatening and needs continuous attention in the future. The middle Himalayas has the highest GLOF risk, with thousands of lakes. The collapse of glacial lakes near the Shiquan River may cause incalculable impacts in China, India, and Nepal, where the Brahmaputra flows downstream. Boshula Mountain, which has the most glaciers in the Hengduan Mountains, also has the retreat of glaciers and the expansion of lakes, and related research is yet to be improved. Under the current special climate change background, the TP lakes may undergo drastic changes in the next few years.

Data availability statement
The data used in this study are available by contacting the corresponding author.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the National Natural Science Foundation of China [grant number 42171362].