Winter remote sensing images are more suitable for forest mapping in Jiangxi Province

ABSTRACT Jiangxi Province boasts the second-highest forest coverage in China. Its forests play a crucial role in providing essential ecosystem services and maintaining the ecological health of the region. High-resolution and high-precision forest mapping are significant in the timely and accurate monitoring of dynamic forest changes to support sustainable forest management. This study used Sentinel-2 images from four seasons in the Google Earth Engine (GEE) platform to map forest distribution. Moreover, the classification results were compared and analyzed using different classification algorithms and feature-variable combinations. Based on the overall accuracy, the optimal image seasonality, feature combinations and classification algorithms were selected, and the forest maps of Jiangxi Province were mapped from 2019 to 2021. The accuracy evaluation showed that the winter image classification results had the highest accuracy (above 0.88). The red edge bands carried by Sentinel-2 could effectively improve the classification accuracy. The Random Forest classifier is the optimal classification algorithm for forest mapping in Jiangxi Province. The forest mapping obtained can be used for ecological health assessment and ecosystem function. The study provides a scientific basis for accurate and timely extraction of forest cover and can serve as a valuable resource for forest management planning and future research.


Introduction
Forests are crucial in providing various ecosystem services vital for human economies and well-being, such as timber and non-timber products, carbon sequestration, watershed protection, habitat diversity and even entertainment (Alamgir et al., 2016;Brockerhoff et al., 2017).Yet, forests have changed significantly and are now more endangered by climate change, fire and fragmentation (Bonan, 2008;Tariq, Mumtaz, Majeed, et al., 2022;Y. Wang et al., 2022).As a result, timely and accurate data on forest cover are critically needed to enable sustainable natural resource management, carbon storage and biodiversity research, as well as to give a direct method of monitoring forest changes (Gómez et al., 2016;Häyhä et al., 2015;Qin et al., 2021;Tariq, Mumtaz, Majeed, et al., 2022).
Forest mapping has been extensively studied and analyzed from various perspectives.However, traditional field measurement techniques are time-consuming and cannot provide large-scale and continuous forest distribution information.Therefore, remote sensing technology has become a valuable tool for identifying the size, types and changes in forests due to its high frequency and spatial resolution (Tariq & Mumtaz, 2022;Tariq, Mumtaz, Majeed, et al., 2022, 2022;Y. Wang et al., 2020).Many remote sensing-based products have been developed to cover a wide area of forest distribution and types (Giri et al., 2011;Gong et al., 2012;Hansen et al., 2013).However, these products may not always be up-to -date due to insufficient input data and a lengthy realization process (Y.Wang et al., 2022).Additionally, their precision on more minor scales is often unsatisfactory.A common concern is the accuracy of global maps, which are often less accurate than regional maps due to the limited sample size used (Barrett et al., 2016).
Sentinel-2 imagery is widely used to extract forest information due to its high spatial resolution, frequent revisits and rich spectral bands (Adjognon et al., 2019;Barakat et al., 2018;Ganz et al., 2020;Ma et al., 2021;Norovsuren et al., 2019;Ottosen et al., 2020;Szostak et al., 2017).Compared to commonly used images, Sentinel-2 has red-edge spectral bands located between the red and NIR parts of the EM, allowing for more accurate chlorophyll content measurement for vegetation monitoring (Filella & Penuelas, 2007).The red edge bands have been shown in multiple studies to improve classification accuracy by enhancing separability between different objects (Forkuor et al., 2017;Qiu et al., 2017).Additionally, red edge indices significantly enrich the feature space for land object classification and improve classification accuracy (Gorelick et al., 2017;S. Y. Huang et al., 2018;Kim & Yeom, 2014;Ustuner et al., 2014).
Utilizing multi-season remote sensing images is advantageous for forest mapping as it enables the detection of seasonal changes in vegetation and provides a comprehensive landscape view (Feng et al., 2016;Gao et al., 2015;R. Li et al., 2022;Zhu et al., 2012).However, most classification studies based on multi-seasonal imagery only consider the dry and rainy seasons, with four-season studies being rare (Fundisi et al., 2021;Kaszta et al., 2016;Pu et al., 2018).Given the study area's primary forest type of evergreen coniferous forest, we hypothesized that using four-season remote sensing imagery would improve forest mapping accuracy.This is because, in winter, the forest remains evergreen while other easily confused vegetation, such as grasses and shrubs, is dormant, and agricultural land is harvested.
Machine learning algorithms have developed rapidly in recent years.The classification algorithms such as random forest (RF), support vector machine (SVM) and classification regression tree (CART) have been widely used for forest mapping (Talukdar et al., 2020).RF can perform feature selection, while SVM is effective for small sample data (Pelletier et al., 2016;Y. Yang, D. Yang, X. F. Wang, et al., 2021).CART is suitable for large-scale datasets (Shao & Lunetta, 2012).Many studies on forest cover mapping have employed various machine learning algorithms and compared their accuracy (Camargo et al., 2019;Jamali, 2019;X. Li et al., 2016;Rogan et al., 2008).
Jiangxi Province has a forest coverage rate of 61.16%, the second highest in China.However, due to rigorous management and activities like logging, forest growth and loss shifts have significant implications for forest structure and function (Ahrends et al., 2017).The Jiangxi Forestry Bureau conducts forest surveys every 5 years and issues resource reports, but dynamic forest changes are not accurately and promptly monitored due to the scattered and short-term impacts of disturbance (Hua et al., 2021).Therefore, more efficient and comprehensive methods, such as remote sensing techniques, must be adopted to monitor forest changes promptly and accurately.This study used RF, SVM and CART algorithms on the Google Earth Engine (GEE) platform to integrate four-season image data from Sentinel-2.Our study aimed to (1) assess the role of four-season remote sensing imagery in improving the accuracy of forest mapping, (2) analyze the impact of different feature combinations, especially the red edge bands and red edge indices, on the classification results, (3) evaluate the accuracy of each classification algorithm, (4) select the optimal combination to map the forest distribution of Jiangxi Province from 2019 to 2021.Overall, our study provides a scientific basis for the accurate and timely extraction of forest distribution, identifies the most suitable seasonal images for forest mapping in Jiangxi Province and offers a research reference for predicting future forest cover changes.

Study area
Jiangxi Province, located in southeastern China, covers an area of 166,900 km 2 between 24°29'14"to 30° 04'43" north latitude and 113°34'18"to 118°28'56" east longitude (Figure 1).Its subtropical warm and humid monsoon climate zone has an average annual temperature of 16.3-19.5°Cand a yearly precipitation of 1341-1943 mm, making it one of the rainiest provinces in China.Hills and mountains dominate the area and the terrain slopes from south to north.The forest coverage rate has significantly increased from 36.70% in 1973-1976 to 61.16% in 2014-2018, ranking second in the country (Hua et al., 2021).Despite having abundant forest resources, including evergreen broad-leaved, evergreen coniferous and mixed forests, these forests are extensively affected by various natural and human-induced disturbances, such as logging, insects, disease and fire (Deng et al., 2016).

Sentinel-2 image data
Sentinel-2A carries a multispectral imager covering 13 spectral bands, including visible light, red-edge, nearinfrared and short-wave infrared.The spatial resolution of the visible light bands and the near-infrared band is 10 m, while that of the red-edge and short-wave infrared bands is 20 m.We interpolated the red-edge bands to match the 10-meter resolution of the visible and near-infrared spectra to ensure a consistent resolution.We used Sentinel-2 Level-2A products from 2019 to 2020 for all four seasons to conduct our study.To ensure high-quality imagery, we filtered the image collections in GEE using three criteria: (1) the study area within Jiangxi Province, (2) the time interval from March 2019 to February 2020 and (3) a cloud percentage threshold of 20%.By applying these filters, we selected 1088 Sentinel-2 images in Jiangxi, each containing 12 spectral bands and three for quality assessment (QA).We used the QA60 band to remove clouds and create a cloud mask, resulting in cloud-free images.

STRM data
The study utilized highly accurate elevation data from the SRTM dataset, which was measured using synthetic aperture radar by NASA and NIMA (Su et al., 2021).We accessed and processed the digital elevation data from the SRTMGL1_003 products on the GEE platform for efficient geospatial data analysis.

Training and validation sample data
The accuracy and sufficiency of selecting training and validation samples significantly impact the accuracy of forest mapping (C.C. Li et al., 2014).To manage and analyze land cover and land use data, the Food and Agriculture Organization of the United Nations (FAO) developed Collect Earth, an open-source land monitoring software (Bey et al., 2016;Tzamtzis et al., 2019).We imported 200 random sample points from ArcGIS into Collect Earth and visually classified land use types as forest or others using Mapbox Satellite as the base map image (Figure 1(d)).Finally, we randomly divided the sample points into 70% training and 30% validation sets to assess the classification performance.

Research methodology
This study aims to map the forest distribution of Jiangxi Province from 2019 to 2021 using the classification algorithm of the GEE cloud platform and the fourseason image data from Sentinel-2.The methodology consists of four steps: image preprocessing, feature construction, image classification and accuracy evaluation, as illustrated in Figure 2.

Feature selection
Spectral bands.To investigate the contribution of red-edge bands in forest mapping, we selected seven relevant Sentinel-2 image bands (B2, B3, B4, B5, B6, B7 and B8) (Table 1).These spectral bands were then divided into two groups: one with no rededge participation (B2, B3, B4 and B8) and the other with red-edge involvement (B2, B3, B4, B5, B6, B7 and B8).This approach allowed us to assess the impact of adding red-edge bands on the accuracy of forest cover identification.Spectral indices.This study also aimed to explore the impact of red-edge indices on forest distribution mapping accuracy.Two commonly used vegetation indices, the Normalized Difference Vegetation Index (NDVI) and Soil Adjusted Vegetation Index (SAVI), were added to the experimental group without rededge participation (Jordan, 1969;Rouse et al., 1974).In the experimental group with red-edge involvement, NDVI was replaced with Normalized Difference Vegetation Index Red Edge (NDVIRE), and Sentinel 2 Red Edge Position (S2REP) was added (Frampton et al., 2013;Rouse et al., 1974).All the spectral indices were calculated using their respective formulas through the GEE cloud platform.The calculation formulas for each index are as follows: Here, NIR represents the Near Infrared band, RED represents the Red band, and RED EDGE1-3 represent the Red Edge bands.Adding these indices is expected to improve the accuracy of forest distribution mapping by providing additional information on vegetation and land cover characteristics.

Terrain features. Adding terrain features has been
shown to improve classification accuracy (Y.Yang, D. Yang, X. F. Wang, et al., 2021).In this study, we used the SRTMGL1_003 digital elevation data product  in GEE to derive elevation and slope, which were included as independent variables in the classification.We added the selected feature variables to the combination of features with and without red-edge involvement to explore their contribution to the classification, as outlined in Table 2.

Classification algorithms
Random forest.Breiman (2001) proposed the random forest algorithm (RF), a simulation-based machine learning method that utilizes iteration and is categorized as a classification tree-based algorithm.RF has been widely used in various fields, including ground object extraction, image classification and regression analysis (Belgiu & Drăguţ, 2016;Cánovas-García et al., 2017;Jin et al., 2018;Kelley et al., 2018;Maxwell et al., 2019;Millard & Richardson, 2015;Pelletier et al., 2016;Teluguntla et al., 2018).In RFbased remote sensing image classification, the primary factors affecting accuracy are the number of decision trees and the minimum number of leaf nodes.Previous research suggests that the value of RF parameters has a limited effect on the accuracy (Pelletier et al., 2016).In this study, we set the number of decision trees to 40 and used the default value for the minimum leaf population.The explain function in GEE was used to calculate the importance score of selected features, which is relative and varies with the features and sample data involved in the classification (Q.Y. Li et al., 2020;C. Liu et al., 2020).

Support vector machine.
Vapnik introduced the support vector machine (SVM) classification algorithm in 1995 (Shao & Lunetta, 2012).SVM is a popular choice for remote sensing image classification due to its ability to handle high-dimensional data, prevent overfitting and achieve high accuracy (C.Huang et al., 2010;Praticò et al., 2021).Previous research has shown that SVM performs exceptionally well in computing small samples (Shi & Yang, 2015; Y. Yang, D. Yang, X. F. Wang, et al., 2021).Although selecting appropriate parameters is crucial for SVM classification, default values are generally suitable for many classification problems.Given the research scope, we used the default parameter values in this study.
Classification and regression tree.The classification and regression tree (CART) is a popular single-tree decision classifier used for MODIS and TM image classification due to its simplicity, fast computation and accessible interpretation (Shao & Lunetta, 2012; Y. Yang, D. Yang, X. F. Wang, et al., 2021).Two key parameters must be considered to optimize CART performance for remote sensing image classification in GEE: the minimum leaf population and the maximum number of nodes.This study used the default parameter values for classification calculations.

Accuracy assessment
The confusion matrix is a widely used method for evaluating the accuracy of image classifications (Rana et al., 2020;Z. H. Wang et al., 2017).In this study, we utilized the confusion matrix in GEE to calculate two evaluation indices: the overall accuracy and the kappa coefficient.These metrics comprehensively assess the classification results and their accuracy.The overall accuracy assesses the algorithm's performance by correctly classifying samples in relation to the total number of validation samples.The Kappa coefficient provides insights into the model's performance by measuring the agreement between the sample data and the predicted values (Mahdianpari et al., 2018).

Four-season images comparison
Table 3 displays the accuracy of SVM, CART and RF algorithms for forest mapping using different feature combinations of Sentinel-2 four-season images.Winter images attained the highest average accuracy of 0.88.The highest classification accuracy for each experimental combination was also found in the combination based on winter images, with an accuracy of 0.92.Spring imagery had the second-highest average accuracy, while summer images had the poorest classification accuracy.These findings suggest that winter  images are the most suitable for forest mapping in the study area.

Feature importance
Table 3 shows that adding red edge bands to the feature combination improved classification performance, especially for the RF classifier.The red edge bands increased the accuracy of spring and summer images by 0.02 in the RF classifier but did not affect autumn and winter images.However, the advantage of red edge bands was less significant in the CART classifier, with only a slight improvement in the accuracy of summer images and a reduction in the accuracy of spring, autumn and winter images.
To determine the importance of each feature variable for classification accuracy, we used the importance scores provided by the RF classifier, which indicate that variables with higher scores are more critical (Sun et al., 2020).We selected the experimental group with the highest classification accuracy in Table 3 (winter images with red edge bands using the RF classification algorithm) for importance score calculation.Figure 3 shows the distribution of importance scores for the feature variables used in this experimental group.
According to Figure 3, red edge indices have the highest importance score, followed by spectral band score, whereas topographic feature has a relatively low importance score.Notably, the NDVIRE has the highest importance score (14.8233), significantly higher than the second-ranked feature variable, B4 (8.5921).However, the importance score of S2REP was unexpectedly low at 4.7946.The remaining feature variable importance scores are relatively similar, ranging from 6.1892 to 8.5921.Specifically, B4, B8 and elevation have importance scores above 8, while Slope, B2 and B6 have relatively low importance scores ranging from 6 to 7.

Classifier comparison
Table 3 shows that the RF achieved the highest classification accuracy in the region with an overall accuracy above 0.85 and a maximum of 0.92.SVM is the second-best classifier, with a classification accuracy above 0.8, while CART has the lowest classification accuracy among the three, with an accuracy of above 0.67.
Figure 4 illustrates the forest distribution obtained from the winter images with red edge bands using different classification algorithms.The forest maps generated by the RF and SVM classifiers exhibit highly consistent distribution patterns, while the forest map generated using the CART classifier presents a more fragmented forest area than the other two classifiers.

Forest map
Based on the overall accuracy results, we found that the winter images with red edge bands using the RF classification algorithm were the best combination.Therefore, we used this feature combination to map the forest in 2019, 2020 and 2021 and obtained the overall accuracy for each year.As shown in Table 4, the maps had an overall accuracy above 0.88 for each of the 3 years.
The forest maps produced for three periods were utilized to track forest distribution changes from 2019 to 2021 (Figure 5).Significant land cover changes were concentrated in the study area's northern, western and southeastern regions.The northern area experienced an increase in forest cover, likely due to reforestation efforts and the conversion of agricultural land to the forest during this period.Conversely, both the western and southeastern areas experienced forest loss followed by regrowth, possibly due to logging and forest fires.

Comparison of the performances of multispectral images in four seasons
Our study highlights that winter remote sensing images are more suitable for forest mapping in our   study area.The result contrasts with findings in other regions where the application of growing season imagery for mapping forest distribution was deemed more precise (Gao et al., 2015;Q. S. Liu et al., 2019;Townshend et al., 2012;Tucker et al., 2004).This difference is mainly due to the prevalence of evergreen forest types in the study area.The presence of dormant vegetation during the winter helps reduce confusion with forests in feature recognition.Deciduous vegetation exhibits reflectance characteristics that differ significantly from evergreen species, lacking a green peak, red edge effect and lower reflectance in the NIR band.
Figure 6 displays the annual and seasonal average NDVIRE values for the highest importance scores from Figure 3.Both land use categories exhibit higher NDVIRE values during summer but lower values in winter compared to the year-round average.The differences in winter NDVIRE values are more consistent with the forest distribution results in Figure 4. Furthermore, Figure 7 shows that the two land use categories differ primarily in winter regarding NDVIRE values.During the growing season, the non-forest vegetation experiences significant NDVIRE fluctuations that could cause feature misclassification.These findings support our hypothesis that utilizing previous knowledge-based winter remote sensing images is more suitable for mapping forests in Jiangxi Province.

The effects of feature variables on classification performance
The red-edge bands and indices are significant in the classification process, evidenced by the accuracy of various experimental combinations and importance scores.Red-edge bands are particularly useful for differentiating morphologically similar plants due to their unique characteristics of sharp increases in vegetation reflectance (Filella & Penuelas, 2007;Fundisi et al., 2021).In forest mapping, red-edge bands have also been emphasized in previous studies.Son et al. used Rapideye imagery to map mangrove density in Central America, as did Ottosen use Sentinel-2 satellite imagery to map tree cover in Europe and found that combinations including red edges had the highest  accuracy (Ottosen et al., 2020;Son et al., 2017).Additionally, some studies confirm that red-edge indices based on the red-edge bands can effectively improve classification accuracy by expanding the feature space of the object (Kim & Yeom, 2014;Shamsoddini & Raval, 2018).Therefore, our study further highlights the value of Sentinel-2's red-edge bands and indices in forest mapping.

Comparison of the performances of the different classifiers
This study compared RF, SVM and CART for forest mapping in Jiangxi Province, with the highest overall accuracy achieved by RF, followed by SVM and CART.Similarities and differences were found in previous studies (Eskandari & Ali Mahmoudi, 2022;Na et al., 2010;Senf et al., 2020;Shi & Yang, 2017).The variations in classification effectiveness may depend on the study area, topography and land cover types (Y.Yang, D. Yang, X. F. Wang, et al., 2021).Therefore, the finding that RF is the optimal classifier for forest mapping in Jiangxi Province only applies to this region.

Limitations and research prospects
Like other studies, this study has many limitations.Although the 3-year forest map is accurate with a precision above 0.88, it only considers two land cover types.Furthermore, selecting samples through visual interpretation with Collect Earth is laborintensive and may not be suitable for studying complex land cover types at a large scale.To overcome this limitation, we will employ a technique that efficiently and automatically transfer reference-year groundtruth samples based on Sentinel-2 images (Ghorbanian et al., 2020).In future studies, we plan to improve forest mapping accuracy by integrating the vegetation identification ability of Sentinel-2 red edge bands with Sentinel-1 multi-temporal data.With the increasing popularity of remote sensing data in object recognition, we hope to enhance our ability to classify land cover types in more complex environments accurately.

Conclusion
This study tested the hypothesis that using four-season remote sensing imagery would improve forest mapping accuracy.To achieve this, we employed three algorithms (RF, SVM and CART) to classify fourseason images of Sentinel-2.Winter images achieved the highest accuracy rates for forest mapping (0.88), likely due to the distinction between evergreen vegetation dominating the study area and other vegetation during winter.Furthermore, our findings indicated that the red edge bands and indices carried by Sentinel-2 significantly improved classification accuracy.RF was the most suitable algorithm for forest distribution in Jiangxi Province, followed by SVM and CART.We generated forest maps of Jiangxi Province from 2019 to 2021 using the highest accuracy feature combinations, with a classification accuracy of >0.88.High-resolution satellite images provided crucial insights into the spatiotemporal dynamics of forests in Jiangxi Province and helped assess the sustainability of forest resources and climate change.Our study provides a scientific basis for accurate and timely extraction of forest distribution and is a valuable resource for forest management planning and future research.

Figure 1 .
Figure 1.The location of Jiangxi Province (a), elevation (b), the total number of Sentinel-2 observations during the study period (c) and sample distribution (d).

Figure 2 .
Figure 2. The technical roadmap of this study.

Figure 3 .
Figure 3. Importance score distribution for feature variables used in the experimental group with the highest classification accuracy (winter images with red edge bands using the RF classification algorithm).

Figure 4 .
Figure 4. Forest distribution obtained from winter images with red edge bands using the RF classification algorithm (a), forest distribution obtained from winter images with red edge bands using the SVM classification algorithm (b), forest distribution obtained from winter images with red edge bands using the CARF classification algorithm (c).

Figure 5 .
Figure 5. Forest cover change 2019-2021.Map of forest distribution from 2019 to 2021 for three areas of high change (a-c).

Figure 6 .
Figure 6.The average NDVIRE values for forest and the other category throughout the year (a), the average NDVIRE values for forest and the other category throughout the spring (b), the average NDVIRE values for forest and the other category throughout the summer (c), the average NDVIRE values for forest and the other category throughout the autumn (d), the average NDVIRE values for forest and the other category throughout the winter (e).

Figure 7 .
Figure 7. Monthly averages of NDVIRE for each land class from 2019 to 2020.

Table 1 .
Characteristics of Sentinel-2 MSI bands used in this study.

Table 3 .
The classification accuracy of different experimental combinations.

Table 4 .
Classification accuracy of forest maps in 2019, 2020 and 2021.