Multi-year mapping of flood autumn irrigation extent and timing in harvested croplands of arid irrigation district

ABSTRACT Flood irrigation after crop harvest, e.g. autumn irrigation (AI), is a common irrigation practice in arid and semi-arid regions like Hetao Irrigation District (HID) in Northwest China to increase soil moisture and leach soil salt. Detailed information about the extent, timing, and amount of AI is imperative for modeling agro-hydrological processes and irrigation management. However, little attention is given to the identification of the above AI factors. There are basically three major difficulties in estimating the annual changes in AI, including a suitable index to identify AI, temporal instability of thresholds, and an effective validation method for irrigation timing. Therefore, this study proposes a simple and effective threshold-based method to extract the extent and timing of AI in the HID using MODIS water indices at a daily timescale. The Multi-Band Water Index (MBWI) time series is first reconstructed using an adaptive weighted Savitzky-Golay filter and then used to identify the AI extent and time. The proposed model has a stronger generalization capability both in time and space due to robust thresholds selected from the Z-score normalized feature variable. The model is validated both at pixels generated by the segmentation of Sentinel-derived MBWI using a threshold-based model and at sampling points from the field survey. Results show that the model performed well with an overall accuracy of more than 90.0% for the irrigation area. The overall accuracies of irrigation timing are 76.4% and 91.7% based on the middle-to-late and whole irrigation periods, respectively. We found a decreasing trend in the AI area and a gradual delay in the starting time of AI in the HID, mainly due to changes in cropping patterns, climate, and irrigation fees. Overall, the model is promising in identifying flood irrigation extent and timing in large irrigation districts and is helpful for irrigation scheduling. GRAPHICAL ABSTRACT


Introduction
Flood irrigation in harvested croplands is beneficial in increasing soil water storage, regulating soil temperature, reducing pests in cropland, and/or leaching salt (Xiong et al. 2021). Flood irrigation has been adopted in many regions of the world (Jing et al. 2020;Negri et al. 2020), especially in the irrigation districts of arid and semi-arid regions in Northwest China. It is also known as autumn or winter irrigation (AI) because it is generally applied in late autumn to early winter after crop harvest. AI is quite different from irrigation during the crop growing season. The surface water inundated by AI often reaches several decimeters and takes several days to fully infiltrate. AI, therefore, consumes a large amount of water, e.g. AI accounts for about 1/3 of the annual irrigation water diversion to the Hetao Irrigation District (HID) from Mid-September to early November (Lu et al. 2019).
Irrigation during the crop growing season requires rational management, and so does AI. On the one hand, improper time and amount of AI may lead to secondary soil salinization issues and harm crop production (Mao et al. 2017). On the other hand, large irrigation districts in the arid regions of China are subjected to water scarcity problems due to the limited available water for irrigation. Therefore, agricultural water conservation and salinity control have become major issues for the sustainable development of irrigation districts in arid regions (Xu et al. 2010). Under the current circumstances of limited water availability, the efficient use of available water to address the soil salinity and its leaching effect is a major concern in these regions (Wen et al. 2020). In addition, the decrease in water supply for irrigation has led to changes in the spatiotemporal allocation of irrigation water and cropping patterns, which then have significant impacts on the hydrological processes, agricultural production, and the environment at a regional scale (Shi et al. 2020). Detailed irrigation information is an indispensable input to quantifying these changes in the regional agro-hydrological simulation models (Wen et al. 2022). Therefore, timely and accurate AI information is necessary for irrigation and agriculture administrations as well as for researchers.
Recently, the applications of remote sensing techniques (satellite data) to investigate irrigation during the crop growing season are significantly increased (Karthikeyan, Chawla, and Mishra 2020;Ozdogan et al. 2010). The investigation is performed at different spatial scales by adopting optical data (Deines et al. 2019;Salmon et al. 2015), microwave data Gao et al. 2018), or both (Bazzi et al. 2019;Elwan et al. 2022). Moreover, the investigation of irrigation during the growing season is mainly focused on the differences in the greenness index (Xie and Lark 2021) and/ or soil moisture between rain-fed and irrigated crops. However, very few studies have investigated AI to date because the above-mentioned methods may not be able to detect AI. Therefore, there is a significant research gap in identifying AI on a large scale. AI identification is almost similar to the identification of water bodies because water accumulates to the depth of several decimeters after the AI application. Water index (WI) can be used for the identification of AI. However, the identification of AI is more difficult than that of water bodies because the surface water ponding as a result of AI lasts only for several days. Therefore, the identification of AI requires high temporal resolution data, where Moderate Resolution Imaging Spectroradiometer (MODIS) with the daily temporal resolution is the most appropriate remote sensing data for studying short-term inundation (Dao, Mong, and Chan 2019).
Meanwhile, differentiation between water, ice, and snow is also an important aspect in the identification of AI because the inundated irrigation water after AI can freeze in some meteorological conditions, where ice and snow may have WI values close to or higher than water (Klein and Barnett 2003). Although freezing of water manifests the condition of being irrigated, the increase in WI due to freezing can lead to significant inconsistencies in feature values for lands irrigated at different times, which hinders the precise identification of AI extent and time. Therefore, snow and ice should be handled carefully while dealing with AI. Although many studies have compared the performances of different WIs for water body detection (Ji, Zhang, and Wylie 2009), little is known about their sensitivity to surface water change and freezing. Therefore, a robust index is required for AI identification which should be sensitive to changes in surface water but not to freezing.
Threshold-based models are one of the main approaches for identifying irrigated land or water bodies based on the selected index (Pervez and Brown 2010). The threshold-based model requires the selection of appropriate feature variables, which can be divided into two categories. The first category is based on the feature value in a specified day or the maximum/minimum values in the time series (L. Wang et al. 2021). The second category is based on the dynamic time series, where the combined index of different periods in a specific time profile is selected, or the time variation of the index is considered (Yeom et al. 2021). It has been proved that the time profile is more suitable to identify irrigation than the static spectral signatures at a one-time point (Y. Chen et al. 2018), but it emphasizes the processing of feature time series. Another drawback of the threshold-based method is that the thresholds cannot be compared temporally (Xie and Lark 2021). Therefore, pre-classification (e.g. wet and dry years classification (Pervez, Budde, and Rowland 2014) or data normalization techniques (Xie and Lark 2021)) were used to obtain thresholds available for multiple years to track annual changes in irrigation extent and timing.
The Z-score normalization method is one of the most promising techniques for normalizing temporal trajectories (Santana et al. 2018). It provides a statistical measure of deviations from the mean after taking into account the natural variability (Kreyszig 2010), and facilitates the user to define thresholds without considering the variations in detected objects due to disturbing factors such as satellite view angle (DeVries et al. 2020). Z-score has been successfully applied so far for fire detection (De Carvalho Júnior et al. 2015), tree planting detection (Chen, Jin, and Brown 2019), flood detection (DeVries et al. 2020), etc. Therefore, the Z-scorebased threshold model was considered in this study to identify AI. Similar to the identification of irrigated paddy rice fields and irrigation timing (Jeong et al. 2012), the AI threshold can be determined by statistical data for a particular study area, whereas AI time refers to the first time at which the threshold is reached.
In addition, a sufficient number of widely distributed and representative reference data is critical to identify and assess the accuracy of remote sensing-based AI extent and time. Currently, irrigation products are only available during the growing season and therefore cannot be used to validate AI identification. Visual interpretation of the high spatial resolution image or Google Earth is another common method of obtaining validation samples for the detection of irrigation areas (Jin et al. 2016). However, it is difficult to provide validation for irrigation timing due to the requirements of short satellite revisit periods and sufficiently accurate auxiliary information. Therefore, a simple and fast method that can extract irrigation distribution and irrigation time from high-resolution images on a large scale could be useful for the validation of AI mapping.
To summarize, there are still several key challenges in detecting the multi-year variations in AI area and timing, such as appropriate feature extraction, temporal incomparability of thresholds, and limited data available for accuracy assessment for the irrigation time. In order to address these problems, the main objective of this study is to develop a threshold-based model capable of extracting multi-year AI area and timing using remote sensing-based daily WI time series data, where a hierarchical decision tree classifier is used to determine the irrigation time for all irrigated pixels. The model is applied to the HID and validated from multiple perspectives using agricultural statistics, field survey data, and information extracted from high-resolution Sentinel-2 images on Google Earth Engine (GEE).

Study area
Hetao Irrigation District (HID), located in Inner Mongolia, is one of the three largest irrigation districts in China and the largest in the arid regions of Northwest China. It is a representative irrigation district that relies on AI for storing water and leaching salt. It consists of five sub-irrigation districts, including Wulanbuhe (WLBH), Jiefangzha (JFZ), Yongji (YJ), Yichang (YC), and Wulate (WLT) (Figure 1). The whole HID covers an area of about 1.189 million ha, where 0.733 million ha (62% of the total) is irrigated cropland that is mainly sown with corn, sunflower, and wheat. Natural land is another major land-use type and covers approximately 19% of the total area of HID, which consists of sparse grassland with scattered shrubs and abandoned cropland due to high soil salinity. The groundwater table depth in HID is generally shallow (Wen et al. 2020), with large annual cyclic variations from 0.5 to 3.0 m.
Agricultural production in HID is highly dependent on irrigation due to the arid and semi-arid climate conditions (low precipitation and strong evaporation potential). As reported in the Bayannur water resources bulletin, irrigation water is primarily diverted from the Yellow River, accounting for about 1/7 of the total water diversion in the Yellow River basin. The total amount of water diverted from the Yellow River is approximately 4.0-4.8 billion m 3 per year from 2010 to 2020. Water available for irrigation has decreased in recent years as a result of decreased runoff and integrated water resources management in the Yellow River basin. A proper irrigation management strategy is therefore essential for high-efficient use of limited irrigation water, particularly for AI, which accounts for more than one-third of the total diversion of irrigation water each year (Lu et al. 2019).

Materials and methods
The flowchart ( Figure 2) shows that the methodology for this study can be divided into four major steps: (a) data collection and pre-processing, (b) feature extraction, (c) development of the algorithm for retrieving irrigation information (extent and time), and (d) accuracy assessment.

Remote sensing data and preprocessing
Remote sensing data used in this study include the daily land surface reflectance of the MODIS (MOD09GA) from 2010 to 2020 and Sentinel-2 from 2016 to 2020.
The MOD09GA has 500 m spatial and daily temporal resolutions, which can be downloaded from https://ladsweb.modaps.eosdis.nasa.gov/. Sentinel-2 has a spatial resolution of 10-60 m and a temporal resolution of 2-5 days. We used Sentinel-2ʹs Top of Atmosphere (TOA) reflectance product processed  using GEE because the surface reflectance data is available for a very limited time on GEE. Several studies have reported a promising accuracy for TOA data in detecting surface water inundation (Seaton, Dube, and Mazvimavi 2020). The remote sensing data preprocessing included re-projection, clipping to the study area, and cloud and snow removal. For MOD09GA, the daily State Quality Assurance (QA) flags were used to mask pixels affected by cloud and snow. On the other hand, the QA60 bitmask band was used for Sentinel-2 to remove the cloud.
We also evaluated the potential of another MODIS product, MCD43A4, as a data source for AI identification. The MCD43A4 product corrects the problem of MOD09GA being affected by sensor viewing angle and solar zenith angle (Thompson and Paull 2017). However, it has a higher percentage of missing data in areas with high precipitation, aerosol concentration, or snow cover (Colditz et al. 2018).
The study period of September 15 to November 30 of each year was selected based on the historical statistics of starting and ending dates of AI. The comparison of the proportions of valid observations of the above-mentioned three products in the study period (given in Appendix A1, Figure 13) shows that MOD09GA is more suitable for AI identification, while Sentinel-2 can be used as a supplement to the validation dataset.

Field sampling data
For validation purposes, sampling points were collected in two field surveys from 19th to 24 October 2019 and 21st to 25 October 2020. The surveyed fields were divided into irrigation fields and non-irrigation fields. The information of each sampling field including the longitude and latitude of points along the field boundary and azimuth is recorded in detail. The longitude and latitude are recorded with a Juno SA Handheld Computer with GNSS (Global Navigation Satellite System) (Trimble, USA). The field-scale sampling data were then aggregated within a spatial window of 500 m to determine whether the corresponding MODIS pixel has been irrigated. If more than 50% of the pixel area is irrigated, it is taken as an AI sample. If the irrigated area of the pixel is less than 50% for the whole irrigation period, then it is considered an NAI sample (Hao et al. 2022;Tsela et al. 2014). The numbers of irrigated and non-irrigated pixels extracted from the field survey in 2019 and 2020 is given in Table 1.

Other auxiliary data
Land-use maps of the study area (with 30 m spatial resolution) for the years 2010 ( Figure 3a) and 2020 ( Figure 3b) are obtained from GlobeLand30 data sets (http://globallandcover.com/).
Globeland30 is a global land cover product developed by the Ministry of Natural Resources of the People's Republic of China. The overall accuracies (Kappa coefficients) of the GlobeLand30 datasets in 2010 and 2020 are 83.50% (0.78) and 85.72% (0.82), respectively (http://globallandcover.com/). In addition, the classification accuracy of cropland was 92% based on the data collected from an agricultural sample survey in HID in 2020, where the remaining 8% of the cropland is mainly classified as grassland.
The soil texture data ( Figure 3d) obtained from the Harmonized World Soil Database within the territory of China in 1:1,000,000, was provided by the Cold and Arid Regions Sciences Data Center at Lanzhou, China (http://westdc.westgis.ac.cn/). Statistical data of the AI area of each sub-irrigation district from 2010 to 2020 (where the data for WLBH and WLT are temporarily unavailable from 2019 to 2020) were provided by the Bayannur Institute of Water Conservancy Research. Furthermore, the statistical data of the AI area in the WLT after 2014 was not used for model calibration and validation, as the data for this period included only the Yellow River irrigation area and was incongruent with other areas.

Masking of potentially irrigated area
In order to be consistent with MODIS data, the landuse maps (Figures 3(a and 3b) were aggregated to 500-m pixel size using the nearest-neighbor approach. All the croplands and grasslands north of the flood bank in 2010 and 2020 land-use maps were used as a mask of potentially irrigated areas for further irrigation events detection. This is because croplands and grasslands are often confused in land cover classification (Shafizadeh-Moghadam et al. 2021) and they are often converted to each other in HID. Furthermore, croplands between the Yellow River and the flood bank ( Figure 1) generally do not irrigate in autumn. However, these croplands exhibit similar spectral characteristics to irrigated croplands as a result of shallow groundwater levels and high soil moisture due to the Yellow River and General Main Canal close to the flood bank. Therefore, these croplands were not considered to avoid misjudgment of the AI location.

Reconstruction of WI time series and WI selection
A daily continuous time series of each index was first generated by linear interpolation based on cloudless observation data and then filtered using an adaptive weighted Savitzky-Golay (S-G) filter, a noise reduction process proposed by Hu et al. (2020). The filter originally proposed for the MODIS-based Normalized Difference Vegetation Index (NDVI) is applied to the MOD09GA. This filter eliminates the effect of negative bias and improves the quality of the index time series.
In the filtering process, weights of indices on different days were determined from the quality control parameters of the MOD09GA products. The four-bit range of each reflectance band provided by the QC band has five situations in the study area, i.e. 0000, 1000, 1011, 1101, and 1110. According to MODIS Surface Reflectance User's Guide Collection 6 (Vermote, Roger, and Ray 2015), they represent different quality control aspects. In short, 0000 and 1000 are of high quality; 1101 are of average quality; 1011 and 1110 are of low quality. Combined with the cloud and snow conditions provided by the State band, the weight (ω i ) corresponding to the i-th point, is specified in Table 2. Initially, the filter window size is set to 7. In order to reduce the computation complexity and reserve the original high-quality data, the index values for points with weight ω i = 1 remain unchanged in the fitting process. The filter window will expand adaptively depending on the number of valid values (values Table 2. Weights for the adaptive weighted Savitzky-Golay filter determined from quality flags of the MOD09GA product. with weights greater than 0.1). The maximum filter window size should not exceed 15, because data beyond seven days before and after the current date has little influence on the central value.
We selected seven indices that could potentially be used for AI identification, including Normalize Difference Water Index (NDWI) using bands 4 and 6 (NDWI46) and bands 1 and 7 (NDWI17) (Boschetti et al. 2014), Multi-Band Water Index (MBWI) (X. Wang et al. 2018), the 2015 water index (WI2015) (Fisher, Flood, and Danaher 2016), Automated Water Extraction Index for images with shadows (AWEIsh) and without shadows (AWEInsh) (Feyisa et al. 2014), and Soil Moisture Monitoring of Remote Sensing index (SMMRS) (Zhan et al. 2006). We compared the sensitivity of these indices to changes in water surface and ice by analyzing all the field samples from 2019 and 2020. Overall, MBWI is the most suitable index for AI identification (see Appendix A2 for more details).

Selection of classification features
The feature variable derived from the MBWI curve is selected based on the following two hypotheses: 1) the peak MBWI value of irrigated land is higher than that of non-irrigated land in the same sub-irrigation district; 2) the fluctuation of MBWI caused by rainfall/ snowfall, ice, and other factors during the study period is unlikely to cause a larger MBWI increase than AI under normal circumstances.
However, we did not directly find the maximum MBWI value in the filtered MBWI time series in some cases, which is mainly due to two reasons. First, some of the peaks are not effective even after filtering the MBWI because some fluctuations may be caused by external disturbances, e.g. subpixel clouds and incorrect flags in MODIS quality information (Y. Chen et al. 2018). Second, since the weighted S-G filter was mainly employed to improve the quality of the index time series without considering the smoothing of curves, the filtered series may still have significant fluctuations, making it difficult to extract the temporal information of AI. The water surface in HID after AI can be generally kept for 3-14 days depending mainly on the applied irrigation depth and soil texture. Therefore, a pixel having surface water for more than three consecutive days should be identified as an AI pixel. We further calculated the three-day moving average of the filtered MBWI time series, which can make the time series smoother with a more obvious changing trend, and thus help to determine the AI time more precisely. Finally, we selected the maximum value of the three-day moving average of the filtered MBWI time series (MMAVE-MBWI) as the feature variable.

Algorithm for identifying autumn irrigation land
The key point of the threshold-based method is to determine a threshold that can be used for multiple years. The variation of reflectance values of MOD09GA products with sensor view-angle and solar zenith angle and the application of S-G filter on MBWI time series may lead to inconsistent inter-class separability across years (Shao et al. 2016). Therefore, processing the data is necessary to stabilize the thresholds. Several studies have shown that the Z-score normalization method can be used to find outliers within the year, where the resulting maps developed from Z-score normalization are better than those derived from simple empirical threshold methods (B. Chen, Jin, and Brown 2019; DeVries et al. 2020). Therefore, we used the Z-score normalization method to eliminate the abnormal interannual fluctuations in each sub-irrigation district. The Z-score can be calculated using the following equation: where a and Z are the initial and normalized values of the selected feature; μ and δ are the average and standard derivation of the feature variable for a reference land-use type in the study region. As an objective baseline, the reference land-use type should be able to reflect inter-annual variations in feature values due to factors unrelated to the study objectives, and it should be available for each sub-irrigation district. We compared three possible reference land-use types, i.e. water bodies, all the croplands and grasslands, and part of croplands and grasslands with higher MMAVE-MBWI, to find an appropriate reference landuse type (see Appendix A3 for more details). After Z-score normalization, the threshold value of the feature variable (MBWI Te ) was calibrated using the statistical data for AI areas of each sub-irrigation district.

Algorithm for determining autumn irrigation time
As a result of fragmented croplands owned by different farmers, a cropland pixel of 500 m by 500 m would usually contain several pieces of cropland that may or may not be irrigated at the same time. More and more cropland in one pixel gets inundated with the application of irrigation, which allows the coarseresolution sensors to start detecting cropland with water cover. The change of MBWI in an AI pixel is comprised of the following stages ( Figure 4). First, before the application of AI, a cropland pixel changes from crop cover to bare soil immediately after crop harvest. MBWI curve shows a valley point shortly before AI (Point A in Figure 4) because the MBWI values for cropland are generally higher than those of bare soil (X. Wang et al. 2018). The MBWI at point A increases when the cropland in the pixel is irrigated. After reaching a specified threshold (Point B in Figure 4), a certain percentage of the cropland in the pixel is irrigated and thus the entire pixel can be considered to reach the level of AI. Therefore, the irrigation time is determined as the date of point B. The specified threshold (MBWI T ) corresponding to B is obtained by inverse normalization using the calibrated MBWI Te , which is MBWI rises continuously with AI until the irrigation stops. At this specific point, the MBWI reaches its peak (Point C in Figure 4) and the water depth of the entire pixel reaches its maximum level. After that, MBWI begins to decrease and reaches the threshold for the last time (Point D in Figure 4). At this specific threshold, it can be considered that the whole pixel is no longer inundated, i.e. no longer in the AI state. MBWI continues to decline and reaches the first valley point (Point E in Figure 4).
According to the characteristic points A, B, C, D, and E of the MBWI time series, characteristic periods can be defined, including peak range and initial, middle, and late irrigation stages. Peak range (points A to E in Figure 4) is defined as the time range to the peak between two valleys. If the ending valley is greater than the threshold, the time range should extend to the next valley below the threshold. Initial irrigation stage (points A to B in Figure 4) illustrates that the cropland in a pixel starts to be irrigated until the pixel is in the state of AI with a certain proportion of irrigated cropland. Middle irrigation stage (points B to C in Figure 4) is the period when the water content/ depth of the entire pixel keeps increasing with continued irrigation until it reaches the peak and the irrigation stops. Late irrigation stage (points C to D in Figure 4) is the period when the whole pixel is still inundated although irrigation has stopped, i.e. the whole pixel is still in the AI state. Overall, the whole irrigation period (points A to D in Figure 4) combines the initial, middle, and late irrigation stages.
In general, the MBWI time series of a pixel may have one or more peaks exceeding the threshold. The following steps ( Figure 5) are used to determine the date of AI for different cases.
(1) For pixels with only one peak exceeding the threshold (Figure 4a), the first day when the MBWI reaches the threshold is taken as the AI date.
(2) For pixels having multiple peaks that exceed the threshold and the highest peak appears first (Figure 4b), the first day when the MBWI reaches the threshold is taken as the AI date because the subsequent peak(s) may be caused by icing or other factors. (3) For pixels having multiple peaks that exceed the threshold where the highest peak did not appear first (Figure 4c), we developed a "proximity algorithm" based on the assumption that the AI dates of adjacent pixels should be close or roughly the same. The proximity algorithm is based on a 3 by 3 window centered on this pixel with the following steps: (a) If more than half of the AI dates of other pixels in a 3 by 3 window are within a certain peak time range of the central pixel (points A to E in Figure 5), the AI dates in this 3 by 3 window are considered to be consistent. In this case, the AI date of the central pixel is the date when the MBWI first reaches the threshold within this consistent peak range.
(b) If the conditions in step (a) are not met, the AI date is taken as the date when MBWI first reaches the threshold within the peak time range where the maximum rising segment of the MBWI curve intersects.
(c) Step (a) is performed iteratively until the irrigation dates cannot be determined for more pixels.
(d) The irrigation date of the final remaining points is determined as the date when MBWI first reaches the threshold within the peak range of the maximum peak.

Sensitivity analysis of the threshold
Sensitivity analysis of the threshold is performed by taking the calibrated threshold in each subirrigation district as the baseline (T b ) and the range of variation is set from −0.3 to 0.3. Then the AI areas in the sub-irrigation districts are estimated with different thresholds, and the sensitivity of the threshold (S T ) can be calculated from Wen et al. (2022) where S T represents the ratio of the relative change in model output to the relative change in the threshold, A b is the estimated AI area corresponding to T b , ΔA is the variation of the simulated AI area, and ΔT represents the change of the threshold. Generally, a greater |S T | indicate the threshold is more sensitive.

Generating validation points from Sentinel-2 images
AI distribution maps are also verified using validation points extracted from Sentinel-2 . The Sentinel-2 MBWI distribution maps from 2016 to 2020 were generated using the GEE platform to produce reference AI points. MBWI was originally proposed based on the Landsat 8 surface reflectance product, where the zero value represents the default threshold for separating water from non-water bodies (X. Wang et al. 2018). Since the water depth and surface area of AI cropland are shallower and smaller than those of water bodies, it can be assumed that the MBWI threshold of water bodies is greater than or equal to that of AI cropland. Therefore, the threshold of water bodies is selected as the threshold of AI cropland to ensure the accuracy of the validation sample for AI lands. Since the TOA reflectance product of Sentinel-2 is used, the default threshold from Landsat 8 surface reflectance product has to be redefined by fitting the linear relationship between Landsat-based MBWI (LMBWI) and Sentinel-2-based MBWI (SMBWI) with images on the day when data is available from both datasets. AI date of the corresponding pixel is the date when the MBWI first exceeds the threshold to become a water-like body. Further, to comply with the MODIS-derived results, the percentage of irrigated Sentinel-2 pixels to the total number of Sentinel-2 pixels within each MODIS pixel (500 m) was recorded. A pixel with a percentage of AI over 50% is extracted as an irrigated pixel (Hao et al. 2022) and its irrigation date is recorded as the time when more than 50% of the pixel is irrigated. However, the threshold method is no longer applicable for the selection of non-irrigated sample points. The multiple snapshots of landscape from Google Earth, Sentinel-2, and Landsat8 imageries are visually interpreted to supplement the field sampling data.

Calibration and validation of the model
Available studies have shown that the choice of calibration and validation period has a significant impact on the evaluation of model performance and further simulations (e.g. Myers et al. 2021). Instead of arbitrarily choosing calibration and validation periods, we divide the dataset into training and testing sets. The training dataset includes information about the AI area of five sub-irrigation districts from 2010 to 2017. The testing dataset includes recorded AI area data from 2018 to 2020, field sampling points from 2019 to 2020, and validation points extracted from Sentinel-2 from 2016 to 2020. A k-fold cross-validation method is used to determine the appropriate threshold and evaluate the performance of the model by iteratively withholding samples from the training dataset (Y. Chen et al. 2018). Moreover, the testing dataset is used to evaluate the accuracy of the model in greater detail.

Cross-validation in each sub-irrigation district
The thresholds in each sub-irrigation district are calibrated by minimizing the mean absolute relative error (MARE) between simulated and observed annual autumn irrigated areas, which is where ŷ i and y i are the simulated and observed areas at the year i, and n is the total number of years. In order to select the threshold, we used k-fold cross-validation, which involves training the model using k−1 folds of training data and validating the trained model using the remaining data. The data of 8 years acquired from 2010 to 2017 were divided into four folds (k = 4), where data of 6 years is used to train the model, and the remaining 2 years' records are used for model validation. However, the data from 2010 to 2014 was used for calibration and validation in WLT, where data for three years was selected randomly for calibration and the other two years for validation. Consequently, a total of 28 models are obtained for WLBH, JFZ, YJ, and YC, while 10 models are for WLT after the cross-validation. The model with the lowest MARE for each sub-irrigation district was selected to further evaluate the accuracy.

Validation with testing dataset in the whole HID
In the testing dataset, the irrigation extent is validated using statistical AI area data. Further, both the irrigation extent and time are validated using field sampling points and points extracted from Sentinel-2.
Relative error (RE) and overall accuracy (OA) are used to assess the accuracy of AI area and specific locations, respectively, which are expressed as where N m is the number of simulation objects matching the observation object, and N is the number of the observation objects.
OA is also used to assess the accuracy of the determined AI date, where two different OAs are used based on different irrigation periods. For the middle to late period of irrigation (MTLP-based OA), N m in the OA is the number of samples whose observed irrigation dates are within the date intervals of the Mid-to-late irrigation period (points B to D in Figure 4). However, N m in OA based on the whole period of irrigation (WP-based OA) is the number of samples whose observed dates are within the date intervals of the whole irrigation period (points A to D in Figure 4).
WP-based OA is used in the validation of field sampling points because the recorded date may be in any stage of AI for a pixel. For validation points extracted from Sentinel-2, the recorded date is the date on which most of the areas (50% or more) in the pixel have been irrigated. Although the threshold is high, it may also be lower than the value of MBWI when the cropland has just begun to be irrigated. Due to the revisit cycle and cloud cover, Sentinel-2 may not be able to capture dates with maximum MBWI value for many pixels. Therefore, MTLP-based OA is more reasonable for points extracted from Sentinel-2.

Cross-validation results based on statistical autumn irrigation areas
Based on the results of the inter-annual fluctuations of average feature variable (MMAVE-MBWI) for three different reference land-use types, i.e. water bodies, all the croplands and grasslands, and part of croplands and grasslands with higher MMAVE-MBWI, there is a good agreement among the three land-use types from 2010 to 2020 (Figure 15, Appendix A3). Since most of the study area is covered by cropland and grassland , we selected the average MMAVE-MBWI of all the cropland and grassland pixels which is accurate enough to reflect the inter-annual changes of MMAVE-MBWI for the study area.
After MBWI normalization with reference to the average MMAVE-MBWI of all the cropland-grassland pixels, 4-fold cross-validation was used to calibrate and validate the model. It is observed the model performs well when data from different periods was used for calibration. The average MAREs during the calibration period of all models developed in the crossvalidation processes in each sub-irrigation district are smaller and range from 2.7% to 7.6% (Figure 6, left panel). The comparison shows that different calibration periods results in slight differences in the MAREs . The difference between maximum and minimum MAREs in JFZ is only 2%, and the differences in other sub-irrigation districts are about 5%.
However, the ranges of MAREs in the validation period are significantly greater than those in the calibration period (Figure 6, middle panel). These differences in MAREs might be associated with fewer years (only 2 years) for validation. When the calibration and validation periods were considered as a whole, the variations in MAREs in each sub-irrigation district becomes smaller and almost concentrated near the mean or median (Figure 6, right panel). The MARE of JFZ is the smallest with a mean value of 2.8%, while WLBH, YJ, YC, and WLT have mean MAREs close to each other ranging from 6% to 8%. These results suggest that the threshold-dependent classifier was quite robust and produced reasonably good results.
Among the models developed for each subirrigation district, we selected the model with minimum MARE during the whole calibration and validation period as the final model for further validation of AI location and calculation of AI time. The final MBWI T and MBWI Te of five sub-irrigation districts are shown in Table 3. The minimum and maximum MBWI T are observed in WLBH and WLT, respectively. This is mainly because WLBH has more sand while WLT has more clay (Figure 3(d), resulting in different surface water retention capacities.
The results of sensitivity analysis illustrated that the values of S T are −0.22, 0.04, 0.21, −0.13, and −0.37 for WLBH, JFZ, YJ, YC, and WLT, respectively. Since the autumn irrigated area decreases with the threshold, the signs of S T values are opposite to those of MBWI Te (Table 3). In other words, a 100% increase in the absolute values of the threshold results in the decrease of identified AI area by 22%, 4%, 21%, 13%, and 37%. In general, the models for different subirrigation districts have a low to moderate sensitivity to the threshold. Among the irrigation districts, JFZ has the lowest sensitivity, probably due to its higher percentages of AI in all years.

Validation points generated from Sentinel-2 images
The fitting results between SMBWI and LMBWI are SMBWI = 0.88*LMBWI−0.015, R 2 = 0.83, p < 0.05. Therefore, we set the threshold value of SMBWI as −0.015. If the cropland pixel's MBWI value is greater than −0.015 on a given day, the pixel is then identified as an AI pixel. A total of 6483 testing samples (3789 irrigated and 2694 non-irrigated) from 2016 to 2020 were selected from Sentinel-2 images to further verify the accuracy of the model (Table 1). The distribution of validation points is shown in Appendix A4 (Figure 16).

Accuracy assessment of autumn irrigation area and its distribution
The model performed well in the testing years, as illustrated by smaller relative errors between observed and simulated irrigation areas at WLBH in 2018 and JFZ, YJ, and YC from 2018 to 2020 (Table 4). The relative errors are within the acceptable range, among which 52% are in the range of −5.0% to 5.0% and 81% from −10.0% to 10.0%. The MAREs for the testing year (2018-2020) are 2.2%, 7.3%, 6.0%, and 4.5% for WLBH, JFZ, YJ, and YC, respectively. The MAREs of sub-irrigation districts and the whole HID from 2010 to 2020 are between 4% and 7%. Therefore, the threshold-based model developed using data from 2010 to 2017 can be applied to identify AI in other years with satisfactory precision.
Moreover, the overall accuracy for MMAVE-MBWIbased classification model was 90.0% and 97.0% according to the field sampling data and the reference data extracted from Sentinel-2, respectively (Table 5). In summary, the good performance of  model validations with independent datasets supports the robustness and potential of the method for extracting AI attributes.

Accuracy assessment of autumn irrigation timing
The accuracy of estimating irrigation time is given in Table 6. For field sampling points, WP-based OA is higher than 90.0% in both 2019 and 2020, with an average of 91.7%., However, the MTLP-based OA for points extracted from Sentinel-2 varies from 66.5% to 82.9% with an average of 76.4% when 50% of irrigated area is taken as the lower limit of the irrigated fraction to define an irrigated MODIS pixel. WP-based OA has higher values than those from MTLP-based OA because of the longer time interval. Furthermore, the MTLP-based OA from 2017 to 2020 can be significantly higher than in 2016. This may be related not only to the identification results but also to the reference data used for assessment. The pixels in study area have average Sentinel-2 images of 7.4, 12.2, 13.6, 13.5, and 12.7 days during the study period from 2016 to 2020. A significantly smaller number of images in 2016 may cause a loss of effective information, thus failing to provide a more realistic assessment of the model quality. Therefore, we attempted to increase the effective information by increasing the lower limit (80%) for the irrigation fraction of the validated pixels to account for the difference in MTLP-based OA in different years. It can be found that the average MTLPbased OA will rise to 88.9%, with a significant increase in 2016 in particular (Table 6).
Overall, the observed AI time matches well with the irrigation dates simulated by our model, which indicates the method used to calculate irrigation time is reliable. Figure 7 depicts the spatial distributionof AI areas in HID from 2010 to 2020, where the irrigated area depends mainly on the availability of runoff from   the Yellow River that varied by as much as 20% among different years. Over the 11 hydrological years, the irrigated area varied between a maximum of 4758 km 2 in 2011 and a minimum of 4121.5 km 2 in 2019. The mean autumn irrigated area from 2010 to 2020 was 4444.3 km 2 with a standard deviation of 215.25 km 2 , providing insights into the inter-annual variability in the extent of the autumn irrigated area. The proportion of the total AI area in each subirrigation district has remained stable for these years, with average values of 10.54% for WLBH, 26.70% for JFZ, 21.99% for YJ, 29.27% for YC, and 11.48% for WLT. Although there have been slight fluctuations in some years, the irrigated area has shown a downward trend during the past 11 years (Figure 8), with an average decreasing rate of 53.76 km 2 /year in the whole HID. Besides, YC and YJ also have higher decreasing rates of 20.50 and 17.44 km 2 /year, respectively. Based on the statistical data of planting areas acquired from Water Conservancy Development Center of HID and the estimated autumn irrigated areas in Figure 8, JFZ has the highest proportion of AI within the total planting area (83.5%), while WLBH has the lowest proportion (58.1%). Despite the largest cropland area in YC, its proportion of AI (70.8%) is considerably lower than that of JFZ. Overall, the proportion of AI has reached 72.0% for the whole HID.

Spatial-temporal changes in the distribution of irrigated croplands across HID
Superimposing multi-year images of AI area can effectively detect the maximum extent of irrigation area and find the autumn irrigated times for each pixel from 2010 to 2020 (Figure 9 a). Some lands have never been irrigated in autumn, mainly grassland, wasteland, and sunflowers in WLBH and WLT. Highly concentrated AI lands are mainly distributed in the JFZ and YJ, the southern WLBH, eastern YC, and northern WLT. The distribution of AI in the WLBH is mainly concentrated near the main canal.
By aggregating the 500 m irrigation map into a uniform 9 km 2 pixel and performing linear regression on the irrigation area of each aggregated pixel, we further studied the spatial trends in the variations of irrigated area over time on a fine scale (Figure 9(b). From Figures 7 and 9b, we can find that the autumn irrigated areas in the southern YC and the northern YJ have significantly decreased since 2014. The western part of the WLT has hardly been irrigated in autumn since 2016. The northeast corner of the JFZ has also seen a reduction in AI area since 2015. obvious differences in the spatial distribution of AI time among different years. In general, earlier irrigation was found around the boundary between YC and WLT and some areas in the north of HID. Compared with highly variable irrigation time in areas with small and scattered cropland, more synchronized irrigation was found in the southern part of HID (mainly in JFZ and YJ). The identified irrigation time of WLBH varies greatly from year to year.

Spatial-temporal changes in autumn irrigation timing
As an example, Figure 11 shows evolution process of the irrigation area in 2019. A small-scale AI area first appeared in YJ and the east of YC. Before October 15th, fewer areas in the HID have been irrigated, while a wide range of AI has appeared in JFZ, YJ, and YC since October 15th. By November 15th, the whole HID has almost completed AI. The results (cumulative area diagram shown in Appendix A5, Figure 17) show that irrigation was earlier in 2010, 2012, and 2018, while delayed in 2015 and 2020. Figure 12 shows that the start time of AI is delayed gradually from 2010 to 2020. Meanwhile, the proportion of cropland irrigated from mid-September to mid-October showed a downward trend and reached a new low of 6% in 2020. Since 2012, more than 70% of the cropland is irrigated from mid-October to mid-November, and this proportion is still increasing.

Discussion
Our study found that AI in HID showed different spatial distribution patterns each year, with a downward trend in the area, and a delayed starting time in the study period. This is consistent with our    field investigation and previous literature results (L. Wang et al. 2021;Yang, Liu, and Liu 2017).
The distribution pattern of AI is influenced by cropping patterns and soil texture. The cropping patterns differ in different sub-irrigation districts, especially the percentage of sunflower planting. The sunflower is more salt-tolerant and planted later than other major crops, which can be irrigated in the spring before sowing to leach salt and increase soil moisture. Therefore, sunflower has less demand for AI. As a result, JFZ with a lower percentage of sunflower (Wen, Shang, and Rahman 2019) has the highest proportion of AI within the total planting area. On contrary, YC and WLT with higher percentages of sunflower have lower proportions of AI. Furthermore, WLBH has the lowest proportion of AI mainly due to sandy soil texture (Figure 3(d). The sandy soil with poor moisture retention capacity results in poor effect in increasing the root zone moisture, which reduces the farmers' enthusiasm for AI and leads to great variation in the AI time ( Figure 10) dependent on the farmers' decisions.
The decreasing trend in the AI area ( Figure 8) is mainly related to the change in cropping patterns and irrigation water fees. Increased area and extent of sunflower planting in recent years (Yu and Shang 2017) is the main reason for the decreasing trend of AI area. The decreasing rate was greater in YC and YJ and smaller in WLBH, JFZ, and WLT. This is mainly because the southern part of YC and WLT has always been the growing area of sunflowers, but recently sunflowers are expanding toward the middle of HID, i.e. the rest of YJ and YC. Besides, the water fee of AI is higher than that of spring irrigation in HID, which results in the decreased enthusiasm of the farmers for AI and also leads to a decreasing trend in the AI area.
The postponement of AI time is related to not only cropping patterns but also some other factors such as climate change. In recent years, the proportion of late autumn crops tends to increase, such as maize harvested at the end of September, which leads to delayed AI time in these croplands. Meanwhile, climate change characterized by temperature rising and shortened freezing periods leads to the poorer effect of early irrigation (Yang, Liu, and Liu 2017).
The changes in AI area and time significantly impact the hydrological cycle, agriculture, ecosystem, and environment in HID. On the one hand, a decrease in the AI area leads to a drop in groundwater levels. Consequently, road slurry and land collapse caused by frost heave and thaw collapse in mid-spring will become less frequent (Yang, Liu, and Liu 2017). In addition, although the area irrigated by water diversion from the Yellow River is decreasing year by year in WLT, the total AI area is still stable (Figure 8). It indicates that the proportion of AI using other irrigation methods is increased annually in WLT, which has a good effect on saving irrigation water and controlling soil salinity (Mao et al. 2017). On the other hand, the changes in irrigation extent and timing will lead to corresponding changes in salinity and moisture distribution (Li et al. 2012). The multi-year water-salt balance in the HID will be disrupted, requiring artificial adjustment of irrigation strategies.
Our results can be used as a reference for irrigation and agricultural management in the HID. First, it can provide AI information in near-real-time (by the end of AI season) to guide spring irrigation and AI in the coming year. Second, the results provide information on croplands that are not irrigated in autumn, which are useful for determining the main planting areas for sunflowers and guiding the rational adjustment of the cropping patterns to prevent the single planted variety from affecting the sustainable development of local agricultural production (J. Liu et al. 2015).
The present model has been applied successfully in the HID. However, because it is an initial exploration in identifying the extent and timing of AI, the model still has some limitations and needs to be further improved.
To better detect the AI time, MODIS data with the daily temporal resolution is used. However, the moderate spatial resolution of MODIS data may influence the identification precision because of the inevitable problem of mixed pixels (Wardlow and Callahan 2014). The proportion of AI in HID is high, exceeding 70% of the crop planting area. Furthermore, the irrigated land is basically in large blocks, which provides a better opportunity to use MODIS data to identify the irrigated area. Although the model performs well in the entire HID and four of five sub-irrigation districts, its performance in WLBH is not as good as in other districts. This is mainly because the distribution of cropland in WLBH is more fragmented and MODIS cannot capture it well. The selection of remote sensing data with appropriate temporal and spatial resolutions depends mainly on the distribution of cropland in the irrigation district and research purposes (Velpuri et al. 2009). Generally, products whose spatial resolution matches the cropland scale for irrigation application should be used Massari et al. 2021). However, we often need data with high temporal resolution (e.g. daily scale) that may not have the high spatial resolution to match the irrigation scale (Durgun et al. 2020). In such cases, the percentage of cropland in each pixel can be considered in the classification in future studies or an appropriate method for irrigation land fraction estimation in pixel scale can be further developed (W. Liu et al. 2018;Zhang, Chen, and Lu 2015).
Another source of uncertainty may be associated with the threshold, which is also an unavoidable problem of the threshold-based method. The key is how to choose the threshold value. In some previous studies using MODIS data for identifying irrigation, the thresholds came from typical sample plots (L. Wang et al. 2021) that require the matching between sample purity and actual purity of the classification target; otherwise, it may lead to misclassification (Y. Chen et al. 2016). Therefore, we chose the conservative calibration method with statistical data to obtain the threshold. Second, the thresholds are normally not stable over different years. Compared with a fixed threshold, using a flexible Z-score normalized threshold can make the model robust enough to be applied in different years (B. Chen, Jin, and Brown 2019). However, we still need to be cautious in applying the thresholds to changed environmental conditions in the future. Besides, MBWI thresholds need to be adjusted for other irrigation districts with different management practices, irrigation volumes, soil characteristics, and cropping patterns.
Inaccuracies in land-use products can also lead to uncertainties in the identification of irrigated land. Identifying possible cultivated land is the first step in mapping irrigated land (Jin et al. 2016). Usually, the extraction of potential cultivated land depends on the existing land cover maps (Xie et al. 2019). However, the classified maps were generally for country or even global scales, which is seldom independently validated for a specified irrigation district (Gumma et al. 2020). Moreover, the temporal resolution of these products is relatively low, and the updating speed of these products generally cannot keep up with the speed of actual land-use change (Pan et al. 2020). According to the land-use maps for 2010 and 2020 (Figure 3), the distribution of lakes in WLBH has changed significantly from 2010 to 2020. Since lakes are possibly confused with the phenomenon of AI, landuse maps in only two years cannot eliminate the influences of lakes in other years. Therefore, precision land-use maps for the study region on an annual basis are beneficial for high precision in AI mapping.
For accuracy assessment, the mixed pixel problem complicates the evaluation of the final mapping accuracy. A threshold of 50% for AI or NAI area proportion was chosen to extract MODIS validation pixels based on sampling points and Sentinel-2 images with finer scales, which is a common practice in binary classification with mixed pixels (Hao et al. 2022). However, the most appropriate proportions for different regions need to be further explored (Colditz et al. 2018). In future studies, AI classification results considering the percentage of cropland in each pixel are useful to study mapping accuracies at different levels of cropland abundance, which can also provide useful insights to find the most appropriate proportion to identify a MODIS pixel as an AI pixel in the study region.
Moreover, the data gap of Sentinel-2 may affect the evaluation results on irrigation time. Studies have shown that the number of images involved is crucial when detecting objects with spatial-temporal hydrological fluctuations (Borro et al. 2014). The principles we set for determining the irrigation time for samples may not be applicable to all samples due to data gaps. Therefore, the optimal number of images for validation sample selection should also be considered in the future, which may aid in achieving a realistic and more accurate assessment of the model.

Conclusions
Knowing exactly when and where irrigation is applied is essential for effective irrigation water management. Autumn irrigation has been adopted in many regions of the world but is usually less investigated than irrigation during the growing season. A new approach was developed to map AI area and timing distributions using MODIS MBWI time series in the Hetao Irrigation District (HID) of Northwest China. This study addressed three key technical issues, i.e. selection of an appropriate WI for the freezing period, temporal incomparability of thresholds, and difficulty in validating the identified irrigation results, especially irrigation timing. The model is simple but effective and suitable for rapidly updating AI lands in large irrigation districts, especially for irrigated cropland distributed in large blocks. The generated maps from 2010 to 2020 for HID with a total land area of 1.189 million ha and an irrigated land area of 0.733 million ha delivered reasonable performance. The overall accuracy of irrigated area is over 90%, whereas the overall accuracies of irrigated dates assessed with "MTLP" and "WP" are 76.4% and 91.7%, respectively. The proposed mapping methodology and associated datasets provide the first detailed AI extent and timing distribution dataset for HID to date and can help reveal new insights into irrigated land use and water use change.
Based on the mapping results of 11 years of data, although the AI area has increased or decreased in some regions, there is no significant difference in the annual spatial distributions of AI extent compared to the significant difference in the annual spatial distribution of AI time in HID. Moreover, the area of AI tends to decline in the whole HID, and the starting time of AI is gradually delayed. These changes are related to many factors, including climate, irrigation fees, and changes in cropping patterns. The results obtained in this study can provide a reference for the rational formulation of AI scheduling under water-saving irrigation conditions and can provide a basis for reducing secondary salinization and promoting agricultural production in the study area.
x � ¼ x À x min x max À x min (A1) where x and x* are the original and normalized indices, and x min and x max are minimum and maximum values of the x series, respectively. By comparing the time profiles of all sampling points, multi-band water indices are found more sensitive to changes in water content. They usually have a more significant increase after autumn irrigation, exhibiting an improvement than the two-band normalized-difference water indices ( Figure 14). However, the inclusion of more bands also leads to more fluctuations in the multiband water indices.
In addition, different water indices are increased with varying magnitudes in late November (around DOY 325 for the representative irrigated pixel in Figure 14), possibly due to snow and/or ice. Among them, the increase of MBWI is small and usually lower than the increase caused by irrigation, while it even does not increase when other indices have abrupt changes in some cases. This is mainly because the number of absorption bands of ice and snow is similar to that of water, but their maximum absorption is different from that of water (Zeng et al. 1984;Satterwhite et al. 2003). The weights of MBWI in the visible and nearinfrared bands reduce the influence of ice and snow to some extent. In view of this, we finally chose MBWI to extract irrigation information.