Effectiveness of upscaling the vegetation temperature condition index retrieved from Landsat data using an algorithm combining the trend surface analysis and the spatial variation weight method

ABSTRACT Taking the Guanzhong Plain of China as the research area, a method combining the trend surface analysis (TSA) method and the spatial variation weight method (SVWM) was used to upscale the vegetation temperature condition index (VTCI) retrieved from Landsat 8 from a finer spatial resolution to a coarser. The upscaled results were compared with VTCIs retrieved from Aqua MODIS to provide technical support for the comprehensive application of drought monitoring results on two spatial scales. Meanwhile, the upscaled methods were systematically evaluated in a case study according to various indicators, including the spatial distribution characteristics for drought, the statistical characteristics, the semivariogram function (SVF), the structural similarity (SSIM), the correlation coefficient (r), and the root mean square error (RMSE). The results show that the dominant class variability weighted (DCVW) method performed better than did the arithmetic average variability weighted (AAVW) and median pixel variability weighted (MPVW) methods in terms of the statistical characteristics, SSIM, correlation and RMSE, and the upscaling model based on the union of TSA and DCVW had the highest accuracy. The DCVW is a promising tool for upscaling Landsat-VTCI images of the Guanzhong Plain from finer to coarser spatial resolutions because of its efficiency and flexibility.


Introduction
Drought is one of the main natural disasters impacting agricultural production worldwide and leads to major water problems (Bayarjargal et al., 2006;Svoboda et al., 2002). Traditional drought monitoring methods based on meteorological and hydrological data lack regional representation due to the limited number of meteorological stations. Remote sensing technology is flexible and objective and has advantages that include the ability to obtain enormous information on the occurrence, duration and severity of droughts over a wide area; therefore, it is widely used in the field of drought monitoring, early warning and impact assessments. Researchers have attempted to use a variety of parameters, such as the normalized difference vegetation index (NDVI) and the land surface temperature (LST), and their relationships to study drought monitoring and early warnings (Bayarjargal et al., 2006;Kongan, 1995). However, droughts are complex, slow, cumulative events; thus, drought patterns and intensities may go undetected. Wang et al. (2004), 2001 presented a method of monitoring drought by the vegetation temperature condition index (VTCI), which can accurately monitor the onset, duration and intensity of regional drought. VTCI has been widely used for drought monitoring, impact assessments and crop yield estimations by researchers (Mühlbauer et al., 2016;Patel et al., 2012;Peng et al., 2016;Wan et al., 2004).
With developments in imaging satellite technology, new types of sensors that can acquire abundant types of remote sensing data with multiple spatial and temporal resolutions are increasingly available, thus providing sufficient information for the study of surface parameters and the construction of quantitative models. However, remote sensing images provide data with varying spatial resolutions, which results in differences in the reflectivity and inversion of the surface parameters (Raffy, 1994). Moreover, the inversion parameters are of poor comparability due to uncertainties caused by sensor differences, surface heterogeneity, and differences in model mechanisms and algorithms. At the same time, single−resolution remote sensing images do not always accurately invert the size, shape and spatial distribution of geospatial variables because the resolution may not be optimal; therefore, the geospatial variables of a certain scale need to be optimized to accurately map their spatial features, processes and variability and the optimal spatial scale needs to be determined (Hay et al., 1997). Therefore, it is necessary to invert and analyze surface parameters using multitemporal and multispatial resolution remote sensing images. Because of diverse and uncertain interference factors during original data acquisition, research on scaling and fusion methods for different space−time scale data is more valuable in actual production. In actual remote sensing survey research, scale has an important impact. Droughts are complex and changeable phenomena, and a given scale may not be optimal for accurately mapping and clearly visualizing the distribution features, patterns, processes, and variations of droughts. To fully exploit multiscale spatiotemporal information and data, multiresolution satellite−derived parameters or variables are generally transformed to an optimal scale to provide adequate information in applications. However, the process of scale transforming and fusion must obey physical laws, which makes it much more difficult to transform and fuse remote sensing images (Raffy, 1994). At the same time, data processing will significantly affect the scaled result, which will change the statistical features and spatial characteristics and cause some loss of spatial information (Bian & Rachael, 1999). Therefore, it is necessary to develop appropriate scaling methods, which can maintain statistical characteristics or reveal new spatial pattern processes, to accurately identify useful drought information across scales, which will have positive significance for the effective monitoring of regional surface drought conditions.
In recent years, the emergence of scale and scaling has provided the possibility of achieving better application of multiscale spatiotemporal information and data, and considerable progress has been made in that aspect of study and application. In the 1990s, Meentemeyer and Goodchild proposed the concept of scale science (Goodchild & Quattrochi, 1997;Meenrenmeyer & Box, 1987). Friedl et al. (1995) developed a model of the remote sensing process to simulate imagery produced by various instruments, such as the Landsat Thematic Mapper and the Advanced Very High−Resolution Radiometer, and they demonstrated that scaling the NDVI influenced the range, variance, and uncertainty associated with estimates of LAI and the fraction of absorbed photosynthetically active radiation (FPAR) inverted from simulated NDVI data. Braswell et al. (2003) retrieved land cover distributions in two different parts of the Brazilian Amazon region by estimating relationships derived using a Bayesian−regularized artificial neural network (ANN) between land cover fractions derived from 30−m resolution ETM+ and reflectance data from 1−km resolution MODIS and MISR images. L. Wang et al. (2010) used a contextual approach based on mixed pixels and the support vector machine (SVM) algorithm to produce a scaling model of the net primary productivity (NPP) retrieved from fine resolution (TM) to coarse resolution (MODIS) and accomplished the correction of scale effect for NPP retrieved from coarse−resolution MODIS data.
The change characteristics of droughts at different spatiotemporal scales, which can be studied using multisensor remote sensing data with different resolutions, is an important topic in the field of quantitative remote sensing research. The purpose of this article was to develop a new upscaling method that combines the trend surface analysis method (TSA) and the spatial variation weight method (SVWM). This article discusses the methods of upscaling based on the VTCI drought monitoring results retrieved from the Landsat 8 OLI/TIRS (L8) data and the Aqua MODIS data during March to May from 2013 to 2019 as a case of the Guanzhong Plain in Shaanxi against this situation. And the statistical characteristics, the semivariogram function (SVF), the structural similarity (SSIM), the correlation coefficients (r) and the root−mean −square errors (RMSE) were analyzed to validate and evaluate the performance and superiority of the methods. Based on these analyses, the best method of upscaling Landsat−VTCI images from a finer to a coarser spatial resolution is chosen, and it will provide technical support for regional surface drought monitoring based on multisource satellite remote sensing data.

Study area
The Guanzhong Plain is located in Shaanxi Province in Northwest China, and it has an area of 5.55 × 10 4 km 2 (Figure 1). It is a plain basin that is surrounded by mountains, narrowed and closed in the west and open to the east. This area is the most important farming area in Shaanxi Province due to its low relief, fertile soil and natural conditions. Most of the area is covered by winter wheat in spring and early summer. The Guanzhong Plain is located in a warm temperate semihumid monsoon climate zone, and it is sensitive to climatic changes. The area is inland away from the ocean and belongs to the fragile zone of the northwestern ecological environment. The mean annual precipitation is approximately 500-750 mm and shows a high variability, with approximately 60% of the precipitation distributed from June to September and approximately 10% distributed from November to February of the following year. Seasonal droughts cause serious impacts on regional agricultural production, water resources and water quality.

Data source types and processing
Landsat data processing In this paper, we choose Landsat 8 OLI/TIRS (L8) data, which covers the satellite orbit ranks of 126/36, 127/36 and 128/36 during March to May from 2013 to 2019, based on cloud cover (CC ≤ 5%). The spatial resolution of the L8 data is 30 × 30 m, and the pass period is 16 days. The radiometric calibration and atmospheric correction of the L8 images were performed by use of ENVI software to reduce distortions based on the latest calibration coefficients of the OLI/ TIRS spectral bands and the FLAASH module. The NDVI and LST of the study area were calculated based on the above data.
The NDVI values were calculated based on reflectance in the near−infrared and red bands of the L8 data.
The LST is retrieved by the brightness temperature of band 10 of the L8 data with the mono−window algorithm proposed by Qin as follows Qin, Zhang et al., 2001): The brightness temperature can be calculated as follows: where L 10 is the radiance derived from the thermal infrared band 10 of the L8; and the values ofK 1 and K 2 are obtained from the header file of TIRS band 10, with K 1 ¼ 774:89W � m À 2 � sr À 1 � μm À 1 and K 2 ¼ 1321:08K. The basic concept of the mono−window algorithm is to retrieve LST from the thermal infrared band of Landsat data based on the thermal radiance transfer equation : where a and b are coefficients, with values −67.35535 and 0.458608, respectively; C and D are intermediate variables; T a is the atmospheric mean temperature; ε is the ground emissivity; τ is the atmospheric transmittance. So the calculation problem of LST can now be converted to a determine process of the parameters of T a , ε and τ. T a depends mostly on the in situ distribution of atmospheric temperature and atmospheric conditions at each layer of the profile, T a is a linear function of the near−surface air temperature under the standard atmospheric distributions (clear sky and without great turbulence). Combined with the data of the atmospheric temperature at the ground (T 0 ) obtained from local meteorological data (the temperature data is observed by the local meteorological stations at the time of satellite passing), T a was estimated according to the linear function of T a and T 0 by Qin (Table 1) Qin et al., 2003). Because of the research area is located in 33°N ~ 36°N (Table 2), the effective atmospheric mean temperature in mid −latitude summer was calculated as T a . Researches show that the change in atmospheric transmittance mainly depends on the dynamic change of atmospheric water content, other factors, such as air pressure, air temperature, aerosol content, O 3 , CO 2 , CO, NH 4 , etc., have little effect on the change of atmospheric transmittance due to their small dynamic changes. Due to its extreme importance in influencing the variation of atmospheric transmittance, water vapour content has been extensively used as the determinant in estimation of atmospheric transmittance (Franca & Cracknell, 1994;Qin et al., , 2003Sobrino et al., 1991). It is difficult to retrieve atmospheric water vapour content from Landsat data. Generally, the ratio between the band 2 and the band 19 of the MODIS L1B Calibrated Radiances products with the same date as Landsat data were used to calculate the atmospheric water vapour content. This method has advantage in partially eliminating the influence of the surface reflectance with wavelength on the atmospheric transmittance and improving the accuracy of the atmospheric water vapour content (Gao et al., 2007;Hu et al., 2015): where ω is the atmospheric water vapour content (g/ cm 2 ); α = 0.02, β = 0.6321; ρ 19 and ρ 2 are the apparent reflectance of band 2 and band 19, respectively. The transmittance decreases steadily with water vapour content increase, and the change of transmittance with water vapour is close to linearity for a small segment. So some simple linear equations established by Qin were adopted to estimate transmittance from water content for Landsat data (Table 2) (Table 3) Qin et al., 2003). The ground emissivity ε can be estimated by NDVI. Sobrino et al. found that the ground emissivity is highly correlated to the vegetation index, and proposed obtains the emissivity values from the NDVI considering different cases (Sobrino et al., 2004): (a) NDVI<0.2, in this case, the pixel is considered as bare soil and the emissivity is obtained from reflectivity values in the red region; (b) NDVI>0.5, pixels with NDVI values higher than 0.5 are considered as fully vegetated, and then a constant value for the emissivity is assumed, typically of 0.99; (c) 0:2 � NDVI � 0:5, in this case, the pixel is composed by a mixture of bare soil and vegetation, and the emissivity is calculated as follows: where ε v and ε s are the vegetation emissivity and the soil emissivity, and value for 0.99 and 0.97, respectively. For plain surfaces, the term dε is negligible, P v is the vegetation proportion obtained according to: The best way to validate the LST accuracy is to compare the in situ ground truth measurements of LST with the retrieved ones with the algorithm from the Landsat data. Since it is extremely difficult to obtain the in situ ground truth measurements comparable to the pixel size of Landsat data at the satellite pass, Qin et al. used the simulated data generated by atmospheric simulation programs LOWTRAN 7.0 to replace the real−time data. That is, the thermal  radiance reaching the remote sensor at the satellite level were simulated to calculate the LST according to certain ground conditions and atmospheric conditions. Comparison of the assumed LST used for the simulation with the retrieved one from the simulated total radiance enables to examine the accuracy of the LST and the algorithm. The verification results of six standard atmospheric conditions show that the algorithm is able to provide a quite accurate estimate of LST in most cases. The LST difference between the assumed and the retrieved ones are less than 0.4°C in most cases . More details about the algorithm for LST and its sensitivity can be found in the work of Qin et al. Qin, Zhang et al., 2001).

MODIS data processing
Multitemporal Aqua MODIS data, which include the daily surface reflectance products (MYD09GA) and the daily LST products (MYD11A1), were selected from March to May from 2013 to 2019. The Aqua MODIS satellite views the entire Earth's surface every 1 to 2 days and provides daily reflectance and LST products at a resolution of 926.62 m × 926.62 m. The projection transition, resampling, cropping and format conversion of the MODIS data were performed by the MODIS Reprojection Tool (MRT), which was developed by the National Aeronautics and Space Administration (NASA). Based on the daily NDVI and LST products, the 10−day maximum MODIS−NDVI and MODIS−LST products for each year at 10−day intervals were generated by using the maximum value compositing (MVC) approach (Helben, 1986).

Determining the vegetation temperature condition index
The VTCI follows the form that was defined in Sun (Sun et al., 2008): where NDVI i is the NDVI value in period i; LST max NDVI i ð Þ and LST min NDVI i ð Þ are the maximum and minimum LST values of the pixels that have the same NDVI i value within a study region, respectively; LST NDVI i ð Þ denotes the LST value of one pixel for which the NDVI value is NDVI i ; and a, b, a 0 and b 0 are undetermined coefficients that estimated from the scatter plots of the LST and NDVI by using different composite NDVI and LST products. The VTCI is related to NDVI changes as well as LST changes of pixels with a specific NDVI value. (10) is the difference between the maximum LST of the pixels and LST of one pixel, and LST max NDVI i ð ÞÀ LST min NDVI i ð Þ in equation (10) is the difference between the maximum LST and the minimum LST. LST max NDVI i ð Þ and LST min NDVI i ð Þ are regarded as the warm edge where there is less soil moisture availability and plants are under dry conditions and the cold edge where there is no water restriction for plant growth, respectively (Sun et al., 2008). Therefore, determining the warm edge and cold edge are the most remarkable problems associated with using the VTCI. The warm edge of the MODIS−VTCI is determined by using the multiyear maximum value composite LST and NDVI products, and the cold edge of MODIS data is determined by using the multiyear maximumminimum value composite LST products and the multiyear maximum value composite NDVI products. The VTCIs of L8 are calculated from the NDVI and LST and refer to (3). The warm edge and the cold edge of the L8 data are determined by using minimum and maximum LST and NDVI of one−day L8 image. The range of VTCI values is from 0 to 1. The lower the value of VTCI, the more severe the drought (Sun et al., 2008).

Coordinate transformation
Remote sensing images acquired by different sensors have different spatiotemporal resolutions and projection information, and it is necessary to establish a more accurate spatial registration relationship between images for scaling processing. Therefore, spatial coordinate transformation is required between MODIS−VTCI images (Lambert projection) and Landsat−VTCI (UTM) images. First, we need to determine the plane coordinate of a pixel of the MODIS −VTCI image and convert the plane coordinates of the pixel to geographic coordinates by using the Lambert reverse calculation algorithm. Then, the geographic coordinates are converted to plane coordinates in the UTM projection to realize the coordinate conversion of two images (Xu, 2011;Zhang, 2013).

Data source types and processing
In this study, we combined the TSA method with the SVWM to upscale L8 images from a finer resolution of 30 × 30 m to a coarser spatial resolution of 930 × 930 m.

Trend surface analysis method
The TSA method is a multivariate statistical analysis method that fits the spatial distribution and variability of geographic drought variables and provides additional information by constructing surfaces that are usually decomposed into trend surfaces and residual surfaces (Eppler & Full, 1992;Wang & Zuo, 2015). Based on a previous study (Bai et al., 2017), we select the quadratic polynomial trend surface to produce the trend surface. The functions of the quadratic polynomials trend surface are as follows: where VTCI i x i ; y i ð Þ, VTCI Ti x i ; y i ð Þ and e x i ; y i ð Þ are the retrieved VTCI values from the L8 data and the trend and residual VTCI values of variable VTCI i at location x i ; y i ð Þ, respectively; i is the number of selected data; and c 0 to c 5 are the polynomial coefficients.
After removing the coefficients of each of the polynomials (10), the remaining part can be considered as an independent variable x. Let Then, the problem of obtaining the trend surface equation is transformed into the problem of obtaining a solution.
The trend surface equation problem is then transformed into a multiple linear regression analysis problem of VTCI T and x 1 , x 2 , x 3 , x 4 , and x 5 as follows: To obtain the best fit for the actual spatial distribution of drought information, according to the least square theory (LS), the sum of the squares between the measured value VTCI i x i ; y i ð Þ and the trend value VTCI Ti x i ; y i ð Þ should be minimized to achieve a best fit as follows: The following formula was applied to calculate the partial derivative of Q for c 0 to c 5 , and this value was set to 0 to obtain the normal equation: This equation can be resolved via the following equation: To solve the regular equations with the common elimination method, the coefficients from c 0 to c 5 were estimated by using the least−squares method, and c 0 to c 5 were then substituted into the fitting model to obtain the quadratic polynomial trend surface analysis equations. The trend surfaces of each local window could be obtained with these equations.

The spatial variation weight method
The spatial and temporal variations of drought information are large. It is necessary to comprehensively consider the heterogeneity of the land surface and the spatial variability of drought information to detect interactions in land surface patterns and reduce drought information loss when aggregating drought information. Moreover, a strong spatial autocorrelation is exhibited because the reflectivity information of each pixel in a remote sensing image is affected by the surrounding pixels. Surface heterogeneity, spatial autocorrelation, spatial distributions of drought and random factors will affect spatial upscaling results. Wang et al. (G. Wang et al., 2001) proposed that the key to obtaining accurate upscaled results is to obtain the dominant spatial features and spatial variability of remote sensing images. Therefore, it is necessary to accurately map the drought information of each pixel in a study area. In actual research, spatial autocorrelation and spatial variability can be measured by the variance between the VTCI values of the dominant spatial feature and the VTCI values of each pixel of a remote sensing image in the study area, and the reciprocal of the variance is usually taken as the weight. The farther a VTCI pixel value is from the VTCI value of the dominant spatial features, the larger the variance is and the smaller the weight is. Different measurement techniques for the VTCI values of the dominant spatial characteristics result in different upscaling methods. In this study, the SVWM, including the dominant class variability weighted (DCVW) method, the median pixel variability weighted (MPVW) method and the arithmetic average variability weighted (AAVW) method were used to obtain the VTCI values of the dominant spatial characteristics of the trend surface of the local 31 × 31 window. Based on the VTCI values of the dominant spatial characteristics, we aggregated the VTCIs of the trend surface to upscale the Landsat−VTCI images of the Guanzhong Plain. The DCVW method determines the dominant spatial characteristics value based on the VTCI value frequency distribution of the trend surface of the local 31 × 31 window and takes the VTCI value with the highest frequency as the VTCI value of the dominant spatial characteristics of the local window. The variance between the VTCI value of the dominant spatial characteristics and the VTCI value of each pixel of the trend surface of the local window was calculated. To add the reciprocal of the variance, the VTCI weight is equal to the result of dividing the reciprocal by the resultant summed value. Each VTCI is then multiplied by its weight in the trend surface of the local 31 × 31 window and added together. The weighted average of the trend surface of the local 31 × 31 window is finally obtained according to the above steps. The formula is given as follows: where w i is the weight; m is the pixel number of the trend surface of the local 31 × 31 window; VTCI Ti is the value of a certain VTCI of the trend surface of the local 31 × 31 window; and VTCI d T is the VTCI value of the dominant spatial characteristics of the trend surface of the local 31 × 31 window.
In the MPVW method, it is assumed that the intermediate VTCI in the trend surface of the local 31 × 31 window has the highest probability of having the dominant spatial feature and the spatial variation and autocorrelation can be quantified by the variance between the intermediate VTCI and the VTCI of each pixel in the trend surface. The VTCI value of all pixels in the trend surface of the local 31 × 31 window is sorted, and when the VTCI value of some pixels are equal, only one of them participates in the sorting. If the total number of VTCI values participating in the sorting in the trend surface is odd, then the intermediate VTCI value is taken as the VTCI value of the dominant spatial feature of the trend surface. If it is even, then the mean of the two intermediate VTCIs is taken as the VTCI value of the dominant spatial feature as follows: In the AAVW method, the spatial variation and autocorrelation are quantified by the variance between the arithmetic mean value of the VTCIs and the VTCI value of each pixel in the trend surface. The arithmetic mean value of the VTCIs in the trend surface of the local 31 × 31 window is used as the VTCI value of the dominant spatial characteristics: The operating equations and processes of the DCVW, MPVW and AAVW methods are consistent except for the determination method of the dominant spatial characteristics value. In the actual operation process, only the VTCI value of the dominant spatial characteristics of the trend surface of the local 31 × 31 window needs to be replaced.

Concrete means and steps
First, we selected the corresponding periods of the Landsat−VTCI images and MODIS−VTCI images of the study area. As noted above, to register Landsat −VTCI plane coordinates, we chose a center pixel from a geo−referenced Landsat−VTCI image and extended 15 pixels above, below, to the left and right to form a local window of 31 × 31 pixels. Then, the TSA was used to build a polynomial model to produce the trend surface of the local window. The DCVW, AAVW and MPVW were used to upscale the Landsat−VTCIs by analyzing the spatial data of the trend surface and to extract Landsat−VTCIs over each block of the 31 × 31 pixels, and these VTCIs were then aggregated into one pixel that has drought information in the initial pixels. All the windows included the study area and were adjacent to each other without overlap. With aggregation, the Landsat−VTCIs were upscaled from a resolution of 30 × 30 m to a resolution of 930 × 930 m.

Evaluating the upscaling methods
Different upscaling methods exhibit different amounts of information loss and variation in the upscaled units (Raffy, 1994;G. Wang et al., 2001). Evaluation indicators, such as the image information entropy, average gradient (AG), SVF, SSIM, r and RMSE, were calculated and analyzed to quantitatively compare and evaluate different upscaling methods. The image information entropy can reflect the richness and complexity of the image information, and its value reflects the amount of information contained in the image. Considering the VTCI image as a sample whose gray values ([0,255]) are independent of each other, the grayscale distribution of the image is P ¼ P 0 ; P 1 ; . . . ; P j ; . . . ; P LÀ 1 � � . The unary gray entropy of an image is defined as follows: where L is the total gray level and p j is the proportion of the number of pixels with a gray value of j. The AG reflects the change in the contrast of small details in the multidimensional direction of the image, that is, the change in image density. AG is the average of the gradient values of all pixels in the image in the X and Y directions to characterize the relative clarity of the image. For VTCI images ([0, 255]), the greater the gray level change rate, the larger the corresponding gradient change (Huang et al., 2013). Therefore, the AG can be used to reflect differences in small details and texture transformation features in the image, and it is defined as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi @VTCI The SVF is a comprehensive index that is used to measure spatial dependence and heterogeneity (Cohen et al., 1990), and it is calculated as follows: where N h ð Þ is the number of VTCI pairs and VTCI i and VTCI iþh are the VTCI values at two pixels VTCI i x i ; y i ð Þ andVTCI iþh x i ; y i ð Þ, respectively, which are separated by a lag h.
The image structure contains the main information of the image, and SSIM can quantitatively interpret the structural information. From the perspective of image composition, the SSIM function has three parts: luminance, contrast and structure (Z. Wang et al., 2004): where L and M are the upscaled Landsat−VTCI image and the MODIS−VTCI image, respectively; l L; M ð Þ, c L; M ð Þ and s L; M ð Þ are the luminance comparison function, the contrast comparison function and the structure comparison function, respectively; α, β and γ are parameters that are used to adjust the relative importance of the three components and were originally set to μ X ; μ X and μ Y are the mean VTCI values of two images; δ X and δ Y are the standard deviations of the VTCI values of two images; δ XY is the covariance of the VTCI values of two images; and C 1 and C 2 are constants that are used to avoid instability.
The RMSE is often used to measure the deviation between the assessment data of an upscaled Landsat −VTCI image and the reference data of a MODIS −VTCI image: RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi The upscaling ability was evaluated by calculating the image information entropy, AG and SVF. In addition, we also assessed the effectiveness of different upscaling methods by comparing the SSIM, r and RMSE values between the upscaled Landsat−VTCIs and the MODIS −VTCIs. The greater the values of SSIM and r, the smaller the values of RMSE and the better the upscaling.

Results
First, we performed a TSA to build a trend surface of Landsat−VTCIs of the local 31 × 31 window. The DCVW, MPVW and AAVW methods were then used to build the upscaling model by analyzing the spatial distribution characteristics and variations of the trend surface. Meanwhile, the quantitative drought monitoring results of the MODIS−VTCI images were used as the "real" drought monitoring results, and the statistical characteristics, SVF, drought distribution and texture characteristics, SSIM, r and RMSE between the upscaled Landsat−VTCI images (L−VTCI) and the MODIS−VTCI images (M−VTCI) were compared and analyzed to evaluate the upscaling method.

Analysis of the statistical characteristics
The process of upscaling will cause information loss. According to the statistical information of image information entropy in Table 4, the image information entropy of the original Landsat−VTCI image is reduced after it is upscaled to a lower spatial resolution image, and the range of the difference is 0. These results show that the variation ranges of information entropy are related to surface heterogeneity. The central part of Guanzhong Plain has various types of land use and land cover and the gray value of the image is scattered due to the greater heterogeneity of land surface, which leads to greater changes during the upscaling process. The information entropy of upscaled Landsat−VTCI images was significantly larger than that of the corresponding MODIS−VTCI images, and the difference range of the information entropy between them was 0.0065-1.0144, indicating that the upscaled images obtained by Landsat−VTCI with higher spatial resolution contain more information, and the drought information reflected by the upscaled Landsat−VTCI image is richer and more complex. Therefore, the upscaled Landsat−VTCI images can obtain more drought information characteristics and different upscaling methods result in different levels of information loss. AG reflects the texture difference of the image. Table 4 shows that the AG value of upscaled Landsat −VTCI images increases as the spatial resolution gets larger and the difference range of the AG before and after upscaling is 1. 5770-13.4044. This finding shows that the gradient of the Landsat−VTCI image presents a large change from the higher spatial resolution to the lower spatial resolution, which results in a loss of some details in the Landsat−VTCI, such as texture and edge data. The AG of the upscaled Landsat−VTCI image is higher than that of the corresponding MODIS−VTCI images, and the difference range is 0.0348-12.7356. This finding illustrates that the texture variation in drought information in the upscaled aggregated by Landsat−VTCI images with higher spatial resolution was larger than that of the MODIS−VTCI images in the corresponding time period; moreover, the texture feature of the upscaled Landsat−VTCI images was clearer compared to that of the MODIS−VTCI images. Therefore, studying image scaling and developing appropriate upscaling method are significant for both theory and practice. Table 4 shows us that the image information entropy and AG values of the upscaled images aggregated using the DCVW method were larger than that using the AAVW and MPVW, indicating that the images upscaled by DCVW contain more drought information and texture features than that by AAVW and MPVW, which means that the more information is lost by the AAVW and MPVW methods during the process of upscaling.

Analysis of the semivariogram function
The SVF value can indicate the spatial autocorrelation and variability of an image, and it was calculated for the upscaled images and the MODIS−VTCI images (Table 4). The results show that all of the images upscaled using the DCVW, AAVW and MPVW methods had larger SVF values than the MODIS−VTCI images. When the original Landsat −VTCI images of the study area were upscaled to a smaller number of VTCI units over the same spatial extent, the dominant spatial characteristics of the drought information of the images were smoothed. The SVF values obtained using the AAVW and DCVW methods were relatively smaller, indicating that the AAVW and DCVW methods produced relatively stronger smoothing towards the dominant spatial variability in the process of resampling the Landsat−VTCI images from a finer spatial resolution to a coarser spatial resolution.

Analysis of the spatial distribution characteristics and texture feature
The spatial distribution characteristics and variations of drought of the upscaled Landsat−VTCI images and the MODIS−VTCI images of the study area were visually compared. The images upscaled using the DCVW, AAVW and MPVW methods were darker in the eastern Guanzhong Plain, which is similar to the corresponding regions of the MODIS−VTCI images. The upscaled Landsat−VTCI images were lighter in the western Guanzhong Plain, which is similar to the MODIS−VTCI images. Overall, the spatial distribution and texture characteristics of the upscaled images were similar to those of the MODIS −VTCI images. These findings show that the upscaled Landsat−VTCI images at the coarser resolution and the corresponding MODIS−VTCI images had good consistency and indicate that the quantitative drought monitoring results of the MODIS−VTCI images can be used to evaluate the upscaled results. Figure 2 shows that the spatial distribution, image brightness level and texture features of the upscaled images and the MODIS−VTCI images were consistent. Moreover, the images upscaled using the DCVW, AAVW and MPVW methods were basically consistent with the MODIS−VTCI images in terms of texture feature gradient, and both the Weihe River in the middle study area and the Yellow River in the eastern study area could be easily identified in the upscaled images.
Considering the differences in recognizing the brightness, contrast and structural features of the Landsat−VTCI images in the upscaling process by different upscaling methods, the SSIM values between the MODIS−VTCI images and the images upscaled by the DCVW, AAVW and MPVW methods were further compared ( Table 5). The SSIM values between the MODIS−VTCI images and the images upscaled using the DCVW, AAVW and MPVW methods were larger, demonstrating that the brightness characteristics, contrast characteristics and structural features of the images upscaled using the DCVW, AAVW and MPVW methods were all similar to those of the MODIS−VTCI images. In other words, the upscaled images and the MODIS−VTCI images have a consistent image structure. Most of the SSIM values from the DCVW method were larger than those from the AAVW and MPVW methods. This difference implies that the images upscaled by the DCVW method better matched the MODIS−VTCI images with respect to the dominant features of luminance, contrast and structure compared to the images obtained with the AAVW and MPVW methods. Thus, the DCVW method has a stronger capacity to characterize image structure than the AAVW or MPVW methods in the upscaling process.

Analysis of the consistency
Correlations between the upscaled images and the MODIS−VTCI images were analyzed ( Table 5). All of the correlation coefficient (r) values were larger and in the range of 0.4370-0.6945, which means that all the Landsat−VTCI images upscaled using the DCVW, AAVW and MPVW methods had better correlations with the MODIS−VTCI images. The r values obtained from the three methods were close to each other, and the DCVW method consistently produced greater r values compared with the AAVW and MPVW methods. This result indicates that the improvements from upscaling the Landsat−VTCI images using the DCVW method were more obvious and significant than those based on using the AAVW and MPVW methods. Therefore, the DCVW method is more suitable for upscaling Landsat−VTCI images from finer to coarser spatial resolutions.
The RMSE values between the corresponding MODIS−VTCI images and the upscaled images using the DCVW, AAVW and MPVW methods were further quantitatively compared to analyze the systematic error between the Landsat−VTCIs and the MODIS−VTCIs and to evaluate the upscaling effects (Table 5). The RMSE values between the MODIS−VTCIs and the upscaled Landsat−VTCIs ranged from 0.0858 to 0.2137, and the values were smaller except for the value on 18 May 2014. The results show that the systematic error between the Landsat−VTCIs upscaled by the three methods and the corresponding MODIS −VTCIs was small and the upscaled Landsat−VTCI images and the MODIS−VTCI images showed similar degrees of drought. The RMSE values for the DCVW method were slightly lower than those for the AAVW and MPVW methods, which implies that the images upscaled using the DCVW method were closer to the quantitative drought monitoring results of the MODIS −VTCI images. The DCVW method was more suitable for upscaling Landsat−VTCI images of the Guanzhong Plain from finer to coarser spatial resolutions.

Discussion
The innovation of this research lies in the application of the TSA to construct the trend surface. The trend surface is constructed by a mathematical polynomial model to simulate the spatial distribution of drought and its regional variation, and it considers the influence of the high spatial variation and random factors in natural scenes. The spatial analysis problem was solved by a mathematical statistical method, and the spatial analysis was quantified and modeled, which has a potential promotional effect on aspects close to the actual drought surface. Conversely, the upscaling models that use the DCVW, AAVW and MPVW methods were built upon the foundation of the trend surface, and the trend surface can better characterize the dominant distribution of features, variability, patterns and processes of Landsat−VTCI images. Therefore, combining the TSA with the DCVW, AAVW and MPVW methods is more applicable to upscaling Landsat drought monitoring results.
The formation of drought is a complicated process. The VTCIs in the Guanzhong Plain vary in terms of structural factors and show regional differences. At the same time, spatial heterogeneity is one of the main causes of errors in the upscaled results. The MPVW method is the worst because the median value of the local window was used as the dominant class value in the process of upscaling. While the variation in drought during the local window may be large, the VTCI values may present a skewed distribution and the MPVW method will  yield a biased estimate because the median value of the local window is used as a dominant class value. The DCVW method uses the VTCIs of the dominant type of pixels as the dominant class value for upscaling. By calculating the variance between the VTCI of each pixel and the VTCI of the dominant type of pixel in the local window to measure the local spatial autocorrelation, the surface spatial pattern can be effectively detected to determine the loss of drought information. Thus, the upscaled model combining the TSA with the DCVW method better aggregates Landsat−VTCIs of the Guanzhong Plain from finer to coarser spatial resolutions. In this work, the LST calculation, as well as the coordinate transformation, is introduced from literatures. The LST calculation is vital for the VTCI, while the coordinate transformation has benefits to improve the accuracy when building the spatial position relationship between two images. More work about both of the LST calculation and the coordinate transformation is needed to in the future.

Conclusions
In this study, we evaluated and compared images upscaled by combining the TSA with the AAVW, MPVW and DCVW methods in a case study. The images upscaled using the DCVW, AAVW and MPVW methods generally matched the MODIS −VTCI images with respect to the spatial distribution characteristics and texture features of droughts. Different upscaling methods result in different levels of information loss, with the DCVW method losing less information in the process of upscaling relative to the AAVW and MPVW methods. The SSIM analysis results imply that the DCVW method is better at upscaling spatial drought variables than the AAVW and MPVW methods because it has a stronger capacity to characterize the image structure of the original Landsat−VTCI images. The DCVW method also performed better than the AAVW and MPVW methods in terms of the correlation analysis between the upscaled Landsat−VTCIs and the MODIS−VTCIs. Moreover, the RMSE values using the DCVW method were all slightly lower than those using the AAVW and MPVW methods. Systemic errors occurred between the quantitative drought monitoring results of the MODIS−VTCI images and the relative results of the Landsat−VTCI images, and the upscaled results with the DCVW method were closer to the quantitative drought monitoring results. In conclusion, the upscaling model based on combining TSA and the DCVW method has the highest accuracy. The method discussed in the paper is a promising tool for upscaling Landsat−VTCI images of the Guanzhong Plain from finer to coarser spatial resolutions.

Disclosure statement
No potential conflicts of interest were reported by the authors.

Funding
This work was supported by the National Natural Science Foundation of China under Grant No. 41371390.