Analysing the spatial context of the altimetric error pattern of a digital elevation model using multiscale geographically weighted regression

ABSTRACT Many freely available Digital Elevation Models (DEM) have increasingly been used worldwide due to the difficulty in acquiring accurate elevation data in some regions, emphasizing the need to investigate their accuracy and the factors that may influence their uncertainties. We performed an accuracy analysis of the Topodata DEM in the hydrographic region of Uruguay (Brazil) assuming that its vertical accuracy may be related to terrain characteristics. Multiscale Geographically Weighted Regression (MGWR) was applied to investigate the spatial scales over which terrain characteristics affect local variations in altimetric errors. MGWR outperformed Ordinary Least Squares (OLS) and Geographically Weighted Regression (GWR). MGWR results also showed that aspect, curvature, and artificial areas operate at much smaller scales than elevation and have a higher influence in areas with high positive altimetric errors. The model explains about 41% of the total variation of the altimetric error of the Topodata DEM in the study area. Our findings enrich the understanding of the global and local processes affecting the accuracy of the Topodata DEM and shed light on the importance of local terrain characteristics in effective DEM product development. HIGHLIGHTS DEM products provide fundamental information for several research areas. OLS, GWR and MGWR were applied to identify the factors explaining the altimetric error of a DEM. MGWR investigated the spatial scales over which terrain characteristics affect local variations in altimetric errors. MGWR outperformed OLS and GWR proving that terrain characteristics operate at different scales.

Although traditional methods to generate DEM are expensive and time-consuming (Uysal et al., 2015), Earth's surface analysis is becoming increasingly viable due to the rising availability of DEM products at different spatial resolutions (Drăguţ & Eisank, 2011).Moreover, due to the lack of more accurate three-dimensional data for geographic studies at a regional scale, freely available DEM are frequently used in several scientific applications which evidences the need to investigate the accuracy of DEM products (Liu et al., 2020).Approaches regarding the assessment of DEM quality are usually based on altimetric discrepancies between the DEM and the reference data, which must present reasonable density and be well-distributed in the study area (Polidori & El Hage, 2020).In addition, it is essential that these reference data have greater accuracy than the evaluated DEM to ensure a relevant statistical analysis as well as a spatial error analysis (Polidori & El Hage, 2020).
In recent decades, many technological advances in the creation and in the process of making available DEM products have been made, despite there are not enough specific guidelines yet regarding the assessment of the accuracy nor a perspective of suitability for the use of these products (Mesa-Mingorance & Ariza-López, 2020).In this sense, scale effects modelling has been considered an important research topic since there are not many studies dealing with the potential influence that such effects may have on some factors' modelling related to hydrology, soil science and geomorphology (Chang et al., 2019;Drăguţ & Eisank, 2011).Moreover, mathematical modelling of the predicted DEM error as a function of the landscape characteristics is also a promising research path (Polidori & El Hage, 2020).
Many studies have addressed the vertical accuracy of DEM (Hawker et al., 2019;Hirano et al., 2003;Hladik & Alber, 2012;Wessel et al., 2018;Weydahl et al., 2007) through external and internal quality assessment, i.e. with or without reference data, respectively (Polidori & El Hage, 2020).The vertical accuracy is often assessed through the comparison of the elevation of the DEM product with a reference elevation collected from a greater accuracy source where parameters such as mean, standard deviation and root mean square error (RMSE) are obtained to analyse the elevation discrepancies (Mesa-Mingorance & Ariza-López, 2020; Wise, 2000).
The most common parameter to assess the vertical accuracy of a DEM is the RMSE (Mesa-Mingorance & Ariza-López, 2020), but it is also possible to use information entropy (Wise, 2012) as well as geomorphometric analyses to measure the quality of a DEM (Pipaud et al., 2015;Szypuła, 2019;Temme et al., 2009).Furthermore, some studies have addressed spatial statistics techniques such as Ordinary Least Squares (OLS) and Geographically Weighted Regression (GWR) to model errors in DEM (Carlisle, 2005;DeWitt et al., 2015;Erdoğan, 2010;Gallay et al., 2010).
OLS is a technique used to estimate the parameters of multiple linear regression models, which is based on the principle of minimizing the sum of squared differences between the observed dependent variable values and the predicted values (Brunsdon et al., 1996;Hutcheson & Hutcheson, 2011).OLS assumes that the relationship between variables is constant across the entire study area, disregarding any spatial variations.On the other hand, GWR (Brunsdon et al., 1996;Fotheringham et al., 2002) and MGWR -Multiscale Geographically Weighted Regression (Fotheringham et al., 2017) are both spatial regression techniques that explicitly account for spatial heterogeneity by estimating relationships locally, allowing for spatially varying coefficients.GWR operates at a local scale, where the relationships between variables are estimated for each individual location within the study area, while MGWR simultaneously estimates the relationships at different scales.This allows MGWR to capture variations in relationships at local, regional, and global scales.Thereby, MGWR can model the relationship between the dependent and the independent variables considering the geographic scale at which processes occur, allowing it to differentiate spatial homogeneous and heterogeneous relationships that may influence the dependent variable at different locations (Fotheringham et al., 2017).Hence, it is possible to identify if an explanatory factor of the altimetric error operates locally, regionally, or globally in the study area.However, there is still no evidence of studies that have modelled the pattern of the altimetric error through MGWR.
Our study performs a vertical accuracy analysis using MGWR, aiming to identify the local factors that may explain the altimetric error of the Topodata DEM in the hydrographic region of Uruguay (Brazil), accounting for possible different spatial scales in the relationship between such local factors and the altimetric error.Accordingly, this research study aims to investigate not only if the vertical accuracy of DEM products is related to local terrain characteristics but also if there are spatial variations in the magnitude of those relationships.Additionally, this study aims to determine if the relationships between the terrain characteristics and the altimetric error operate at different spatial scales.Besides MGWR, two additional models were used for comparison purposes, namely OLS and GWR.
Our findings are expected to provide a better understanding of the global and local processes influencing the quality of Topodata products and highlight the importance of terrain characteristics in effective DEM product development, besides shedding light on some limitations of regression modelling applications.Providing further understanding of the features influencing DEM vertical accuracy may also contribute to improving the applications that rely on the altimetric data extracted from DEM.

Study area
The Uruguay River watershed is approximately 385,000 km2 being 174,412 km 2 of this area placed in the southern part of Brazil, covering 2% of the national territory, which is named in the hydrographic region of Uruguay (Brazil, 2006).The study area's altimetry ranges from 32 to 1822 meters above sea level (Figure 1).

Data and preprocessing
SRTMGL1v003 DEM and Topodata DEM Shuttle Radar Topography Mission (SRTM) was released in 2000 onboard the Space Shuttle Endeavour aiming to generate a near-global DEM of the Earth through radar interferometry (NASA, 2013).Initially, SRTM data was made available with a resolution of 1 arc-second for the United States territory and with 3 arc-seconds for other regions of the world.In 2015, the data with full resolution (1 arcsecond) was released globally (NASA, 2022).In this context, the Topodata project was developed by the Brazilian National Institute for Space Research (Instituto Nacional de Pesquisas Espaciais -INPE) to refine, through kriging techniques, SRTM data from the resolution of 3 arc-seconds (≈90 meters) into 1 arc-second (≈30 meters) over the Brazilian territory (Valeriano & Rossetti, 2012).This project also derived geomorphometric data from SRTM products providing information, such as slope, aspect, and curvatures, ready for use by the scientific community (Valeriano, 2008;Valeriano & Rossetti, 2008).Despite there being other DEM with better resolution, SRTM-90 (90meter pixel) and Topodata are highly applied in research over the Brazilian territory (Silva et al., 2022).Moreover, Topodata DEM is still widely used due to the unavailability of cartographic products in suitable scales for some Brazilian regions (Ferreira & Cabral, 2022).
Topodata DEM was released in 2008 and had been revised regularly by INPE (INPE, 2008).SRTM datasets most recent version (SRTMGL1v003) eliminated voids that were present in previous versions of SRTM products by using data from ASTER Global Digital Elevation Model (GDEM) Version 2.0, the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010), and the National Elevation Dataset (NED) (NASA, 2013).

Brazilian Geodetic Network
The accuracy analysis in this study is based on official data from the Brazilian Geodetic System (IBGE, 2022a).The Brazilian Geodetic Network (BGN) contains geodesic stations implemented throughout the national territory with essential planimetric, altimetric and gravimetric information used as a reference in positioning activities as well as for correction and verification of Brazilian territory images (IBGE, 2019(IBGE, , 2022a)).Most stations are materialized through concrete landmarks with a metal plate on the top, identifying their coordinates, altitudes and gravity obtained by using high-precision geodetic procedures and models (IBGE, 2022a).A total of 1068 reference points from the BGN are present in the hydrographic region of Uruguay (Figure 1) georeferenced to the Geocentric Reference System for the Americas (SIRGAS2000horizontal datum) and Imbituba (vertical datum).It is important to mention that the Imbituba datum is defined by the calculated middle level of the sea with data from a tide gauge station and then propagating it throughout the Brazilian territory by high-precision geometric levelling (IBGE, 2019).The main characteristics of BGN, SRTM and Topodata DEM are shown in Table 1.
The variables aspect, curvature, elevation, relative relief, roughness, slope, TPI, TRI and VRM were derived from the SRTM DEM (NASA, 2013) to ensure that their values are independent of the altimetric error derived from the Topodata DEM.If we used the same DEM to derive the explanatory variables, then their values would be a function of the values of the dependent variable.Hence, the regression model would not be correctly specified (i.e. it would be an inappropriate model).On the other hand, distance to rivers and drainage density were derived from the drainage data of the National Water and Sanitation Agency (Agência Nacional de Águas e Saneamento Básico -ANA) (ANA, 2022).
Aspect measures the orientation of the slope for each location on which the compass direction ranges from 0° to 360°Clockwise, where 0°, 90°, 180° and 270°Correspond to north, east, south, and west, respectively (Kaliraj et al., 2015;Lei et al., 2022).Curvature refers to a morphological measure of the terrain topography, where positive values mean that the surface is upwardly convex, negative values reveal an upwardly concave whereas zero values indicate that the surface is flat (Lee & Sambath, 2006).
The LULC mapping was performed by the Brazilian Institute of Geography and Statistics (Instituto Brasileiro de Geografia e Estatística -IBGE) based on images from the Moderate-Resolution Imaging Spectroradiometer (MODIS) sensor and from LANDSAT-5 and LANDSAT-7 satellites.In addition, the technical review process also involved the incorporation of polygons from the vegetation maps and auxiliary information from the Continuous Cartographic Base of Brazil at a scale of 1:250,000 (IBGE, 2017(IBGE, , 2022b)).Since SRTM data were collected in year 2000, we use the LULC data from the same year trying to be as faithful as possible to the reality of the terrestrial surface at the mission's time.The following LULC classes are found in the hydrographic region of Uruguay (IBGE, 2017): (1) Artificial area -characterized by urban use, structured by buildings and road systems, where non-agricultural artificial surfaces predominate.
(2) Agricultural area -characterized by temporary and permanent crops, irrigated or not, with the land used for food production, fibre, and agribusiness commodities.It includes all cultivated land, which may be planted or fallow, and cultivated wetlands.It can be represented by heterogeneous agricultural zones or extensive areas of plantations.
(3) Pasture -area intended for the grazing of cattle and other animals, with cultivated herbaceous  , 2013) and Topodata DEM (Garofalo & Liesenberg, 2015;Miceli et al., 2011;Valeriano, 2008).artificial reservoirs (artificial water dams built for irrigation, flood control, water supply and electricity generation).Figure 2 and Table 2 show the spatial distribution of each LULC class.
To include the LULC classes in the regression modelling, it was necessary to consider an area around each reference point from the BGN.For this purpose, we considered the average distance parameter obtained through the "Calculate Distance Band from Neighbour Count" tool (ArcGIS Desktop software -version 10.8.2) which returns the average distance to the N nearest neighbour (Esri, 2022).Since this result was 5,438 meters, we defined the buffer radius value as 5,500 meters in this analysis.Then, the independent variables of the regression models were computed as the area of each LULC class inside the buffer at each location.This approach guaranteed that the scale of analysis used to compute the independent variables derived from the LULC classes was the same, regardless of the spatial distribution of the BGN points which are not regularly distributed.
To derive the altimetric errors, we followed the methodology applied by Satge et al. (2016) and Shean et al. (2016).Negative altimetric errors mean that the DEM analysed overestimates the elevation and positive errors underestimate the elevation at each verified point (Brasington et al., 2003;Holmes et al., 2000).
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 1 n where A REF is the altitude of the reference point from the BGN, A DEM is the altitude extracted from Topodata DEM, and n is the number of reference points.
The MAE is a statistical measure that computes the average absolute difference between the value considered the true (BGN) and the value extracted from the Topodata DEM.The RMSE measures the uncertainty in the computed values, defining the degree of correspondence between the reference values and those extracted from the DEM.Thus, lower MAE and RMSE values indicate better results (Rawat et al., 2019).
Voronoi tessellation is a concept proposed by Georgy Voronoi in 1907 based on a computational geometry data structure that has been applied in many scientific areas (Kastrisios & Tsoulos, 2018).Voronoi maps or Voronoi diagrams are built from polygons generated around a sample point.Each Voronoi polygon is obtained by intersecting perpendicular bisectors of adjacent points, where the nearest neighbour of any point inside the polygon is the sample point (i.e. the generator of the polygon) (Nene & Nayar, 1997;Safar, 2005).
Hot spot analysis is a spatial analysis technique used to identify statistically significant clusters of high (hot spots) or low values (cold spots) within a dataset using the Getis-Ord Gi* statistic (Getis & Ord, 1992).
The Global Moran's I is an inferential statistic used to measure the spatial autocorrelation in a dataset and to test whether the observed spatial pattern in the dataset is randomly distributed (null hypothesis) or spatially autocorrelated (alternative hypothesis).
A significant positive index reveals evidence of spatial clustering of similar values, and a negative one provides evidence of a dispersion pattern of dissimilar values (Moran, 1950;Prasannakumar et al., 2011).The Local Moran's I examines the significance of local spatial autocorrelation by calculating the Moran's I statistic for each location using neighbouring values to identify local clusters or spatial outliers (Anselin, 1995).Clusters with high or low values are defined as high-high (HH) or low-low (LL), respectively, and correspond to statistically significant positive spatial autocorrelation.A high-low (HL) outlier corresponds to a high value correlated with surrounding low values, and a low value correlated with surrounding high values is defined as a low-high (LH) outlier.Spatial outliers correspond to statistically significant negative spatial autocorrelation (Anselin et al., 2007).
OLS, GWR and MGWR are linear regression models, but each one operates at different spatial scales and make different assumptions regarding the spatial heterogeneity of the data set.OLS is a global model assuming a single coefficient to explain the relationship between each explanatory and the altimetric error in the whole study area, whereas GWR and MGWR are local regression models that allow the coefficients to vary in space.Moreover, MGWR also allows each independent variable to adopt a different spatial scale of analysis.
ArcGIS Desktop software (version 10.8.2) was used to run the ESDA and OLS analysis.We investigated all possible combinations of the 22 independent variables in 539,909 exploratory OLS models considering their statistical significance, the variance inflation factors (VIF), and the models' Adjusted R 2 .The OLS model which has presented the highest Adjusted R 2 , without multicollinearity issues, included four significant variables: aspect (X 1 ), curvature (X 2 ), elevation (X 3 ) and LULC_1 (artificial area) (X 4 ).These variables were then included in the GWR and MGWR models that were estimated using the MGWR 2.2 software (Oshan et al., 2019).
All variables in the MGWR model were standardized to increase the interpretability of the bandwidths of the spatial kernel.An adaptive bisquare kernel was applied in both GWR and MGWR models as the distance-weighting function to control the optimal number of nearest neighbours to be included in the local model fitting (5) (Fotheringham et al., 2017): where w ij is the weight between points i and j, d ij is the Euclidean distance between points i and j, and the bandwidth b i is the distance from focal point i to its M th nearest neighbor.The optimal number of neighbors (M) is determined by the lowest corrected Akaike's Information Criterion (AICc) that is obtained from multiple comparisons.The MGWR model is formulated in Equation 6.
where bwk denotes the specific optimal bandwidth used in the calibration of the intercept and k th conditional relationship (k ¼ 1; . . .; 4), and u i ; v i ð Þ are location coordinates for each focal point i (i ¼ 1; . . .; 1068).Model diagnostics and inferencerelated diagnostics are computed for local parameter estimates (Fotheringham et al., 2019;Yu et al., 2020).
In the GWR model, the notation bwk was removed because the optimal bandwidth used is the same for the intercept and all k conditional relationships.That notation was also removed from the OLS model, as well as the u i ; v i ð Þ location coordinates, because the OLS (global) model is specified using a single regression equation: the β k parameters (k ¼ 0; 1; . . .; 4) are the same for all locations and they are estimated using the whole data set.For further details see Fotheringham et al. (2002).

Statistical analysis
The altimetric errors of the Topodata DEM range from − 36.68 to 39.23 meters, and 75% of them are smaller than 1.22 meters (Table 3).There is very a strong correlation between the elevations extracted from SRTM and Topodata DEM when they are compared with the reference points from the BGN as the coefficient of determination (R 2 ) of a simple linear regression is higher than 0.99 in both cases (Appendix A1/A2).
The histogram indicates that the altimetric errors are negatively biased (overestimated) which is reinforced by the median value already shown in Table 3.Despite that, they present a distribution a little close to a normal curve (Appendix A3).
The boxplots of the elevation differences are shown in Appendix A4, where it is possible to notice that the SRTM and Topodata elevation are very similar to the BGN altitudes and that more than 50% of the altitudes are below 500 meters, an expected fact since Brazil is considered a low-lying country being 41% of its territory below 200 meters and barely 7% above 800 meters (Alvares et al., 2013).Even though it can be observed in Appendix A4 that almost 25% of the point altitudes are above 800 meters.
The altimetric errors are higher in areas with slopes above 30% (Appendix B1).Furthermore, we observe that the MAE and the RMSE increase as the slope also increases.Likewise, the altimetric errors are higher in places where the elevation is higher than 1,200 meters (Appendix B2), despite the altimetric errors being also higher in regions where elevation is between 900 and 1,200 meters when compared with places lower than 900 meters.Regarding the LULC, the classes that presented the highest altimetric errors were "forest vegetation" and "artificial area" (Appendix B3).The Pearson's correlation coefficient between elevation and altimetric error is approximately 0.22.The correlation matrix of all the candidate explanatory variables (Appendix C) shows that some of them are significantly correlated with each other, such as relative relief, VRM, TPI.

Spatial effects in altimetric errors
The Voronoi map of the altimetric error shows that most samples (82%) present altimetric errors between − 6.46 and 6.46 meters and that they are evenly distributed (Figure 3).There is no apparent trend of the altimetric error over the study area.
The Global Moran's I statistic revealed that the altimetric error did not have a significant spatial autocorrelation (index ≈0.02; p-value = 0.98), thus the altimetric error pattern does not appear to be significantly different from random.Hot Spot Analysis (Getis -Ord Gi* statistic) reveals both high altimetric errors (hot spot) and low altimetric errors (cold spot) values clustered spatially (Figure 4b).In this analysis, results demonstrate the existence of 7 hot spots in the upper region, 5 in the middle and lower region of the watershed at the 1% significance level, 6 in the upper and 1 in the middle part of the watershed at the 5% significance level, and 2 in the upper region of the watershed at the 10% significance level.In addition, this analysis also shows cold spots in the upper region (3 points) and 1 in the middle region of the watershed at 1% and 5% significance levels, respectively (Figure 4b).

OLS results
The summary of OLS results in Table 4 shows that all variables have a statistically significant coefficient.Furthermore, the low VIF values indicate that there is no evidence of redundancy among the explanatory variables.

GWR and MGWR results
Results of the MGWR model executed with the same variables as the OLS and GWR models show an improvement of the proposed model since it had a higher Adjusted R 2 (0.41) and a lower AICc (2,602.14)than the OLS and GWR models (Table 5).It was also noticed a decrease in the residual sum of squares (RSS).
Given the spatially varying nature of MGWR, the map of the Local R 2 (Figure 5a) highlights the areas where the local regressions have a better goodness-offit and provides insights on the locations where important variables may be missing.The Local R 2 ranges from 0.18 to 0.74 (Figure 5a).Low values of the Local R 2 in the flatter regions located in the middle of the study domain indicate that other factors may affect the altimetric error besides aspect and elevation, which have significant coefficients in that region (Figures 6a, 6c).The Local R 2 was higher than 0.5 in approximately 25% of the local regressions obtained with the MGWR model (Table 6).These areas, where the Local R 2 ranges from 0.50 to 0.74, overlap with the areas where the curvature presents significant local   coefficients (Figure 6b).Therefore, the inclusion of curvature contributed to improving the goodness-offit of the MGWR model in the northern and southern parts of the study domain.
There is no evidence of spatial autocorrelation in the residuals of the MGWR model over the study area because the value of the Global Moran's I statistic was equal to − 0.01, and it was not statistically different from zero (z-score = −0.01;p-value = 0.99).Hence, the spatial pattern of the residuals does not appear to be significantly different from random.
A Local Condition Number (Local_CN) greater than 30 indicates that there might be a multicollinearity problem in the model (Oshan et al., 2019).In this way, there is also no evidence of multicollinearity among the independent variables because the Local_CNs range from 1.14 to 2.67 (Figure 5b).

Spatial pattern analysis of the coefficients
The optimal bandwidths of the coefficients obtained from GWR and MGWR models are shown in Table 7.The GWR model presented a very restrictive bandwidth (173) compared with the bandwidth of the MGWR model since it is approximately half of the average bandwidth of the MGWR model (337).Furthermore, GWR assumes that the aspect, curvature, elevation and LULC_1 (artificial area) influence the altimetric error on the same scale, which is refuted by MGWR model results.The MGWR bandwidth of the curvature variable with 52 nearest neighbours indicates that this variable operates on a local scale.The MGWR bandwidths of the LULC_1 (artificial area) and aspect variables show that these variables operate on a regional scale.The MGWR bandwidth of the elevation variable reveals that this variable influences the altimetric error on a global scale because its bandwidth is exactly the possible maximum number of neighbours (1067).
The MGWR bandwidth and the standard deviation parameter estimates of the variables are inversely related because a large bandwidth of a variable influences the dependent variable on a large scale which means small heterogeneity and, consequently, a small standard deviation of parameter estimates (Fotheringham et al., 2019).Likewise, a short bandwidth influences the dependent variable on a local scale where the standard deviation of the local parameter is large (Table 7).
MGWR local coefficients of each explanatory variable are shown in Figure 6, where blank areas indicate that the coefficients are not significantly different from zero.Hence, the explanatory variable does not significantly affect the altimetric error in those locations.The significant coefficients of the aspect variable are all negative and appear in the upper and middle course of the hydrographic region of Uruguay (Figure 6a).On the other hand, the non-significant coefficients of this variable are mostly in the lower part of the study area.Therefore, the explanatory variable does not significantly correlate with the altimetric error in those locations.Figure 6b shows the variation of the curvature local parameters where the coefficients are all positive in parts of the upper and lower course of the watershed, considering only regions with significant values.Figure 6c reveals that the elevation variable significantly influences the altimetric error at a global scale and its coefficients are very similar across the whole study area, as they range from 0.42 to 0.45.All significant coefficients of the LULC_1 (artificial area) variable also presented positive values, and they are mainly in the lower course of the watershed (Figure 6d).The percentage of locations with significant coefficients (p 0.05 of t-test) of aspect, curvature, elevation, and LULC_1 were 82%, 36%, 100%, and 18%, respectively (Appendix D).

Discussion
Despite covering 2% of Brazil territory, the area of the Brazilian part of the Uruguay River watershed is approximately equal to double the area of some countries such as Azerbaijan (86,600 km 2 ), Hungary (93,030 km 2 ) or Portugal (92,212 km 2 ).Other regions of the Brazilian territory were discarded, and that area was chosen because of the reasonable coverage of points from the BGN, where we could identify 1068 reference points satisfactorily distributed over the study domain.Being the Topodata model exclusively developed for the Brazilian territory, it is expected that the altimetric range of the study area (32 to 1822 meters above sea level) may adequately represent the average of the national territory elevation.
The statistical analysis results demonstrated that the elevation and slope variables affect the accuracy of the Topodata DEM because higher places and steeper areas presented higher altimetric errors.In fact, the highest altimetric error residuals are related to the highest slope classes, and this result was also observed in similar studies (Gdulová et al., 2020;González-Moradas & Viveen, 2020;Gorokhovich & Voustianiouk, 2006;Mukherjee et al., 2013;Varga & Bašić, 2015).Additionally, altimetric errors are higher on higher-elevation surfaces, which was also found by Mukherjee et al. (2013) and Pandey et al. (2017).Nevertheless, some authors did not find a significant relationship between altimetric error and elevation (González-Moradas & Viveen, 2020;Varga & Bašić, 2015).
There is evidence of the influence of LULC on altimetric errors of the Topodata DEM since the artificial areas and forest vegetation presented higher altimetric errors when compared with the other classes as shown in the statistical analysis.Previous studies have already verified the LULC effect on altimetric error (Satgé et al., 2015;Yap et al., 2019), especially in vegetated areas (Dong & Shortridge, 2019;Gdulová et al., 2020;Leon et al., 2014) but such effect was also found in artificial areas (Dong & Shortridge, 2019;González-Moradas & Viveen, 2020), although smaller altimetric errors have been observed in built and homogeneous environments such as houses, roads and bare land (Leon et al., 2014).Despite the statistical analysis having shown some influence of the forest vegetation class on the altimetric error, this variable was not included in the local regression models because it was only significant in 0.02% of all the exploratory OLS investigated models.The best OLS regression model identified in this analysis included aspect, curvature, elevation and LULC_1 (artificial area) as explanatory variables of the altimetric error in the Topodata DEM.Nonetheless, Random Forest Regression (Breiman, 2001) could be considered in future studies for the selection of a reduced number of factors from a large set of potential explanatory variables.
The analysis of spatial effects in altimetric errors highlighted the pattern of spatial heterogeneity of the altimetric error.Generally, GWR outperforms global regression models because of its capability to deal with spatial non-stationarity.Though, GWR uses a single optimized bandwidth for all independent variables to define the local neighbourhoods, thus assuming that all relationships vary at the same spatial scale across all covariates (Fotheringham et al., 2002(Fotheringham et al., , 2019)).However, we assume that they may operate at different scales.In this sense, the MGWR approach is more appropriate since it computes an optimal bandwidth for each independent variable (Fotheringham et al., 2017).In fact, MGWR outperforms GWR because it allows examining the spatial scales in different processes by enabling the optimization of covariate-specific bandwidths (Fotheringham et al., 2019;Yu et al., 2020).
GWR and MGWR models were estimated using the same covariates of the best OLS model previously identified.Regression analysis results based on the Adjusted R 2 and AICc values proved that the MGWR outperforms the OLS and GWR models as expected.Moreover, our findings allow us to state that GWR may not be suitable for modelling the altimetric errors of a DEM because not all the explanatory variables influence the altimetric error on the same scale.Our results confirm the hypothesis that each explanatory variable operates on a scale as the curvature variable affects the altimetric error on a local scale, the LULC_1 (artificial area) and aspect on a regional scale, and the elevation influences the dependent variable on a global scale.MGWR local coefficients of different variables can be directly compared because the dependent and independent variables were standardized.MGWR coefficients analysis showed that high positive altimetric errors are mainly where the aspect and curvature variables coefficients are significant.Most of the highest positive altimetric errors (high-high clusters/hot spots) are placed in regions with the lowest coefficients of the aspect variable, which exhibits a negative relationship with the altimetric error.On the other hand, the curvature coefficients showed a positive relationship.Therefore, as the curvature values increase, so do the altimetric errors.However, it is possible to verify a (low-low) cluster of points with negative altimetric errors overlapping regions with the highest values of the curvature coefficients.Negative altimetric errors (low-low clusters) are mostly in the middle region of the study area, where only aspect and elevation variables have significant coefficients.Furthermore, negative errors occurred almost exclusively where the LULC_1 coefficients are not significant.
One of the limitations of this study is the use of the SRTM DEM to derive some explanatory variables as this product also has accuracy issues (Weydahl et al., 2007).Nevertheless, future work should consider the development of algorithms capable of dealing with the error arising from some of the explanatory variables addressed in this study aiming to reduce altimetric discrepancies in the DEM products.

Conclusion
This study performed a vertical accuracy analysis of the Topodata DEM in the hydrographic region of Uruguay assuming the hypothesis that its vertical accuracy would be related to terrain characteristics.The results of the statistical analysis showed that the MAE and RMSE values are sensitive to elevation, slope and some LULC classes, namely forest vegetation and artificial area.We performed a linear regression analysis through OLS, GWR and MGWR models to identify the factors that may explain the spatial patterns in the altimetric error of the Topodata DEM.The MGWR model showed better results than OLS and GWR because it models the relationship between the altimetric error and the factors influencing DEM vertical accuracy considering the geographic scale at which individual process occurs.The results show that different terrain characteristics operate at different scales and their relationships with altimetric error vary in space.The aspect, curvature, and artificial areas variables operate at much smaller scales than elevation which influences the altimetric error on a global scale.This implies that elevation is more relevant throughout the whole study area, whilst the other variables are relevant in certain areas since they operate on local or regional scales.
Our findings proved that different terrain characteristics operate at different scales and their relationships with altimetric error vary in space.In this way, this research provides a better understanding of the global and local processes influencing the quality of Topodata products and highlights the importance of terrain characteristics in effective DEM product development, besides shedding light on some limitations of regression modelling applications.

Figure 1 .
Figure 1.Study area and reference points in the hydrographic region of Uruguay (Brazil).

Figure 2 .
Figure 2. LULC in the hydrographic region of Uruguay.
Anselin's Local Moran's I, represented in Figure 4a, shows that 23 of the 1,068 points (≈2%) have significant positive (15 high-high and 3 low-low clusters) and negative (3 low-high and 2 high-low outliers) spatial autocorrelation.The high-high clusters (high altimetric errors surrounded by high altimetric errors) are in the upper watershed course, and 2 of the 3 lowlow clusters (low altimetric errors surrounded by low altimetric errors) are in the middle watershed course.Figure 4a also shows that 3 low-high outliers (low altimetric errors surrounded by high altimetric errors) are in the upper watershed course whereas there is 1 high-low outlier in the upper watershed course and 1 in the middle watershed course.

Figure 3 .
Figure 3. Voronoi map of the altimetric error.

Figure 4 .
Figure 4. Local Moran's I statistic (a) and hot spot analysis (b) of the altimetric error.

Figure 5 .
Figure 5. Voronoi map of the local R 2 (a) and local condition number (b) of the MGWR model.

Figure C1 .
Figure C1.Correlation matrix of the candidate explanatory variables.

Mosaic of agriculture and forest remnants
TIF *Sistema de Referência Geocêntrico para as Américas (Geocentric Reference System for the Americas); **World Geodetic System 1984; *** Brazilian official vertical datum; **** Earth Gravitational Model 1996.area characterized by the mixed occupation of agriculture, pasture and forestry associated with forest remnants.Other plant formations (herbaceous and shrubby) may occur to a lesser extent.(5) Forestry -area characterized by forest plantations of exotic and/or native species as monocultures.(6) Forest vegetation -area occupied by forests.Trees taller than 5 meters are considered forest formations, including areas of dense forest, open forest, and seasonal forest, in addition to the mixed ombrophiles forest.(7) Grassland -area characterized by natural grassland vegetation subject to grazing and other low-intensity anthropic interference.(8) Mosaic of anthropic areas and grasslandarea characterized by the mixed occupation of agriculture, pasture and/or forestry with remnants of grassland vegetation.Arboreal plant formations may occur to a lesser extent proportion.(9) Water body -It includes all inland waters such as rivers, streams, canals, and other linear bodies of water.It also encompasses naturally closed bodies of water (natural lakes) and

Table 2 .
Spatial distribution of LULC classes in the hydrographic region of Uruguay.

Table 3 .
Statistical accuracy indicators of the Topodata DEM.

Table 4 .
Parameter estimates for the OLS model.

Table 6 .
Frequency distribution of local R 2 values of the MGWR model.

Table 7 .
Summary of local regression results.