Evaluation of the elevation model influence on the orthorectification of Sentinel-2 satellite images over Austria

ABSTRACT The European Space Agency (ESA) is using PlanetDEM for the generation of their Sentinel-2 Level-1C orthophotos. There are concerns that this DEM is not accurate enough for this purpose, especially over mountainous regions like in Austria. In this study, we investigate the height accuracy of PlanetDEM in Austria by comparing it with a reference DEM that is derived from airborne laser scanning. Afterwards, we predict the errors in the Sentinel orthophotos that are caused by these DEM errors: 99.8% of all pixels have a PlanetDEM-induced displacement between ±10 m (i.e. one Sentinel-2 pixel) for all investigated four satellite tracks over Austria. Still, at a few spots, the location errors can reach 60 m. ESA’s goal is to achieve a multi-temporal geo-registration quality (for Level-1C products) of 3 m (95.5% confidence) once the Global Reference Image is established. If we assume that this error budget consists partly of a georeferencing and partly of a DEM-induced error and both are equal, then the DEM-induced error may only amount to 2 m. In this case, PlanetDEM would not allow for this accuracy in the mountainous regions, because there, according to the predictions of this study, 95.5% of the errors are between ±3.8 m.


Introduction
The European Space Agency (ESA)'s Sentinel-2 satellites are equipped with the Multi Spectral Instrument (MSI), which is observing in the visible and infrared of the electromagnetic spectrum with a ground sampling distance (GSD) of 10 m, 20 m and 60 m, respectively. Using two of these satellites, a revisit time of 5 days is anticipated at every point on the equator (Drusch et al., 2012). The first Sentinel satellite (2A) was launched on 25 June 2015, and the second satellite (2B) followed on 7 March 2017.
The orthorectification of the satellite images (product Level-1C), performed by the ESA, is aimed to reach a geometric accuracy of 20 m (95.5% confidence), and improved to 12.5 m when using ground control points (GCPs). The multi-temporal geo-registration quality shall be 0.3 pixel (95.5% confidence), enabled with the Global Reference Image (GRI) (ESA, 2017). The GRI is a set of cloud-free images in the sensor geometry (product Level-1B) with known exterior orientation, which is computed using GCPs.
Currently, the performance is evaluated to be 10.5 m (95.5% confidence) without GCPs, and 1.2 pixel (95.5% confidence) for multi-temporal georegistration. The latter is expected to improve once the GRI is completed and included in the processing chain (ESA, 2017).
Orthorectification requires a digital elevation model (DEM). The quality of the DEM has an impact on the quality of the geolocation of the pixels in the orthophoto. For the orthorectification, ESA uses the 90-m resolution model Planet-DEM-90; (Gascon, 2014). This commercially available product is based on Shuttle Radar Topography Mission (SRTM) and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER, a Japanese sensor on board the Terra satellite).
There are concerns (EuroSDR, 2014) that the quality of this DEM does not allow achieving the required accuracy, especially over mountainous regions like in Austria; see Figure 1. Many applications (e.g. time series analysis or data fusion with other sensors) rely on a precise georectification. Therefore, it is crucial to know the size of the DEM-induced errors in the orthophotos and, thus, the accuracy gain that is possible by using a better DEM. SRTM was acquired by Interferometric Synthetic Aperture Radar (InSAR) and in flat regions the accuracy, as specified, is sufficient. In regions with strong terrain gradients, the effects of foreshortening, layover and radar shadow have severe impacts on the elevation model quality, and, thus, lead to errors of several of tens of meters in height. With such elevation errors, the orthorectification puts image content not at the correct position in the orthophoto, butdepending on the sign of the errorinwards or outwards. Finally, it means that the anticipated geolocation accuracy may not be reached in all places. Furthermore, the 90-m grid is not able to model all terrain features as occurring in rough landscapes. It shall be noted that the effective geolocation errors are not necessarily restricted to mountain peaks, but may also manifest at the valley bottom.
Because orthorectification is performed by ESA, less computational burden is put onto the users of the Sentinel-2 data. However, this only applies, if the accuracy as specified by ESA can really be reached. The aim of this article was to address the concerns on Sentinel-2 geolocation accuracy over the area of Austria caused by DEM errors. Within Austria's alpine region, steep slopes occur. Terrain model elevation errors in these regions have a severe effect on lateral displacement, since the displacement due to an elevation error and slope can add up. Such errors are expected from SRTM due to radar shadows (InSAR viewing geometry) and other effects.
For this analysis, terrain data of superior quality are used, acting as "ground truth". Two DEMs derived from airborne laser scanning (ALS) data are available for this study: one for the whole of Austria with grid width 10 m (AustriaDEM), and one for the state Tyrol with grid width 5 m (TyrolDEM). The height quality of PlanetDEM over Austria is determined by comparison with AustriaDEM. In order to show modifications of the original SRTM and ASTER data in PlanetDEM, EarthDEM, another DEM that is also based on SRTM and ASTER, will be compared with AustriaDEM too. All the data used for this study are summarized in Section 2.
At first in Section 3, the accuracy of PlanetDEM with respect to the AustriaDEM is determined. After a brief description of the four satellite paths of Sentinel-2 over Austria in Section 4, the orthophoto displacements caused by the actual height errors of PlanetDEM are estimated in Section 5 using again AustriaDEM as reference. There, in order to verify these predicted displacements, several Sentinel-2A Level-1C orthophotos are compared with aerial reference orthophotos of superior quality. In Section 6, the effect of a changing grid width on the predicted displacements is analysed, thus allowing a hint on which DEM could meet with the targeted multi-temporal geo-registration quality. Finally, Section 7 summarizes the main findings and gives a conclusion.

Data
The following data are used in this study: • Planet-DEM-90: This elevation model, referenced as PlanetDEM in the following, can be purchased from PlanetObserver (PlanetObserver, 2017). It contains orthometric heights given in a geographic grid (3″grid spacing in latitude and longitude, corresponding to 90 m arc length) and refers to WGS84. This height model is derived mainly from SRTM 1 and ASTER GDEM v2 data. It shall be noted that PlanetDEM is regularly updated and thus exists in different versions. Originally, we acquired PlanetDEM for Austria for this study in November 2015. However, any differences in the estimated orthophoto displacements that might be caused by differences in our version of PlanetDEM and the version that ESA is actually using for the generation of the Level-1C orthophotos should be excluded. Therefore, we contacted ESA in July 2017 and then received ESA's version of PlanetDEM for the area of Austria, which is now used in this study. Using GDAL (2017), the grid of PlanetDEM was projected using Lambert conformal conical projection (LCC in the following, EPSG:3416) and interpolated into a grid with 90 m spacing. The orthometric heights were then transformed into ellipsoidal ones using the software Opals (Pfeifer, Mandlburger, Otepka, & Karel, 2014) and the EGM96 geoid (Lemoine et al., 1998), which according to Farr et al. (2007) was used during SRTM processing. • EarthEnv-DEM90: This elevation model, referenced as EarthDEM in the following, is freely available (EarthEnv, 2017) and is also based on SRTM and ASTER data (Robinson, Regetz, & Guralnick, 2014). It also refers to WGS84 and is given in a geographic grid (3" grid spacing). Again, gdalwarp was used for projecting and gridding with respect to LCC (EPSG:3416) and a grid width of 90 m. Ellipsoidal heights were generated again using the EGM96 geoid. • AustriaDEM: For entire Austria, a digital terrain model with 10-m grid width, referenced as AustriaDEM in the following, is available for free in the course of the Austrian Open Government Data initiative (OGD, 2017). This DEM is derived from ALS data that were collected by the surveying departments of the individual Austrian states in various missions with dm height accuracy at various grid widths around 1 m. For the generation of AustriaDEM, these grids were merged and downsampled to 10 m. By downsampling, the very high original accuracy will decrease. After comparison with TyrolDEM (see next in this data list), we conclude that the height accuracy of AustriaDEM is 1 m (standard deviation). AustriaDEM is given in the Austrian geodetic datum (termed MGI) in LCC projection. Using gdalwarp, this grid was transformed from MGI/ LCC (EPSG:31287) to ETRS89/LCC (EPSG:3416) and interpolated into a grid with 10-m spacing. The orthometric heights were then transformed into ellipsoidal ones using the EGM2008 geoid (NGA, 2017). AustriaDEM will serve as the main reference DEM for the quality investigation of PlanetDEM. • TyrolDEM: For the state Tyrol (see Figure 1), a digital terrain model with grid spacing 5 m, derived from ALS, was provided from the Tyrolian department of surveying specifically for this study. Originally, this DEM was given with respect to the Austrian geodetic datum in two different Gauß-Krüger projections using orthometric heights. Using gdalwarp, these data were transformed from MGI/Gauß-Krüger (EPSG:31254 and EPSG:31255, respectively) into ETRS89/LCC (EPSG:3416) and interpolated into a grid with 10-m spacing. The orthometric heights were then transformed into ellipsoidal ones using the EGM2008 geoid. TyrolDEM will be used for a more detailed analysis in the Alps and for an analysis regarding the influence of the DEM grid width. • PhotoDEM: From the homepage of the Austrian mapping agency (BEV, 2017), a photogrammetry-based DEM of entire Austria with a grid width of 100 m can be downloaded for free. Using gdalwarp, this grid was transformed from MGI/Lambert (EPSG:31287) to ETRS89/LCC (EPSG:3416) and interpolated into a grid with 100-m spacing. The orthometric heights were then transformed into ellipsoidal ones using the EGM2008 geoid. PhotoDEM will also be used in the analysis regarding the influence of the DEM grid width. • Sentinel Images: For this study, the Sentinel-2A Level-1C products listed in Table 1 were used. They were downloaded from Copernicus (2017). Their selection was based on cloud and snow coverage, shadows and having passed the internal geometric quality test (visible in the product details before the download). Each Level-1C image is an orthophoto of size 100 × 100 km 2 referring to WGS84/UTM (in the case of Austria, either zone 32 or zone 33). The conversion into ETRS89/LCC (EPSG:3416) was again done using gdalwarp. Only the true colour image (TCI file) was used, having 10-m GSD. For two of these Level-1C scenes, the respective Level-1B images were thankfully provided by ESA. • Reference Orthophotos: From the Austrian mapping agency (BEV, 2017), orthophotos were available for this study. These orthophotos are derived from aerial image flights in years 2013-2015 with a GSD of 25 cm. Using gdalwarp, these orthophotos were transformed from MGI/Gauß-Krüger to ETRS89/LCC and resampled to 1-m GSD. These orthophotos serve as reference for the Sentinel-2 orthophotos.
Investigating the height accuracy of Planet-DEM-90 and EarthEnv-DEM90 Figure 2 shows the difference between PlanetDEM and the AustriaDEM as a colour coding. It can be seen that the differences are not purely random, but largely correlated with the steepness of the terrain (i.e. the mountains and hills are largely visible as positive height differences). Additionally, the differences show a systematic tile-like pattern (especially in the alpine regions; c.f. Figure 1). There is one rectangular area in the western part of Austria where the differences are very large (exceeding ±50 m). Figure 3 shows the difference between EarthDEM and the AustriaDEM. Because of the fact that PlanetDEM and EarthDEM are derived practically from the same data sources (SRTM and ASTER), the differences look very similar. However, EarthDEM does not show this tile-like pattern in the differences.
According to the homepage of PlanetDEM (PlanetObserver, 2017), the main data sources (SRTM and ASTER) were "extensively reprocessed". Obviously, this processing focused more on the mountains than the lowlands. Apparently, thereby at some point in time, a very large error might have occurred in that particular wrong tile in the west. Figure 4 shows the histograms of the computed differences. Table 2 lists some statistics. Table 2 lists the value σ MAD , which is a robust estimator for the standard deviation assuming a Gaussian distribution. However, it is only representative for the central 50% of the data. Additionally, the histograms of the differences of PlanetDEM and EarthDEM with  respect to the AustriaDEM are rather different from a Gaussian, having too many values close to zero and a bias towards the positive values. An explanation for that could be that the histograms consist of more than one population. Indeed, the comparison of these histograms with the spatial distribution of the differences (i.e. Figures 2 and 3) reveals that at least two such populations exist: (i) in the lowlands, both PlanetDEM and EarthDEM fit well to the AustriaDEM; (ii) in the Alps, the differences are much more dominant. Both DEMs are derived from radar data and, therefore, these observations could be explained by the radar properties (layover and foreshortening), which have negative influence in mountainous regions but not in flat ones. As a result, in the mountainous regions, both SRTMbased DEMs deviate more from the terrain than in the lowlands. In summary, PlanetDEM has a tile-like pattern of systematic differences in the Alps, whereas EarthDEM does not show this. Both PlanetDEM and EarthDEM have similar height deviations with respect to the AustriaDEM (plain standard deviation 12.5 m), 95% within −26 and 26 m (for PlanetDEM) and 95% within −19 and 27 m (for EarthDEM). With respect to the AustriaDEM, a small constant vertical shift is present for EarthDEM (median 1.4 m), whereas for PlanetDEM no such shift can be observed.

Sentinel-2 orbits covering Austria
Sentinel-2A covers Austria by four paths. Figure 5 shows the actual coverage over Austria. For this, a swath width of 296 km was chosen for each track. This value fits better to Austria than the nominal 290 km found in, e.g. Drusch et al. (2012). The tracks are numbered from west to east.
For computing the effect of the height errors of PlanetDEM on the orthophotos, these orbits are modelled as circles. The centre of them is assumed to be at the earth ellipsoid's centre of gravity. Two further points define each orbit; see Table 3. For the tracks 1 and 2, these two points were derived from the navigation information that came with the two Level-1B scenes. For the other two orbits, these two points were derived with additional information from N2YO (2017).

PlanetDEMApplied method
The MSI of Sentinel-2 collects data in the visible range of the electromagnetic spectrum with a GSD  Table 2 is superimposed. Table 2. Statistics of the height differences of PlanetDEM and EarthDEM with respect to AustriaDEM. Listed are mean and standard deviation, together with their robust variants median and σ MAD . The latter is a robust estimator for the standard deviation of a Gaussian distribution derived as σ MAD = 1.4826 ⋅ MAD, where MAD is the median of the absolute differences (with respect to the median derived from all values). Also given are various percentiles. All values are in meter, n is the number of pixels (size 90 × 90 m 2 ) from which these statistics are derived. of 10 m. For the generation of the respective orthophotos, the PlanetDEM is used (Gascon, 2014). In Section 3, it was documented that the plain standard deviation of this DEM is about 12 m. Additionally, very large errors were detected, exceeding 50 m, which are localized in certain irregular tiles of PlanetDEM. Consequently, the question arises, which displacements are caused by these errors of PlanetDEM, especially in the mountainous regions. The opening angle Ω of Sentinel-2 is small (21.06°) and centred around the nadir. In horizontal terrain, the displacement Δr in the orthophoto, caused by a height error Δh, becomes in general Here α is the look angle with ÀΩ=2<α<Ω=2: The displacement increases from the nadir towards the border of the swath. There, the viewing angle α obtains the maximum value of Ω/2 = 10.53°. Consequently, at the swath's border, the displacement over horizontal terrain becomes Δr ¼ 0:186 Á Δh (i.e. around 20% of the height error turns up as displacement). However, this is only true for horizontal terrain. In mountainous terrain, slope and exposition can increase or decrease this displacement, as shown in Figure 6. Additionally, in mountainous terrain, the height error of a radar-based DEM is generally larger than in flat horizontal areas. Therefore, the simple estimation using Equation (1) will not work in mountainous areas and, instead, a more rigorous analysis must be performed, which considers the actual shape of the terrain. For this analysis, the AustriaDEM will serve as reference.
Note that Figure 6 shows the relations in a spatial Cartesian (i.e. unprojected) coordinate system. Because all DEMs and the orthophotos refer to LCC, it is more convenient to perform the actual monoplotting in this projection. However, the projection ray connecting the true terrain point P, the wrong terrain point Q and the orbit point S P is then no longer a straight line, but a slightly bended one.
Therefore, the procedure for computing the displacement error in the orthophoto is as follows; c.f. Figure 6: (1) Define the orthophoto matrix (in LCC) to be congruent with the matrix of the reference AustriaDEM, both having grid width 10 m.
In the same way, define a displacement matrix D, which stores in each pixel the displacement of the corresponding orthophoto pixel.
(2) For each pixel, with centre point P′ = (X P , Y P ) T , in this grid do the following: (a) The stored height value (H P ) in this pixel of the AustriaDEM defines the respective  true terrain point P = (X P , Y P , H P ) T . Note that P′ is the ground projection of P . (b) Find the respective point S P on the satellite orbit from where P is actually seen. This calculation is done in the spatial Cartesian coordinate system associated with the ellipsoid. There, the satellite orbit is represented as a circle s using the points from Table 3 with the centre in the ellipsoid origin. The point S P is the intersection of the orbit circle s with the plane ω through P orthogonal to s in S P . The details regarding the computation of S P are found in appendix A (see Equation A1). (c) If the angle α between the nadir direction and the vector from S P to P is smaller than Ω/2, then proceed. (3) Intersect the ray from S P to P with PlanetDEM.
This method is known as monoplotting (Kraus, 2007). Since this computation is done in LCC (where PlanetDEM is given), the bending of the projection ray needs to be considered. This is done by computing the tangential vector t P of the bended projection ray in P. Further details regarding the computation of t P are described in Appendix A (see Equation A2).
Monoplotting using the tangential vector in P yields the wrong terrain point Q = (X Q , Y Q , H Q ) T . Its ground projection Q′ = (X Q , Y Q ) T gives the displaced orthophoto point (i.e. the location in the orthophoto where the orthorectification using PlanetDEM would map the image of point P). The length d P of the vector from P′ to Q′ gives the displacement in the orthophoto, which is then stored in the displacement matrix D. Note that the displacement is a 2D vector. However, since the direction of that vector is always orthogonal to the ground track of the satellite, it suffices to consider only that vector's length in the following, and assign the sign according to the following definition. The sign of d P is chosen positive if the vector from P′ to Q′ points away from S′ and negative if it points to S′.
Note that this presented method is rigorous in the sense that the mapping between orthophoto and original image is computed for each orthophoto pixel. However, according to ESA (2018), a more economic version of this mapping is used by ESAthe so-called anchor point method (Kraus, 2007). For this method, only the points of a sparser grid are mapped rigorously. Afterwards, a bilinear interpolation within these grid points is used to approximate the rigorous mapping for all the pixels between these grid points. This approach has the advantage of being much faster than the rigorous mapping for each pixel. The drawback is that a certain approximation error has to be accepted. However, according to Kraus (2007), this approximation error is of little relevance even under extreme conditions. Since speed is not an issue in Figure 6. Computing the orthophoto displacement using monoplotting. In point P' the height error of PlanetDEM is dH. Over horizontal terrain, the displacement would be d p . Because of the slope of PlanetDEM the actual displacement is dP and in the orthophoto the image of P does not appear in the correct location P' but in Q'.  our investigation, we apply the presented rigorous mapping for each pixel.

Results
The following Figures 7 and 8 show the resulting displacement matrices D for all four orbits as colour codings. Generally, it has to be said that the displacements are very small, and thus for viewing purposes the colour codings use a very small class width of 1 m (corresponding to a 1/10 of a Sentinel-2 OP pixel). All colour codings show two expected properties: (i) generally, the displacements increase with the viewing angle (i.e. from the ground track towards the swath borders). (ii) Flat areas are less effected than the mountainous regions.
The fact that the displacements are very small is also reflected in the statistics in Table 4. By subtracting the 0.1st from the 99.9th percentile, we see that 99.8% of all pixels have a PlanetDEM-induced displacement between ±10 m (i.e. one Sentinel-2 pixel) for all tracks. Still, at a few spots (0.2% of all pixels), the location errors can become much largerreaching up to 60 m.
Similarly, we can derive that 95.5% of the errors are roughly between ±3.8 m for tracks 1 and 2 (which cover mostly the Alps), between ±3.0 m for track 3 (which covers less of the Alps), and between −3.8 m and 1.2 m for track 4 (which is almost entirely over lowlands).
Once the GRI is established, the multi-temporal geo-registration quality shall be 0.3 pixel or 3 m (95.5% confidence) (ESA, 2017) (Gascon, 2014). In a recent study from ESA (Gascon et al., 2017), the authors report that the multi-temporal registration error is only targeted for orthophotos acquired from the same relative orbit. They say that orthophotos acquired from adjacent orbits, and thus with different viewing angles, cannot meet the requirement of 0.3 pixel, since the DEM-induced error will be larger than 0.3 pixel. The authors report no expectations by ESA for the multi-temporal registration error from adjacent orbits. Instead, they assume the vertical accuracy of PlanetDEM to be 16 m (95.5% confidence), and then roughly estimate the PlanetDEMinduced error at the swath border (over horizontal terrain) to be 5.6 m or 0.56 pixel (95.5% confidence). We showed in Section 3 that the vertical accuracy of PlanetDEM in Austria is actually 26 m (95.5% confidence). Despite this much larger value, the detailed analysis presented here showed that (for Austria) the PlanetDEM-induced error is actually only 3.8 m (95.5%, tracks 1 and 2). This highlights the importance of a thorough analysis.
Because no ESA expectation for the DEM-induced error is reported in Gascon et al. (2017), it is difficult to judge whether PlanetDEM will fulfil the expectation by means of the statistics in Table 4. We might determine a rough expectation by considering the following arguments: • Users will want Level-1C products with homogenous accuracy, i.e. irrespective of same (relative) or adjacent orbits. • Because of the DEM errors, however, the registration accuracy between orthophotos of adjacent orbits will be generally worse than for orthophotos of the same relative orbit. In Yan et al.(2018), the authors report 0.35 pixel (overall mean of misregistration distances) for original Level-1C orthophotos of the same relative orbit, and 0.45 pixel for orthophotos of adjacent orbits. • In Gascon et al. (2017), the authors report a multitemporal registration error of 0.22 pixel (95.5% confidence) obtained empirically for orthophotos acquired from the same relative orbit after a GCPbased refinement (thereby anticipating the usage of the future GRI). For the same relative orbit, this result is largely independent of the DEM errors, and, thus, we might consider this as the pure georeferencing error.
The latter is better than the original goal of 0.3 pixel (95.5% confidence). Therefore, we argue that the DEMinduced error might take the remaining part: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0:3 2 À 0:22 2 p = 0.2 pixel or 2 m. Georeferencing error and DEM-induced error would thus be of nearly equal size. If we compare this 2-m value with the above derived Table 4. Statistics of the length of the predicted orthophoto displacements over Austria for all four tracks. Units are in meter, n is the number of pixels (size 10 × 10 m 2 ) from which this statistic is derived.  ,798,420 437,041,938 656,675,159 190,131,224 95.5% ranges from Table 4, we see that PlanetDEM would not allow for this accuracy (i.e. reaching the 0.3 pixel multi-temporal error between tracks). It is important to understand that the predicted displacement errors in Table 4 refer to the true location of each orthophoto pixel. If two Sentinel-2 orthophotos from adjacent orbits are to be compared, then larger DEM-induced displacements must be expected. With the presented method, these displacements between orthophotos of adjacent orbits can also be predicted simply by the sum of the displacement matrices D of the respective individual orbits. For example, if an orthophoto of track 1 is to be compared with an orthophoto of track 2, then the statistics of the predicted displacements would be the following: 95.5% would be between ± 8 m and 98.8% would be between ± 16.5 m. The maximum displacement would be 73 m.

Verification
A proper method for verifying these predicted orthophoto displacements is the comparison of actual Sentinel-2 orthophotos with reference orthophotos. The latter were provided by the Austrian National Mapping Agency (BEV). Originally, these orthophotos, which are derived from aerial images, have a resolution of 0.25 m. For this study, they were downsampled to 1 m.
Before all data were transformed into LLC, the original reference orthophotos and the Sentinel-2 orthophotos referred to different datums. Thus, at first, the absolute georeferencing of the Sentinel-2 orthophotos should be checked. According to Gascon et al. (2017), the accuracy of the absolute geolocation (without GCPs) is currently 5 m (mean value) and 10 m (at 95.5% confidence).
At an earlier stage of this study, we checked the absolute geolocation accuracy of the two Sentinel-2 orthophotos acquired over the state Tyrol on 13 June 2017 and 26 June 2017. For this purpose, Harris corners were first extracted in the Sentinel and the reference orthophotos, matched using SURF features (Bay, Ess, Tuytelaars, & Van Gool, 2008) and then refined using least squares matching. Afterwards, the distances between matching points in each Sentinel orthophoto and the reference were computed. The mean of these distances was 6.2 m for the scene from 13 June 2017 and 4.2 m for the one from 26 June 2017. These mean values fit very well to the mean value of 5 m from Gascon et al. (2017). If a 2D affine transformation is applied on the Sentinel orthophotos, whose parameters are derived from the matching points, the RMS values of these differences improve a little bit to 3.3 m. 2 Still, we used the Sentinel-2 orthophotos in their original delivered georeferencing because of the reason explained in the next paragraph.
Checking the predicted displacements by comparing Sentinel-2 orthophotos with reference orthophotos is challenging. Because no artificial points are available, we are limited to natural points, whose identification accuracy will often be only around 0.5-1 pixel (and thus overtops possible errors from the absolute georeferencing). Therefore, it is almost impossible to check predicted errors that are less than or around one orthophoto pixel. This, however, is the predicted error for more than 99.8% of Austria.     Thus, we focus on the spots with the largest errors. The min and max values in Table 4 can reach 60 m (i.e. 6 pixels in the Sentinel-2 orthophoto). These spots were first detected automatically, and then manually investigated by comparing the Sentinel with the reference orthophotos.
The automatic detection started by converting the displacement matrix D of each track into a logical array D′ using D′ = abs(D) > 30 m (thus limiting to errors larger than three orthophoto pixels, in anticipation of the difficulties in comparing natural points). Afterwards, the connected components of D′ were computed. All components with an area larger than 50 pixels, or with maximum displacement larger than 40 m, were investigated manually, subsequently. The distribution of these detected spots is shown in Figure 9(a).
Many of these spots could not be used, because they were covered by clouds in the used Sentinel-2 orthophoto, they were in a shadow area in the Sentinel orthophoto or the reference orthophoto, or   no particular natural point could be found that was clearly identifiable in both orthophotos.
The following Figures 10-17 show for each track two spots where natural points could be identified clearly. The point is always marked in the reference orthophoto by a green circle. The location with the very same coordinates is then marked also in the Sentinel orthophoto and the colour-coded displacement map. The legend of the latter is shown in Figure 9(b). The IDs of these points is TN, where T is the track number (1-4) and N is the point counter per track (1, 2).
In all eight investigated spots, the predicted and the actually measurable displacement fit quite well. Table 5 lists these values again. For the location 21, the difference between predicted and measured is the largest with nearly 20 m. All spots with large displacements are close to terrain edges. This was expected because edges cannot be represented well in PlanetDEM with 90-m grid width. The location 21 is the peak of a very steep mountain. In principal, it could be possible that this peak is also not optimally represented in the used reference AustriaDEM with 10-m grid width.
For the state Tyrol, where spot 21 is located, another ALS-based DEM, but with 5-m grid width, is available: TyrolDEM. Therefore, the computation of the orthophoto displacement prediction was repeated with this DEM for the area of Tyrol. In the last column of Table 5, the resulting predicted displacements based on TyrolDEM are listed. Now, the predicted displacement for location 21 fits very well to the measured one, thus showing the general problem that a small grid width is required for a proper grid-based representation of peaks.
Investigating the grid width of the DEM to be used for Sentinel-2 orthophotos generation In Section 5.2, it was found that 95.5% of all pixels in the Sentinel-2 orthophotos in Austria have a PlanetDEM-induced displacement roughly between ±3.8 m, which might be slightly not enough to meet the expected multi-temporal geo-registration quality.
We could ask now the question to which extent this relation will change, if a DEM with a better resolution (i.e. smaller grid width) would be used for the orthophoto generation. We can try answering this, by using the reference TyrolDEM with its very high resolution of 5 m. For this investigation, the TyrolDEM is downsampled to DEMs with various larger grid widths by simple averaging. Then, each of these DEMs is tested against the original 5-m resolution version using the very same method as in Section 5.1. Table 6, which has the same structure as Table 4, lists the resulting statistics of the displacements for track 1. These statistics can be interpreted in the following ways: • TyrolDEM downsampled to 90 m has 95.5% between −2.2 and 2.5 m, and thus performs better than PlanetDEM (grid width also 90 m), which has 95.5% between −4.1 and 4.4 m. Even TyrolDEM downsampled to 150 m is a bit better with 95.5% between −3.6 and 4.5 m. The interpretation of this is that the grid width of the DEM used for orthophoto generation is only one aspect. Equally, if not even more important, is the accuracy of the individual grid points (i.e. the method used for acquiring the original elevation data). TyrolDEM (and Austria-DEM)  stems from ALS, whereas PlanetDEM is derived from SRTM and ASTER. If the accuracy of the measurement method is the important issue, then it also would seem questionable, if an SRTM/ASTER-based DEM with higher resolution would deliver the required quality. • Of course, since the 150-m version is derived from the 5-m reference, this comparison is not fully independent and a bit biased. Therefore, the independent PhotoDEM with grid width 100 m is also used in this analysis. It has 95.5% between −3.2 and 4.0 m and thus lies between the 90-m and the 150-m versions of TyrolDEM, however, much closer to the 150-m than the 90m version. This gives an idea about this bias, but still, a rough estimation on the effect of the changing grid width is possible. • The maximum displacement may reach up to approximately 50 m. This maximum value stays fairly constant over all tested grid widths. The interpretation of this is that the larger displacements at edges and peaks occur fairly easily once these terrain features are not represented in the grid. A circumstance that can already happen if the grid width is changed a little with respect to the reference grid width (5 m in this case). • If we take up the perspective that georeferencing and DEM-induced errors equally contribute to the multi-temporal geo-registration quality (see Section 5.2), then 2 m would be the limit for the 95.5% range. Taking into account the bias mentioned, this necessary quality could quite likely be achieved using a DEM with 50-m grid width, which, additionally, should be ALS-or Photogrammetry-based.

Summary and conclusions
At first, the height errors in PlanetDEM (having 90-m grid width), which is used for the orthorectification of Sentinel-2 images, were quantified by comparing it with the ALS-based reference AustriaDEM. The standard deviation of the height differences over entire Austria is 12 m and 95% of the differences are within −26 and 26 m. Generally, the vertical accuracy in the lowlands is much better than in the mountainous regions. In PlanetDEM, an irregular tile structure became evident, which suggests that the alpine regions are differently processed than the lowlandsprobably an artefact of the reprocessing of the original SRTM and ASTER data by PlanetObserver. In some of these alpine tiles, the heights are systematically wrong by 50 m. These elevation errors cause displacements in the Sentinel-2 Level-1C orthophotos, which were investigated next. Despite the very large errors in PlanetDEM, these predicted displacements are rather small. 99.8% of all pixels have a PlanetDEM-induced displacement between ±10 m (i.e. one Sentinel-2 pixel) for all four investigated tracks. Still, at a few spots (0.2% of all pixels), the location errors can become much largerreaching up to 60 m. Table 6. Statistics of the length of the predicted orthophoto displacements over Austria for DEMs with different grid widths for track 1. TyrolDEM acts as reference DEM. For the column PlanetDEM, the actual PlanetDEM was used. The values there are slightly different to the ones in Table 4 (column of track 1), because TyrolDEM and AustriaDEM are not identical in Tyrol. PhotoDEM refers to the DEM with 100-m grid width that is derived using photogrammetry. For the other columns downsampled versions of TyrolDEM were usedindicated by the respective grid width in the first row. Units are in meter, n is the number of pixels (size 5 × 5 m 2 ) from which this statistic is derived. The reason why the larger errors in PlanetDEM produce rather small errors in the orthophotos is mostly explained by the very small opening angle of the MSI of roughly 21°. Partly, it is also "pure luck" that the very large errors in PlanetDEM cause no big orthophoto displacements. These large height errors are located close to the ground tracks of the orbits. There, they have no effect because of the almost nadir looking direction.
By comparing real Sentinel-2 Level-1C orthophotos (GSD 10 m) with reference orthophotos (GSD 1 m), the predicted large displacements (ranging between 25 and 65 m) were checked at selected points in all four satellite tracks over Austria. At all eight points, the predicted and measureable displacement agreed up to a few meters.
Once the GRI is established, the multi-temporal geo-registration quality shall be 3 m (95.5% confidence). In Gascon et al. (2017), the authors simulate the usage of the future GRI by means of GCPs. They obtain a multi-temporal registration error of 0.22 pixel (95.5% confidence) for orthophotos acquired from the same relative orbit, where the result is largely independent of the DEM errors. If we assume that this is the pure georeferencing error, and allow the DEM-induced error to take the remaining part of the multi-temporal geo-registration quality, then the DEM-induced error may only amount to around 2 m. In this case, PlanetDEM would not allow for this accuracy in the mountainous regions, because there, according to the predictions of this study, 95.5% of the errors are between ±3.8 m.
In the last analysis, the effect of changing the grid width was investigated. It was concluded that an ALSor Photogrammetry-based DEM with 50-m grid width could be sufficient to meet the mentioned requirement of 2 m. Peaks, however, may not be well represented even in a DEM with a very small grid width. Consequently, at peaks larger displacements have to be anticipated (up to 60 m in this study).
Finally, it should be mentioned that all DEMs in this study were terrain models. Therefore, any object above the terrain (e.g. multistorey buildings or large trees) will be tilted in the orthophoto. This displacement is independent from the resolution or accuracy of the used terrain modeleven in flat areas. At the swath border over a horizontal plane, this error can reach around 20% of the object height, which, in case of a forest edge with 25-m height, would be 5 m (or half a pixel). It is important to remember this circumstanceespecially when comparing multi-temporal orthophotos of different orbits, where the errors can add up due to opposite viewing directions.
with R ¼ T 1 j j. From the first two equations, it immediately follows that X k n $ Â P ð ÞÂn $ ; (A1) which has two solutions. However, because one solution is close to P and the other is on the other side of the ellipsoid, the right solution is found easily using P T Á X>0. Afterwards, the third equation is simply applied as S P ¼ X=jXj Á R.
A.2. Computing the direction vector t P of the projection ray in P Once S P is found, the projection ray in the spatial Cartesian coordinate system is defined as the straight line from S P to P. In LLC, the map of the projection ray is a bended line, whose direction vector t P in P is needed for computing the monoplotting in LLC. It is found by numerical differentiation as follows: t P ¼ P 10 À P with P 10 ¼ ψ À1 P 10 ð Þ and P 10 ¼ P À S P jP À S P j Á 10 (A2) (i.e. P 10 is an auxiliary point on the projection ray 10 m away from P).