Factors affecting relative height and ground elevation estimations of GEDI among forest types across the conterminous USA

ABSTRACT The Global Ecosystem Dynamics Investigation (GEDI), a new spaceborne LiDAR system of the National Aeronautics and Space Administration (NASA), has the potential to revolutionize global measurements of vertical vegetation structure. However, GEDI performance among different forest types and factors influencing GEDI performance needs to be evaluated against similar measurements from existing airborne LiDAR platforms. Ideally, comparisons across diverse forest types will inform future work quantifying biomass or mapping species habitats. Thus, we compared the second version of GEDI L2A product (GEDI V2) with Airborne Observation Platform (AOP) leaf-on LiDAR data across 33 National Ecological Observation Network (NEON) sites. Comparisons were made for ground elevation and relative height (RH) of GEDI with simulated airborne laser scanning (ALS) waveforms from discrete point cloud LiDAR. Results indicated that GEDI V2 obtained high accuracy on ground elevation and RH100 estimations (3σ) with RMSEs of 1.38 m and 2.62 m, respectively. GEDI produced forest height estimations (RH100) for all 12 forest types with a %RMSE below 25%. GEDI RHs were sensitive to ground finding accuracy, and GEDI performance of RH estimation varied from forest profiles of different forest types. For factors influencing GEDI performance, greater than 21% of GEDI RH95 and 33% of ground elevation variations can be explained by land surface attributes, observing sensor system characteristics, and the collection time differences between GEDI and NEON LiDAR. Furthermore, geolocation error remains an essential factor affecting GEDI performance, which varies among forest and land cover types, especially for canopy height estimation. The findings reported here can provide insights to guide and enhance future GEDI-based global forest structure mapping and applications.


Introduction
Knowing vertical forest structure is vital for assessing ecosystem carbon storage, annual productivity, and biodiversity (Dupuis et al. 2020;Schneider et al. 2020). Forest vertical structure plays a key role in modeling ecosystem functioning (Fahey et al. 2019;Leila et al. 2020). The past 20 years of published research demonstrates the value of relative heights (RHn) measured by airborne LiDAR for estimation of biomass and canopy volume (Lefsky et al. 2002;Lim et al. 2003). The relative heights are defined as the distance between the elevations of detected ground return and the n% accumulated waveform energy, where n ranges from 1 to 100 García et al. 2020). Forest profiles reveal vertical structural complexity (e.g. canopy rugosity, rumple index), gap distributions, and the vertical distribution of foliage, which can indicate the characteristics of light penetration to the forest floor and be used as an indicator of forest disturbance Brown et al. 2020;Fahey et al. 2020) and wildlife biodiversity monitoring (Coops et al. 2016). Thus, forest vertical canopy structure is critical for global change studies over vegetated regions of the world.
Airborne laser scanning (ALS) is a mature technology that has been widely used to understand the vertical structure and functioning of regional or even nationwide forests (Atkins et al., 2018;Nord-Larsen and Schumacher 2012;Kotivuori, Korhonen, and Packalen 2016). Nationwide airborne LiDAR projects span broadly geographic areas which contain various and complex land cover types, forest ecotypes, and tree species (Coops et al., 2016). Some examples of nationwide ALS projects include the National Ecological Observatory Network (NEON) in the USA (Ordway et al. 2021), the ALS campaign in the Swedish National Mapping Agency, Sweden (Nilsson et al. 2017), and the ALS collected in permanent field plots of national forest inventories in Norway (Hauglin et al. 2021). Yet, directly using ALS for global or near-global forest structure measurement is still challenging (Lang, Schindler, and Dirk Wegner 2019).
Spaceborne LiDAR missions, such as the Ice, Cloud, and land Elevation Satellite (ICESat) and the Global Ecosystem Dynamics Investigation (GEDI), which sample datasets globally or near globally (Neumann et al. 2019;Magruder, Neumann, and Kurtz 2021), provide a promising way to make up for the current deficiency of global LiDAR measured vertical structure and ecosystem functioning estimations (Schutz et al. 2005;Neuenschwander et al. 2020;Lefsky 2010;Marc et al. 2011). In December 2018, the National Aeronautics and Space Administration (NASA) launched GEDI which is mounted on the International Space Station (ISS), and allows sampling land surfaces ranging from 51.6°S to 51.6°N. GEDI is the first spaceborne highresolution (25 m footprint diameter) laser ranging system, specifically designed to globally measure vertical structure and topography (Dubayah et al. 2020c). GEDI was designed to provide about 10 billion cloudfree observations during its initial two-year mission ), but has already exceeded this goal and operations have now been extended by NASA until September 2023 (http://database.eohand book.com/database/missiontable.aspx). GEDI observations should significantly facilitate an improved understanding the role of forest structure in global ecology with its wide range of coverage and high resolution. Prelaunch studies reported that simulated GEDI waveforms from airborne discrete point cloud LiDAR satisfied forest vertical structure estimation and predictions of forest functioning Schneider et al. 2020). The effectiveness of the simulator was validated by Hancock et al. (2019) for several ALS instruments in tropical and temperate forests. However, the actual GEDI performance of forest structure estimation must be evaluated against existing higher-resolution observations to better understand and apply GEDI in forest structure and functioning estimates.
To date, GEDI products have been released in two versions. The first version (GEDI V1), available since January 2020, has been widely compared to highly accurate ALS forest structure parameters across different regions and forest types (summarized in Table 1). Previous studies using GEDI V1 showed various performance accuracies for terrain and canopy height estimations with the root mean square errors (RMSE) ranging from less than 1.00 m to 6.13 m and from 2.09 m to 9.20 m, respectively (Adam et al. 2020;Dorado-Roda et al. 2021;Quiros et al. 2021;Guerra-Hernández and Pascual 2021;Potapov et al. 2021). GEDI V1 performance for canopy height measurements was influenced by observation data quality which can be influenced by factors of GEDI preprocessing algorithms, land cover types, slope, canopy cover, height, acquisition time, beam type, sensitivity, geolocation uncertainty, and landscape heterogeneity (Adam et al. 2020;Roy et al. 2021). Fayad et al. (2021b) found that around 14% of height errors, indicated by RMSE values, were attributable to distorted waveforms in steep areas compared to flat areas. Roy et al. (2021) showed that GEDI performance depends on GEDI horizontal geolocation accuracy and land cover heterogeneity around GEDI footprints. Quiros et al. (2021) demonstrated that canopy height, slope, land cover type, canopy cover, and GEDI sensitivity played important roles in GEDI performance. Adam et al. (2020) reported that GEDI observation time and beam types could influence GEDI performance for terrain and canopy height estimations.
Compared to GEDI V1, the second version (GEDI V2), released in April 2021, has reduced the horizontal geolocation error from 23.80 m (1σ) to 13.20 m (1σ) on the one hand (Dubayah et al. 2021). On the other hand, the corresponding relative height from an optimum GEDI pre-processing algorithm was used in GEDI V2 rather than the fixed algorithm used in GEDI V1. The optimum GEDI pre-processing algorithm was the recommended algorithm from the six waveform pre-processing algorithms (a1-6), which used different thresholds for denoising waveform and detecting waveform-signal range above the noise (Supplementary Table 1) ). Additionally, more quality metrics and land surface characteristics were added to the GEDI V2 products, for example, indicators of geolocation errors (latitu-de_bin0_error and longitude_bin0_error), elevation errors (elevation_bin0_error), slope, and tree cover (landsat_treecover). For GEDI V2 performance assessment, Liu et al. (2021) illustrated that GEDI V2 estimated terrain elevation and canopy height with RMSE values of 4.03 m and 5.02 m, respectively. Liu et al. (2021) also indicated that GEDI performance varied from land cover types, slope, canopy cover, height, acquisition time, beam type, and sensitivity. However, several questions remain unanswered, which include (1) How does the GEDI V2 performance vary among RHs? (2) For GEDI V2, what are the relative performances for forest structure measurement from the bottom-to-top canopy among different forest types?
(3) What is the degree to which separate factors Table 1. Previous studies of GEDI performance of terrain and relative height estimates. The V1 and V2 are the first and second version of GEDI, respectively. The IQR is interquartile. The accuracy metrics of r, R 2 , Bias, MAE, %Bias, and %RMSE are the correlation of Pearson (r), coefficient of determination, mean errors, mean absolute errors, percent Bias, and percent RMSE, respectively.

Studies
Forest types Reference datasets Accuracy Factors (Adam et al. 2020)  --influence GEDI V2 performance; and can understandings of the effects of such factors be used as a quality metric for filtering GEDI observations? To answer these questions, this paper aims at providing insights on (1) GEDI V2 overall performance for ground elevation and RH estimations; (2) GEDI relative performance for ground elevation and RH estimations among different forest types; (3) the quantification influences of various factors on GEDI V2 relative heights (RHs) and ground elevation estimations in forests; and assess the effectiveness of using knowledge of these factor effects to provide quality metrics for GEDI observation filtering. To address these research questions, GEDI V2 performance of relative height and ground elevation estimates were analyzed against the ALS LiDAR data from 33 NEON sites as reference data. These 33 NEON sites were distributed among 15 ecological climate domains and mainly covered 12 forest types across the conterminous United States (CONUS).

Study area
Assessing GEDI performance for forest structure measurement would ideally span variability in canopy structures with a range of canopy cover, height, and vertical biomass distribution (Silva et al. 2018;Powell et al. 2010). Thus, the 33 sites of NEON LiDAR datasets distributed across the CONUS are used as the study area ( Figure 1). These sites are distributed over various climate zones with elevations ranging from 15.00 m to 3513.00 m, annual average temperatures ranging from −0.30°C to 22.50°C, and annual average precipitation ranging from 173 mm to 2530 mm (Supplementary Table 2). Over these 33 sites, forests include evergreen coniferous forest, mixed forest, and deciduous broadleaf forest. Among forest areas, 12 forest types with different tree species were selected to analyze GEDI performance among different forest types.  complex vertical layers in flat areas, such as Oak/Pine, Oak/ Hickory, and Oak/ Gum/ Cypress, with average layer number of 5, 4, and 4, respectively. Broadleaf forest types with high canopy cover include Maple/ Beech/ Birch. Details of the canopy height, cover, canopy layer, and topographic slope of the 12 forest types can be found in Supplementary Table 3.

GEDI-data
Each GEDI laser pulse illuminates ~25 m diameter footprint, and the reflected pulses are stored as a full waveform that records vertical objects continuously from the ground to the top of the canopy (Qi et al. 2019). The GEDI instrument consists of three lasers, in which two lasers are used at full power (power beams). The third laser is split into two beams (coverage beam) (Luthcke et al. 2019). These four beams are dithered to generate eight observation beam tracks . The laser sensors of GEDI are the High Output Maximum Efficiency Resonator (HOMER) with 15.6 ns pulse and 242 Hz pulse repetition rate in 1064 nm wavelength laser beam . GEDI products of the relative height (L2A) (Dubayah et al. 2020a) and vertical profile metrics (L2B) (Dubayah et al. 2021) from 2019 to 2020 were downloaded from NASA Earthdata Search (https://search.earthdata.nasa.gov/search) based on a spatial query submitted to the GEDI Finder (https://lpdaacsvc.cr.usgs.gov/services/gedifinder).

NEON LiDAR datasets
The NEON released Airborne Observation Platform (AOP) discrete point cloud LiDAR datasets (hereafter referred to as NEON LiDAR) were used as reference data in this study (National Ecological Observatory Network 2022). The NEON LiDAR datasets were all collected by a consistent instrument in the peak greenness of growing seasons and were reported to be accurate for ground elevation estimation with errors lower than 0.15 m on open, flat, and reflective surfaces (NEON.DOC.002293) (Goulden 2014). The RMSE of canopy height models (CHM) extracted from NEON LiDAR were lower than 1 m (NEON. DOC.002 387). A total area of 5705.63 km 2 NEON LiDAR was employed in this study, in which all NEON LiDAR was collected during 2017-2019 over the 33 NEON sites. Specifically,74.80,14.39,and 10.81% NEON LiDAR were overlapped with GEDI footprints that were collected in 2019, 2018, and 2017, respectively. The NEON aircraft collected datasets at an altitude of 1 km above ground level. The instrument of NEON LiDAR was the Optech ALTM Gemini with a wavelength of 1064 nm, laser footprint diameter of 0.25 m, 1-4 laser pulse per square meter, maximum scan angle ranging from 18 degrees to 42 degrees, and average point cloud density from 2 points/m 2 to 18 points/m 2 . Details of NEON LiDAR in each site can be found in supplemental Table 2. To collocate the NEON LiDAR with GEDI, the 1 m resolution NEON CHM datasets were also downloaded from https://data.neonscience.org/data-products/explore (National Ecological Observatory Network 2022b).

Forest type inventory
The forest type map used in this study, with a 250-m spatial resolution, was produced by the USDA Forest Service -Forest Inventory and Analysis (FIA) Program & Remote Sensing Applications Center (RSAC) ).

Methods
The methods of this study include the pre-processing of NEON LiDAR and GEDI datasets and the analysis of (1) GEDI performance analysis over all observations, (2) GEDI performance analysis among different forest types, and (3) analysis of factors influencing GEDI performance in forest areas, and the effectiveness of using factors for GEDI filtering.

Pre-processing (1) NEON LiDAR pre-processing
To make the NEON LiDAR comparable with GEDI waveform LiDAR, we simulated GEDI waveforms from NEON LiDAR using the rGEDI simulator (referred to as GEDI-simulated waveforms) (Hancock et al. 2019). Theoretically, the rGEDI convolved the density of point cloud into a Gaussian pseudo-waveform by weighting each point with the distance between discrete point cloud data and the footprint center (Blair and Michelle 1999). The GEDI-simulated waveforms have a footprint size of 25 m, full width at half maximum (FWHM) of 15.6 ns, and digitizer resolution of 15 cm, but without Gaussian noise. Good quality GEDI-simulated waveforms were selected by a conservative point density (3 point/m 2 ) to ensure the simulator performance. To reduce the interactive influence of RH and ground elevation, the actual ground elevation of NEON LiDAR generated from interpolation of ground point within each footprint was used rather than the ground elevation extracted from GEDI-simulated waveforms. However, the ground elevation of NEON LiDAR is referenced to the Geoid12A vertical datum, which differs from the WGS-84 ellipsoid reference of GEDI. Thus, the ground elevation of NEON LiDAR was converted into the WGS-84 ellipsoid using the Vdatum software (https://vdatum.noaa.gov/). The relative canopy heights do not need conversion because they are relative to the ground elevation regardless of the vertical reference datum.
For factors that potentially influenced the consistency of NEON LiDAR datasets, beam density, point density, and scan angle within GEDI-simulated waveform footprints were used as quality indicators. Since NEON LiDAR datasets, used in this study, were collected during the peak greenness season of 2017-2019, the absolute year difference (year interval) and date difference (date interval) between GEDI and NEON LiDAR were calculated, respectively. As a result, the year interval ranged from 0 to 3, in which 51.73%, 38.51%, 7.06%, and 2.70% samples had 0-, 1-, 2-, and 3-year differences. It is worth noting that the date interval, ranging from 0 to 233, did not consider the year interval, for example, a 1-day difference between the 150 day of year (DOY), 2017, and the 151 DOY, 2018. As described in (Roy et al. 2021), the higher the landscape, the more influential geolocation errors for GEDI canopy height estimation (Roy et al. 2021). Thus, the coefficient of variation of CHM within a 75 × 75 m focal area (i.e. three times of GEDI diameter) was calculated to represent the landscape heterogeneity (Kamoske et al. 2019).
(2) GEDI-data pre-processing It is worth noting that GEDI includes six preprocessing algorithms which use different thresholds for waveform smoothness and detecting the start and end of signal (Adam et al. 2020;Fayad et al. 2021a). The relative heights (from RH01 to RH100) derived from the optimum algorithm (recorded by aN) and ground elevation (elev_lowestmode) were extracted from GEDI L2A datasets. Then, the highquality GEDI observations were identified by metrics of the quality_flag (referred to as QA) equivalent to 1, degrade_flag equivalent to 0, landsat_water_persistence lower than 10%, and urban_proportion equivalent to 0. The opposite values of these metrics represented waveforms of poor signal quality and waveforms affected by cloud and poor signal quality, orbit-degraded, water-covered, and urban-covered observations, respectively. To further filter out potential orbit degradation that was unmarked by a degrade_flag, GEDI observations were further filtered if the differences between GEDI ground elevation and the two Digital Elevation Models (DEMs) were greater than 50 m. These two DEMs, having the same vertical datum as GEDI, included the SRTM (Shuttle Radar Topography Mission) DEM (digital_e-levation_model_srtm) and TanDEM-X DEM (digita-l_elevation_model) (Fayad et al. 2021a;Dubayah et al. 2021). As a result, 146,292 GEDI footprints within the NEON sites boundary box were extracted from the GEDI L2A product. The leaf_off_flag equivalent to 0 was used to select leaf-on observations. In contrast, the evergreen forest observations during leaf-off season were reserved according to pft_class equivalent to 1 (i.e. evergreen needleleaf trees) or 2 (i.e. evergreen broadleaf trees). Thus, 55,217 leaf-on samples, with DOY ranging from 14 to 349, were employed for GEDI performance of relative height and ground elevation estimations. To correctly select forested footprints, the forest footprints were identified as observations with landsat_treecover and modis_treecover greater than 0% and canopy height greater than 5 m (Nophea and Putz 2009).
For the analysis scope of factors influencing GEDI accuracy, metrics indicating a location (Longitude and Latitude of footprint centroid), topography (ground elevation), vertical layers (number of peaks, representing the number of detected modes in received GEDI waveforms), and tree cover (landsat_treecover) were extracted from GEDI L2A. The topographic slope was calculated from SRTM DEM. Other metrics were also extracted from GEDI L2A, including signal-to-noise ratio (SNR) of the waveform (sensitivity), solar elevation (day with values lower than 0, otherwise night), laser pulse energy (Beam type, the BEAM0000, BEAM0001, BEAM0010, and BEAM0011 were coverage beams, otherwise power beams), GEDI footprint location errors (latitude_bin0_error and longitude_bin0_error), and elevation errors (Bin0 elevation errors represented by elevation_bin0_error). To display the complexity of vertical profile, the foliage height diversity (FHD, fhd_normal) was extracted from GEDI L2B, in which FHD is the Shannon entropy of the vertical foliage profile (Valbuena et al. 2012).

GEDI performance analysis over all observations
Taking NEON-LiDAR derived ground elevation and GEDI-simulated RH100 as the reference, the measurement accuracy of GEDI ground elevation and RH100 were analyzed taking the Bias (1), MAE (2), RMSE (3), r (4), %Bias (5), and %RMSE (6) as statistic metrics. The variations of the GEDI performance deriving ground elevation and RH100 estimations were compared across the 33 NEON sites. Additionally, GEDI RH measurement accuracies across the vertical profiles were evaluated by comparing the Bias, MAE, RMSE, %RMSE, and %Bias (6) from RH01 to RH100.
RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi P n j¼1 ðH j;GEDI À H j;GEDIÀ simulated Þ 2 n s ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P n j¼1 H j;GEDI À H GEDI À � 2 q � ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P n j¼1 H j;GEDIÀ simulated À H GEDIÀ simulated where n is the total number of samples of footprints, H j;GEDI and H j;GEDIÀ simulated present the RH (or ground elevation) of the jth sample of GEDI and GEDIsimulated waveform, respectively. The H GEDIÀ simulated and H GEDI is mean RH (or ground elevation) of simulated GEDI and GEDI respectively. Note that positive Bias values indicate that GEDI estimates the canopy to be higher than GEDI-simulated data, while negative values indicate GEDI estimates the canopy to be shorter than GEDI-simulated data.

GEDI performance analysis across different forest types
GEDI is expected to perform inconsistently among forest types due to the various tree species, forest cover, height, and canopy layers (Adam et al. 2020;Dorado-Roda et al. 2021;Quiros et al. 2021;Guerra-Hernández and Pascual 2021). Thus, the accuracy of ground elevation and RHs were assessed for different forest types by referring to the RH extracted from GEDI-simulated waveforms, taking the Bias, MAE, RMSE, %RMSE, and %Bias as accuracy metrics. Considering the possible interactive influence of ground and RH estimation, we further analyzed the relationship between GEDI biases of ground elevation and relative height estimates among forest types.

Analysis of factors influencing GEDI performance
(1) The importance of factors To investigate factors affecting GEDI performance of ground elevation and RH estimates over forested areas, the relative importances of 19 different factors (Table 2) were calculated. These 19 factors included land surface attributes, GEDI and NEON LiDAR sensor system characteristics, and the time interval between the GEDI and NEON LiDAR. The amount of explained variance of each factor for the RH difference between GEDI and GEDI-simulated waveforms was estimated using the "lmg" function of "relaimpo" R package. The "lmg" function decomposed R 2 into non-negative contributions (Grömping 2009) by assuming that the explained variance by each factor is affected by its order relative to other factors in a regression model. Thus, using the average contribution of explained variance of regression models with different orders of the factors can measure the relative importance for factors, whether correlated or not (Grömping 2006(Grömping , 2009. (2) Analysis of the effectiveness of using factors to filter GEDI datasets It is necessary to discuss the component effects from factors contributing to GEDI observation errors and the possibility of using this knowledge to improve ground elevation and relative height estimations by filtering GEDI observations. Thus, we analyzed the RMSE change with the gradient increase values of factors to explain the effects from factors that contributed to GEDI observation errors. To eliminate slope effects on other factors, the analyses of factor effects on GEDI performance was conducted in flat areas (slope <15.00°). The slope lower than 15.00° was used due to the negligible height error introduced by topographic slope to GEDI relative height estimation (Yang et al. 2021;Wang et al. 2019). Then, factors were considered as a filter for GEDI observation screening. Comparisons of RMSE values and sample changes before and after filtering are made. For example, having topographic slopes lower than 15.00° distinguish GEDI observations in flat from steep areas, and using beam type and solar elevation separate power beam from coverage beam, and night from day observations, respectively. The sensitivity values greater than 0.94 were used to screen out the low SNR datasets. Moreover, a 95% left-side confidence interval of coefficient of variation, FHD, number of peaks, location errors, and Bin0 elevation errors was used to detect extreme observations caused by high landscape heterogeneity, vertical complexity, canopy layers, GEDI geolocation uncertainty, and GEDI elevation uncertainty. Factors including the longitude, latitude, elevation, tree cover, forest types, and time interval were not considered as filters because these factors were unique to different forest areas and therefore not suitable for general filtering of GEDI points.
(3) Analysis of dataset pre-processing for GEDI performance evaluation The pre-processing of GEDI and NEON datasets determines both the quantity and quality of the GEDI datasets that are usable for analysis, potentially influencing the representativeness of experimental results for the GEDI application community. The influence of four pre-processing steps in this paper helps illustrates the degree to which pre-processing methods are reasonable. Thus, comparisons of GEDI RH95 estimations were conducted to answer the following questions: (1) Can using quality_flags from the five remaining algorithms improve GEDI performance as compared to using the recommended optimum algorithm in the GEDI L2A product? (2) Is using a narrowed DEM differences (40, 30, 20, and 10 m) between GEDI elevation and TanDEM-X DEM on GEDI RH95 and ground elevation estimations more accurate than the current using 50 m DEM difference in GEDI preprocessing? (3) How do different phenology observations affect ground elevation estimation compared to the growing season observations? (4) How do the GEDI-simulated waveforms using different footprint sizes influence GEDI performance evaluations? In addition, RMSE values of GEDI performance before and after colocation were compared among different forest types to further understanding of the influence of GEDI geolocation uncertainty. Colocation between GEDI and NEON LiDAR was conducted using the R 2 values between the RHs of GEDI and NEON LiDAR CHM (Liu et al. 2021;Blair and Michelle 1999).

Ground elevation and RH100 estimations across the CONUS
The distributions of ground elevation and RH100 differences between GEDI and NEON LiDAR are shown in Figure 2. On average, GEDI observations overestimated ground elevation but underestimated RH100. There were 2.27% and 2.60% observation outliers  (Figure 2b). The Bias, MAE, RMSE, %Bias, and %RMSE of ground elevation and RH100 for each NEON site showed the variations of GEDI performance over the CONUS (Figure 3, Supplementary Table 4). GEDI ground elevation and RH100 estimations showed greater biases in the predominately forested ( Figure 1) and topographically heterogeneous coastal regions of the eastern and western CONUS as compared to the flatter, less forested central CONUS. Overall, the respective ground elevation ranges of Bias, MAE, RMSE, %Bias, and %RMSE were from 0.04 m to 3.51 m, 0.17 m to 5.06 m, 0.21 m to 6.90 m, 1.00% to 18.17%, and 3.79% to 31.46%. For RH100, the respective ranges of Bias, MAE, RMSE, %Bias, and %RMSE were from −5.74 m to 0.03 m, 0.38 m to 6.60 m, 0.48 m to 8.78 m, −22.38% to 1.76%, and 13.60% to 31.64%. Having greater than 3.50 m and 5.70 m of differences in Bias among NEON sites for ground elevation and RH100 estimations indicates that local factors strongly affect GEDI performance.

Comparison of RHs between GEDI and GEDI-simulated waveforms
The RH05 to RH100 of GEDI and GEDI-simulated waveforms were compared (Figure 4). All GEDI and GEDIsimulated RHs pairs were positively correlated. The RH95 corresponded most closely between GEDI and GEDI-simulated waveforms, with MAE of 1.35 m and RMSE of 2.08 m. Bias values were negative for all RHs, indicating that GEDI consistently underestimated RHs compared to the GEDI-simulated waveforms. The underestimated GEDI RH was lowest in RH95  (0.35 m) and highest in RH05 (1.73 m). It is worth noting that RH100 underestimated canopy heights by 1.12 m. This means a 20.00 m canopy height estimated by discrete point cloud LiDAR (GEDIsimulated RH100) would have a corresponding 18.88 m canopy height estimation from GEDI. Conversely, a 20.00 m RH95 estimation from LiDARderived GEDI-simulated data have a closer correspondence with 19.66 m GEDI RH95 estimation. This example illustrates the greater similarity between GEDIsimulated and GEDI waveforms for RH95 compared to RH100. The %Bias has a corresponding change with Bias, while %RMSE differing from RMSE has a lower value in RH100 estimation than RHs below RH100.

GEDI performance analysis among different forest types
Comparisons of %RMSE and %Bias among RHs for each forest type ( Figure 5 and Supplementary Figure 1) showed that GEDI RH95 corresponded more closely with GEDI-simulated waveforms than RH100. Forest relative height profiles, representing vertical biomass distribution, showed comparative % RMSE differences among forest types. However, cautious usage of mid-to low-RH over profiles should be emphasized because the mismatch between GEDI and GEDI-simulated waveforms is probably more apparent in mid-to low-RH than high RH given that the different technologies used (higher penetration  ability of full-waveform LiDAR than ALS). The GEDI performance of RH100 estimation showed that GEDI had the lowest accuracy in tall tree forest stands in steep areas (California mixed conifer with %RMSE of 21.39%, Fir/Spruce/Mountain Hemlock with %RMSE of 17.98%), and in high cover forest types (Maple/ Beech/Birch with %RMSE of 18.04%), but performed the best in the relatively short forest in flat areas such as Longleaf/Slash pine and Oak/Pine with % RMSE values of 11.59% and 11.80%, respectively ( Figure 5).
The GEDI performance of ground elevation estimation indicates differences among forest types (Table 3). A strong negative relationship between differences of RH95 and ground elevation was clear for broadleaf forest types, like Maple/Beech/Birch, Oak/Hickory, and Aspen/Birch ( Figure 6). This means that underestimates of canopy height are due in part to relatively poorer signal penetration that is needed to find the true ground surface which leads to overestimates of ground elevation. For example, Maple/ Beech/Birch with closed canopies, and Douglas-fir and Oak/Hickory with dense understories (Supplementary Table 3) had lower ground energy returns, resulting in larger MAEs and RMSEs than other forest types (Table 3). For needle leaf or mixed forests, this negative relationship was apparent to varying degrees in some forest types (Longleaf/Slash pine, Loblolly/Shortleaf pine, Oak/Pine, and Oak/ Gum/Cypress) but minimal or not existent in others (Lodgepole pine, Douglas Fir, Fire/Spruce/Mountain Hemlock, California Mixed Conifer). Here, we show that finding true ground is a particular challenge in broadleaf forest types with high tree cover (Supplementary Table 3).

The relative importance of each factor for GEDI performance
Given the much higher correspondence of RH95 than RH100, we used RH95 in the factor importance analysis. Overall, the 19 effect factors (Table 1), respectively, explained 33.31% and 21.19% of the variances of   Figure 7). The GEDI RH95 estimation is less concentrated among the factors, but 61.04% of the 21.19% response variance was explained by the number of peaks, coefficient of variation, and tree cover (Figure 7). NEON LiDAR-related factors were relatively unimportant (<5.00%) in GEDI ground elevation  estimation, confirming the reasonability of using NEON LiDAR as a reference for GEDI performance assessments. A specific factor often played different roles in ground elevation and RH95 estimations indicating substantially different importances. For example, the slope was a significant factor for ground elevation estimation, whereas it was much less critical for RH95 estimation. The coefficient of variation was one of the most important factors for RH95 estimation, but it was of minimal importance for ground elevation estimation. The parameters built into GEDI L2A product including solar elevation, beam type, and bin0 elevation errors were the least important factors in ground elevation and RH95 estimations, explaining less than 5.00% of both respective variances (Figure 7). Figure 8 shows For GEDI RH95 estimation, the increase of the slope, tree cover, FHD, location errors, and bin0 elevation errors, scan angle, date interval, and year interval degraded results, while increasing the sensitivity, number of peaks, and beam density improved results. The forest type, tree cover, scan angle, and year interval caused the most RMSE fluctuations on GEDI RH95 estimation with 4.85, 1.21, 1.27, and 2.92 m, respectively. The sensitivity, number of peaks, solar elevation, and date interval contributed 0.50-1.00 m to RMSE variations. The coefficient of variation, beam type, elevation, and point density contributed less than 0.50 m to GEDI RH95 RMSE variations.

The relationship between GEDI performance and factors
The RMSE of GEDI ground elevation and RH95 estimation showed the influence of each factor. The samples of GEDI observation before and after using factors as quality flags were compared here. The results showed that individually, using factors as filters reduced GEDI observations to varying degrees (Figure 9-10). For instance, removing GEDI observations in steep areas can increase GEDI performance of both RH95 and ground elevation estimation (Figure 8a) but decreased available GEDI observations by 36.00%, consequently removing many good-quality observations which had lower differences between GEDI and NEON LiDAR. Using sensitivity can improve GEDI performance of ground elevation estimation (Figure 8g) but it caused a decrease of 18.66% of GEDI observations. Unfortunately, using beam type and solar elevation as filters in GEDI performance of ground elevation and RH95 estimations was forest-type-specific (Table 4-5). Even worse, beam type and solar elevation as filters caused the removal of 45.04% and 37.86% GEDI observations. While sacrificing relatively few GEDI observations (<5.00% samples Figure 9-10), using coefficient of variation, FHD, sensitivity, number of peaks, location errors, and bin0 elevation errors as filters had minimal effect, reducing RMSE by less than 0.05 m in both RH95 and ground elevation estimations.

Analysis of dataset pre-processing for GEDI performance evaluation
The effectiveness of filtering GEDI observations in preprocessing was analyzed here. Filters included the quality_flags derived from the six GEDI waveform preprocessing algorithm, the DEM difference, and the phenology flag (leaf_off_flag) for selecting the growing season observations. The results indicated that the pre-processing algorithm optimization substantially improved the GEDI performance of ground elevation and RH estimations compared to that without selecting the optimum algorithms (GEDI V1), as shown in Supplementary Table S5. Moreover, using the six-algorithm quality_flags rather than the optimum algorithm can further reduce the RMSE values of GEDI estimates of ground elevation and RH95 (Figure 11a).
How the known DEMs of TanDEM-X built into GEDI L2A can help filter the GEDI data with biased ground elevation was further explored. As expected, narrowing the DEM difference from 50.00 m to 10.00 m can reduce the RMSE values of GEDI estimates of RH95 and ground elevation from 5.86 m to 5.01 m and from 2.67 m to 2.48 m (Figure 11b), respectively. Notably, narrowing the DEM difference resulted in sacrificing 22% GEDI observations that included many good-quality and some unique observations with tall canopy height (Supplementary Figure 2). Thus, further using DEM difference might be arguable and unsuitable for many users, such as those aggregating shots over a region of interest.
For effects from different phenological observations in GEDI ground elevation estimation, constraining data to periods with similar phenology reduced bias and RMSE by 0.36 m and 0.85 m, respectively, compared to periods of different phenology ( Figure 12). Thus, use of data with similar phenology is recommended to obtain consistent observations between GEDI and other LiDAR datasets.
It is also notable that the discrete point cloud LiDAR with small footprint has a much higher spatial resolution than GEDI with a relatively large footprint. To investigate the potential effects of these scale discrepancies, we conducted a comparison of the differences between GEDI and GEDI-simulated waveforms using different simulation footprint sizes as a function were conducted ( Figure 13). The results show that increasing GEDI-simulated footprint size increased the accuracy of GEDI ground elevation estimation but decreased accuracy of RH95 and RH100 estimations (Supplementary Table 6). The errors introduced by using different spatial scale LiDAR datasets, thus, cannot be ignored. Apart from the influence of the footprint size of simulated GEDI waveform on GEDI performance evaluation, any misregistration between GEDI and NEON LiDAR, dominantly caused by GEDI geolocation errors, is another factor influencing the consistency between GEDI and simulated GEDI waveforms. A comparison of top canopy height (RH100) estimations before and after colocation illustrates the effect of GEDI geolocation errors (Supplementary Table 7). Results show that GEDI RH100 improved substantially with values of Bias, MAE, and RMSE less than 0.07, 2.64, and 3.83 m after geolocation correction (Supplementary Table 7). The colocation effects on GEDI RH100 estimation vary from forest types ( Figure 14) and land cover types (Supplementary Figure 3).

Accuracy analysis of GEDI performance of ground elevation and RH estimations
This paper focuses on the GEDI V2 performance of ground elevation and relative height estimations. The representativeness of our results can be confirmed using the consistent and accurate reference datasets of NEON LiDAR across complex geomorphic units (the CONUS) among various land cover types and forest types including evergreen needleleaf forests, evergreen broadleaf forests, deciduous broadleaf forests,  Table 2). In line with previous studies (A. Liu et al. 2021;Guerra-Hernández and Pascual 2021;Potapov et al. 2021), our results showed that, on average, GEDI observations overestimated ground elevation but underestimated RH100 (Figure 2). However, underestimation of ground elevation and overestimation of RH100 have been reported in some cases (Adam et al. 2020;Quiros et al. 2021). Such overestimations of GEDI RH100 were potentially caused by using outdated DTM and CHM as reference datasets that underestimated canopy height due to ignoring tree growth (Adam et al. 2020). The underestimation of ground elevation was site-specific, which could be caused by outliers dispersed over a relevantly small area for evaluation and/or complex rugged terrain (Quiros et al. 2021). By using consistent instrument NEON LiDAR over a vast area, with more than 90% of datasets collected within a 1-year interval, we consider the results of this study to be more robust and convincing than previous research that evaluated GEDI performance using inconsistent or small reference areas (Adam et al. 2020;Guerra-Hernández and Pascual 2021;Potapov et al. 2021).
For effective use, RH100 errors are expected to be less than 1 m (Luthcke et al. 2019;Wenlu and Dubayah 2016;Duncanson et al. 2020). After removing outliers exceeding 3σ (Figure 2), the accuracy of GEDI V2 RH100 estimations, with a Bias value of 0.96 m, reported here, meet this standard. A similar study for the USA (Liu et al. 2021) reported RMSE values of 4.03 m and 5.02 m for GEDI ground elevation and RH100 estimations. Our results, with corrected RMSE values of 1.38 m (ground elevation) and 2.62 m (RH100), show the large effects that relatively few outliers can have on GEDI performance evaluations.

Accuracy analysis of GEDI performance among different forest types
Average results obscure large variations of GEDI performance among the 12 temperate forest types investigated here ( Figure 5, Supplementary Table 3). In general, GEDI RH100 estimation was most accurate for forest stands with less canopy cover and on flat ground (e.g. Longleaf/Slash pine and Oak/Pine) and worst for steep areas and forests with dense canopy cover (e.g. Maple/ Beech/Birch and California mixed conifer) ( Figure 5). The %RMSE for RH100 estimations ranged from 11.59% (needled evergreen tree in flat areas) to 21.39% (dense deciduous broadleaf tree in steep areas) in this study. Despite the considerable disparity, these results compare favorably to previous studies that reported %RMSE values of 21.10% in relative open forest (mean cover less than 42.00%) (Quiros et al. 2021), or ranging from 27.68% to 41.37%, across an agro-forestry-pastoral ecosystem comprised of cork oak, Pinus pinaster, and Pinus pine (Dorado-Roda et al. 2021).
The ability of GEDI to accurately and precisely capture vertical forest structure varied among different forest types ( Figure 5). For example, Maple/Beech/ Birch forests with typically dense canopy cover have relatively large errors for both top and understory relative height estimation ( Figure 6) due to the challenge for GEDI to find true ground surfaces (Liu et al. 2021). In comparison, California mixed conifer forests,  Figure 11. Improvement of GEDI performance of ground elevation and RH95 estimates from filtering. Positive values indicated that RMSE values were higher before filtering using the quality_flags of all six algorithms (a) and differences between GEDI and TanDEM-X (b). ∆RMSE was the RMSE difference before and after using filter.
composed of uneven-aged trees (Supplementary Table 3) spreading across mountainous areas, perform worst in top of canopy height estimation due to strong variations of both vertical and horizontal structural patterns (Wiggins et al. 2019;Gonzalez et al. 2010). Consequently, incorporation of forest type into GEDI estimations is expected to further improve our ability to quantify forest structure and biomass over large regions.

Factors of GEDI performance and their potential for filtering
Apart from GEDI performance analysis, this study explored the factors influencing GEDI estimations and the potential utility of using factors as filters for screening GEDI observations (Figure 7-14, Table 4-5, Figure 11b, Supplementary Figure 2). As described in section 3.3, GEDI performance of ground elevation and  RH estimations was most strongly influenced by land surface characteristics and the GEDI sensor system factors (Figure 7-8). Although filtering GEDI observations based on the examined factors can be used to improve statistical results, this enhanced effectiveness comes with the risk of sacrificing many good quality GEDI observations with consistent RH and ground elevation values, as compared to reference datasets (Figure 9 and Figure 10). Some of the analyzed factors interacted with each other. For example, topographic slope yielded inclined ground and vegetation surfaces within a footprint, resulting in complex pulse signal interactions with the land surface Wang et al. 2019). As slope increases, waveform extents are broadened and the mixture of ground and vegetation signals is exacerbated Lee et al. 2011;Xing et al. 2010;Wang et al. 2019). Consequently, the geometric optical relationship between pulse lasers and land surfaces in steep areas varies from that in flat areas . The slope effect also influences the effects of other factors on GEDI performance of ground elevation estimation differently in flat areas versus steep areas (Figure 8(e-h,k-l). Slope-affected factors included FHD, elevation, sensitivity, number of peaks, beam density, point density, and scan angle. Even though flat-area GEDI observations were used to analyze the other factors influencing GEDI performance, there are other potential interaction relationships among factors that should be considered in future work. For instance, the potential for aggravation of geolocation errors in steep forest area is demonstrated by the comparable RH95 estimation improvements for removing steep area observations ( Figure 8b) and providing geolocation corrections ( Figure 14). The influences of beam type and solar elevation were forest-type-specific (Table 5), which is inconsistent with previous studies that reported performance of power beam better than coverage beam, and night observation better than day observations in RH estimation (Liu et al. 2021;Dorado-Roda et al. 2021).
These results probably reveal that GEDI V2 was already using a conservative threshold to set the quality_flag and degrade_flag for GEDI observation filtering for forests over the CONUS. Notably, vegetation phenology differences have the potential to influence LiDAR waveform characteristics, impacting assessments of GEDI performance for ground elevation and relative height estimation (Shao et al. 2019). Prior studies have generally ignored the influences of time gaps of DOY and phenology between GEDI and reference LiDAR collections (Adam et al. 2020;Guerra-Hernández and Pascual 2021). Based on our analyses of the importance of individual effect factors, the phenological effects of time-of-year differences between NEON LiDAR and GEDI observations were more remarkable than those related to the number of years between observations (Figure 7). Thus, matching phenological conditions while minimizing the time interval between GEDI and corresponding reference LiDAR is recommended.
Previous studies (Potapov et al. 2021;Quiros et al. 2021;Liu et al. 2021;Adam et al. 2020) demonstrated that GEDI performance varied from using a different pre-processing algorithm to derive RH. Our paper demonstrated the effectiveness of using the RH derived from the optimum pre-processing algorithm with a reduction of RMSE before and after using the optimum pre-processing algorithm for RH100 estimation ( Figure 11a, Supplementary Table S5).
For any object to be assessed, additional challenges exist because GEDI V2 datasets are known to have geolocation errors averaging more than 10 m (Liu et al. 2021;Hansen et al. 2020;Dubayah et al. 2021). We investigated the effectiveness of using a colocated calibration method based on the correlation between GEDI and GEDI-simulated waveforms to mitigate misregistration between GEDI and ALS (Fayad et al. 2021b;Quiros et al. 2021;Roy et al. 2021). Comparisons between results with and without colocation calibration showed that GEDI geolocation errors introduced uncertainties for GEDI RH100 estimation with average Bias, MAE, and RMSE values of 1.06, 1.18, and 1.79 m, respectively. However, actual geolocation effects on canopy height estimation were forest type-specific ( Figure 14). Conversely, geolocation-related uncertainties accounted for less than 0.40 m of RMSE, supporting previous reports of slight effects from geolocation errors on ground elevation estimation (Liu et al. 2021). Alignment of GEDI with GEDI-simulated waveforms depends on the availability of airborne LiDAR datasets from which to derive them (Roy et al. 2021). The currently limited availability of airborne LiDAR datasets makes global colocation correction impracticable for large-scale forest monitoring applications at present. Our findings do show the importance of improving geolocation of GEDI observations for future forest structure assessments.
Removal of poor-quality GEDI observations so that better results are achieved might be an alternative for mitigating errors and improving applications (Potapov et al. 2021;Marc et al. 2011). Removal of less than 5.00% of GEDI observations, using location errors as a filter, did not completely filter out GEDI observations with large geographic errors (Figure 8j). Thus, methods for eliminating the geolocation uncertainty of GEDI observation should be further developed. Compared to homogeneous surfaces, the influence of geographic errors on GEDI canopy height measurements is more pronounced in heterogeneous regions (Roy et al. 2021). The negative relationship between GEDI canopy height measurement errors and the variations of canopy height within a focal area (Figure 8d) illuminates the possibility of using variations over space to filter out GEDI observations in heterogeneous areas. Thus, future work may focus on developing quantitative relationships between spatial variations and GEDI measurement errors of canopy height to alleviate the negative influence of GEDI geolocation errors.

Limitations and uncertainties
This paper assessed GEDI performance of ground elevation and RH estimations across the CONUS. Taking the CONUS geographic unit as the study area, this paper focused on temperate forests. Thus, insight into GEDI performance of ground elevation and RHs estimations in tropical forests was not provided. Due to the limited latitudinal coverage of GEDI observations, because the instrument is mounted on the ISS, GEDI cannot observe boreal forests. Therefore, study of these areas will require GEDI to be combined with other spaceborne LiDAR systems to conduct global forest structure monitoring (Liu et al. 2022). Additionally, this paper focused on factors influencing GEDI performance for estimating canopy heights and ground elevation. Canopy heights are key inputs for models that estimate forest function and so data such as GEDI observations hold promise for enhancing these assessments. However, future work is needed to assess the potential effects of biased canopy height inputs on forest function estimation.
Another overlooked limitation in defining GEDI observation uncertainties is the effect of different LiDAR systems (waveform or discrete point cloud) in ALS that undermine the consistency of forest structure extraction Sun Guoqing and Jon Ranson 2000). For example, RH estimations from discrete point cloud LiDAR data can be different from those defined using waveform data. Discrete systems define RH by the quantiles of point density above ground level while waveform uses the elevation distance between n% accumulative waveform energy to ground elevation as RH (Kamoske et al. 2019;Harding et al. 2001). Using a simulator to unify different LiDAR datasets was a solution (Blair and Michelle 1999;Hancock et al. 2019). The GEDI simulator, named rGEDI, transforms the density of discrete point cloud LiDAR within a 25 m diameter footprint to a GEDIsimulated waveform that has the same FHWM and digitizer resolution as GEDI waveform. In this way, RHs derived from GEDI-simulated waveforms have the same definition as RHs of GEDI waveforms. Moreover, the rGEDI simulator accuracy in forested areas has %RMSE ranging from 8.02% to 13.13% and RMSE ranging from 1.66 m to 4.01 m (Supplementary Table 7), which compares favorably with rGEDI simulator accuracies reported in tropical and temperate forests, with %RMSE of ~17.53% and RMSE of 5.70 m (Hancock et al. 2019). Thus, it is feasible for using GEDI-simulated waveforms in comparing GEDI and discrete point cloud LiDAR.

Conclusion
GEDI observations have substantial potential for improving global forest structure estimation and monitoring, which could tremendously enhance our understanding of global ecology. This study used GEDI-simulated waveforms derived from NEON LiDAR of 33 sites covered different forest types across the CONUS as reference data to evaluate GEDI V2 performance by estimating ground elevation and the relative heights of vegetation. We analyzed the relative importance of 19 factors known to affect GEDI results and explored the effectiveness of using selected factors to filter GEDI datasets to improve results. Overall, GEDI underestimated RH but overestimated ground elevation compared to NEON LiDAR. Greater than 33.00% and 21.00% of variations of GEDI ground elevation and RH95 measurement bias can be explained by land surface attributes, sensor system factors, and the time interval between GEDI and NEON LiDAR observations. For the GEDI RH measurements among forest types, performance was sensitive to ground finding accuracy and vertical profiles of canopy density, indicating that particular care is necessary when interpreting results for broadleaved forest types. Geolocation errors are still problems for GEDI estimates of ground elevation and RH, which vary by forest type (Figure 12) and land cover type (supplementary Figure 2). The findings reported here are expected to provide insights into the characterization of GEDI observations and guide future studies of GEDI-based forest structure estimations. Numata was also supported by NASA LCLUC (80NSSC20K0365).