Health assessment of plantations based on LiDAR canopy spatial structure parameters

ABSTRACT The Yellow River Delta (YRD) has China's largest artificial Robinia pseudoacacia forest, which was planted in the late 1970s and suffered extensive dieback in the 1990s. The health grade of the R.pseudoacacia forest (named canopy vigor grade, CVG) could be achieved by using high-resolution images and canopy vigor indicators (CVIs). However, a previous study showed that there was no significant correlation between CVG and the field-estimated aboveground biomass (AGB) of R.pseudoacacia forest. Therefore, this study aims to construct forest health indicators (FHIs) based on canopy spatial structure parameters extracted from LiDAR. The FHIs included Weibull_α (the scale parameter of the Weibull density function that reflects the shape of the tree canopy), VCI (vertical complexity index), sdCC (the standard deviation of canopy cover), H99 (the 99th percentile height) and cvLAD (the coefficient of variation of leaf area density), and could significantly distinguish three forest health grades (FHG) (p < 0.05). The FHG was positively correlated with forest AGB (rs  = 0.51, p = 0.004), and the similarity value with CVG was 63.33%. The results of this study confirmed that the FHIs can reflect both canopy vigor and tree productivity, and distinguish forest health status without prior classification information.


Introduction
As the main body of terrestrial ecosystems, forests play a key role in both the global carbon cycle and climate change mitigation (Pan et al. 2011). Forests are degrading and even disappearing at an alarming rate worldwide (Hansen et al. 2013). Planted forests are the priority choice for the management of degraded ecosystems, which can prevent wind and fix sand, maintain soil and water loss and improve saline soil (Jiang, Zhu, and Zeng 2003). Planted forests account for approximately 7% of the world's total forest area, of which China's planted forests account for 6.9×107 ha, which ranks China first in the world (FAO 2020). However, since the 1990s, the growth and development of planted forests in China have been stagnating and even dying, which has greatly reduced the carbon storage of forests (Jiang, Zhu, and Zeng 2003;Zhu et al. 2005). Therefore, the rapid and accurate assessment of the health status of plantations is helpful for sustainable forest management and carbon sequestration of terrestrial forest ecosystems.
for measuring forest biophysical parameters, such as tree height, volume, and biomass (Lausch et al. 2017), whose changes can reflect tree growth (Naesset and Gobakken 2008;Silva et al. 2016). LiDAR can also easily calculate a variety of parameters that describe forest ecological conditions, the most critical of which is canopy cover (Lausch et al. 2017;Leeuwen and Nieuwenhuis 2010). The vertical distribution differences of branches and leaves in the canopy can be distinguished by quantifying the canopy profile with a Weibull distribution (Tompalski et al. 2015). In addition, LiDAR-derived structural parameters can describe both the vertical (Zhang, Lin, and She 2017) and horizontal heterogeneity (Moran, Rowell, and Seielstad 2018) of stands, which have a close bearing on forest health and biodiversity (Gao et al. 2014). In summary, the LiDAR-derived parameters can well represent the changes in canopy spatial structure and therefore quantitatively evaluate the forest health status.
The purpose of this study is to realize automatic and rapid identification of forest health status using structural parameters extracted from LiDAR without prior classification information. The proposed FHI system is expected to comprehensively reflect canopy vigor and forest productivity.

Study area
The YRD is located on the east coast of China ( Figure 1a) and has a warm temperate subhumid continental monsoon climate. It has a shallow water table, low soil productivity, and widespread saline soil (Wu, Liu, and Huang 2016). Because of the characteristics of nitrogen fixation, the prevention of soil erosion, and salt tolerance, R.pseudoacacia trees have been planted as the main afforestation tree species in the YRD since the 1970s. However, since the 1990s, these trees have suffered from many diebacks and even deaths. Up to 2013, four R.pseudoacacia forests ( Figure 1b) covered a total area of 27.94 km 2 , of which 41.46% were healthy, 36.09% suffered from medium dieback and the remaining 22.45% from severe dieback or were completely dead (Wang et al. 2015). The Gudao forest, with a total area of 5 km 2 , was selected as our study area (Figure 1c). The forests are mostly pure stands of mature stands with tree spacing ranging from 3 m to 4 m. The forest structure is simple, with herbaceous species in the understory.

UAV-LiDAR data
The LiDAR point cloud data were acquired in June 2017 using the Li-Air airborne LiDAR System (GreenValley, China), which was equipped with a Riegl VUX-1 four-return laser scanner, an IMU (Novatel), and a dual frequency GPS (Novatel). An eight-rotor UAV platform was flying at 120 m above the ground with a ground speed of 4.8 m·s −1 and a cruising radius of 2 km. The average point density was 70 m −2 with a ranging accuracy of 10 mm, and the spot diameter was approximately 50 mm. The acquired data were georeferenced to the UTM coordinate system (geodatum: WGS84).

Field data
Similar to a previous two-level sampling design (Wang et al. 2015), samples were collected at two levels, namely, 30 m × 30 and 10 m × 10 m. In each 30 m × 30 m plot, five 10 m × 10 m subplots were arranged, deployed in the four corners and one at the center. The geographic coordinates of these plots were recorded with the GEOXT6000 GPS (Trimble Navigation Company) locator. In each subplot, we selected one standard tree and evaluated the five CVIs (Table 1) referring to the Crown Condition Classification Guide (Schomaker et al. 2007). The live crown ratio was measured by the ratio of live crown to tree height. The crown diameter was expressed by the ratio of the crown diameter to the healthy trees of the same species. Visually reconstruct damaged or missing crown outline by comparing crowns of healthy trees of the same species. Then determine the crown dieback by assessing what percentage of the outlined area is dieback. The crown density and foliar transparency were estimated based on the crown density-foliage transparency card that proposed by the USFS. The card provides a table of percentages for various combinations of density  and transparency. For each plot, the indicators were obtained by averaging the five standard trees, and each indicator was classified as Class 1, 2, or 3 based on the thresholds. Finally, the CVG at the plot level was determined by specific rules (Table 2). Tables 1 and 2 refer to the previous studies (Wang et al. 2015). A total of 30 field plots (the black dots in Figure 1c) were investigated in June 2017, June 2019, and October 2019, including 14 healthy plots, 10 medium dieback plots, and 6 severe dieback plots. The stand conditions for 2017 and 2019 are the same, because the R.pseudoacacia forest in this study is mature forest, the growth of trees basically stopped. Height and DBH were measured for all trees in the above plot. Tree height was determined by Vertex IV and Vertex Laser Geo (Haglöf Sweden AB, Långsele, Sweden), and tree DBH was measured at 1.3 m above the ground with a tape measure. The plot-level forest AGB was derived from all trees, of which the trunk, branch, and leaf biomass were calculated using allometric Equations (1) -(3) for R.pseudoacacia forest (Lu et al. 2020). (1) where W trunk , W branch , and W leaf are the trunk, branch, and leaf biomass, respectively, D is the DBH, and H is the tree height. Table 3 shows a summary of plot-level tree height (H), DBH, and AGB divided according to the three CVGs. The CVG and AGB obtained from the field were used to assess the effectiveness of our methodology. Figure 2 shows an overview of establishing the forest health indicators (FHIs) and classifying forest health in four steps. In the first step, the normalized LiDAR point cloud data were used to construct the canopy height model (CHM), from which the LiDAR parameters were then derived. In the second step, the FHIs were established via hierarchical clustering based on Pearson correlation coefficients (PCCs). Hierarchical clustering was also employed to construct the forest health classification guide (FHCG), which was used to determine the forest health grade (FHG) of R.pseudoacacia. In the third step, the FHG based on FHCG was assessed by field data. The last step is to map the FHG of R.pseudoacacia forest in the Gudao forest.

Preprocessing and LiDAR parameters
The LiDAR data were preprocessed in LiDAR 360 3.1 software (Green Valley, Being, China). First, the LiDAR point clouds were denoised to remove outliers. Then, the clouds were filtered into ground points and vegetation points. The digital elevation model (DEM) and digital surface model (DSM) were generated by interpolation of ground points and the maximum number of vegetation points, respectively. The CHM was subsequently constructed by subtracting the DEM from the DSM with a pixel size of 1 m. The LiDAR point clouds were normalized against the DEM.
2.3.1.1. Vertical structure parameters. The vertical structural parameters were extracted from the normalized point clouds. To avoid outliers, the 99th percentile height (H 99 ) was used to represent the maximum tree height at the plot scale (Huesca et al. 2019). The extraction of the other four parameters required vertical stratification of the point clouds.
The canopy height profile (CHP) is represented by the ratio of the number of point clouds in a certain height layer to the total number of point clouds, which describes the vertical distribution of photosynthesis and nonphotosynthetic tissue (Zhao, Popescu, and Nelson 2009). The height percentiles, instead of the height layer, were used to simplify the CHP into 10 height layers in this study. The two-parameter Weibull density function was applied to fit the CHP, and the scale parameter (α) and shape parameter (β) were derived by the maximum likelihood estimation method (Equation (4)) (Tompalski et al. 2015;Zhang, Lin, and She 2017): where f(x) is the point cloud density, x is the height percentiles, and α and β refer to the scale and shape parameters of the Weibull density function, respectively. Leaf area density (LAD) can describe the vertical structure distribution of photosynthetic tissues in forests, and its estimation depends on LAI (Detto et al. 2015). LAI is calculated based on the Beer-Lambert law (Richardson, Moskal, and Kim 2009). When derived from LiDAR data, it was calculated by dividing the gap fraction of the height layers by height Δz. LAD was defined according to the number of point clouds above and below the target height layer i (Detto et al. 2015). The calculation formulae are as follows: where GF is the gap fraction, κ is the extinction coefficient (a value of 0.5 was adopted in this study), n ab is the number of point clouds above the height layer, n be is the number of point clouds below the height layer, and n tot is the number of point clouds. The coefficient of variation (cvLAD) (Equation (7)) (Carrasco et al. 2019) was used to quantify the vertical variation in leaves: where H is the total number of height layers and LAD is the mean LAD across all layers. The vertical complexity index (VCI) (van Ewijk, Treitz, and Scott 2011) was used to quantify the vertical distribution of all aboveground vegetation, and it is calculated as: where n represents the total number of elevation levels divided vertically, and P i represents the portion of points in each elevation level out of the total point cloud (%). A VCI value of one indicates that the canopy configuration is more evenly distributed than that of zero in the vertical direction.
2.3.1.2. Horizontal structure parameters. The horizontal structural parameters are divided into two parts. One part is extracted based on CHM, which is a surface model expressing the height of vegetation from the ground. The standard deviation of CHM (CHM std ) is a continuous parameter that represents vegetation heterogeneity based on canopy heights (Huesca et al. 2019). The other part is extracted based on point cloud. Canopy cover (CC) is defined as the vertical projection area of forest canopies divided by the total stand area, which can reflect the stand density (Lefsky et al. 2002). The horizontal change of canopy cover within a stand can be expressed by the standard deviation of canopy cover (sdCC) (Moran, Rowell, and Seielstad 2018). For a 30 m × 30 m plot, it was first split into 900 subplots of 1 m 2 each. The sdCC was derived by calculating the CC for each subplot and assigning the standard deviation of the subplot CCs to the plot. The definitions of all LiDAR parameters are summarized in Table 4. All the parameters CC Canopy cover (the percentages of first returns above 2 m). (Lefsky et al. 2002) sdCC Standard deviation of canopy cover. (Moran, Rowell, and Seielstad 2018) were calculated by the lidR package and the raster package in R software and extracted with 30 m resolution, which matched the size of field plot.

Establishment of FHIs and FHCG
Cluster analysis is an exploratory classification technique in which samples that are closer together are classified into one category by calculating the distance between them, and those that are farther apart belong to different categories (Kaufman and Rousseeuw 2009;Thompson et al. 2016). Without prior knowledge, that is, training samples, a set of forest health indicators (FHIs) is directly explored and constructed from UAV LiDAR point cloud data by an unsupervised clustering method. The specific steps are as follows. First, the LiDAR parameters with a similar correlation were grouped into one category via hierarchical clustering analysis using the Pearson correlation coefficient (r) as the pairwise distance metric. According to the principle that the correlation coefficient within the category is the largest (representing all indicators of the category) and the correlation coefficient between the categories is the least (having the lowest correlation with indicators of other categories), only one parameter was selected from each category to constitute forest health indicators. Then, each indicator was divided into three classes by hierarchical clustering, corresponding to healthy, subhealthy, and unhealthy forest health grades, and used to established a forest health classification guide (FHCG). The above steps were realized by the corrplot function and hclust function in R software.

Classification and assessment
The FHG of 30 plots was defined according to the rules in Table 5, namely, healthy (H), subhealthy (S), and unhealthy (U). The method of grade determination is to see if a plot meets the healthy or unhealthy criteria. Plots that qualify for neither of these grades are assigned to the subhealthy grade. The verification of the results was divided into two parts. The first part was the evaluation of classification results. Linear discriminate analysis (LDA) (Li and Yuan 2005) was applied to 30 field plots to visually determine the degree of separation between three FHGs. The nonparametric Kruskal-Wallis test (Breslow 1970) was used to determine differences in FHIs among the three health grades. The second part was to verify the rationality of the FHG, the CVG, and ABG of 30 field plots as test data. To judge whether the FHG based on LiDAR structural parameters can represent forest canopy vigor, the similarity between the FHG and the CVG was calculated by the indicator SIM (Han and Kamber 2007): where n same is the number of plots with the same discriminant results and n tol is the total number of plots. It measures how similar two grades are, with a score of 0-1. Moreover, the relationship between FHG and AGB was measured by the Spearman rank correlation coefficient (r s ) (Spearman 1987) (Equation (10)): where n is the number of plots and d i is the rank difference of FHG and AGB in each plot. The result can be positive correlation, negative correlation and no correlation. These steps were performed using the MASS package and ggplot2 package in R software.

FHG mapping
In order to generate the FHG map covering the entire study area, the study area was segmented, with a cell size of 30 m × 30 m corresponding to the field plot. Five FHIs were extracted from the cell. According to the FHCG obtained in the previous steps and combined with Table 5, the FHG map can be obtained. Finally, the different degraded forms of R.pseudoacacia were classified with the photos taken by field investigations.

FHIs and FHCG
Five categories were grouped from the LiDAR parameters (Figure 3). The first category consisted of Weibull_β, CC, and VCI with a correlation coefficient of 0.64-0.94, and the second category contained CHM std and H 99 (r = 0.84). sdCC, cvLAD, and Weibull_α were each grouped into separate categories (r < 0.5). According to the principle that the selected parameter should have the largest intraclass and the smallest interclass value, the parameters of Weibull_α, H 99 , and VCI were selected in three categories. Therefore, the FHIs were composed of five parameters, i.e. Weibull_α, H 99 , VCI, sdCC, and cvLAD. Table 6 and Table 7 showed the established FHCG, in which each indicator was recorded in three classes. Class 1 represents a healthy condition characterized by tall trees (H 99 ), flourishing canopies (Weibull_α, cvLAD), and evenly distributed vegetation both vertically (VCI) and horizontally (sdCC). In contrast, class 3 is unhealthy. The trees are short, and the canopy is not closed. Class 2 does not meet the criteria of either class 1 or 3.
The classification of sdCC (Table 7) is different because it presents different distribution patterns in different CC classes (Figure 4a). Class 1 has a high CC and low sdCC, thereby suggesting the high uniformity of the canopy in the horizontal direction (red box in Figure 4a). The plot with a low CC but a high sdCC (blue box in Figure 4a) indicates that most of the tree canopies have dieback; however, there are trees that are still growing well (Class 3 in Figure 4a). In the FHI establishment step, VCI was selected. VCI and CC are highly correlated (r = 0.94), which can reflect the same stand status (Figure 4b). Therefore, VCI was selected to judge the class of sdCC instead of the CC in this study.

Assessment on separability and similarity
The LDA on 30 field plots indicated that the FHGs can be separated explicitly in the two discriminate coefficient axes, except for one healthy and subhealthy plot that had a partial overlap ( Figure  5a). The Kruskal-Wallis test result showed that all the selected indicators except cvLAD (p = 0.087) were significantly different (p < 0.05) among the three FHGs (Table 8). However, if the insignificant indicator cvLAD was removed, there was more overlap between healthy and subhealthy plots by using the remaining four indicators (Figure 5b).
The calculated SIM value between the FHG and the CVG was 63.33%, indicating that the majority of the FHG had the same grade as the CVG. Moreover, the FHG was significantly correlated with the forest AGB (r s = 0.51, p = 0.004), while CVG was not (r s = 0.097, p = 0.61). Figure 6 is the FHG map covering the entire study area generated based on FHIs. Based on field surveys, the degraded R.pseudoacacia trees were distributed on the north and south sides of the study area, while trees growing in the middle are in good health. Windthrow (Figure 6a) and trees with severe canopy dieback ( Figure 6b) were distributed along the river in the north. The trees in the south of the study area grew short and had large gaps between trees, but the leaves are green and without defoliation (Figure 6c).

Contribution and variability of FHIs
In order to explore the role of each indicator in forest health assessment, we used LDA to evaluate it from the modeling perspective, which was estimated by comparing the performance of all LDA models, removing one indicator each time. The result indicated that Weibull_α and VCI were the most important indicators, followed by sdCC and H 99 , and cvLAD was relatively less important ( Figure 7). With the degradation of R.pseudoacacia, the shape of the canopy changed from obovate to spindle ( Figure 8a); thus, the indicators reflecting the canopy three-dimensional structure changed obviously. Figure 8b shows the structural and morphological variability of the four indicators with the largest contribution in Figure 7 under different FHGs. The indicator Weibull_α was a good indicator of canopy shape with a higher value and a higher peak point (the percentile height of H 65 ) on the fitting curve at a healthy plot and a smaller value and a lower peak point (the percentile height of H 30 ) at an unhealthy plot (the first row of Figure 8b). The VCI described the vertical distribution of tree components such as leaves and branches. With the decline in tree health  status, the point clouds were concentrated in the lower canopy and understory layers with a declining VCI value (the second row of Figure 8b). The trees in healthy plots were generally higher (the third row of Figure 8b). From the top view, indicator sdCC described the canopy horizontal   distribution. In the healthy plots, the trees flourished with a large VCI, and the sdCC was small (the fourth row of Figure 8b), thereby indicating closed forest and little variability. The cvLAD with poor contribution was less separable among three health grades than the other four FHIs. This is because the R.pseudoacacia plantation in the study area had a single stand and only grass with different degrees of light loving under the stand (Wang, Lu, and Pu 2016), resulting in a peak in the middle of the LAD profile in all three health status stands (Figure 9). The cvLAD is a minor indicator, however, it can be significant in specific cases, especially at the deciduous stands (Bouvier et al. 2015). At the same time, our result also showed that discrimination confusion would increase when this indicator was refused ( Figure 5). Therefore, cvLAD was still included by FHIs. In our study, the distinction between healthy and unhealthy stands was confused because their canopy shape and stand structure were similar. Compared with the flourishing crowns in healthy stands, the subhealthy trees only had slight defoliation in the treetops and some lateral branches. As in previous studies, there was no clear boundary between the two (Wang et al. 2015). Better indicators and methods to accurately distinguish the subhealthy are our future work.

FHIs vs CVIs
Because of the low terrain, the groundwater level in the study area is generally lower than 1.5 m, resulting in the widespread distribution of saline soil (Chen et al. 2019;Wu, Liu, and Huang 2016). The R.pseudoacacia forests have been in the later stage of degradation due to the longterm adverse water and salt conditions. To understand the difference between FHIs and CVIs, correlation analysis was performed. Table 9 showed that Weibull_ α and VCI were highly correlated with CVIs, which played an important role in identifying the dieback stand. For example, in the north part of the study area (Figure 6b), the trees have serious dieback and low canopy density. The low correlation with CVIs was the vertical structure indicator (H 99 and cvLAD) (Table 9), which is the advantage of FHI. Compared with CVIs reflecting canopy appearance, the structural parameters extracted from LiDAR can provide more detailed 3D structural heterogeneity information for forest health assessment (Bourgoin et al. 2020) and are more suitable for forests with obvious structural differentiation in late degradation. For example, in the southern part of the study area (Figure 6c), the growth of trees has been inhibited since planting with small height and DBH values due to long term and slow disturbance (Lu et al. 2020), so the trees are not healthy. Since there was no tree crown dieback, the tree was considered as healthy when discriminating by CVI, because only the indicator of crown diameter was in Class 2, and the other four indicators were in Class 1. However, when using FHI, the grade was corrected to subhealthy or even unhealthy because of the unqualified H 99 , sdCC and cvLAD. Therefore, the advantage of FHIs is that the horizontal indicators play the same role as CVIs in characterizing tree canopy vigor, while the vertical indicators compensate for the deficiency in characterizing tree productivity.
At the early stage of disturbance, leaf biochemical parameters such as chlorophyll and water content changed. However, LiDAR-based indictors lose effectiveness when the leaves are discolored but not dropped (Torres et al. 2021). At this time, optical and hyperspectral remote sensing image data can play a role (Huang, Anderegg, and Asner 2019). Therefore, the combination of LiDAR data and hyperspectral remote sensing imagery will be the future direction of forest health status identification and quantification, which can monitor the stage or degree of forest degradation to take corresponding management countermeasures. In addition, due to the disadvantages of expensive data acquisition and limited coverage area of LiDAR, canopy texture analysis based on very high spatial resolution (VHSR) optical imagery also provides an alternative indicator for forest structure assessment (Bourgoin et al. 2020;Dash et al. 2017). In the future, the correlation between canopy texture extracted from VHSR optical imagery with continuous coverage and forest structural parameters extracted from LiDAR covering local typical areas can be used to complete rapid assessment of forest health on a large scale.

Conclusions
In this study, a new set of FHIs was established by using UAV-LiDAR data and cluster analysis, which demonstrated the potential of LiDAR technology in forest health monitoring. The results showed that the FHIs, including Weibull_α, VCI, sdCC, H 99 , and cvLAD, could clearly distinguish forest health status without prior classification information. The LDA identified Weibull_α and VCI as the dominant indicators, but the comprehensive use of five indicators can improve the accuracy of health status discrimination. FHIs could consider both canopy vigor and tree productivity when evaluating forest health, with a similarity between the FHG and the CVG of 63.33% and a significant positive correlation (r s = 0.51, p = 0.004) with the forest AGB. With the decline of R.pseudoacacia health status, the canopy spatial structure changed: the shape of the canopy changed from obovate to spindle (Weibull_ α), the situation of canopy dieback intensified (cvLAD), and the structural variability of the vertical and horizontal directions increased (VCI, sdCC). In addition, the unhealthy R.pseudoacacia was generally shorter (H 99 ). FHIs can be easily obtained from LiDAR data, and the development and popularity of LiDAR technology enables rapid and effective monitoring of forest health with no or few field plots.

Data availability statement
The data that support the findings of this study are openly available in 'Zenodo' at https://doi.org/10.5281/ zenodo.5724611.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by National Natural Science Foundation of China: [Grant Number No. 41471419 and No. 31971579].