Identifying treetops from aerial laser scanning data with particle swarming optimization

ABSTRACT In this study, the particle swarming optimization procedure was applied to parametrize two Local Maxima (LM) algorithms in order to extract treetops from LiDAR-data in a test area (10 km2) of heterogeneous forest structures of conifers in the Alps. The obtained results were compared with those of a widely used variable-size window LM algorithm calibrated using literature values. Quantitative statistical parameters like matching, extraction, omission, and commission rates were calculated. The experimental results showed the effectiveness of the proposed method, which was capable to identify the 91% of the trees and estimate the 92% of the real above ground biomass with a total extraction rate close to 1. Almost all the dominant and codominant trees were extracted, while the extraction rate of the dominated trees averaged over 50%


Introduction
Since the 1990s, the use of Light Detection and Ranging (LiDAR) technology for forest inventories and management has gained more and more importance because of its suitability to get detailed information about the vegetation and the objects on the ground surface (Dubayah & Drake, 2000;Lim, Treitza, Wulderb, St-Ongec, & Floodd, 2003). The raw data of an Airborne LiDAR System (ALS) survey are an irregularly distributed point cloud, which provide 3-dimensional (3D) information of the landscape (Heritage & Large, 2009). Different algorithms have been proposed to analyse these data in order to derive the Digital Elevation Model (DTM), representing the bare earth surface, the Digital Surface Model (DSM), which contains all the objects on the surface (buildings, vegetation, etc.), and the Canopy Height Model (CHM), which is obtained by subtracting DTM from DSM. Specifically, CHM is a product widely used in forestry and forest sciences (Chirici, McRoberts, Fattorini, Mura, & Marchetti, 2016). A proper characterization of the 3D-structure of a forest involves the extraction from the LiDAR data of the most important structural and biometrics forest parameters, such as tree height and diameter distribution, canopy cover, and forest biomass. There are two possible approaches to extract these structural attributes from a discretereturn LiDAR point cloud: the single-tree-based and the plot-based methods (Wulder, Bater, Coops, Hilker, & White, 2008). The main difference between these two approaches is that the former focuses on the metrics of each single tree of a given forest, while the second one aims at developing a number of site-specific empirical statistical relations to estimate forest attributes, without attention to the biometrics of the single trees forming the forest.
The single-tree approach generally requires at least 5 pts m −2 as point density, but it allows extracting single-tree attributes, such as tree total height, crown height, crown diameter with an appropriate accuracy (Hyyppä, Kelle, et al., 2001;Popescu & Wynne, 2004;Naesset and Nelson 2007;Wang et al., 2016). On the other hand, the plot-based approach allows deriving meaningful results at plot level such as mean basal area and forest volume with a lower point density (0.5-1 pts m −2 ). A limit of the plot-based approach is that the elaborated data cannot be used as inputs for most of the current forest dynamics models that are based on position and size of single trees (e.g. SORTIE, Pacala et al. 1993; -PPA, Purves, Lichstein, & Pacala, 2007;2008). In addition, several models for the assessment of the protection ability of forests against natural hazards need the position of the single tress as input (e.g. Rockyfor3D, Dorren L.K.A., 2015). However, for both approaches, the validation procedure consists of comparing field-and LiDAR-derived measurements at single-tree or plot scale.
Many algorithms were proposed for the single-tree extraction method, based both on raster and pointcloud data. The most applied algorithms are the extraction of tops and crowns by local maxima methods (Popescu & Wynne, 2004), the region growing method (Tiede, Hochleitner, & Blaschke, 2005), the watershed method (Chen, Baldocchi, Gong, & Kelly, 2006) and the individual tree segmentation on point clouds (Li, Guo, et al., 2012). In general, the detection rate of trees using the above-mentioned algorithms ranges between 40 and 80% being this percentage much lower in complex and heterogeneous forests (Persson, Holmgren, & Söderman, 2002).
In order to apply these algorithms, a number of parameters, e.g. the radius of the crowns, the minimum distance between the trees, the minimum height of the trees, must be preliminarily defined. Generally, these parameters are subjectively chosen looking at the structural characteristics of the forest (e.g. the mean area of the crown, the mean distance among the trees, etc.). For instance, several authors (Daley, Burnett, Wulder, Niemann, & Goodenough, 1998;Popescu, Wynne, & Nelson, 2002;Wulder, Niemann, & Goodenough, 2000) propose to use variable window sizes for the extraction of treetops and the estimation of vegetation parameters. This approach comes from the assumption that there are different tree crown sizes in a forest and the Local maxima (LM) filter should be redefined to appropriately fit the single-tree structure. The variable-size window assigned to each pixel can be evaluated considering local parameters such as the semivariance or the local breaks of the slope (Wulder et al., 2000) or the height of the tree itself (Popescu & Wynne, 2004). The second parameter to define when using LM filtering is the shape of the window. Doruska and Burkhart (1994) and Popescu and Wynne (2004) stated that a uniform circular moving window is appropriate to extracted the crown of many tree species considering their branching pattern. This empirical approach would achieve good results over forest with relative low heterogeneity, while it seems to be inappropriate when uneven-aged and multi-layered forests have to be analysed. Therefore, the optimization and validation of these parameters respect to a specific goal, such as the assessment of the forest volume, seems to be a crucial step to improve our capability to exploit the potential of LiDAR data. This optimization process should be based on an automatic and reliable procedure to be feasible and implementable in different types of forest.
In this paper, we present the preliminary results of a new approach based on the Particle Swarming Optimization (PSO) procedure in order to optimize the parameters of two LM single-tree extraction algorithms applied to both raster and point-cloud LiDAR data. PSO was widely used in various kind of optimization problems (Eberhart & Shi, 2001;Voss & Feng, 2002;Liang, Qin, Suganthan, & Baskar, 2006;Yisu, Knowles, Hongmei, Yizeng, & Kell, 2008;You, 2008;Li, Yang, et al. 2012, Ma, Yu, & Hu, 2013Yang et al. 2015;Du, Ying, Yan, Zhu, & Cao, 2017). In forest science, PSO has been applied for the calibration of models for forest planning and management (Hu, Sarosh, & Dong, 2011;Shan, Bettinger, Cieszewski, & Wang, 2012), forest fire susceptibility, image manipulation for land cover classification and forest landscape patterns mapping (Sameen & Pradhan, 2017;Sami, El-Bendary, & Berwick, 2012). However, to the best of our knowledge, it has never been used to parametrize algorithms for the detection of single trees using LiDAR data as input. In comparison with other optimization approach, PSO has a number of advantages. The main strength of PSO is its fast convergence, which compares favourably with many metaheuristics like Genetic Algorithms (GA) and Simulated Annealing (SA) (Abraham, Guo, & Lio, 2006). Moreover, it can be easily implemented, and requires few parameter settings and computational memory (Eberhart & Shi, 2001;You, 2008). Compared with Real Coded Genetic Algorithms (RGA) PSO has memory and creates constructive cooperation between the particles of the swarm, which share information among themselves. The knowledge of good solutions is retained by all particles, whereas in RGA, previous knowledge of the problem is discarded once the population changes. Compared to multi-resolution grid search, PSO can be applied for all kind of problems and in particular when the problem is solvable with simple mathematical equations, while multigrid methods are usually applied for solving partial differential equations and nonlinear problems. Moreover, the nature itself of the process of treetop detection from LiDAR data cannot be approached with multigrid methods starting from coarser to finer resolution because this approach would work only with big isolated maxima based on CHM (Canopy Height Model), and all the other treetops in the area would be lost. This paper describes an application of such method in forestry. The main objective of this paper is to discuss the usefulness of the PSO procedure for calibrating single-tree detection algorithms from LiDAR data. In addition, the obtained results were compared with those obtained with a widely applied LM algorithm based on the identification of local maxima of CHM using a variable-size window as proposed by Popescu et al. (2002).

Study area
The study area is part of the Aurina Valley located in north eastern Italy in the Autonomous Province of Bolzano. The total area covered by the LiDAR survey is approximately 10 km 2 with the altitude ranging from 1059 to 2406 m a.s.l. and an average elevation of 1729 m a.s.l.
Coniferous mixed forests are the dominant forest typology of this area with Norway spruce (Picea abies) (60%), larch (Larix decidua) and stone pine (Pinus cembra) as dominant species. Most of the area is of public property and managed by the local community of Aurina Valley, with a limited number of private properties. The main function of the forest is the protection against rock fall and avalanches and selection and shelterwood cuttings by small groups are the elective silvicultural systems in this forest area. The site was selected in 2012 for its high variability in forest structure, which represents a very frequent situation in Alpine environment.

Field sample survey
Three circular plots of 15 m radius were randomly selected from a 50 × 50 m regular grid within a buffer zone of 150 m from the forest road (Figures 1-2). Based on the pre-existing map of the forest structures of the Forestry Division of the Autonomous Province of Bolzano-Bozen, the plots belong to the following vertical forest structures: biplane, adult and multiplane. In summer 2013, all trees within each plot were mapped and diameter at breast height (DBH), species and height of each tree with height greater than 1.3 m    Table 1). The DBH was measured using a diameter tape and the height of the trees using a Vertex hypsometer (Haglöf, Sweden). The tree height and diameter (DBH) distribution of trees are reported in Figure 3. The main dendrometric characteristics of the 3 plots are reported in Table 2.
The position of the centre of each plot and the position of each tree within each plot were estimated using the application for digital field mapping Geopaparazzi (http://geopaparazzi.github. io/geopaparazzi, Brovelli, Minghini, & Zamboni, 2016). In Geopaparazzi, we considered both the GPS signal and the Canopy Height Model (CHM) loaded as background layer. Since the CHM image was used as reference, the tree position was referred to the top of the trees. We preferred to use this method instead of a more traditional approach (e.g. polar coordinates of the base of the stems from the centre of the plot), as most of the trees were heavily tilted and the position of the top did not match that of the base of the trees.
In order to calculate the stem volume from DBH and height of the single tree, we applied a site-specific allometric relationship using the data collected in 12 plots including the three that we extracted for this test.

LiDAR survey
The LiDAR data were collected in one single flight in September 2012 at an average flying height of 500 m a.g.l. An Optech GEMINI Airborne Laser Terrain Mapper (ALTM) was used by Helica s.r.l. company (Udine -Italy) to collect the data. Maximum 4 returns were recorded from each pulse. The final LiDAR point cloud had an average density of 10 points/m 2 . The most important LiDAR survey parameters are shown in Table 3.
LiDAR data were collected, processed and delivered by Helica s.r.l in Universal Transverse Mercator (UTM) coordinate system, in WGS84 -ETRF2000 datum. The Z coordinates of the 3-dimensional LiDAR data have been transformed in orthometric elevation using the software Cartlab (www.sxst.it/siti/ sxst/sxst.nsf/prodotti/CartLab-3?opendocument) and the geoid ITALGEO2005. The data were classified by manual filtering using the Terrascan software within the platform Terrasolid (Terrasolid Ltd., Helsinki) using the algorithms proposed by Axelsson (1999) and Sithole (2005). The classification differentiated ground and low vegetation according to the ASPRS Standard LiDAR Point Classification  Figure 3. Distribution of the heights (a) and DBH (b) of the trees with DBH greater than 17.5 cm in the three studied plots.  (LAS specifications version 1.2), while the remaining points were labelled as Unclassified.
Estimating single tree position and height from laser data A single-tree top extraction method based on the identification of LM (Hyyppä, Schardt, et al., 2001;Persson et al., 2002) applied to both raster and point cloud data was the approach in the present study. LM are points with higher elevation respect to the others in a window of a given size and shape. This technique is based on the assumption that the trees have a tip and this is much reliable with conifers than broadleaves. In order to optimize the parametrization of algorithms (e.g. the size of the window and the threshold on the height of the trees) a PSO method was used and the obtained results compared with those of a manual calibration of the LM algorithm as proposed by Popescu et al. (2002), and implemented in the Fusion software (McGaughey, 2007). Specifically, data analysis and optimization method ( Figure 4) consisted of the following five steps: (1) run the local maxima algorithm using a set of randomly defined parameters for both raster and point-cloud datathe preliminary result is a set of tree positions and heights from the input datasetin this step there is no need to specify the minimum number of reference trees because PSO can run also with a single tree; (2) compare the obtained positions and heights of the LiDAR-extracted tree tops with those of trees mapped in the field using a matching procedure, which considers the distance and difference in height between the LiDARderived and the field-measured trees; (3) assess the obtained results based on a target function that takes into account the number of true positives (TP: LiDAR-extracted trees that match for position and height the field-measures ones), false positives (FP: LiDARextracted trees that do not exist in the field), and false negatives (FN: trees measured in the field that cannot be paired to LiDAR-extracted tree tops); (4) after the first simulation, the parameters of local maxima algorithms are iteratively changed to move toward the best possible solution of the target function; (5) the optimizer iterates until either the target solution or the number of maximum imposed iterations (2000) has been reached.

Local maxima extractor for raster data
The LM algorithm, when applied to raster data, scans the raster canopy height model (CHM) by using a moving window and identifies as local maxima the cells in the centre of the moving window with the highest height value in comparison with all the cells covered by the window. The LM technique used in this study adopted a circular window with a constant radius that was automatically defined through PSO procedure. In addition, since we focused our study on trees with DBH greater than 17.5 cm, we introduced a tree height threshold, so that only local maxima higher than the threshold were considered. The minimum tree height threshold was a further parameter that has been calibrated through PSO.
A preliminary analysis revealed that a lot of FPs was located at the boundaries of the trees crowns. These FPs were related to the presence of cells of the lower and ground layers creating the morphological conditions for the identification of local maxima ( Figure 5). Therefore, we introduced in the LM algorithm a filter for local maxima on the crow boundaries based on the difference in height between the height of the candidate local maximum and that of all the neighbouring cells contained in the moving window. When this difference in height was greater than the threshold for at least two combinations of cells inside the moving window, the local maximum was considered on the boundary of a crown and excluded. Also, this vertical threshold was a further parameter that has been optimized through PSO. Therefore, by the PSO procedure the following three parameters were optimized: the size of the moving window, the minimum tree height threshold and the difference in height among the cells of the moving window to identify crowns boundaries.
The result of the raster-based tree top extractor was a set of tree positions and heights.

Local maxima extractor for point-cloud data
First, the vegetation points were normalized by subtracting the ground elevation taken from the DTM (Lee, Slatton, Roth, &Cropper, 2010 andLi, Guo, et al., 2012), obtaining the so-called "pointcloud CHM". The LM algorithm, applied to the point-cloud CHM, compared the height of each point against the heights of all the points included in a circular window centred on the point itself. When the central point resulted to be the highest, then it was automatically considered as a tree top candidate and added to a temporary list of extracted trees. Because of the breaching pattern of the trees, branches in the upper canopy layer could generate false local maxima (Figures 6 and 7). In these cases, the branch creates a peak far and misaligned respect to the centre of the tree with a significant difference in height between the centre and the neighbouring cells on the outer side of the tree respect to the inner side where the branch comes out. Such false positive points were automatically removed when the cell of the DSM (used as reference layer for checking the elevation) containing this treetop candidate had a height lower than a given threshold respect to the heights of the surrounding cells. This filtering procedure was applied to all the points contained in the list of treetop candidates in order to remove branches erroneously interpreted as treetops.
On the remaining treetop candidates a further filter was applied in order to avoid that within a given radius two local maxima were considered. This radius threshold was considered as a parameter of the algorithm and it mainly depends on the forest density, structure, species composition and crown dimensions. When more than one treetop candidate was extracted within this threshold radius only the highest treetop candidate was kept.
The final result of the point-cloud-based treetop extractor was a set of tree positions and heights.

Particle swarming optimization
The PSO  is an iterative computational method used to find an optimal solution given a measured set of data. The PSO technique has proven to be very efficient for solving unconstrained real valued optimization problems (Parsopoulos & Vrahatis, 2002). It is inspired to the movement of swarms, which are composed of particles that represent candidate solutions. At each iteration, the core algorithm produces a new candidate solution (particle). Each particle moves to a better position in the parameters space towards its own previous best position (local best), and towards the best solutions of all the other members of the swarm (global best). The fitness of this optimization process is based on the results of a fitness function, whose mathematical structure should be chosen according to the aim of the analysis (e.g. forest volume assessment). For all particles, a fitness value is calculated each time the extraction algorithm is run, and the result will influence the direction of the particles fly through the problem space in the following iteration. This iterative process will go on until the following iteration will not get the desired result of the fitness function or the program reaches the maximum number of possible iterations.
The PSO main concept consists in changing the position of each particle toward its personal best (pbest) and global best (gbest) solution. The position and the velocity of the particles can be represented by a position-vector x i and a velocity-vector v i , where i is the index of that particle. The best position of each particle so far found is saved in the vector x # i , while the best position among the swarm is stored in the best position-vector x*. During the iteration time t,  the modified velocity of each particle can be calculated by Equation (1).
where w is the inertia weight and c 1 and c 2 are constant values usually set to 2  representing the coefficients of self-recognition and social components, while r 1 and r 2 are the random numbers, which are used to maintain the diversity of the population and are uniformly distributed in the interval [0,1] (Abraham et al., 2006). The inertia weight has been originally employed in the range 0.9-1.2 (Shi & Eberhart, 1998a) and it defines how much of the velocity of the previous time step should be retained: larger values facilitate global exploration, while smaller ones tend toward local exploration (Shi & Eberhart, 1998b). Eberhart and Shi (2000) indicated that initially setting the inertia weight to a large value and then linearly decreasing it leads to a better performance than using a fixed value, while Abraham et al. (2006) suggested a starting value around 1 with gradually reducing towards 0. In the implemented algorithm, the inertia weight was calculated using the power law of Equation (2) w where initDecelFactor is the starting value of w, iterationStep is the number of the current iteration and decayFactor is the factor used to decrease the initial value towards lower values. In agreement with Abraham et al. (2006), the new position of the particle is updated as the sum of the previous position and the new velocity as follows: In the particle swarm model, each particle searches the solutions in the problem space with a range [−s, s] and any other range has to be translated to a symmetrical range. In order to guide the particles effectively in the search space, the maximum moving distance during any iteration must be clamped in between the maximum velocity range [−vmax, vmax] with 0 < vmax < s (Abraham et al., 2006).
In this article, we used parameters derived from literature review and from the experience before the application to our model and data. The values of the parameters used for the different simulations are summarized in Table 4.

Tree matching procedure
In our model, the fitness function was based on the comparison between the LiDAR-extracted and the field-mapped trees. In order to perform this comparison, we implemented an automatic tree-matching algorithm that identifies the number of true positives (TP), false positives (FP), and false negatives (FN). This tree matching procedure was based on the assumptions that the horizontal distance between the field-mapped tree and the LiDAR-extracted treetop cannot be more than 3 m and the difference in height between field-mapped and LiDAR-extracted tree should be less than 15% of the field-measured height of the tree (Monnet, Merminy, Chanussotz, & Bergery, 2010).
The automatic matching procedure consists of four steps: (6) each field-mapped tree is compared with all LiDAR-extracted trees that are located within the threshold distance; (7) among the LiDAR-extracted trees located within this threshold distance, only those with similar height are considered. This similarity in height between field-mapped and the LiDAR-extracted trees is defined through a percentage of the field-measured height; (8) the obtained candidate matching trees are associated to each mapped tree and sorted by distance from it; (9) when field-mapped trees are near to each other, one or more LiDAR-extracted trees will be present within the threshold distance of different field-mapped ones, which leads to an ambiguity in the process of associating the right extracted tree to the mapped ones, since the association normally works in a way that the first analysed mapped tree is associated to the nearest extracted, which in turn could be instead the nearest to an other mapped tree that would be analysed later in the list (Figure 8). Therefore, the mapped trees that have the same extracted tree in their list of possible matches are analysed and only the nearest mapped tree to the common extracted maintains it in its list of possible matches, while the extracted tree is removed from the list of the other mapped trees.

Fitness function
PSO searches for the best value of an index of goodness using a specific fitness function (Makhoul, Kubala,  Weischedel, 1999). In this study, for both raster and point cloud LM-algorithm, we implemented the following three different fitness functions: (i) the harmonic mean of precision and recall, (ii) the minimizing of FP trees, (iii) the minimizing of FN trees.
Where the harmonic mean takes the arithmetic mean of the FPs and FNs and basically means that FPs and FNs are equally important when the TPs stays the same. It is calculated as: where P and R are precision and recall, respectively calculated as follow: and the general formula: where β is a parameter which weights the relative importance given to recall respect to precision and can assume only for positive real values. After some tests we decided to use two different values of β: β = 2 for minimizing the FN and β = 0.01 for minimizing the FP.
Our algorithm implements the optimization of the fitness function by maximizing its value in the range between 0 and 1. Therefore, the model converges when the fitness function is maximized (approximately to 1), which means that all LiDAR-extracted trees have a hit in the dataset of field-mapped trees, while no false positives or false negative have been generated. Even if the ideal situation of reaching 1 does not occur, and the optimization stops due to reaching the maximum number of iterations, it is assumed that the best available solution for the current set of parameters has been reached.

Parallelization of the code
The proposed method iteratively works on swarms of particles, which means that an algorithm can be run also thousands of times before a convergent solution or the number of user-defined maximum iterations has been reached. On the bright side, the methodology is perfectly suitable for computational parallelization of processes both at plot and swarm level calculation. In our implementation, we decided to perform parallelization at plot level. Working in Linux 64bits operating system with an 8-CPU (Intel ® Xeon ® X5550, 2.66 GHz) computer allows us to reduce the processing time by 8 times when 8 plots are processed simultaneously. This is particularly useful with point-cloud data, as they are complex to process and resource demanding. . Schematic representation of the association process used in the automatic matching procedure to avoid erroneous association of trees due to a partial shift between LiDAR-derived and field-measured data.

Data analysis
For each combination of LM algorithm and fitness function, the following statistical parameters were calculated for each plot: • Total number of LIDAR-extracted trees (N ext ); • Number of field-mapped trees (N map ); • Number of TP trees (N tp ); • Number of FN trees (N fn ); • Number of FP trees (N fp ); • Extraction rate (R ext = N ext /N map ); • Matching rate (R mat = N tp /N map ); • Commission rate (R com = N fp /N map ); • Omission rate (R om = N fn /N map ); • V meas = volume of field measured trees; • V tp = volume of true positive trees; • V fn = volume of false negative trees; • V fp = volume of false positive trees; • V ext = V tp + V fp = extracted volume; • R Vtp = V tp /V meas = matching volume rate; • R Vfn = V fn /V meas = omission volume rate; • R Vfp = V fp /V meas = commission volume rate; • R Vext = V ext /V meas = extraction volume rate.
In addition, for each combination LM algorithm and fitness function the following root mean square errors (RMSE) were calculated: • RMSE ext = root mean square error of extraction rates; • RMSE h = root mean square error of tree heights; • RMSE volext = root mean square error of extracted forest volume rate.
The abovementioned parameter was assessed for the whole tree population of each plot and with the exception of the RMSE volext by splitting the population of each plot in 3 groups of height classes at the 33rd and 66th percentile. The Table 5 shows the threshold height values for the 33rd and 66th percentile of each plot. The RMSE ext and the RMSE vol were calculated as follows: where V map is the field-measured aboveground biomass (AGB) and V ext is the LiDAR-extracted AGB at plot level and N is the result of: (10) the number of plots (3) multiplied by the number of LM-algorithms (2) when RMSE ext is referred to the fitness function (11) the number of plots (3) multiplied by the number of fitness functions (3), when RMSE ext is referred to the LM-algorithm The height of each single LIDAR-extracted tree was calculated as the elevation of the LiDAR data in the point where a LM was detected and the volume of the aboveground biomass (AGB) was estimated for both field-mapped-and the LiDAR-extracted trees through a field-derived species-independent allometric function relating the tree volume to the tree height and DBH (diameter at breast height) as follows: where DBH is the tree diameter at 1.3 m of height, and H is the total tree height. Since the volume of both field-measured and LiDAR-extracted trees was estimated using the same equation we did not introduce an additional source of discrepancy between the two estimates. However, since only position and height of trees were extracted from the LiDAR data, the DBHs were estimated using the following hypsometric relationship derived from the field measurements: The obtained results were then compared with those achieved by applying to the same set of LiDAR-data the LM algorithm proposed by Popescu et al. (2002), and here used as reference. This algorithm was implemented in the Fusion software by the Silviculture and Forest Models Team of the United States Department of Agriculture (USDA) Forest Service, in conjunction with the University of Washington Precision Forestry Cooperative. It identifies LMs of a CHM using a variable-size circular window (WSWA), whose size is based on the canopy height similarly concept (Popescu & Wynne, 2004;Popescu et al., 2002) and specifically in this case the width of the window was calculated using the equation proposed by Kini and Popescu (2004) for mixed pines and deciduous trees as follows: where h t is the height of the canopy in meters and width is the radius of the circular window. After checking for normality and homogeneity of variance, data of the abovementioned parameters were analysed by the analysis of variance (ANOVA) with LM-algorithm and fitness function as independent variables considering the different plots as replicates. Paired comparisons were carried

Detection rates
The fitness function did not affect the size of the optimized moving window, while when point-cloudbased LM algorithm was applied the optimized moving window resulted significantly lover (p < 0.05) in comparison with the raster-based LM algorithm (Table 6). For both algorithms, the multi-layered forest structure needed a smaller moving window to optimize the detection rates (Table 6). According to the three different optimization functions, Table 7 shows the detection rates of LMalgorithms based on raster, point-cloud and variablesize window (WSWA) methods. Irrespective of the forest structure, the algorithms based on PSO performed much better than WSWA (Figure 9(a)). The R mat ranged between 0.55 and 0.91 and averaged 0.74 for the PSO-based LM-algorithms, while it averaged 0.48 for the WSWA method (Table 7, Figure 9(a)). On average, the raster-and the point-cloud-based LM algorithms produced similar results in terms of R mat , but the raster-based algorithm detected more FPs, leading to the higher R com and, in same cases, to an overestimation of the total number of trees (Rext> 1) (Table 7, Figure 9(a)). However, the RMSE ext was not statistically different between rasterand point-cloud-based algorithms because of the  Table 7. Matching rate (Rmat), commission rate (Rcom), omission rate (Rom) and extraction rate (Rext) of LiDAR-extracted treetops for raster-based LM-algorithm, point-cloud-based LM algorithm and the variable-size window LM algorithm (WSWA) using the three different target functions (harmonic mean, minimization of FP and minimization of FN) within the particle swarming optimization procedure. compensation between R com and R om (Figure 9(c)). No significant statistical interaction between the LM algorithm and the fitness function was found for both R mat and R com (Figure 9(b)). On the whole, irrespective of the target function, the point-cloud-based LMalgorithm led to very similar results, showing to be less sensitive than the raster-based LM-algorithm to the type of function that was used for the optimization of parameters (Table 7). On average, the R mat of the three target functions was not statistically different (Figure 9(b)), but R com was significantly higher when Min FN was applied in comparison to Min FP function (Figure 9(b)). However, the RMSE ext was not affected by the fitness function (Figure 9(c)).

Raster-based LM-algorithm
Considering the three height classes in which we divided the trees of each plots, the PSO-based LM algorithms detected almost all the dominant trees (CLASS III), whereas the R mat of the WSWA was on average 0.75 (Table 8). The raster-based algorithm detected a higher number of FP dominant trees in comparison to the point-cloud-based algorithm, so that the R com averaged 0.04, 0.14 and 0.06 for pointcloud-based, raster-based and WSWA LM algorithm, respectively (Figure 10(e)). However, no statistical difference in terms of RMSE ext was observed among the three methods ( Figure 11).
While the target function did not affect the R mat of dominant trees, the R com was, on average, lower when the Min FP was adopted (Figure 10(b)). On the whole, the RMSE ext of dominant trees was not affected by the target function (Figure 11(f)).
For intermediate trees (Figure 10(c-d)) (CLASS II) R mat averaged 0.83 when the PSO-based LM algorithms were applied, while it dropped to 0.51 when the WSWA was adopted (Table 8). The R mat of intermediate trees was similar for the two PSO-based Figure 9. Mean matching (Rmat = dark grey), commission (Rcom = light grey) and extraction (Rext = rdark + light grey) rates of treetops for the three tested methods (a) and target functions (b) with the correspondent RMSEext (c, d). Different letters denote significant differences (p < 0.05) among methods of target functions within the same detection rate or RMSE. methods, but the R com of raster-based algorithm was three times higher than that of point-cloud-based one (Figure 10(c)). The target function did not affect both R mat and R com of intermediate trees and the RMSEext was not affected by both the LM-algorithm and the target function (Figure 11(c-d)).
The mean R mat of dominated trees (CLASS I) declined to 0.48, 0.43, and 0.20 for raster-based, point-cloud-based, and WSWA algorithms, respectively (Figure 10(e)). However, no statistical differences among the LM-algorithms were observed (Figure 10(e)) both in terms of R mat and R com . RMSE ext of dominated trees was not influenced by the LM-algorithms and the target function (Figure 11(f)). If the R mat was not affected by the target function, the use of the Min FP as target function reduced to zero the number of FP dominated trees (R com = 0) without any statistically significant reduction of R mat (Figure 10(f)).

Tree heights and forest volume
The optimization process was based on an automatic matching procedure, which used the distance and the difference in height between the LiDAR-extracted and the field-measured trees as parameters. Therefore, the heights of the LiDAR-extracted trees necessarily fitted very well those of the field-measured ones. On the whole, the RMSE of LiDAR-extracted against fieldmeasured height of trees (RMSE h ) was very low and averaged 0.96 m. The RMSE h was significantly affected by LM-algorithm and the plot structure. The pointcloud-based algorithm was found to be a more accurate method to estimate the tree height than the rasterbased algorithm (Figure 12(a)) and on average the RMSE h was lower for the multilayered and higher for the biplane plot, respectively (Figure 12(b)). The fitness function did not prove to be able to affect the accuracy of the estimation of tree height  Table 9 shows, for each combination of LM algorithm as well as for the WSWA method, averaged on the fitness function, the per cent of the above ground biomass (AGB) of true positive (TP), false positive (FP), and false negative trees (FN), by using the fieldmeasured volume as reference. The two algorithms based on PSO performed much better than WSWA irrespective of the target function (Figure 13(a)). The rate volume of LiDAR-extracted TP trees (R Vtp ) respect to the total field-measured volume ranged between 0.78 and 0.92 and averaged 0.88 for the PSO-based LM algorithms (Figure 13(a)), while it  averaged 0.61 for the WSWA. However, the LMalgorithm did not influenced the R Vfp and the fitness function did not affected both R Vtp and R Vfp and the RMSE volext (Figure 13). With the raster-based LM algorithm, the LiDAR-extracted volume of the TPs resulted, in some cases, greater than the volume of the field-measured trees even if the R mat was lower than 1. This is because the raster-based method tended to overestimate the average height of treetops (Figure 12(a)). However, no statistical difference between the two PSO-based methods was found in terms of R Vext and R Vtp , and RMSE volext when all the data were pooled together (Figure 13(a)). Although, on average the results in terms of LiDAR-extracted volume were similar between the two PSO methods, the point-cloud-based LM algorithm showed to be less sensitive than the raster-based LM algorithm to the selected target function (Figure 13(b)). However, no significant interaction between the target function and the LM-algorithms were found for RVext, R Vtp , R Vfp , and RMSE volext ( Figure 13).
The results related to the volume of dominant and intermediate trees are in line with those obtained for the entire population (Table 10). In fact, the PSObased methods performed much better than the WSWA in terms of R Vtp , and only for the dominant trees also in terms of RMSE volext (Figures 14-15). No statistically significant differences between the two PSO-based methods were observed for all the considered parameters connected to the volume of dominant and intermediate trees (Figures 14-15(c-d)). The fitness function has not shown any significant effect on R Vtp , R Vfp and RMSE volext of dominant trees (Figures 14-15(a-b)). Concerning the estimation of the volume of dominated trees, the three algorithms did not lead to statistically significant differences, even if the WSWA produced lower R Vtp and higher RMSE volext absolute values in comparison to the other algorithms, confirming what we have observed in the higher height classes. The Min FP allowed reducing to zero the volume of FP trees, without compromising the value of R Vtp .

Discussion
Most of the methods and algorithms so far proposed to extract treetops from LiDAR data were based on the subjective choice of a number of parameters, which are defined considering the characteristics and the spatial distribution of trees of the studied forests. However, this approach has often produced variable and inconsistent results according to the structure, species composition and development stage of the forest. In the present study, for the first time, we applied the particle swarming optimization procedure to parametrize two LM algorithms in order to extract treetops of coniferous from LiDAR-data.  The obtained results were compared with those of a commonly used variable-size window LM algorithm, which was subjectively parametrized according to Popescu et al. (2002), and used as reference. Our approach, based on PSO iterative parametrization process, seems to be very promising as the detection rates of the treetops were much higher than that achievable with the reference LM algorithm. This suggests that a rough parametrization of the selected algorithm can produce a strong reduction of the estimation accuracy even when the forest structure is well known. Our preliminary results are encouraging also in comparison with recent studies in Alpine region, where complex forest structures were considered. For instance, Eysn et al. (2015) tested 8 singletree detection methods in 8 Alpine study areas, covering different forest types and structures. Five of the 8 proposed methods were based on LM algorithms with predefined parameters (e.g. size of the sliding window). The mean R mat in the abovementioned study ranged between 0.4 and 0.5, which is roughly half of the matching rate we found in the present study. These sharply different performances are partially due the different DBH thresholds, 17.5 cm in our study and between 4 and 12 cm in Eysn's study. Nevertheless, also by narrowing down the comparison to the higher height classes, it is clear that the PSO procedure allowed us to improve the estimation accuracy. However, we cannot exclude that the greater heterogeneity in the quality of the LiDAR data and forest structures has negatively influenced the Eysn's results. Figure 13. Mean matching (RVtp = dark grey), commission (RVfp = light grey) and extraction (RVext = rdark + light grey) rates of volume for the three tested methods (a) and target functions (b) with the correspondent RMSE (c, d). Different letters denote significant differences (p < 0.05) among methods or target functions within the same detection rate or RMSE. The matching rates in different height classes show that especially in dominant and intermediate layers the PSO-based method can detected a higher number of treetops than WSWA method. In this regard, our methods have to be considered more advanced not only because they are based on an automatic parametrization process, but also for the number of parameters that were optimized in order to reduce biases and improve the detection ability. This finding is in agreement with the results of Kaartinen el al. (2012), who observed that advanced methods, like those based on point-cloud 3D analysis, generally perform better than simple methods based on LM detection using a rasterized CHM as base.
A high accuracy can be reached when a high matching rate is combined with a low commission rate. In this context, the point-cloud-based method combined with the Min FP target function was found to be the best compromise, as it allowed us to get high matching rates associated with the lowest commission rates. This was generally observed for all the height layers. On the contrary, the raster-based LM algorithm tended to over-perform due to high commission rate. As the best size of the mowing window in the raster-based LM algorithm was automatically selected by the PSO procedure, we cannot exclude that in some cases the choice of a small-size mowing window might have led to a higher commission rate. In spite of our filtering procedure, small local maxima kernel, can identified local irregularities as potential treetops, and this results in a high commission rate. The point-cloud based LM algorithm performed better than raster-based method also in terms of detection accuracy of tree heights with consequential benefit in terms of Figure 14. Mean matching (RVtp = dark grey), commission (RVfp = light grey) and extraction (RVext = dark + light grey) rates of volume of three classes of height: DOMINANT (a-b), INTERMEDIATE (c-d) and DOMINATED (e-f) trees for the three tested methods and for the three target functions. Different letters denote significant differences (p < 0.05) among methods or target functions within the same detection rate. accuracy of forest volume estimation. The lower vertical estimation accuracy of the raster-based method is probably due to the rasterization process itself that implies a smoothing of the vertical irregularities and as a consequence an increase in the RMSE h . On the other hand, when the point cloud-based method is used, a high accuracy of tree height estimation seems to be reachable only with a high LiDAR-data point density. In this context, an analysis of the sensitivity of vertical estimation accuracy to the LiDAR-data point density seems to be crucial for both raster-and point cloud-method. A further indication deals with the importance of forest structure on vertical accuracy. In agreement with the results of Eysn et al. (2015), we got the best vertical accuracy for multi-layered forest while the biplane structure seems more challenging in this regard.

Conclusion
The particle swarming optimization iterative procedure has been proved to be a powerful method to parametrize local maxima algorithms based on raster or point cloud LiDAR data in order to properly detect treetops of coniferous stands with complex forest structures. This optimization procedure allowed us to obtain high detection rates and estimation accuracy of forest volume. However, further studies are needed to test and validate our methods on a larger dataset covering Figure 15. Root mean square error rates of extracted volume (RMSEVolext) for the three classes of height: DOMINANT (a-b), INTERMEDIATE (c-d) and DOMINATED (E-F) trees for the three tested methods and for the three target functions. Different letters denote significant differences (p < 0.05) among methods or functions. different type of forests and forest structures. Finally, the effect of different point density and flight parameters on the detection accuracy of the proposed methods should be performed to understand the robustness and the sensitivity of our approach to the most common variables of LiDAR dataset.