Comparison of regression models to estimate biomass losses and CO2 emissions using low-density airborne laser scanning data in a burnt Aleppo pine forest

ABSTRACT The knowledge of the forest biomass reduction produced by a wildfire can assist in the estimation of greenhouse gases to the atmosphere. This study focuses on the estimation of biomass losses and CO2 emissions by combustion of Aleppo pine forest in a wildfire occurred in the municipality of Luna (Spain). The availability of low point density airborne laser scanning (ALS) data allowed the estimation of pre-fire aboveground forest biomass. A comparison of nine regression models was performed in order to relate the biomass, estimated in 46 field plots, to several independent variables extracted from the ALS data. The multivariate linear regression selected model, including the percentage of first returns above 2 m and 40th percentile of the return heights, was validated using a leave-one-out cross-validation technique (6.1 ton/ha root mean square error). Biomass losses were estimated in a three-phase approach: (i) wildfire severity was obtained using the difference normalized burn ratio , (ii) Aleppo pine forest was delimited using the National Forest Map and ALS data and (iii) burning efficiency factors were applied considering severity levels. Post-fire biomass was then transformed into CO2 emissions (426,754.8 ton). This study evidences the usefulness of low-density ALS data to accurately estimate pre-fire biomass, in order to assess CO2 emissions in a Mediterranean Aleppo pine forest.


Introduction
Wildfires are a socio-environmental hazard in Mediterranean ecosystems, acting as a source of greenhouse gases (GHGs) emissions to the atmosphere (Akagi et al., 2013;Andreae et al., 1988;Seiler & Crutzen, 1980;Van Der Werf et al., 2010;Wiedinmyer et al., 2011). Consequently, fires are able to alter the carbon cycle behaviour at regional or even global scales (Narayan, Fernandes, Van Brusselen, & Schuck, 2007), as well as to decrease the effect of carbon sequestration by forest ecosystems (Van Der Werf et al., 2006;Wiedinmyer & Neff, 2007). In the Mediterranean basin, an average of 45,000 fires is recorded yearly (Oliveira, Oehler, San-Miguel-Ayanz, Camia, & Pereira, 2012), increasing the albedo and determining the current landscape (Pausas, Llovet, Rodrigo, & Vallejo, 2008). Although these values and the resulting emissions are variable in time and space, biomass burning contributed significantly in the total direct carbon monoxide (CO) emissions (Pétron et al., 2004). These natural or anthropogenic disturbances might be enhanced by climate change, increasing fire risk (Moriondo et al., 2006) particularly in summer months (Sebastián-López, Salvador-Civil, Gonzalo-Jiménez, & SanMiguel-Ayanz, 2008). In Spain, fire statistic registers show a reduction in the number of fire events during the last decade (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010), as well as in the total burned area (Rodrigues, Ibarra, Echeverria, Perez-Cabello, & de la Riva, 2014;San-Miguel-Ayanz et al., 2012). However, the occurrence of large fires (>500 ha) has increased. In 2015, 39% of the total area affected by fires was burned in a large fire (MAGRAMA, 2016a). Moreover, if fire recurrence is high, regeneration process might fail for even species with high resilience such as Aleppo pine (Pinus halepensis Mill.) (Pausas et al., 2008), influencing carbon sequestration. Under this context, scientists, fire managers and decision-makers require the most accurate information available related to fire emissions and its impact on the environment and population. The account of carbon dioxide (CO 2 ) emissions is essential for climate regulation policies and the evaluation of the effects of these policies (Mieville et al., 2010), as well as for understanding the services that forest provide to societies (Lal, 2008;Pan et al., 2011). GHG emissions from fires estimation require (i) the delimitation of the burned area, (ii) the estimation of prefire biomass, (iii) the assessment of the fraction of biomass consumed by fire, also defined as burning efficiency (De Santis, Asner, Vaughan, & Knapp, 2010) and (iv) the use of conversion factors to estimate GHG emissions. However, little research has been conducted on quantifying pre-fire biomass and biomass consumed by fire. According to De Santis et al. (2010), biomass consumption was traditionally estimated using a two-step methodology which includes (i) the estimation of pre-fire biomass by applying allometric regression equations using destructive sampling or biomass values per species and (ii) the post-fire biomass estimated by field-based weighting (Prasad et al., 2001;Sá, Pereira, & Silva, 2005;Ward et al., 1996) or by visual examination (Roy, Jin, Lewis, & Justice, 2005). An alternative approach is based on the use of remote sensing imagery for pre-fire biomass estimation. Despite the wide acceptance of the use of optical and radar remote sensing to estimate forest attributes such as biomass (Chuvieco, 2009;Leboeuf, Fournier, Luther, Beaudoin, & Guindon, 2012;Le Toan, Beaudoin, Riom, & Guyon, 1992;Tanase, de la Riva, Santoro, Pérez-Cabello, & Kasischke, 2011), airborne laser scanning (ALS) is considered one of the best techniques for forest structural parameters estimation (Lefsky, Cohen, Parker, & Harding, 2002;Maltamo, Naesset, & Vauhkonen, 2014;Vosselman & Maas, 2010).
The main objective of this study is to estimate the CO 2 emissions derived from the consumption of the aboveground tree biomass (AGB), which refers to the total biomass of the trees considering stem, branches and needles, in a heterogeneous Aleppo pine forest, located in Aragón Region (Spain). To achieve this goal, a discrete, multiple-return, low point density ALS data and field plots representative of pine stands were used to fit and validate the AGB models. A secondary objective was the comparison of different regression models, including machine learning.
Besides, the majority of previous approaches to CO 2 estimation assume that biomass is completely consumed. However, during wildfires in conifer stands in some cases only the needles and the small fine twigs of the pine crowns are consumed (Call & Albini, 1997;Mitsopoulos & Dimitrakopoulos, 2007;Scott & Reinhardt, 2001). Consequently, different combustion factors were applied to avoid assuming that biomass was completely consumed by the fire (French, Goovaerts, & Kasischke, 2004). The fire severity levels were extracted from the difference normalized burn ratio (ΔNBR) spectral index applied to Landsat 8 OLI images.

Study area
The study area, burned on 4 July 2015, is located in Luna municipality, northeast of Spain (42°12ʹN, 0°4 5ʹW). Aleppo pine has a high potential of ignition and represents almost 50% of the forested area in Aragón and is well adapted to these Mediterranean environmental conditions. The fire scorched in the area of 14,263 ha, of which 3390.4 ha was woodland. Those forested areas were covered in a 62.3% by monospecific Aleppo pine. As can be observed in Figure 1, for forest inventory purposes, the field campaign to estimate AGB was conducted in a close unburned area (Figure 1(b)). The proximity between both sites (see Figure 1(a) and (b)) and the similarity on environmental characteristics such as slope, climate and vegetation enable to extrapolate the AGB model to the burned area ( Figure 1(a)). This similarity was previously evaluated by comparing some variables derived from the ALS metrics such as slope, canopy cover and tree height. These heterogeneous pine forests from the structural point of view appear fragmented into stands of variable size, accompanied by an evergreen understorey with species, such as: Quercus ilex subsp. rotundifolia, Quercus coccifera, Juniperus oxycedrus, Buxus sempervirens and Juniperus phoenicea.
Climate of the region is Mediterranean with continental features. Annual precipitation is medium-low and irregular, averaging 525 mm and mostly occurring in autumn and spring. Winter has a monthly mean temperature less than 10°C, whereas summers have temperatures of~20°C. The study area is characterized by a hilly topography, with elevations ranging from 430 to 1150 m above sea level and slopes from 0°to 39°. The lithology of the study area corresponds to Miocene shales and sandstones, alternating with conglomerates.

Field plot data
Field data were acquired in 46 circular plots, 15 m radius at the unburned area during June and July 2015 (Figure 1(b)). The location of the field plots was selected, within the limits of the Aleppo pine stands at the unburned area, using a stratified random sampling technique, in order to achieve a representative sample of the variability of the terrain (Naesset & Økland, 2002), forest structure and tree density (Montealegre et al., 2016). Thereby, terrain slopes, tree height and canopy cover of the study area were derived from ALS point cloud to define homogeneous areas.
The centre of the selected plots ( Figure 1(b)) was located in the field using a Leica VIVA GS15 CS10 GNSS real-time kinematic Global Positioning System. The average accuracy of the planimetric coordinates was 0.18 m. Tree breast height diameter (dbh) was measured at 1.3 m, using a Mantax Precision Blue diameter caliper (Haglöf Sweden®). It should be noted that only the trees with a dbh >7.5 cm were measured in each plot. The AGB was calculated for each plot according to Montero, Ruiz-Peinado, and Muñoz (2005) allometric equation and extrapolated to per hectare biomass value (kg of dry biomass per ha) considering the plot area (Equation (1)).

Biomass kg=ha
where CF is a correction factor (CF ¼ e SEE 2 =2 ) being e the Euler number and SEE the standard error (0.151637); dbh is breast height diameter in cm; a (−2.0939) and b (2.20988) are the specific parameters for Aleppo pine; and A plot is the area of each plot (706.8 m 2 ). These data act as ground truth to adjust and validate the AGB predictive model, which would be extrapolated to the burned area (Figure 1(a)) to estimate pre-fire biomass. The extrapolation of the AGB model was carried out in a Geographical Information System (GIS) environment using the selected Light Detection and Ranging (LiDAR) metrics and the coefficients of the model.

Remote sensing data
The ALS data were captured for the burned and unburned area in several surveys carried out 4 years before the fire ignition between January and February 2011, using a small-footprint oscillating-mirror airborne Leica ALS60 discrete-return sensor. The Spanish National Plan for Aerial Orthophotography (PNOA) provided these data with a nominal density of 0.5 point/m 2 (IGN, 2017b). Data were delivered by the National Geographic Information Centre (CNIG) in 2 × 2 km tiles of raw data points in LAS binary files format v. 1.2. The x, y and z coordinates were provided in UTM Zone 30 ETRS 1989 geodetic reference system and orthometric heights. The point cloud was captured with up to four returns measured per pulse. The ALS60 sensor was operating in 1.064 µm wavelength, 0.22 mrad beam divergence and ±29 scan angle degrees from nadir. The ALS point cloud density was 1.5 point/m 2 , considering all returns with a vertical accuracy better than 0.2 m for the area burned on 4 July 2015 and for the unburned one. The temporal lag between ALS acquisition data at the unburned area, captured in 2011, and fieldwork campaign, performed in June and July 2015, was considered appropriate, as no significant changes took place in the study area in that period.

Methods
The two-phase approach methodology includes the pre-fire biomass estimation through the comparison of different models and the estimation of biomass losses by applying three burning efficiency factors to assess the CO 2 emissions to the atmosphere ( Figure 2).

Pre-fire AGB estimation
This section describes the process followed for ALS data processing, as well as the generation of pre-fire AGB model.

ALS data processing
The first processing step was noise point removal, which included verification of the overlapping returns. Thereafter, ALS point clouds were filtered using the multiscale curvature classification algorithm (Evans & Hudak, 2007) to extract the ground points. This algorithm, implemented in the MCC 2.1 command-line tool, is suitable for this environment according to Montealegre, Lamelas, and de la Riva (2015a). Then, a digital elevation model (DEM) with a 1 m size grid was generated using the Point-TIN-Raster interpolation method (Renslow, 2013), following Montealegre, Lamelas, & de la Riva., (2015b). The normalized heights were obtained by the subtraction of the ground elevation value of the DEM from each point height. The normalized ALS tiles were clipped to the spatial extent of each field plot (Figure 1(b)). Furthermore, a wide range of statistical metrics commonly used as independent variables in forestry were calculated (Evans, Hudak, Faux, & Smith, 2009) using FUSION LDV 3.30 open source software (McGaughey, 2008). It should be noted that ALS-derived variables were generated after applying a threshold value of 2 m height so as to remove ground and understorey laser hits according to Nilsson (1996) and Naesset and Økland (2002).
Model for estimating pre-fire AGB With the aim of comparing the predictive performance of different regression methods for the estimation of AGB, eight regression methods were analysed: a multivariate linear regression (MLR) model, two machine learning algorithms and five regression trees structures. These methods are briefly described below.
MLR has been widely employed to estimate forest parameters by relating dependent variables, from fieldwork campaign, and independent variables, extracted from the ALS point cloud (García, Godino, & Mauro, 2012;Gonzalez-Ferreiro, Dieguez-Aranda, & Miranda, 2012;Lim, Treitz, Wulder, St-Onge, & Flood, 2003;Means et al., 1999;Naesset & Økland, 2002;Watt et al., 2013). As a first step following Montealegre et al. (2016) the Spearman's rank correlation coefficient (p) was calculated in R software, in order to select the ALS variables that show the strongest correlation coefficient with field plot biomass data. The selection of the ALS metrics was made within a minimum p value of ±0.5. Then, the selected variables were included in a forward stepwise regression, in order to avoid overfitting by selecting the smallest possible number of predictor variables. The fitted model was selected according to measures of goodness of fit. Moreover, it was verified if the fitted model meets the basic assumptions of linear regression models according to García et al. (2012). Logarithmic transformation of dependent and independent variables was explored in the cases where statistical hypothesis of linear regression models could not be fulfilled (García et al., 2012;Means et al., 1999), as well as to verify whether the measures of goodness of fit of the models improve. Support vector machine (SVM) is a supervised learning model which has associated learning algorithms that analyse and recognize patterns. This method assumes that input data are separable in space (Mountrakis, Jungho, & Caesar, 2011). SVM tries to find among multidimensional hyperplanes the optimal separation between classes, where the separability is a maximum. The data located in the hyperplane are the most difficult to classify since they have lower separability and they are called support vectors. SVM was implemented by using R package "e1071" and models with linear and radial kernels were computed. In both SVM models, the parameter cost was defined in the interval 1-1000, and the parameter gamma in the interval 0.01-1, applying the best parameters after tuning the model.
Random forest (RF) is an ensemble learning method that uses decision trees as base classifier. RF combines a decision tree that depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest (Breiman, 2001). The algorithm adds randomness to bagging and increases the diversity of decision trees by growing them from different subsets. In each decision tree, RF divides the nodes by using the best variables from a random sample. RF was implemented through R package "randomForest" (Liaw & Wiener, 2002) and "caret" (Van Essen et al., 2001). The RF model was adjusted using two parameters: the number of trees to growth (ntrees) and the number of variables selected randomly at each split (mtry). They were in the intervals 1-1000 and 1-2, respectively.
The regression tree structures are nonparametric regression techniques based on "If-Then" rules. In this study three linear models and two non-linear local models are computed. The R package "CORElearn" has been used to perform the different regression trees. It should be added that the differences between them refer to the regression model considered in the leaf nodes.
Locally weighted linear regression (LWLR), or loess, is a method which fits a regression surface to data by smoothing the dependent variable as a function of the independent variables (Cleveland & Devlin, 1988). The coefficient of smoothness is fitted by computing weighted mean square error and considering a distance function.
Linear model with a minimum length principle (MDL) is based on the rule developed by Rissanen (1978) which considers that regularities in a set of data can be used to compress the data by using fewer symbols, from a finite alphabet, than needed to explain the data faithfully.
Reduced linear model (RLM) is a linear model computed by the least square method and, after that, simplified using an exhaustive search to remove those variables that contribute little to the model in order to minimize the estimated error, as in regression tree models like the so-called M5 in Waikato Environment for Knowledge Analysis (Weka) software (Quinlan, 1992).
K nearest neighbour (KNN) is a lazy learning method based on the KNN algorithm (Fix & Hodges, 1951), which includes two phases. First, the KNNs are searched using the complete dataset and considering an established distance. Then, the mean of the k-most similar instances is used for the prediction.
Weighted k nearest neighbours (WKNNs) is a refinement of the KNN algorithm, which gives greater weight to the closer neighbours according to their distance to the observations. In this sense, the weighted mean of the KNNs is used for the prediction.

Model validation and comparison
The algorithms were computed after applying a preprocessing phase which is based on the normalization of the data in values ranging from 0 to 1. The scaling of the data avoids weights saturation (Görgens et al., 2015) and may improve the performance of the models. In order to avoid overfitting of the model by selecting the smallest possible number of predictor variables, a forward stepwise regression was used.
Considering that fieldwork is a time-consuming task and increases the costs of the study, it was not possible to measure a high number of field plots. In this sense, the 46 measured plots, although may seem a low number, are enough to meet the statistical requirements. Accordingly, the models were validated using a leave-one-out cross-validation (LOOCV) technique (Maltamo et al., 2014), in order to do not further reduce the sample (Andersen, McGaughey, & Reutebuch, 2005). For those methods with randomness, LOOCV was executed 100 times so as to increase the robustness in the results (García-Gutiérrez et al., 2015).
The comparison between models was performed by analysing the results in terms of root mean square error (RMSE) and bias. Furthermore, Friedman nonparametric test was applied in order to compare the performance of the different models (Friedman, 1940). The test was carried out separately for each RMSE measure of each fold of the cross-validation (Stojanova, Panče, Valentin, Andrej, & Sašo, 2010). In those cases where the null hypothesis of Friedman test was rejected, which implies that the models were not equivalent, the Nemenyi (1963) post-hoc test was used to determine whether the differences between the models were statistically significant, with a significance level of 0.05.

Estimates of biomass losses and conversion to CO 2 emissions into the atmosphere
The estimation of biomass losses was performed in three phases: (i) wildfire severity estimation, (ii) pre-fire Aleppo pine forest location mapping and (iii) selection of burning efficiency factors related to pre-fire vegetation (De Santis et al., 2010;Oliva & Chuvieco, 2011).
First, wildfire severity was estimated according to FIREMON methodology (Key & Benson, 2006). NBR was calculated for pre-fire and post-fire images (Equation (2)). Then, the ΔNBR was estimated by the subtraction of NBR post-fire from NBR pre-fire (Equation (3)). Subsequently, the burned area was delimited using this index in a GIS environment.
where ρ NIR (near infrared) and ρ SWIR (short-wave infrared) refer to bands 5 and 7 Landsat 8 OLI reflectance, respectively. In a second phase, the location of pre-fire Aleppo pine woodland was delimited using the Spanish National Forest Map (MAGRAMA, 2016b), and the canopy height model derived from the ALS data captured previous to fire. In order to improve the accuracy in forest location, stands less than 2 m high were excluded from the analysis.
Third, a thorough bibliographic search of burning efficiency values for Mediterranean conifer forests was conducted (Deeming, Burgan, & Cohen, 1977;Miranda et al., 2005). However, few approaches were suitable to our Mediterranean conifer forests and most of them assume that forest biomass is consumed completely (French et al., 2004). The goal of this study was to obtain spatialized coefficients related to different burn severity levels. Thus, following De Santis et al. (2010) methodology, three burning efficiency factors, related to pre-fire vegetation, were applied considering low, moderate and high severity levels. The four generic severity ranges proposed by Key and Benson (2006) (Table 1) were reclassified in three ranges so as to match with the three burning efficiency factors. In this regard, the low burning efficiency value denotes low consumption of the leaves and very low woody branches consumption; the moderate burning efficiency value indicates intermediate consumption of the leaves and moderate consumption of small branches; and the high burning efficiency value suggests a complete consumption of the leaves and high loss of small branches and twigs.
The conversion of biomass losses to CO 2 emissions requires the estimation of the biomass carbon content and the application of an emission factor. The carbon content was computed using a conversion factor of 0.499 set by Montero et al. (2005) for Aleppo pine. With respect to the emission factors, several conversion factors have been proposed so as to estimate different GHG emissions to the atmosphere. In this sense, the account of CO 2 emissions to the atmosphere generated from forest biomass combustion was obtained according to Trozzi, Vaccaro, & Piscitello (2002) equation, which includes the same parameters as the equations established by IPCC (2006), Levine (2003) and Seiler and Crutzen (1980) (Equation (4)).
where ε is the fraction of total carbon emitted as CO 2 (0.888); δ is the factor of conversion from the emissions in ton of carbon to the emissions in ton of CO 2 (44/12); and C is the carbon content.

Results
A summary of the field plot characteristics is presented in Table 2. Inventoried trees present a variety of diameters, from 14.2 to 28.1 cm, and diverse heights, ranging from 7.2 to 17.2 m. This accounted for the variability of biomass in the study area. All models included two ALS-derived variables: the percentage of first returns above 2 m (t-test: 8.3) and the 40th percentile of the return heights (t-test: 4.4), both variables showing a direct and coherent relation with AGB. The higher value of the variables, the higher biomass amount.
The regression models to estimate the AGB are summarized in Table 3. The MLR and the SVM with radial kernel (cost = 570 and gamma = 0.03) models presented the lowest RMSE with 6.1 and 7.3 ton/ha, respectively. LWLR regression tree performs slightly better than SVM with linear kernel (cost = 210 and gamma = 0.01), with RMSE of 8.3 and 8.5 ton/ha, respectively. Furthermore, the remaining regression trees as well as RF machine learning (ntrees = 500 and mtry = 1) show a lower accuracy. It should be added that most of the models present values of bias close to zero, except from SVM linear kernel, WKNN and KNN models that show a slight overestimation with values close to 1.
The performance comparison between the models, by using Friedman test, indicates that the models are not equivalent with a p-value of 0.000. However, the application of post-hoc Nemenyi test shows that only WKNN (p-value = 0.0) and KNN (p-value = 0.0) models presents differences statistically significant, with 95% of probability.  Figure 3 shows the scatter plots of the observed AGB against the model predictions for the different regression models. MLR and SVM with radial kernel show consistent results and high coefficient of determination (0.88 and 0.87, respectively). SVM with linear kernel and LWLR also present good coefficient of determination (0.84 and 0.83, respectively). Lower coefficients of determination as well as less stable results are evidenced in the scatterplots for the remaining regression tress, especially WKNN and KNN.
The implementation of the MLR model (Equation (5)) in a GIS allowed estimating pre-fire AGB, which accounts 546,486.7 ton.
PrefireAGB ¼ 1:007Ã10689:32 Ãe 0:0158Ãpercentage of first returns above 2m ð Þ Ãe 0:0713Ã40th percentile of height ð Þ : The Aleppo pine forest was burned with a high severity in most part of the area, as can be observed in Figure 4 (a). The biomass losses range from 4 ton/ha to more than 12 ton/ha (Figure 4(b)). As can be observed in Table 4, high severity areas represent~60% of Aleppo pine burned area, accounting~70% of biomass losses. Finally, the combustion of Aleppo pine forest in Luna wildfire emitted 426,754.8 tons of CO 2 into the atmosphere.

Discussion
The use of GHG emissions equations is widely accepted for accounting forest biomass combustion by a wildfire (IPCC, 2006;Levine, 2003;Seiler & , 1980;Trozzi et al., 2002). Moreover, several conversion factors as well as emission factors, from global to regional scales, have been proposed to accurately estimate emissions to the atmosphere. However, one of the main uncertainties related to the use of these equations is the account of pre-fire biomass and biomass losses. In this sense, LiDAR technology has been proposed as the best technique to accurately estimate forest structural parameters, such as biomass, and artificial intelligence methods have been applied and compared to generate this variable. Nevertheless, little research has focused on comparing the performance of several regression models, including machine learning algorithms and regression trees, regarding traditional MLR models to estimate forest parameters. This study proposes the use of low point density ALS data to improve the estimation of pre-fire AGB by comparing a set of state-of-the-art methods and traditional linear regression methods in a Mediterranean Aleppo pine forest, considering that biomass estimation is the key information to compute burning emissions.

Crutzen
The results demonstrate that low-density ALS data can be used to accurately estimate pre-fire biomass.
The two ALS-derived variables included in the models were analogous to those proposed by other authors (i.e. Guerra-Hernández et al., 2016b;Montagnoli et al., 2015). These variables concern the canopy cover distribution and the vertical distribution of the point cloud. The comparison between regression models shows that the MLR model has the lowest RMSE (6.1 ton/ha) and bias (0.0), matching with the values obtained by other authors (Gonzalez-Ferreiro et al., 2012;Montealegre, Lamelas, de la Riva, García-Martín, & Escribano, 2015c). Consequently, MLR slightly outperforms other nonparametric methods supporting Görgens et al. (2015) findings. However, no statistically significant differences between MLR and SVM with kernel radial were found. This suggests that the results partly agree with Gleason and Im (2012), Gagliasso et al. (2014) and García-Gutiérrez et al. (2015), who obtained lower estimation errors with nonparametric techniques, although the later authors included a relatively high number of independent variables in the models. In this sense, the use of a large number of variables tends to increase the performance of the models. Nevertheless, the selection of a reduced number of biologically representative variables, especially when computing non-linear regression models, might generate more understandable models for forest management purposes. This also might explain that MLR models outperform other nonparametric models, considering the number of variables included in Görgens et al. (2015) models. It is to notice that, as in the case of several previous studies (García, Riaño, Chuvieco, & Danson, 2010;Naesset & Gobakken, 2008;Naesset & Økland, 2002), it has been necessary to perform a logarithmic  transformation of the dependent variable in order to meet the assumptions of the linear regression model. The three-phase approach performed is considered a suitable option for estimating biomass losses, which account for 262,659.7 ton. This methodology solves the lack of post-fire forest structure information derived from ALS data and constitutes an alternative to field estimation of burning efficiency, which is laborious, expensive and requires a detailed knowledge of the pre-fire scenario (De Santis et al., 2010). The use of conversion and emission factors for the Mediterranean basin, included in Trozzi et al. (2002) equation, enables to accurately estimate CO 2 emission at a regional scale, summing up a total of 426,754.8 ton.
When comparing nonparametric methods and linear regression models, discrepancies appear. Our findings show that not always the use of nonparametric methods ensures the best biomass estimations. In this sense, the generation of several models such as MLR, SVM with radial kernel, LWLR or SVM with linear kernel might be taken into account. Furthermore, the use of variable selection processes should be considered, in order to determine a limited number of variables which are biologically representative. Consequently, the use of new artificial intelligence models and nonparametric models, which have several advantages for example no need of normality, should be used within the forestry and environmental purposes of obtaining robust and understandable models. The improved estimation of CO 2 emissions from biomass burning, by including ALS data as relevant information for computing biomass, is considered to better understand the interactions between fire disturbances and the emissions to the atmosphere. Nevertheless, it is to notice that our model does not consider the emissions generated by combustion of litter, shrubs and young trees, implying an underestimation of the total CO 2 emissions during wildfire. It was not possible to estimate these fractions of biomass. In fact, the few studies developed to estimate shrub biomass were performed using high-density ALS data or full waveform LiDAR (Estornell, Ruiz, Veláquez-Martí, & Hermosilla, 2012;E. Greaves et al., 2016;Swatantran, Dubayah, Roberts, Hofton, & Bryan Blair, 2011). Moreover, the majority of them were developed in areas without tree cover due to the difficulty of the pulse to penetrate the canopy (Vosselman & Maas, 2010). In this sense, further research is needed on the estimation of shrub biomass using low point density ALS data in order to improve GHG emissions to the atmosphere.
Considering that the findings are site-dependent, the comparison of different biologically representative models for biomass estimation at regional scales, as well as alternative variable selection processes, may be considered. In this sense, the use of multi-temporal ALS or multi-temporal series of remote sensing data might be useful to better understand the effect of wildfire disturbances to the atmosphere. It would also be desirable to focus on the account of CO 2 emissions or other GHG gasses generated by combustion of other Mediterranean species.

Conclusions
This study verifies the usefulness of low-density ALS data to accurately estimate pre-fire AGB in a monospecific Aleppo pine forest, which is relevant information to compute biomass losses caused by fire and CO 2 emissions. The comparison of the effectiveness of a set of state-of-the-art artificial intelligence methods and traditional linear regression methods is especially interesting to improve forest parameters modelling. The best model for pre-fire AGB estimation was the MLR, which included two ALS variables: the percentage of first returns above 2 m and 40th percentile of height, presenting an RMSE of 6.1 ton/ ha and a bias of 0.0. No statistically significant differences between MLR and SVM with kernel radial, which is the second best model, were found. The three-phase approach used for biomass losses estimation and the subsequent transformation into CO 2 enable to quantify the emissions to the atmosphere by the combustion of Mediterranean Aleppo pine forest in Luna wildfire, summing up a total of 426,754.8 ton.