Probabilistic modelling of wildfire occurrence based on logistic regression, Niassa Reserve, Mozambique

Abstract Fires are one of the main factors for disturbances in Niassa Reserve-Mozambique, with economic and environmental impacts. There are cyclical records of fire occurrences across the reserve. However, studies on the main causative factors and identification of more susceptible locations are very limited. In this perspective, this study had as objectives: (1) determine the main significant factors for wildfire occurrences; (2) Map the probability of wildfire occurrences, using logistic regression. Independent variables included vegetation index (NDVI), climatic, topographic and socioeconomic data. The analysis period was from 2001 to 2015 and comprised the months with more occurrences of wildfire (May to December). According to the results, the main factors that determine the occurrence of fires were: NDVI, temperature, elevation, followed by precipitation, slope, relative humidity and human settlements. The spatial distribution of probability of fire occurrence reveals that zones with high and very high risk are located at the west and central west zones (areas with higher accumulation of dry biomass); medium risk zones are located in the centre of the reserve, while in central east and east zones the probability of fire occurrence is low and very low risk. Results showed that the expectation of wildfire ignition using logistic regression presented good precision (area under the curve 74%).

Niassa Reserve is the most extensive area of miombo vegetation conservation worldwide (Ribeiro et al. 2008), with one of the largest fauna concentrations in Mozambique (Leo-Smith et al. 2007). Though fires can play an ecologically significant role in biogeochemical cycles and functioning of this system, the frequency and intensity of fires many times lead to forest vegetation destruction with huge negative effects in atmospheric chemistry (atmospheric pollution and carbon emission) and in ecology (loss of biodiversity, landscape instability and proliferation of invasive species) (Chuvieco 2003;Bond and Keeley 2005), becoming a threat to economic goods and to human health (Shlisky et al. 2007).
According to Timberlake et al. (2004), every year, from May to October there is a recurrence of fires in Niassa Reserve. From 2000 to 2012 alone, 45% of the reserve area burnt annually or over two years, and 27% burnt each 3-4 years, only 9% of the total area did not record burning during this period (Ribeiro et al. 2017). According to this same author, fire in the Reserve presents a return interval of 3.29 years, which means a total frequency of 0.36 year -1 .
Miombo forests' ecological characteristics, their seasonality, climatic factors and physiographic characteristics have influenced the incidence of fires in Niassa Reserve. In addition to these factors, the pressure on this area, with an increase of anthropic activities, has considerably increased the number of fires detected and burnt areas.
However, for the burning to occur, three conditions are required: propitious meteorological conditions; availability of combustible vegetation; existence of an ignition source (Parisien and Moritz 2009). The action of each of these factors is different for each region and time of the year, which causes high variability in the pattern observed in fires. In African forests, it is known that this variability is determined by a combination of factors: climatic (rainfall and temperature), herbivory and human activities (Archibald et al. 2010).
In this perspective, the understanding of the way the environment is occupied, its physical characterization, including biological and climatic aspects of each geographic region, and the determination of controlling factors of fires can assist in the detection of locations more susceptible to the occurrence of wildfire, facilitating the planning of strategies for fire prevention and fighting (San-Miguel-Ayanz et al. 2003;del Hoyo et al. 2011).
The modelling of fire risk thus becomes an important tool for forest managers in the identification of locations with high risk of forest fire, also leading to the optimization and allocation of resources for firefighting (San-Miguel-Ayanz et al. 2003;Mohammadi et al. 2014). In locations like Niassa reserve, the identification of determinant factors to control wildfire and the use of maps of fire risk probability can serve, therefore, as a preventive or protective approach to improve fire management.
Methods coupled to Geographic Information Systems, integrating remote sensing data, have been often used to model the probability of fire risk and determine controlling factors at the large, local and regional scales. Among them, statistical methods are outstanding: Artificial neural network (de Vasconcelos et al. 2001;Costafreda-Aumedes et al. 2015), the maxent algorithm (Renard et al. 2012), the autoregressive model (Prestemon et al. 2012), classification trees (Lozano et al. 2008), global logistic regression (Zhang et al. 2013;), multiple linear regression and random forest (Oliveira et al. 2012;Guo et al. 2016a).
However, the choice of the modelling method depends on the characteristic of the dependent variable, while using a very fine spatial resolution (for example 1 km), binomial response is required because only presence or absence is recorded (Taylor et al. 2013).
For specific cases Logistic Regression was used, which is one of the statistical methods mostly used, both for prediction of fire risk and to determine the causes of fire, at the global level (de Vasconcelos et al. 2001;Lozano et al. 2008;Syphard et al. 2008;del Hoyo et al. 2011;Padilla and Vega-Garcıa 2011;Magnussen and Taylor 2012). Compared to other techniques, it is flexible, variables can be continuous and/ or categorical, and it is not necessary to follow the normality principle (Legendre and Legendre 1998;Catry et al. 2009).
Thus, this study had as objectives (1) determine the main significant factors for wildfire occurrence and (2) map the probability of wildfire occurrence in Niassa Reserve, based on logistic regression.

Study area
Niassa Reserve is located at the north of Mozambique, between parallels 12 36 0 46,67 00 and 11 26 0 05.83 00 south and meridians 32 25 0 20.16 00 and 38 31 0 23.16 00 east ( Figure 1). The reserve is part of Rovuma watershed and its territorial extension is 42,311 km 2 . However, this study is concentrated in the central area of the reserve (Conservation Area), with around 23,040 km 2 , and the remaining portion is part of the buffer area, administered by the concessionary for touristic purposes. The climate in the region is dry sub-humid tropical, characterized by two distinct climatic seasons: dry (May to September) and rainy (October to April). The average annual temperature ranges from 20 to 26 C, while average annual rainfall ranges from 770 to 1,140 mm.
The vegetation cover is characterized by occurrence of four large vegetal formations: Deciduous forest; open semi-deciduous forest, mountain forest, riverside forests (woods) and scrubland (Nhongo et al. 2017). From the reserve total area, 72% is covered by the Zambezian dry miombo forest, which occurs in Sandy soils, high lands, with predominance of Brachystegia spiciformis, Bachystegia boehmii and species of Julbernardia globiflora (White 1983). Elevations range from 136 to 1,413 m, above sea level, with gradual increase from east to west, and occurrence of several inselberg rock formations.
The reserve is an area with the lowest demographic density in the country, around 1.3 inhabitants/km 2 (Ribeiro et al. 2017).

Dependent variable: active fire
Data from active fire sources obtained for the period from January 2001 to December 2015, from the MODIS sensor on board the Aqua and Terra platforms, with a spatial resolution of 1 km, collection 6, monthly thermal anomalies product MCD14ML, made available by NASA FIRMS (Fire Information for Resource Management System) through the website (https://earthdata.nasa.gov/earth-observation-data/nearreal-time/firms/). Each position of MODIS active fire represents the centre of a 1 Â 1 km 2 pixel that is labelled by the algorithm as containing one or more fires inside the pixel.
To avoid false alarms (commission errors), only high reliability fire pixels were considered (>80% reliability), because in some cases the product underestimates the occurrence of some fires like: short duration burnings, which occur among images available or that start and end before the passing of the satellite; very small fire fronts hardly detectable; cloud coverage at the time images are being taken; heavy smoke and fire only on the ground of a dense forest, without affecting treetops (Oliveras et al. 2014;Giglio 2015;Anderson et al. 2015). It can also overestimate fire pixels in target situations with contrasting temperatures (e.g. forest limit and bare soil, on hot days), sandy soils or exposed rock presenting high temperatures on hot days (Schroeder et al. 2008;Devisscher et al. 2016), and some fire pixels highly questionable are still classified as nominal reliability, despite the adjustments made in collection 6 (Giglio 2015).

Independent variables
Independent variables comprised four categories: climate, vegetation, topography and socioeconomic factors. Criteria for selection of variables were based on previous studies on fire occurrence (Achard et al. 2008;Wotton et al. 2010;Gralewicz et al. 2012;Oliveira et al. 2012;Chuvieco et al. 2012) and knowledge of the area of study. It is worth mentioning that variable land use and cover was not included because the area of study presents the lowest demographic density and, therefore, low diversity of land use and cover. 70% of the reserve area is occupied by vegetation cover, represented in this study by NDVI. Details of variables used are presented in  settlements was generated based on the Euclidian distance of each cell to the nearest road or human settlement.

Ignition frequency of wildfire conditioning factors
First, an analysis of ignition frequency of fires in the 15-year period was made with regard to the conditioning factors. Histograms were generated representing the frequency of fire ignition per class in each conditioning factor. Mean and coefficient of variation were also calculated for each factor.

Modelling approach
To estimate the probability of fire risk in Niassa Reserve, logistic regression was applied based on the following equation: where P is the probability of occurrence of the event, and should be included as a dichotomous variable; z is obtained from a linear combination of independent variables based on adjustment of maximum likelihood, with constant a; coefficient of partial lineal regression b and original values of variables x: The use of logistic regression assumes that the predictable variable is dichotomous, the existence of both presence and absence of fire sources. For presence, fire sources recorded from 2001 to 2015 were used, codified as 1 (representing the occurrence). For nonoccurrence (absence) 31,834.5 random points noncoincident with ignition points in the whole reserve were generated, in a 1:1.5 ratio to ignition sources (Catry et al. 2009;Chang et al. 2013) which were codified as zero (0) (representing nonoccurrence).
Additionally, in order to validate the model, two distinct groups were generated, one to build the model (training) and the other for its validation, a procedure used by several authors (Catry et al. 2009;Chuvieco et al. 2009;del Hoyo et al. 2011;Guo et al. 2016b) varying only in the percent used in the construction of the respective groups. In the present work 50% of ignition points (10.612) and 50% of nonignition points (15.917) were selected and used as a subset for training and remaining points to test the model's predictive capacity, that is, its validation. All analyses were made using the software SPSS 24.0.
Since logistic regression assumes that independent variables should not be correlated (Colinearity), multicollinearity among independent variables was tested using Tolerance and VIF (Variance Inflation Factor). Multicollinearity is present when there is some level of inter-relation among predictive variables (Villagarc ıa 2006). Its existence in a regression model may distort the model estimate or interfere with the precision estimate. Variables presenting significant colinearity (VIF ! 10) and coefficient of tolerance (Tolerance >0.1) must be removed from the model. The values of Tolerance and VIF obtained indicated that there are no multicollinearity problems.
The Stepwise Forward method was applied to build the model's logistic regression. The Forward method is characterized for considering variables with higher coefficients of sample correlation observed with the response variable. It only starts the model with the constant and adds the variables, more correlated, one at a time. When there is no inclusion in a stage, the process is interrupted and the variables selected up to this stage will define the final model ( del Hoyo et al. 2011) .
The significance of each variable was assessed by the Wald test (Legendre and Legendre 1998) at 5% (P < 0.05) significance level. Besides, the odds ratio was also calculated based on the exponential coefficient Exp (bi), which is an indicator of change in probabilities resulting from the change of one unit in the predictor.
Once the model was defined, the next stage was testing its performance, which was done using different approaches. The global assessment of the model was made using the adjustment to the Hosmer-Lemeshow model test (Hosmer et al. 1997;Hosmer and Lemeshow 2000). According to Norusis (2002), in case the test result is inferior to 5%, the null hypothesis that there is no difference among values observed and predicted was rejected, meaning that data do not adjust to the model. To assess the predictive capacity of the logistic model 2 Â 2 classification tables of values observed predicted were also created, using and comparing the set of training and validation data. However, to determine threshold probability (cutoff), above which fire ignition occurrence is accepted, and below which it is considered that no fire occurred, the Youden index was applied (Garcia et al. 1995;Chang et al. 2013), which was also used in previous studies to determine the best cutoff values in logistic regression to predict the occurrence of fires (Catry et al. 2009;Chang et al. 2013). The optimal value corresponds to the value of intersection between sensitivity and specificity (de Vasconcelos et al. 2001). For such, the set of training data was used to build the classification table and determine the optimal cutoff value, which was 0.3824 for the present work.
Another procedure used to assess how well a model is parameterized and calibrated was the receiver operating characteristic (ROC) curve, which analyses the proportion of positive true, classified as positive (sensitivity), and negative true classified as negative (specificity) for the model, which is a plotting of sensitivity versus specificity for several thresholds of probability (Swets 1988;Fielding and Bell, 1997). A model that denotes good performance is the one that covers large areas below the curve (Catry 2007;del Hoyo et al. 2011;Jim enez-Valverde 2012). Values between 0.5-0.7 indicate low precision, values between 0.7-0.8 indicate acceptable precision, between 0.8-0.9 indicate good precision and values above 0.9 reveal excellent predictive capacity of the model (Swets 1988;Hosmer and Lemeshow 2000;McCune et al. 2002;del Hoyo et al. 2011).  (2010). For the standardized coefficient, the calculations involve multiplication of each non standardized logistic coefficient by the standard deviation of the variable to which the variable belongs, the higher the absolute value of the standard coefficient, the higher is the importance of the variable (Gal an and L opez 2003); and (vi) change in R 2 when the variable was removed from the model (the higher the change, the more important is the variable). For the present study, the change in likelihood logarithm was used (À2 LL). And the most important variable to the model is the one that presents the lower global scoring.

Spatial modelling
To produce an ignition probability map, all independent variables were represented in layers, in a geographic information system.

Model validation
The validation of the logistic regression model is obtained by applying the model to the validation sample (Hair 1998). Thus, independent analysis was made using the validation group to check the efficacy of the predictive model built. For such, the ROC curve and classification table were used. Additionally, to assess the predictive capacity of the risk map produced, a distribution of ignition points of the validation group across classes of the ignition probability map was made.

Analysis of frequency of fire ignition
A preliminary analysis of fire spatial distribution against the variables selected ( Figure  2) shows that the highest frequencies of fires occurred in areas where temperatures were mild (between 23 and 26 ), with low precipitation (<50 mm), while 90% of fires occurred in areas with low to medium relative humidity (40-60%). Most fire sources occurred between low and medium elevation (150-650 m) and around 80% on flat slopes and gentle hillsides (0-25%). However, with regard to Aspect, ignitions were evenly distributed across several north, east, south and west quadrants, but with higher incidence in the north of the reserve. The highest frequencies of fires were also recorded in areas with low and medium NDVI values, 0.2-0.4. Results also revealed that most fires occurred at a distance between 10 and 20 km from human settlements and near roads. This analysis also shows that the variable with more variability was precipitation (Coefficient of variation, CV ¼ 213) and with less variability was air temperature (CV ¼ 6.1).

Model of fire ignition
After several interactions, the final model selected the variables most correlated to wildfire, which were NDVI, Air temperature, Elevation, Precipitation, Slope, Relative humidity and Distance to human settlements. These variables were revealed to be significantly related with the probability of occurrence of wildfire ignitions (P < 0.05). Roads and Aspect were excluded. The significance of explanatory variables and their respective coefficients are presented in Table 2. According to the results, a negative relation was found between fire occurrences and NDVI, Air temperature, Slope; and a positive relation was observed among Elevation, Precipitation, Relative humidity and distance to human settlements.
The Hosmer and Lemeshow goodness-of-fit test indicated insufficient fit of the regression to the data (x 2 ¼ 310,422, df ¼ 8, P < 0.0001). The model predictive capacity was also assessed using classification tables, comparing observed and predicted values, and using a cutoff value of 0.3830, global classification was 67.10% of correctly classified cases (Table 3), using training data. The area under the curve (AUC) was 75% (acceptable precision) (Figure 3) indicating good adjustment. These measurements combined suggest model acceptance as a significant logistic regression model.
The model obtained was represented by Equation (2): Pi ¼ 1= e À 10:580þ0:004Elevþ0:000004Dis sett À1:131NDVIþ0:022PrecÀ0:040SlopÀ0:396Tempþ0:020Ru where Pi is the probability of a point to correspond to one ignition per fire; Elev is elevation; Dist_sett is Distance to settlements; NDVI is normalized difference vegetation index; Prec is rainfall; Slop represents Slope; Temp is temperature; UR is relative humidity.

Relative importance of variables
The result of relative importance of variables is presented in Table 4. The global scoring, calculated by the sum of classification of all variables, is presented in the last column. NDVI was the most important variable in fire occurrences, followed by air temperature, elevation, precipitation, slope and air relative humidity, and in last position, distance to human settlements.

Spatial modelling of probability of fire occurrence
The spatial distribution of logistic probability of the final model ( Figure 4) is the estimate of fire risk for Niassa Reserve. The probability interval scale was divided in five classes: Very low (0.00-0.20) located at east; low (0.20-0.40), located at central east; medium (0.40-0.60) central area of the reserve; high (0.60-0.80), central west of the reserve and very high (>0.80) west of the reserve. In terms of relative distribution, it was identified that 10% of the reserve area presents very low probability of occurrence, 14% low, 21% moderate, 28% high and 27% very high susceptibility (Table 5).

Model validation
According to validation results, presented in the contingency table, the model classified correctly 66.8% of all observations. This result is slightly inferior to those obtained with the training group (67.1%). This difference between results obtained in the training group and for validation lead us to conclude that the model revealed good predictive capacity in the ignition points, reducing the capacity of classification of points of nonignition points, which may mean an excess of predicted areas. In short, the correctness rates in the validation sample are almost identical to the correctness rates in the training sample (Table 5), and one may conclude that the logistic regression model has strong empirical support both in the validation sample and in the training sample.
The model performance was analysed through the area under the ROC curve. For the validation group, it was 0.74, which reveals acceptable capacity of the model, slightly reducing when compared to the results of the training group (0.75), but both are quite satisfactory.
In order to assess the predictive capacity of the map produced, distribution of ignition points of the validation group across classes of the ignition probability map was made. According to the results (Table 5), it can be observed that, though very high probability classes represent 27%, it can predict 39% of the total of ignition points of the validation group and present the highest density (0.66 km 2 ). The high class represents 28% of the area and can predict 29%. The two classes can predict 68% of the total ignition points of the validation group. The medium class, on the other hand, occupies only 21% of the area, but can predict 18% of ignition points. On the other hand, only 13% of ignitions are located in low and very low risk classes which represent 24% of the territory. With these results one can conclude that the ignition probability map presents good predictive capacity.

Determinant factors in fire occurrence
The logistic regression analysis provided ground for the understanding of determinant factors for fire occurrence in Niassa Reserve. Results show that NDVI, air temperature, and precipitation are the most important variables, followed by slope and relative humidity, and finally, the less important variable in the determination of fire occurrence is distance to human settlements. These results show consistency with recent studies on determinant factors for occurrence of fire in African savannahs and conservation areas (Berjak and Hearne 2002;Trollope and Trollope 2004;Archibald et al. 2009).
NDVI is the most important variable, it presented strong negative relation with fire occurrence, which means higher probability of fire occurrence as there is reduction in vegetation vigor. According to Eva and Lambin (1998), in African savannahs, NDVI is reduced seasonally, and in the dry season can reach values as low as those of burnings and exposed soil. Works developed by Nhongo et al. (2017) in Niassa reserve report low NDVI values in the dry season, ranging, on average, from 0.2 to 0.4, according to the type of vegetation cover. This means that as NDVI values decrease, there is a reduction in vegetation vigor (moisture content in combustible vegetation) and increase of dry biomass accumulation with consequent fire risk. One can assume that moisture is inversely proportional to the inflammability of the combustible vegetation.
Studies developed globally show that the combustible moisture condition is a critical factor that influences the danger of wildfire in ecosystems prone to fire, like African savannahs (Zarco-Tejada et al. 2003). There are records of significant correlation coefficients between vegetation index and combustible moisture content, based on the hypothesis of dependence of chlorophyll content with regard to the content of water in treetops (Dasgupta et al. 2007;Glenn et al. 2008). Results with obtention of coefficients of negative correlation of vegetation and wildfire were also reached (Leblon et al. 2007;Bisquert et al. 2011;Fan et al. 2017).
According to Fried et al. (2008), high temperatures proportionally reduce the combustible moisture, making zones highly susceptible to fires. Research developed in pastures and savannahs in South Africa indicated that air temperature has a highly important positive effect on fire intensity (Yakubu et al. 2015). In the present study a negative relation was obtained between air temperature and fires in Niassa reserve. However, it is important to mention that 60% of them occurred in areas with medium and high temperatures, of $23-26 C (dry season). The negative relation is possibly related to the spatial dynamics of burnings within the reserve, whose dynamics migrate from east (higher temperatures) to west (lower temperatures), as well as to the analysis window (May to December). However, if we consider the analysis in the long term, air temperature presents a positive relation with fire sources. Results with a negative relation between fire and air temperature were also found by Chang et al. (2013); Guo et al. (2015); Ye et al. (2017), again indicating the complexity of these analyses.
Several studies mention that low precipitation is typically described as a determinant factor for fire risk globally (Batista 2000;Chang et al. 2008). According to the adjusted model of the present work, there is a positive relation between fire probability and precipitation. However, despite the positive relation, 90% of fires occur in areas with low precipitation, of $0-50 mm. It is worth highlighting the beginning of the water year, in October, and the occurrence of fire until December, which somehow ends up by influencing the positive relation of rainfall and fires.
Among climatological variables, relative humidity is, in isolation, one of the less important factors in fire susceptibility in Niassa Reserve. High air relative humidity reduces the possibility of fires. This result seems contradictory, because, according to Turner et al. (1961), relative humidity inferior to 30-40% is the optimal condition for start and spread of wildfire. On the other hand, air relative humidity above 60% may avoid sustained vegetation material combustion (Ronde et al. 1990). In the present study, despite the positive relation between relative humidity and fires, 90% of fires occur under conditions where relative humidity is below 60%, meaning that fires occur under conditions favorable to vegetal combustion. The elevation variable, on the other hand, has a significant and positive effect on fire occurrence. Records of higher numbers of fires are observed between 400 and 650 m, where mountain forests and deciduous vegetation occur. According to Castro and Chuvieco (1998), elevation influences vegetation structure, combustible moisture and air humidity. Combustible distribution depends on topography. Certain species of trees, particularly species with larger structures, are located in more elevated altitudes. In Niassa Reserve there is a direct relation between vegetation structure increase (vegetal biomass) and altitude. Vegetation density and structure increase occurs from east (low elevation) to west (high elevation), due to the increase of precipitation and reduction of air temperature (Nhongo et al. 2017). Studies developed in subtropical forests have documented the increase of tree density in subtropical forests as altitude increases (Millar et al. 2004;Dolanc et al. 2014). Considering that the present study was made during the dry period, where vegetation reaches low NDVI values, one can infer that the increase of dry biomass (accumulation of forest's combustible load), precisely in areas with higher elevation and larger vegetation structures, determines the higher incidence of fires in Niassa reserve. The positive relation between Elevation and wildfire was also found by Schwartz et al. (2015); Sass and Sarcletti (2017) and Zhang et al. (2016).
On the other hand, more declivitous areas are associated with the increase of wildfire risk, which can be due to the increase of speed of fire propagation, and influences wind conditions, air humidity and moisture of combustible material (Jaiswal et al. 2002). Results obtained for Niassa Reserve reveal that most wildfire occurs in areas with Slope below 25%.  Finally, it is known that areas near human settlements are more prone to fires, because they are subject to fires of vehicles and loads and fires produced by those who pass by the area and by the human presence (Ferraz and Vettorazzi 1998;Jaiswal et al. 2005); though previous studies have verified that the probability of fire occurrence in Niassa Reserve did not increase or reduce with increase or reduction of distance to human settlements. One of the reasons may be related to low demographic density, proximity to river courses, and mitigating measures near human settlements. Zumbrunnen et al. (2011) demonstrated the nonlinear character of the relation involving fire occurrence, population density, human settlements and roads, particularly the levelling of fire occurrence when potential anthropic ignition sources increase. Therefore, expected increases in the number of inhabitants and the associate expansion of urbanized areas and road cover may not result in more fires.  Areas with very high and high risk probability, located in the central-west and west of the reserve, respectively, are mostly covered by deciduous and mountain forests (Nhongo et al. 2017). Though presenting mild temperatures due to elevated altitudes, they are also characterized by occurrence of vegetation with larger structure and higher treetop density. Due to that, this is the region with more accumulation of dry biomass in the dry period, when compared to other areas of the reserve, and consequently with higher fire risk. The centre of the reserve presents medium probability of wildfire occurrence, with predominance of deciduous forests, dominated by species of trees and a layer of welldeveloped grasses on the inferior stratum; temperatures are medium to high and altitudes are intermediary. These results were expected, because in these areas there is accumulation of dry biomass, and climatological variables present magnitudes prone to wildfire occurrences.
However, at the reserve's central east and east, the probability of occurrence is low, despite the high air temperatures, low precipitation and air relative humidity. This pattern can be explained by the occurrence of semi-deciduous open vegetation, constituted mostly by grasses and sparse trees, typical of savannahs, with low biomass density. According to Scholes et al. (1997), the strongly seasonal character of water availability in these phytophysiognomies leads to accumulation of fine, dry and easily inflammable combustibles that can potentially burn every year. However, the low density of biomass influences low occurrence of fires and their nonmaintenance for a long period after the start.
These results are, therefore, in accordance with the analysis of relative importance of variables used, which showed NDVI, air temperature and elevation as determinant factors in fire occurrences.

Validation of results
The model obtained showed good predictive capacity when applied to the set of validation data. ROC curve analysis, with 74% of agreement (acceptable precision), 66.8% global classification, shows that results were good when compared to other models of logistic regression developed to predict fire occurrences. For example, Bisquert et al. (2011) estimated the probability of fire occurrence in the Galicia Region (northwest of Spain) and obtained global precision of 58.2%; Padilla and Vega-Garcıa (2011) modelled the occurrence of fires caused by humans, with precision ranging from 47.4 to 82.6% for different ecoregions in Spain; Chang et al. (2013) modelled the occurrence of fires based on logistic regression in Heilongjiang province (China) and obtained global accuracy of 64.9%.
Another aspect worth mentioning is the reliability of the final map, since the two classes (high and very high) can predict 68% of the total of ignition points of validation.

Conclusions
In this work, the feasibility of modelling fire risks was illustrated, and also which biophysical and human factors are important drivers of fire in Niassa Reserve, based on logistic regression. The model applied showed good performance, proving to be statistically significant with regard to the explanation of spatial correlation between wildfires and their location.
NDVI, air temperature and elevation are the main determinant factors of fire sources, followed by precipitation, Slope, relative humidity and distance from human settlements. NDVI is the most important factor, for it reflects both humidity contained in combustible material and their seasonality, and the amount of combustible material (biomass accumulation), which influences the occurrence of fires in Niassa Reserve.
Areas in the east of the reserve, with higher altitude, larger vegetation structures and consequently more accumulation of biomass in the dry season, are more susceptible to fire source occurrences, showing that vegetation, climate and topography have significant control on fires in this region.
The results obtained in the present study, in addition to providing better understanding of the spatial distribution of wildfire in the reserve, provide an important tool to guide the management of fires in Niassa Reserve, since they consider characteristics of vegetation cover.
Though several studies point to human factors as drivers of fire sources, it was observed that vegetation is one of the main factors for occurrence of fires in Niassa Reserve, due to seasonality and biomass accumulation. However, despite the strong influence of vegetation cover, climatic and topographic factors, as well as the effect of human factors should not be ignored.
One of the strategies for management of fires would be controlled burn to reduce combustibles and fire intensity, and this should be developed in areas with high susceptibility to wildfire.