Modelling spatial patterns of wildfire occurrence in South-Eastern Australia

ABSTRACT This paper describes the development and validation of spatial models for wildfire occurrence at a broad landscape scale. The hotspots databases from the Moderate Resolution Imaging Spectroradiometer (MODIS) and logistic regression models are investigated for the comprehensive understanding of environmental and socioeconomic determinants regulating the spatial distribution of wildfires over the 11-year period 2003–2013. The probability of occurrence of at least one fire on a 1 km2 grid cell in a 1,030,000 km2 region located in South-Eastern Australia is studied for the prediction of future fire occurrence. Our research shows that wildfires are most likely to occur in mountainous areas, forests, savannas and lands with high vegetation coverage, and are less likely to occur on grasslands and shrublands. Wildfires also tend to occur in areas near human infrastructures. Environmental variables are strong individual predictors of fire occurrence while socioeconomic variables contribute more to the final model. The influence of environmental and socioeconomic conditions on wildfire occurrence and the spatial patterns of wildfires identified in this study can assist fire managers in implementing appropriate management actions in South-Eastern Australia. This paper also demonstrates the potential of applying the MODIS active fire product in wildfire occurrence studies.


Introduction
Wildfire is a major environmental and ecological issue across the world. Wildfires can alter the structure of ecosystems, affect ecological processes and functions, threaten human lives and increase fire-suppression costs (Bowman et al. 2009). Australia is recognized as one of the most flammable continents in the world. Wildfires in the southeastern region of the continent can occur at any time of the year, but mostly during the hotter months from October to March. Wildfires in a densely settled area require rapid, appropriate policy responses because they can cause massive loss of life and property (Russell-Smith et al. 2007). The limitation of resources and the requirement for quick responses demand accurate prediction of where fires will occur. Employing fire occurrence records within empirical models is essential to quantify the characteristics of fire activities to support planning and decision-making (Andrews & Finney 2007). These models can be used to identify fireprone areas and help forest managers target suppression efforts (Pew & Larsen 2001;Syphard et al. 2008;Romero-Calcerrada et al. 2010;Wang & Anderson 2010;Renard et al. 2012).
Fire occurrence has been studied in many countries, mainly in North America (Preisler et al. 2004;Syphard et al. 2008;Preisler et al. 2011;Gralewicz et al. 2012;Magnussen & Taylor 2012;Hawbaker et al. 2013), Europe (Catry et al. 2010;Chuvieco et al. 2010;Oliveira et al. 2012;Mart ınez-Fern andez et al. 2013;Chuvieco et al. 2014) and Australia. Most of recent fire studies in Australia were focused on regional areas such as the Australian Capital Territory (McRae 1992), the mallee woodlands and heathlands of South-Eastern Australia (Krusel et al. 1993;Gibson et al. 2015), the entire Victoria (Dowdy & Mills 2012a, 2012b, the Sydney region (Bradstock et al. 2009;Penman et al. 2013) and the south-west Western Australia (Plucinski 2014;Plucinski et al. 2014). These studies are either temporally based (e.g. Dowdy & Mills 2012a) which emphasize the influences of dynamic meteorological variables on the temporal variation of fire incidences, or spatially based (e.g. Penman et al. 2013) which investigate the effects of geographic variables on the spatial pattern of fire occurrence (Plucinski 2011). None of the spatially based fire occurrence studies are conducted at a scale that covers a complete fire-prone state in Australia.
Wildfires are caused by both lightning incidences and human activities. The spatial distributions of both types of fires are usually regulated by top-down and bottom-up drivers across multiple scales (Parisien & Moritz 2009). At broad spatial scales dominated by top-down drivers, wildfires are selective with respect to different land cover types on account of their relationship with fuel loads and types (Gumming 2001;Mermoz et al. 2005;Moreira et al. 2009). Vegetation indexes have also proved to be useful for fire danger prediction due to their connection with fuel moisture content (Bisquert et al. 2011;Caccamo et al. 2012). At fine spatial scales dominated by bottom-up drivers, topography influences fire extent by affecting the rate and direction of fire spread (Rothermel 1983) and by creating microclimates that affect fuel moisture content and air temperature (Heyerdahl et al. 2001;Sharples 2009). Socioeconomic variables such as distance to roads are found to be significant in predicting fire (especially human-caused fire) locations at both regional (Romero-Calcerrada et al. 2008;Vilar et al. 2010) and national (Hawbaker et al. 2013; scales. Most fire occurrence studies are based on historical data which are sourced from fire management agency observation records (e.g. Bradstock et al. 2009). In the past few decades, the emerging space-borne sensors with short revisit times and high accuracy measurements make it possible to understand fire occurrence over a large scale (Pausas & Keeley 2009). In Australia, continental-scale spatial and temporal fire patterns have been statistically assessed based on Advanced Very High Resolution Radiometer (AVHRR) imagery (Craig et al. 2002;Russell-Smith et al. 2007). The Moderate Resolution Imaging Spectroradiometer (MODIS) on Terra (1999) and Aqua (2002) which has been found to be the most precise and reliable system regarding the accuracy and completeness of target detection (Justice et al. 2002) provides a better option. Studies on fire occurrence (e.g. Renard et al. 2012;Hawbaker et al. 2013;Yulianti et al. 2013) have been undertaken using spatial and temporal wildfire occurrence information provided by active fire products (Giglio et al. 2003) generated from MODIS data. However, none of these studies have explored the relationship between MODIS-based active fire locations and their determinants in Australia.
The choice of modelling method is dependent on the characteristic of the dependent variable. When using a very fine spatial resolution (e.g. 1 km), a binomial response is required because only presence/absence is recorded (Taylor et al. 2013). Logistic regression has been intensively used to model the probability of fire occurrence (Chou 1992;Pew & Larsen 2001;Vasconcelos et al. 2001;Syphard et al. 2008;Padilla & Vega-Garc ıa 2011;Magnussen & Taylor 2012). The advantage of logistic regression is that it can relate locally observed independent variable to each fire record (Taylor et al. 2013) and the result is easy to interpret. Although other techniques, i.e. non-parametric methods may be theoretically superior to logistic regression, the issues of overfitting and lack of transparency are their drawbacks (Magnussen & Taylor 2012).
In this paper, we used 11 years (2003À2013) of MODIS active fire data in combination with environmental and socioeconomic explanatory data to build a binary logistic regression model in order to estimate the probability of at least one fire within a 1 km 2 grid cell, and to generate a fire occurrence probability map over South-Eastern Australia. Our objective is to provide quantitative statistics of the environmental and socioeconomic factors contributing to the spatial distribution of wildfires at a broad landscape scale and provide practical guidance for fire management in this region.

Study area
South-Eastern Australia defined in this research contains the mainland of New South Wales (NSW), Victoria (VIC) and Australian Capital Territory (ACT), covering an area of 1,030,000 km 2 (figure 1). The dominant land cover types (see Section 2.2.2 and figure 1) in this region are open shrublands (39%), croplands (26%), evergreen broadleaf forests (13%) and woody savannas (10%), as calculated by the authors. The climate in this region is temperate: cold and damp in winter, hot and dry in summer. Low-frequency, high-intensity fires occur in this area due to the latitudinal gradient in summer monsoon rainfall activity (Murphy et al. 2013).

Data description
A range of data were collected and transformed for different purposes. Some of them were used for statistical analysis (see table 1) while others, e.g. CLUM land use (see Section 2.2.1 for the details), were used as filters of fire occurrence points.

Land use
The Catchment scale Land Use of Australia Map (CLUM) was used as a filter of fire occurrence points in this study. The data-set was updated in March 2014 and published by the Department of Agriculture (DA). Collected by the Australian Collaborative Land Use and Management Program (ACLUMP) state and territory partners, land use in CLUM is mapped at a detailed catchment scale (1:25,000À1:100,000). CLUM data is classified according to the Australian Land Use and Management (ALUM) Classification version 7, which makes it a single dominant land use map for a given area based on the major management objective. The classification has six primary classes of land use: (1) conservation, natural environments; (2) production from relatively natural environments; (3) production from dryland agriculture and plantations; (4) production from irrigated agriculture and plantations; (5) intensive uses; (6) water. Resampling of the 50-m resolution data to 1 km was performed using the majority algorithm to make it consistent with the resolution of MODIS hotspots.

Land cover
The MODIS 500 m Land Cover Type product (MCD12Q1) based on the classification system defined by the International Geosphere Biosphere Program (IGBP) was used for the fire occurrence points filtering and the statistical analysis. It identified 17 classes (11 natural vegetation classes, 3 developed and mosaic land classes and 3 non-vegetated land classes) at a global scale. We reclassified them into six primary classes to consider the influence of primary vegetation types on fire occurrence in the study area: (1) forests; (2) shrublands; (3) savannas; (4) grasslands; (5) permanent wetlands; (6) croplands, water bodies and others. The MCD12Q1 data in 2003 was chosen because it was the year that had the highest number of fire points in the study area. The data was also resampled from a resolution of 500 m to 1 km using the majority algorithm.

Vegetation
We used Collection 5 MODIS global monthly Vegetation Index product series (MYD13A3) as an indicator of fuel load in the study area. The MYD13A3 data is provided monthly at 1-km spatial resolution as a gridded level-3 product. Two types of vegetation index were used: Normalized Difference Vegetation Index (NDVI) and a new Enhanced Vegetation Index (EVI). NDVI is the most commonly used index to assess live fuel moisture content (Hardy & Burgan 1999;Chuvieco et al. 2004;Caccamo et al. 2012), but can experience saturation under high-density vegetation conditions (Sellers 1985). EVI minimizes canopy background variations and has an improved sensitivity in high biomass regions. We tested both NDVI and EVI in January 2003.

Topography
Our topographical variables include elevation, slope, transformed aspect index (northwestness) and distance to zero residual contours. The elevation layer was resampled at 1-km resolution from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model Version 2 (GDEM V2) 30 m data. The resampling was carried out by calculating the mean of the cell values within each 1 km 2 rectangular block. Slope (in percentage) and aspect (in degrees) maps were derived from elevation data. Because the aspect is a circular variable that cannot be used in linear statistics, we cosine-transformed the aspect layer to a linear variable to obtain an index of 'northwestness' which can better distinguish xeric exposures (high index values) from mesic exposures (low index values) (Franklin et al. 2000). Another topographical variable tested in this study is the distances to zero meso-scale elevation residual contours as suggested by McRae (1992). These contours were produced by generating a macro-scale surface using an ordinary Kriging interpolation method, and subtracting it from a finer resolution elevation surface to finally leave a zero meso-scale elevation surface. Because the resolution of the explanatory variable should not be finer than 1 km in this study, the meso-scale contours we generated are coarser than those in McRae's study.

Socioeconomic data
Most wildfires are of anthropogenic origin, either deliberate or accidental, which indicates the potential relationship between fire occurrence and socioeconomic factors such as wildland-urban interface (WUI), distance to roads and railways and population density. The WUI in this study was defined as the boundary between wildlands and urban areas. Wildland (see Section 2.2.6) and urban residential areas were derived from the 50-m resolution CLUM map. A raster map representing the distance to the nearest WUI was generated at a 1-km resolution. We derived primary roads, secondary roads and railways from the database of OpenStreetMap (OSM), a collaborative project to provide open, freely available geographic data of the world (Neis et al. 2011). From the OSM database, 1-km resolution distance maps were generated based on the Euclidean distance to the nearest road. We used population density data in 2003 with spatial units of Local Government Area (LGA) from Australian Bureau of Statistics (ABS).

Fire occurrence
The dependent variable À wildfire occurrence À was originally derived from Collection 5 MODIS global monthly fire location product (MCD14ML) detected using a contextual algorithm (Giglio et al. 2003). The MCD14ML data is the combined Terra and Aqua MODIS Level 2 swath 5-min MOD14/MYD14 active fire product, so it contains a precise date and time of the active fire (hotspot) when MODIS passes over. The spatial resolution of MCD14ML is 1 km. The validation of the MODIS active fire product based on the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) imagery shows a commission error between 2% and 3% globally (Morisette et al. 2005;Csiszar et al. 2006;Schroeder et al. 2008), even though high commission errors are found in low fire activity areas such as urban sites and agricultural locations (Hantson et al. 2013). Each fire activity is provided with a detection confidence level (low, medium or high); however, hotspots of all confidence levels are included in our study because the lowconfidence hotspots provide additional information which outweighs the problem of slightly higher commission errors (Hantson et al. 2013).
In addition to the existence of commission error, the MODIS active fire product has other limitations. First, it does not distinguish between fire causes (lightning and human), which makes it impossible to analyse each explanatory variable in its causality context. Also, an omission error of 18% related to the size of the fire patch is found . Because the majority of human-caused fires may not be large enough to be detected by MODIS sensors, a bias towards natural-caused fires should be realized. Moreover, the fact that prescribed burning and other nonwildfires were also recorded makes the data filter process necessary. In addition, MODIS active fire points represent fire activities being recorded at given times and locations, which means that they contain information on both fire ignition and spread. Therefore, fire occurrence in our study should be defined as the occurrence of fire activity rather than ignition.
Our study period was from January 2003 to December 2013, including all the years when both MODIS Aqua and Terra data products were available. There were missing data in 2007 from mid-August, on part of 21 April 2009, and on 22 April 2009. The number of MODIS hotspots was 176,884.
As mentioned earlier, it is inappropriate to employ all the MODIS hotspots into our models because fires that occurred on non-wildland areas are recorded. A mask representing wildland areas in the study area was generated using both CLUM Land use and MODIS land cover data to filter the fire hotspots. The pixel value of the binary raster was 0 if located at land cover classes (6), as well as at land use classes (3), (4), (5) and (6) except for plantation forestry, or 1 if located somewhere else. All the hotspots on the non-wildland areas were removed. This process was also able to minimize the influence of commission errors of the product. We then generated a histogram representing the monthly distribution of fire hotspots (figure 2). The monthly profiles for fires show the majority of fires occurring from November to February when fire danger levels are the highest, although these figures experience slight upper fluctuations in spring and autumn months, due to the influence of prescribed burning programmes that are generally conducted during cooler periods (autumn and early spring in our study area). We therefore only used fire data within the typical fire season (November to February) to mitigate the influence of prescribed burning. We created a continuous 1-km spatial resolution density map by calculating the number of fire points occurring during the 11 fire seasons of 2003À2013 that fall within each cell of wildland areas. In total, there were 538,690 data points in the density map which included 28,761 presence points and 509,929 absence points. Data analysis was conducted based on sample statistics in order to reduce the data volume. Among many sampling methods, we utilized the stratified sampling method, i.e. the ratio of presence to absence is preserved. 10,000 samples (which include 596 presence and 9,404 absence points) were randomly selected. That is, 1.86% of the population is sampled. All the non-zero values were converted to 1 so that the value of each point denotes the presence (figure 3) or absence of at least one fire incidence within each cell.

Modelling approach
To estimate the probability, P, of at least one fire within a cell, we developed a multiple logistic regression model. Let the probability, P i , of at least one fire in cell i, and x ij be the value of the jth covariate in cell i, the logistic regression can be defined as: where b 0 is an intercept, and b n are coefficients for explanatory variables, x ni . We fitted the model using all the fire occurrence background points. The ratio of ones to zeros is 1:24.6. We developed univariate logistic regression models for each explanatory variable to evaluate the independent influence of each variable on the fire occurrence. Following the suggestions of Serneels and Lambin (2001), we also tested the performance of quadratic or logarithmic versions of continuous variables. Our final model was chosen by implementing the Akaike Information Criterion (AIC) in a backwards stepwise algorithm (Venables & Ripley 1999).
To avoid the effects of multicollinearity, we used Spearman's rank correlation to compare the correlations among continuous explanatory variables (table 2). A correlation above 0.6 (Wintle et al. 2005) among the explanatory variables was found between NDVI and slope (r D 0.65, P < 0.001), NDVI and population density (r D 0.81, P < 0.001), NDVI and EVI (r D 0.97, P < 0.001), EVI and population density (r D 0.84, P < 0.001), slope and elevation (r D 0.67, P < 0.001), distance to WUI and polulation density (r D 0.62, P < 0.001). Therefore, we excluded EVI, slope and population density in the further analysis. We also implemented a diagnostic procedure using the variance inflation factor (VIF), an measure that shows how much the variance of an coefficient is increased due to collinearity (Belsley et al. 1980). Instead of comparing the correlation between pairs of variables, VIF calculates the linear relationship between one variable and all the other variables. VIFs ranging from 1.01 to 8.29 indicate that there is no evidence of multicollinearity, as recommended by Belsley et al. (1980).
We computed the receiver operating characteristics (ROC) curve of our models to determine the optimal discrimination threshold for predicting fire occurrence. By plotting true positive rate (sensitivity) against the false positive rate (specificity), we were able to evaluate the accuracy of the model. The area under the curve (AUC) of the ROC curve is used to measure the model fit, with a measure of 0.9À1 excellent, 0.8À0.9 good, 0.7À0.8 fair, 0.6À0.7 poor and 0.5À0.6 fail (Swets 1988). To validate the performance of the final model, we used a 10-fold cross-validation approach (Breiman et al. 1984). The procedure was to randomly partition the original samples into 10 groups of approximately equal size, use a single group as the validation data and the remaining 9 groups as training data. The process was repeated 10 times and each time the AUC of the validation data was calculated. A statistical summary of the 10 AUCs was calculated to measure the uncertainty related to the location of the presence or absence. The mean of the AUCs is the cross-validated AUC estimate.
The contribution of each variable was calculated by conducting a jackknife procedure based on the change in AUC (Bar Massada et al. 2013), which was the 10-fold cross-validated AUC in this study. The approach consists of removing explanatory variables from the full model one at a time, and calculating the cross-validated AUC. The difference of AUC values between the full-model and the model without the variable denotes the loss of explanatory power of the model in absence of a given factor. In addition, the AUC values of univariate models were computed, and the variables were ranked accordingly.
All statistical analyses in this study were conducted using R package version 3.1.1 (R Development Core Team 2014). Logistic regression was fitted using a generalized linear model (GLM). VIFs, ROCs and cross-validated AUCs were computed using R modules fmsb (Nakazawa 2014), pROC (Robin et al. 2011) andcvAUC (LeDell et al. 2014), respectively.

Results
According to the univariate logistic regression modelling results (table 3), all the explanatory variables are statistically significant (P 0:05), except for northwestness (P D 0:37). In the group of MODIS land cover, wildfires are most likely to occur on forests and savannas, while least likely to occur on shrublands and grasslands. Permanent wetlands is not a significant predictor (P D 0:95) of fire occurrence due to its small sample size (four observations). NDVI also shows the expected positive relationship to the response variable. The occurrence of fire is positively related to elevation, and is negatively related to distance to zero meso-scale elevation residual contours. Fire occurrence is negatively related to all the socioeconomic variables (distance to primary road, distance to secondary road, distance to railway and distance to WUI), which suggests that fires are more likely to occur close to human facilities and urban areas. The final model for fire occurrence includes four environmental variables (land cover, NDVI, elevation, northwestness) and three socioeconomic variables (distance to primary road, distance to secondary road and distance to WUI). The AUC values corresponding to the 10-fold procedure range from 0.858 to 0.906, with a standard deviation of 0.014 (table 4). This means that, although the AUC value can be affected by the location of the points, the variability of the AUC value is acceptable. The cross-validated AUC estimate is 0.886, which indicates that our model performs well in predicting fire occurrence.
According to the AUC values of the univariate models (figure 4), NDVI shows the strongest predictive power, followed by elevation and land cover. The jackknife estimate of variable importance (figure 4) shows slightly different results. Elevation contributes the most in wildfire occurrence prediction, which coincides with its performance in the univariate model. Distance to WUI and land cover rank the second and the third, respectively. The remaining variables are ordered as follows: distance to primary road, NDVI, distance to secondary road and northwestness.
A 1-km resolution fire occurrence probability map was generated by applying the coefficients of the final model to raster layers corresponding to the explanatory variables ( figure 5(a)). Relatively high wildfire probabilities (greater than 0.2) were found in forestry, high elevation areas close to urban areas, while low probabilities (less than 0.2) existed inland away from the coast and mountainous areas. The largest area of high wildfire probability existed in the Great Dividing Range along the coastline of the study area, with the most flammable areas located in the forestry areas of the Australian Alps extending across eastern VIC, southeastern NSW and the ACT, followed by the wildland-urban interface areas at New England Range in northeastern NSW and Blue Mountains above the Sydney Basin. Natural conservation regions such as Mount Kaputar National Park in northeastern NSW and Grampians National Park in western VIC are also more likely to burn. In addition, 10 prediction maps corresponding to the 10-fold procedure were generated and converted into a standard deviation map ( figure 5(b)), which would provide insights into the spatial distribution of the uncertainty of the final model. The variability was distributed primarily in high fire probability areas, e.g. the Australian Alps.

Discussion
Our model is able to describe the spatial pattern of wildfire occurrence in South-Eastern Australia over the 11-year period from 2003 to 2013. Wildfire locations in our study area are found to be significantly influenced by land cover types. Forests are most susceptible to fire due to the dominance of the fire-prone eucalyptus-related vegetation and heavy fuel loads. Savannas rank second probably due to their inherent features relating to ease of ignition (Murphy et al. 2013). Shrublands are least susceptible to burning compared to other land cover types, due to the lower predominance of grass components in these areas (Murphy et al. 2013). Our result is slightly inconsistent with the findings in other landscapes (e.g. Mermoz et al. 2005;Oliveira et al. 2014), possibly because of the low-level shrub canopy cover (<60%) in most shrub areas of South-Eastern Australia.
As expected, fires are more likely to occur in areas with high values of vegetation index on account of its strong relationship with fuel flammability (Caccamo et al. 2012). Fires are also more  likely to occur in areas with higher elevation. This can be explained by the fact that in our study area, the spatial distribution of the vegetation corresponds to that of elevation, which means there is more vegetation to be burnt at higher elevation. When it comes to the influence of aspect on fire occurrence, some researchers in the northern hemisphere have found that a southerly aspect (or a northerly aspect in the southern hemisphere) is more flammable because south-facing slopes receive longer and more direct solar exposure, decreasing the fuel moisture content and enhance its flammability (Mouillot et al. 2003;Mermoz et al. 2005), while others found that northern slopes are more fire prone because much more water is available and results in heavier fuel loads (Carmo et al. 2011;Oliveira et al. 2014). However, in this study northwesterly aspect is not able to predict fire occurrence. This may be because the coarse spatial resolution fails to provide sufficient or correct information on solar exposure and fuel load. Fires tend to distribute in areas near the zero mesoscale elevation residual contour, which is generated by removing micro-and macro-scale variation of elevation, leaving only the meso-scale residual and generating contours accordingly. This finding is consistent with McRae (1992)'s finding about natural ignitions in the ACT area. This unobvious pattern is able to provide practical information to fire risk mapping. Fires are located in areas near human infrastructures (roads and railways) and WUI is consistent with the results of other studies at small landscape scales (e.g. Penman et al. 2013).
Most environmental factors (NDVI, elevation and land cover) are informative when analysed independently, which is expected because their spatial patterns correspond with the distribution of wildfires at such a broad spatial scale. On the other hand, some of them (e.g. land cover) have low variable contributions because much of the information they provide is included in a relatively more influential variable (e.g. elevation).
Socioeconomic variables do not exhibit good predictive power in the univariate analysis. There are likely several reasons for this. First, the indistinguishability of ignition sources of MODIS active fire product that may hinder the contribution of the socioeconomic variables. Second, the fact that not all human-caused fire occurrences are retrieved also indicates that the independent predictive powers of socioeconomic factors are possibly being underestimated in this study. Third, our fire occurrence points contain information for both ignition and spread. Although fire ignitions in Australia have been proven to be strongly influenced by human activities (Willis 2005), fire spread is fundamentally a function of fuel, climate and terrain (Pyne et al. 1996). However, the contribution of variables to the final model tells a slightly different story. Distance to WUI contributes more predictive power compared with NDVI and land cover, which supports the influence of human activities in fire occurrence. Therefore, socioeconomic variables should not be ignored in fire risk assessments at broad landscape scales. Moreover, the association between human activity and fire occurrence indicates the threat of wildfire to human lives and assets. Reducing the fuel load near densely settled areas close to fire-prone bushlands is therefore an essential issue in the perspective of wildfire management.
The fire probability map produced from the final model illustrates the most fire-prone locations in South-Eastern Australia. The distribution of relative high uncertainty in high fire occurrence areas and the limited predictive capacity of logistic regression in these areas (Rodrigues & de la Riva 2014) means that fire probability has possibly been underestimated. Nevertheless, the prediction map still provides useful information on areas where environmental and socioeconomic conditions meet the requirement for enhanced likelihood of fire occurrence. According to this map, firefighting resources and fire prevention activities should be allocated close to mountainous areas, forests and savannas, as well as lands with heavy fuel loads. Areas close to WUI and transport networks should also be emphasized.
We use MODIS data rather than historical records in this study because the former is globally accessible, which makes it possible to conduct a study at a broad spatial scale, apply the method to other regions of the world, assess the suitability of the model and explore the variation of spatial patterns in different study areas. This is especially true in data-poor regions. However, researchers should bear in mind the inherent drawbacks of the MODIS active fire product such as the existence of commission error (which can be minimized by introducing controlling factors), the indistinguishability of ignition sources and the bias towards natural-caused fires. Another defect of our model is that the explanatory variables only represent the environmental and socioeconomic conditions of the study area for a short period of time even though the active fire data covers 11 years. To reduce the impact of the temporal mismatch, we chose data recorded at the most appropriate time (e.g. January 2003). Nevertheless, the overall goodness of fit of our final model is satisfactory. Further improvement of the model would be obtained by the introduction of precise occurrence data that has lower omission error or can identify ignition types. Adding more explanatory variables, especially climate variables, would also be helpful.

Conclusion
In this paper, we used logistic regression in combination with land cover, vegetation index, topographic and socioeconomic information to characterize the spatial pattern of the fire occurrence at least one fire on a 1 km 2 grid in South-Eastern Australia over the period 2003À2013. Our models and the final map suggest that mountainous areas, forests, savannas and lands with high vegetation coverage would be most fire-prone, while grasslands and shrublands can be less vulnerable to wildfire in the study area. Wildfires also tend to occur in areas near human infrastructures and the WUI. Environmental variables are powerful in predicting fire occurrence when analysed individually while socioeconomic variables contribute more to the final model. The results show the extended knowledge about the influence of environmental and socioeconomic conditions on wildfire occurrence and the spatial pattern of wildfire in the mainland NSW, VIC and ACT, which can help fire agencies in these three regions better arrange their limited resources and target management activities. Our study also demonstrates that the MODIS active fire product is a useful data source to study environmental and socioeconomic controls on the distribution of wildfire although attention should be paid to the data manipulation procedure and the interpretation of the modelling result.