Modeling habitat suitability of Dipterocarpus alatus (Dipterocarpaceae) using MaxEnt along the Chao Phraya River in Central Thailand

Abstract Dipterocarpus alatus plays a dominant role in the ecology and economics of riparian forests in Thailand. Using MaxEnt modeling, we identified potentially suitable regions for D. alatus along the Chao Phraya River in central Thailand. The modeling procedure used 465 occurrence records and 19 Worldclim environmental factors as well as aspect, slope and elevation data. The results indicated that precipitation is the key influential factor affecting the distribution of D. alatus. Highly suitable regions for this species included Nakhon Sawan, Uthai Thani, Loburi, Phra Nakhon Si Ayudhya, and Ang Thong Provinces along the Chao Phraya River. The statistically significant area under the receiver operating characteristics curve value (0.904) indicated that MaxEnt can be used to accurately predict suitable regions for growing commercially valuable plants such as D. alatus. These model results can facilitate habitat conservation and sustainable resource utilization of rare and important plants.


Introduction
Ecological models that predict suitable cultivation regions have become a valuable tool for assessing habitat suitability and resource conservation to protect important plant species. Dipterocarpus alatus (family Dipterocarpaceae) is an important timber tree that plays a dominant role in the ecology and economics of riparian forests in Thailand (Orwa et al. 2009;Asanok et al. 2017). Its wood is used to make plywood for construction, and its resin is used for illumination, waterproofing baskets and boats, and making paint, varnish, and lacquer. Dipterocarpus alatus is believed to be restricted to central and southern Vietnam, Cambodia, Laos, Myanmar, the Philippines, Thailand, and India (Nghia 2005). It is widely distributed in evergreen and dry deciduous forests on ancient alluvial, granite, and basalt rocks with low relief and gentle slopes, in areas where water levels rise and fall rapidly during both the dry and rainy seasons. Optimal conditions for the species include humidity of 75-85%, precipitation of 1500-2200 mm, mean annual temperatures of 25-27 C, and a dry season lasting 4-6 months (Tam et al. 2014;The Forest Herbarium 2017).
The Chao Phraya River is located in an urbanized area in central Thailand, crossing 11 cities along its length of 372 km. The river lies at the center of Thai economic and social development. Dipterocarpus alatus is the dominant species within the riparian area along the Chao Phraya River (Asanok et al. 2017). However, disturbances associated with both urbanization and agriculture frequently degrade the riparian forest, particularly D. alatus populations, and have been attributed to local and regional changes in land use (Santos et al. 2016;Asanok et al. 2017). Ecological modeling can be used to identify the habitat distribution of D. alatus and the environmental factors that affect species distribution and enhance levels of active ingredients in the trees. Ultimately, we aim to use the data to better conserve and manage riparian forests along the Chao Phraya River.
Techniques based on statistical modeling and geographic information systems (GIS) have been become more widely used in ecology and conservation biology (Guisan and Zimmermann 2000;Warren et al. 2008;Brito et al. 2009). Associations among species occurrences and the environmental features of particular habitats are assessed using species distribution models (SDMs; Franklin 2009). Species records can be obtained from field surveys as well as herbarium and museum databases to develop SDMs. Maximum entropy (MaxEnt) is a relatively standard model for precisely predicting species distributions (Phillips et al. 2006;Sunil and Thomas 2009); it has been widely used in studies on suitable areas for endangered species, the suitability of the climatic environment for specific species, and priority assessments for species conservation (Lu et al. 2012;Zheng et al. 2016;Koch et al. 2017). MaxEnt works on the principle of estimating the probability distribution for a variable (e.g. the spatial distribution of a species) that is the most spread out and also subject to constraints such as known observations of the target species. MaxEnt uses entropy to generalize specific observations of species presence and does not require or even incorporate points where the species is absent within the theoretical framework. Points where the species is present are obtained using global positioning system (GPS) locations of target species, whereas points where the species is absent are not usually recorded. Several salient features of MaxEnt models are that they only require presence data, they can use both continuous and categorical data, and they employ efficient deterministic algorithms. Furthermore, the MaxEnt output is continuous and generative, and the models work well with limited data.
Despite advancements in plantation management, D. alatus is considered a threatened species with a reduced and degraded natural distribution. Hence, we chose D. alatus as a representative species to establish new approaches for the conservation and sustainable use of important plant species along riparian areas of the Chao Phraya River. Our objectives were to (1) characterize the habitat and niche of D. alatus based on field-based studies, (2) identify landscape-scale environmental correlates through niche-based models, and (3) identify suitable areas for D. alatus conservation through model projections.

Study area and species occurrence data
The study area was located in the riparian zone of the Chao Phraya River, beginning at the Pak Nam Pho sector of Nakhon Sawan Province and ending at the Pak Nam sector of Samut Prakan Province. The 372-km long Chao Phraya River is the main river in central Thailand, passing through 11 cities and provinces ( Figure 1). The upper part of the river flows through the cities of Nakhon Sawan, Uthai Thani, Chai Nat, Sing Buri, Lopburi, and Ang Thong, and the lower part continues through the cities of Ayutthaya (former capital of Thailand), Pathum Thani, Nonthaburi, Bangkok (current capital), and Samut Prakan. The origin of the river (Nakhon Sawan) is quite far from the ocean, whereas the mouth (Samut Prakan Province) forms an estuary along the coast of the Gulf of Thailand. We focused only on the floodplain within the riparian area between the highest flood level of the river in 2011, which fell within an elevation of 0-30 m above mean sea level (Hydro and Agro Informatics Institute 2012; Gale and Saunders 2014), and the water's edge. The resulting study area had a flat topography and encompassed an area of 14,252 km 2 ( Figure 1).
Sites containing D. alatus were identified using random field surveys at different sites along the riparian zone of the Chao Phraya River. Plant samples were collected, and herbarium specimens were prepared and submitted to the Forest Herbarium at the Department of National Parks, Wildlife and Plant Conservation, Bangkok for identification and validation. A total of 467 sites with D. alatus populations were identified ( Figure  1), and their geographical coordinates were recorded using a Garmin GPS 76 handset (Jaryan et al. 2013).

Modeling procedure
The GPS coordinates of the 467 sites were recorded in the "CSV" file format. The CSV file was used as the input file for MaxEnt. In addition to species occurrence data, environmental data were also used as model inputs. Data for 19 bioclimatic parameters were downloaded from the Worldclim data portal version 2.0 (www.worldclim.org) for our area of interest i.e. along the Chao Phraya River riparian zone. These bioclimatic variables represent annual trends and seasonality in factors such as mean temperature, precipitation, and annual temperature range, as well as limiting factors such as the mean temperatures and precipitation rates of the coldest or hottest month (Table 1). Bioclimatic data were freely available and had a resolution of 30 arc-seconds. These data were downloaded and used in the model (Hijmans et al. 2005). Downloaded data were in the "GRID" format. Data were then converted to "ASCII" format using Arc GIS software (ver. 10.6; ESRI, Redlands, CA, USA; Scheldeman and Zonneveld 2010) to generate data compatible with MaxEnt. Before running the model, autocorrelations among the predictor variables were checked as a recognized source of error (Dormann et al. 2007) using R software (3.4.1; R Core Development Team). Variables with correlation coefficients >0.90 (Table 2), species ecological parameters, significant parameters for extreme environmental conditions, and habitat aspect, slope and elevation were selected as input variables for the model. We used 11 variables in total. Aspect, slope and elevation data were obtained from the Digital Elevation Model (DEM) database in the Shuttle Radar Topography Mission (SRTM) website (http://srtm.usgs.gov/index.php).
MaxEnt software (ver. 3.4.1) was downloaded from http://biodiversityinformatics.amnh.org/opensource/maxent. MaxEnt scores habitat suitability for a species on a scale of 0 (area of lowest suitability) to 1 (area of maximum suitability). To understand the spatial distribution of D. alatus along the Chao Phraya River, the ESRI shape-file of the administrative boundary map along the river was overlaid onto the grid file containing bioclimatic variables in ArcGIS 10.6. This grid file was then overlaid on SRTM-DEM data to generate information on the altitudinal range of D. Mean temperature of wettest quarter bio9 Mean temperature of driest quarter bio10 Ã Mean temperature of warmest quarter bio11 Ã Mean temperature of coldest quarter bio12 Ã Annual precipitation bio13 Ã Precipitation of wettest month bio14 Precipitation of driest month bio15 Precipitation seasonality (coefficient of variation) bio16 Ã Precipitation of wettest quarter bio17 Precipitation of driest quarter bio18 Precipitation of warmest quarter bio19 Ã Precipitation of coldest quarter Asterisks indicate variables used as model inputs. alatus in the area. SRTM-DEM data were downloaded at a resolution of 30 m and then resampled at 1 km. MaxEnt also generates response curves for each predictor variable and uses the jackknife method for highlighting the relative influence of each variable (Fielding and Bell 1997;Khanum et al. 2013;Swanti et al. 2018). In this study, we evaluated MaxEnt model performance according to the omission/commission rate, a threshold-dependent binomial test based on omissions and predicted area (Phillips and Dud ık 2008). Four arbitrary habitat suitability categories for D. alatus based on predicted habitat suitability (IPCC, 2007) were defined as follows: extremely low suitability (0-0.2), low suitability (0.2-0.4), moderate suitability (0.4-0.6), and high suitability (0.6-1).

Results and discussion
Habitat and niche of D. alatus The performance of an ecological model is typically judged using tests and validation. However, users do not need to prove whether model results accurately depict reality. The omission rate is the fraction of the test localities that fall into pixels not predicted as suitable for the species, whereas the predicted area is the fraction of all pixels that are predicted as ideal for the species (Phillips et al. 2006). In Figure 2, the red line indicates the mean area, the black line indicates the predicted omission rate, and the light blue line indicates omission rates of the model training samples. The omission rate is calculated using both the presence records used for training and the test records (Anderson et al. 2003). The thresholdindependent receiver operating characteristics (ROC) curve was also analyzed. ROC performance is represented by the area under curve (AUC; Figure 3). The ROC curve is a plot of sensitivity (the true positive fraction), i.e. the absence of omission error and the proportion of incorrectly predicted observed absences (1specificity), or the false-positive fraction, i.e. commission error. The specificity is defined using the predicted area, rather than true commission. An AUC value of 0.50 indicates that the model is close to random and is a poor predictor, whereas a value of 1 indicates optimum model accuracy (Swets 1988). Model results should be rigorously evaluated, as a species' ecological niche covers a broader area than the geographical range of the species and not all suitable areas are inhabited. Thus, using the maximum amount of information available for species distribution and the variables directly linked to species distribution is recommended. In this context, sites were surveyed for the presence of D. alatus and for predicted presence, to ground-truth the model. In the model, the lines of omission from the training data were close to predicted omission rates (Figure 2). In addition, the AUC value for the training data was close to 1 (0.904), indicating that the model performed better than random, thus validating the accuracy of the model (Figure 3).

Correlations of environmental factors
Climatic patterns establish the broad limits of the distribution of plant taxa at the regional-to-global level (Shimwell et al. 1982;Woodward 1987;Prentice 1992;Taylor and Hamilton 1994). The jackknifing plots comparing climatic variables in the model are depicted in Figure 4. The environmental variable that demonstrated the highest gain and contributed 35% to the model when used in isolation was elevation (elev). The environmental variable produced the largest decrease in gain when omitted was bio11 (mean temperature of the coldest quarter), and aspect exhibited the lowest gain. These results highlight that precipitation plays a key role in the distribution and spread of D. alatus. During our field surveys, we observed that the trees grow along the river and into the forests. Furthermore, the model indicated that D. alatus prefers adequate rainfall and high elevation, which is consistent with results from previous studies (Nghia 2005).

Identification of suitable areas for D. alatus
MaxEnt produces a continuous raster with values ranging from 0 to 1, representing relative habitat suitability.
There is no set rule to establish thresholds; model performance instead depends on the data used or the mapping objective, and therefore varies among species. From our MaxEnt analysis, we obtained threshold values based on a variety of statistical measures; these values were saved in a file called "maxentResults.csv". Some of the most commonly used thresholds are a minimum training presence logistic threshold, 10th percentile training presence logistic threshold, and equal training sensitivity and specificity logistic threshold (Phillips et al. 2006). In this study, 10 th percentile training presence logistic thresholds were applied. To set the 10% minimum threshold, the maxentResults.csv file was examined, and the column titled "10th percentile training presence logistic threshold" was selected. The value in the last row of this column, which represents the average of all runs performed along with averaged model results, was used to reclassify the averaged model results to match the selected threshold using ArcGIS. The resulting final map had four classifications ( Figure 5). Green areas in Figure 5 depict sites with the highest probability of D. alatus presence, whereas those in red represent areas with the lowest probability. Areas of moderate probability are shown in yellow. Of the total area of 53,483 km 2 , 5.84% (704.27 km 2 ) was highly suitable, 14.59% (1757.37 km 2 ) was suitable, 24.83% (2991.10 km 2 ) was moderately suitable, and 54.72% (6592.02 km 2 ) was poorly suitable for D. alatus. The model output ( Figure 5) obtained in this study revealed that the provinces of Nakhon Sawan, Uthai Thani, Loburi, Phra Nakhon Si Ayudhya, and Ang Thong were most likely to host D. alatus, whereas D. alatus was likely to  occur with moderate probability in Chainat, Singburi, and Saraburi Provinces.

Implications for management
Geographical data can be used to analyze species distributions, habitat requirements, and disturbance risks. In addition, the conservation status of a species can be determined by synthesizing information for each of its known populations, particularly with regards to changes in historical range and vulnerability status within its characteristic habitat. Maps based on intrinsic features of the land, such as natural regions and physiographic areas, are the best way of presenting tree distribution data. Species and habitat relationship modeling using precise locality data on microclimate, topography and soil in association with site-specific location data of target taxa can help elucidate the interrelationships and controls of biotic and abiotic factors on species distribution patterns. Our ecological model identifies potential habitats along the Chao Phraya River for the reforestation and conservation of D. alatus, which can help reverse the decline of its natural populations. From a socio-ecological perspective, most youth are unaware of the importance of D. alatus, and reforestation projects focused on this species are now rarely conducted. Thus, increased awareness is crucial, and areas with high occurrence probabilities must be identified for future management efforts.

Conclusions
The present study indicated that the habitat distribution patterns of a dominant tree species could be modeled through MaxEnt using occurrence records and environmental variables. The model predicted that 5% of the riparian zone along the Chao Phraya River in central Thailand was highly suitable for D. alatus. The distribution map of potential habitats can help to identify the likely distribution of D. alatus, and land-use management plans can be enacted based on existing and probable populations of D. alatus. In addition, our model results may facilitate the discovery of new populations of D. alatus, the identification of priority survey sites, and the design of a priority conservation or resource management zone based on the ecological boundaries associated with D. alatus. The distribution modeling of associated species could also help to determine other regions where D. alatus populations can grow well and regenerate. Thus, similar ecological modeling approaches can inform the spatial plans of species conservation and regeneration in response to climate change. The approach presented in this study appears quite promising for predicting suitable habitat for threatened and endangered species with small sample records, and may be an effective tool for biodiversity conservation planning, monitoring, and management.