Habitat suitability modelling for Lagotis cashmeriana (ROYLE) RUPR., a threatened species endemic to Kashmir Himalayan alpines

ABSTRACT Rare and endemic species comprise globally a priority conservation concern in view of being at a higher risk of extinction. Recording the occurrence data for such species, especially in hardly accessible alpine habitats, is a rather challenging task. Modelling serves as effective tool for predicting habitat suitability and practising artificial introductions for such species with encouraging conservation implications. We used Maxent modelling to predict the habitats suitable for Lagotis cashmiriana (ROYLE) RUPR., a threatened species endemic to Kashmir Himalaya. Our modelling approach consisted of two rounds of modelling and one round of ground validations. The first round of modelling was based on the published literature and herbarium records and the second round included the new records that were obtained from ground validations based on first model predictions. Through this approach, we were able to identify several new populations of L. cashmiriana and reported a significant increase in its range size. We also identified the suitable areas for reintroduction of L. cashmiriana in the western Himalayan region after identifying a broad range of environmental factors that determine the distribution of this species.


Introduction
From the last few decades several factors like climate change, habitat fragmentation, alien species invasion, over-exploitation and pollution have emerged as major threats to global biodiversity. Of particular concern are threatened and endemic species. Characterised by restricted geographic ranges, high degree of habitat specialization, small population size, low reproductive capacity and limited geographic distribution, these species are at greater risk as compared to other widely distributed species (Markham, 2014;McKinney, 1999;Myers et al., 2000;Silva et al., 2017). At the same time these species have important role in ecosystem functioning as they help in maintaining the ecosystem diversity, make ecosystems resistant to invasion and act as indicators of general patterns of species diversity (Lyons et al., 2005;Lyons & Schwartz, 2001), thus, substantiating their conservation on priority basis.
Effective conservation strategies of threatened species demand a proper knowledge of their geographic distributions and determining suitable areas for their reintroduction (Nazeri et al., 2010;Polak & Saltz, 2011;Rodríguez-Salinas et al., 2010;Seddon et al., 2014). However, data about geographic distribution of plants remains often biased for being collected from easily accessible areas and during specific periods of the year (Funk & Richardson, 2002;Hijmans et al., 2000). Reintroduction of threatened and endemic species requires a detailed knowledge about the suitable habitats for the species occurrence. However, assessment of suitable habitats has been done for a small number of species due to time constraints. Ecological Niche Modelling tools serve as one of the most appropriate solutions in solving the problem of conservation in many ways (Botts et al., 2012;Huisman & Millar, 2013). They aid in estimating the distribution of species in conservation assessments and have been widely applied in conservation biology to predict the potential distributions of species (Chefaoui et al., 2005;Peterson & Vieglais, 2001;Rushton et al., 2004). They are now being increasingly used to determine various factors which govern the distribution of species (Elith et al., 2006;Guisan & Zimmermann, 2000;Kozak et al., 2008). Recently, numerous modelling techniques have been successfully applied in case of artificial introductions or selecting appropriate sites for their conservation and management (Elith & Leathwick, 2009;Gaston, 1996;Pecl et al., 2017;Watson et al., 2014). Recent developments in Ecological Niche Modelling (ENM) have explored applications to diverse conservation issues, including suitable habitat and species range estimates (Bellard et al., 2012;Chefaoui et al., 2005;Gaubert et al., 2006;Gritti et al., 2013), protected area prioritization and network design (Huisman & Millar, habitat disturbance on species distributions (Araújo & Peterson, 2012;Banks et al., 2005;Bing et al., 2019), to aid in IUCN categorization of species (Pena et al., 2014) and projecting future distributions under climate change (Bellard et al., 2012;Franco et al., 2018;Moraitis et al., 2019;Wang et al., 2018). ENM approach combines species occurrence data with ecological/environmental variables (temperature, precipitation, elevation, geology, and vegetation) to create a model representing species distributions compatible with the environment (Elith & Leathwick, 2009). Availability of high-resolution satellite imageries, downscaling tools for environmental variables and interpolated spatial datasets on climate and vegetation has enhanced the accuracy of prediction of the models manifold. Presently there is a wide range of environmental niche models for studying species distributions such as Bioclim (Busby, 1991), Domain (Carpenter et al., 1993), linear, multivariate and logistic regressions (Mladenoff et al., 1995), generalized linear modelling and generalized additive modelling (Frescino et al., 2001), discriminant analysis (Manel et al., 1999), genetic algorithms (Stockwell & Peters, 1999), artificial neural networks (Manel et al., 1999;Moisen & Frescino, 2002), and support vector machines (Guo et al., 2005).
Kashmir Himalaya which is a part of Indian Himalayan region occupies a pivotal position in representing a unique biospheric unit in the region. The mountainous region lies between 32°20ʹ-34°50ʹ North latitude and 73°55ʹ-75°35ʹ East longitude . Two thousand plant species have been recorded from the region , grouped under 710 genera and 132 families out of which 8% species are exclusively endemic to Kashmir. Despite the region comprises of only 0.48% landmass of India (Dar et al., 2008) its contribution to the country's angiosperm flora is 12% of which 3% are endemic (Dar et al., 2012). Most of these endemic species are restricted to the alpine and sub-alpine habitats. Three hundred and fifty-five species of plants have been rendered threatened due to various anthropogenic activities, such as habitat loss or modification, over-exploitation of economically important plants, alien species invasion, unchecked grazing, agricultural expansion, unplanned development and influx of tourists (Dar et al., 2008;Dar & Naqshi, 2002). Lagotis cashmiriana an endemic and threatened plant species of Kashmir Himalaya is facing an imminent threat due to over-grazing, fragile habitat, landslides, excessive tourist flow, construction of roads and over-exploitation for local use (Dar et al., 2006;Tali et al., 2014). Earlier Dar et al. (2006) carried studies on reproductive ecology and Exsitu conservation strategies of Lagotis cashmiriana as a means for its recovery and restoration. Identification of suitable habitats for reintroduction of Lagotis is the next important step in recovery and restoration of the species. Keeping in view the above-mentioned scenario we used Ecological Niche Modelling approach to model Lagotis cashmiriana with following objectives: (i) to develop reliable, statistically accurate, prediction maps depicting the potential distribution of the species, (ii) to find the suitable combinations of environmental variables driving distribution of species, (iii) to locate new populations and assess the population status in the predicted habitats through field validation and relate it with model thresholds, and (iv) to identify suitable habitats for species reintroductions.

Study area
We focussed on the Kashmir Himalaya a part of Indian Himalayan region situated in the northwest Himalayan biogeographic zone between 33°20-34° 54 N latitudes and 73°55-75°35E longitudes, covering an area of 16, 000 km2. Two of the 12 important biogeographic zones of India namely Northwest Himalaya (including the Kashmir region together with Tilel, Gurais, Keran and Karnah) and Trans-Himalaya (including the Ladakh region) occur in this region (Rogers & Panwar, 1988). The Kashmir Himalayan region is comprised of lofty mountains of the Pir Panjal in the South and Southwest and by the Great Himalayan range in the North and East with a deep elliptical bowl-shaped Kashmir valley in the middle. The altitude of the region ranges from 1500 (Valley) to 5, 420 m Kolahoi (the highest peak).

Study species
Lagotis cashmiriana Royle Rupr. (Kashmir Hare's Ear) is a perennial herb of family Scrophulariaceae found in alpine slopes of Kashmir Himalayas. It is found at an altitude ranging from 3,000 to 4,000 m. It is usually found in shady and wet areas and among rock crevices. The species is not only over-exploited in view of myriad medicinal uses but its individuals are also damaged by herbivores in various populations. Further, landslides, excessive tourist flow and construction of roads are the other factors which in conjunction with hostile habitat conditions contribute to the present threat status of this endemic species.

Species distribution modelling and ground validations
We followed stepwise modelling approach to get insights into the ecology and distribution of L. cashmiriana. Our modelling approach consisted of two rounds of Species Distribution Modelling (SDM) and one round for ground-validation processes. We based our initial model on fourteen secondary occurrence records obtained through herbarium specimens from Kashmir University Herbarium (KASH), published sources and field researchers. All these records were confirmed through repeated field surveys and accordingly were fitted in the model. Prior to fitting the occurrence localities in our models we considered issue of spatial autocorrelation among the presence localities. Although our surveys focussed on collecting coordinates from the entire region of our study; however, the records which were used to build the first model were spatially autocorrelated. We used "SDM tool box" (Brown, 2014) to remove spatially autocorrelated localities.
We carried out extensive surveys during 2014, 2015 and 2016 to explore the robustness and pertinence of the model in predicting the population status of the species in each locality of occurrence as predicted under various model thresholds. We surveyed twenty new sites across Kashmir Himalayan region as predicted by the model to be highly suitable for the species. The newly located sites (new populations) were simultaneously recorded. These new records were added to previous records and the final model was created to predict the suitable habitats for L. cashmeriana. To select localities for sampling, we converted the model predictions to binary maps (suitable/unsuitable) using the lowest presence threshold (LPT; also known as minimum training presence), i.e., the lowest prediction value returned by Maxent for a location with observed presence of the species. Assessment of the actual habitat type of the species in the localities of occurrence was done through repeated field surveys. We also superimposed the predicted potential areas on Google Earth Ver.6 (www. google.com/earth) imageries for habitat quality assessment. The predicted suitability maps were exported in KMZ format using Diva GIS ver. 7.3 (www.diva-gis. org). KMZs are zipped Keyhole Mark up Language (KML) files which specify a set of features such as place marks, images, polygons, 3D models or textual descriptions for display in Google Earth. The exported KMZ files were overlaid on satellite imageries in Google Earth to ascertain the actual habitat condition prevailing in the areas of occurrence.

Environmental data
We used bioclimatic variables to model the distributions of L. cashmiriana. These GIS data sets characterize global climates from 1950 to 2000 using average monthly weather station data and are available at different spatial resolutions. (Hijmans et al., 2005) and are known to influence species distributions (Root et al., 2003). To fit the models at the local scale, besides climate we explored the possible effects of several other types of physiographic factors such as elevation and slope (Table 1). Climatic predictors were obtained from WorldClim (Hijmans et al., 2005) with a spatial resolution of 30 arc-seconds (http://world clim.org/current) and resampled to a 500 m cell size for usage in the regional scale models. All spatial procedures were implemented in ArcGIS 10.1. We tested all the predictor variable for pair-wise correlations using the Spearman's rank correlation test, and only those with a correlation coefficient lower than 0.85 were taken (Elith et al., 2006;). Based on the correlation and taking into consideration the species ecology and the importance of extreme environmental conditions, a set of eleven variables was finally selected to fit our models.

Habitat suitability modelling
SDM software Maxent 3.3.2  was used to estimate the potential range of the species. Maxent predicts the distribution of species using the principle of maximum entropy, which finds the probability that is the closest to uniform combining environmental data with presence localities and background records sampled from the overall study area . Since the sample size for our species was low, we used only the linear and quadratic features . All other parameters were maintained at default settings. As recommended by Phillips et al. (2006) we used default settings. The Maxent default  (Franklin, 2009). In comparison with other presence only Species Distribution Models (SDMs), Maxent has been found more valid in delimiting the species fundamental niche (Elith et al., 2006). While running the Maxent model one can increase the number of replicates which facilitates the cross validation and subsequently the model calibration. Maxent uses machine learning technique which estimates the distribution of a species while conforming the empirical averages of the climate information associated with the occurrence data . It is one amongst the "presence-only" group of species distribution modelling methods which has been widely used and has the capacity to handle low sample sizes. To validate the model robustness, we executed 20 replicated model runs for the species with a threshold rule of 10 percentile training presence. In the replicated runs, we employed crossvalidation technique where samples were divided into replicate folds and each fold was used for test data. Other parameters were set to default as the program is already calibrated on a wide range of species datasets (Phillips & Dudik, 2008). Model quality was evaluated based on Area Under Curve (AUC) value and the model was graded following Thuiller et al. (2005) as: poor (AUC < 0.8), fair (0.8 < AUC < 0.9), good (0.9 < AUC < 0.95) and very good (0.95 < AUC < 1.0). Further, potential area of distribution and/or reintroduction were categorized into five classes based on logistic threshold of 10 percentile training presence, i.e., very-high (0.-762-1), high (0.572-0.761), medium (0.381-0.571), low (0.325-0.570) and very low (0-0.324). The lack of absence data especially for those species which have not been well documented and that are rare poses a major limitation of many studies of species distributions (Chefaoui and Lobo, 2008;Wisz & Guisan, 2009). However, several authors have (Chefaoui and Lobo, 2008;Elith et al., 2006;Graham et al., 2004) proposed random creation of pseudo-absences as an alternative way to overcome these limitations of presence only datasets, thus making the predictions more reliable and accurate (Elith et al., 2006). Pseudo-absences were generated by selected randomly assigning unoccupied grid cells within a polygon containing the collectively known distribution of each species within the study region.

Habitat suitability modelling
Our first model showed most of the alpine habitats of Kashmir Himalaya to be highly suitable for L. cashmiriana which is rather coherent with the described geographic range of the species. Besides, northern parts of Pakistan (Pakistan occupied Kashmir) were also predicted to be the suitable habitats for the target species (Figure 1). The extensive field surveys carried out during 2014, 2015 and 2016 based on our model thresholds covered almost the entire predicted area. At five sites (Sinthan Top, Pehjan, Peer Ki Gali, Kousarnag and Margan) new populations of L. cashmiriana were successfully located. Our final model (2) which included the newly identified sites showed rather widespread distribution adding some parts of Himachal, Uttrakhand and additional parts of Pakistan occupied Kashmir Himalaya to be also suitable.

Model calibration and factors determining species distribution
Our both the models attained an AUC value of>0.90 (0.99 ± 0.0009, 0.99 ±.0004) and thus can be considered as excellent . Both can successfully discriminate between suitable and unsuitable habitats. Jackknife tests of model validations also confirmed that Maxent predicted the species' occurrence significantly better than random expectations (p < 0.05). Based on the analysis of variable contribution as given by Maxent, Precipitation Seasonality had the highest contribution in both the models followed by Mean Diurnal Range and Precipitation of Coldest Quarter as shown in Table 2 and Figure 2. The response curves for the environmental predictors most determinant for the species distribution of L. cashmiriana are presented in Figure 3. Overall, the response curves reveal that the species is mainly distributed in areas with lower values of Precipitation of Coldest Quarter, Precipitation Seasonality and low to medium temperatures, which is coherent with the known distribution of the species along the north-western Himalaya (Kashmir Himalaya).

Habitats for reintroduction
Our field surveys and habitat analysis in the species occupied habitats showed that L. cashmiriana occupies the alpine slopes with moist and rocky habitats. It sometimes grows in rock crevices and prefers pebbled and sandy soils at an altitudinal range of 3000-4000 m (Table 3). Superimposing the predicted potential habitat map of the species on Google Earth satellite imageries revealed a mosaic of habitats to be suitable for the species persistence (Figure 1(c,d)). High to very high habitat suitable areas for the species were continuous alpine patches of north-western Himalayan region. Medium to low habitat suitability areas were subalpine slopes among evergreen forests. The species was found to be closely associated with Juniper spp., Rhododendron spp. and Bergenia spp. which form thick mats and help to maintain moist conditions and also serve as safe refuge for the species. Besides in certain instances the species occurred between rock crevices assisted by moist habitats. Analysis of habitat suitability under current climatic conditions reveals that overall suitable area for species reintroduction is 951 km 2 of which 472 km 2 is less suitable, 230 km 2 medium, 132 km 2 highly suitable and 117 km 2 very highly suitable areas for L. cashmiriana.

Discussion
Rare and endemic species have acquired top priority for conservation worldwide because these species are at higher risk of extinction. Mapping potential habitats for rare and endemic species can aid in conservation planning and management. We used Ecological Niche Modelling approach as an important tool for conservation of Lagotis cashmiriana an endemic and threatened plant species of Kashmir Himalaya. Our models fitted with both climatic and non-climatic predictors depict, from a robust modelling approach, the potential range of the species besides identifying the most suitable areas for its occurrence. Moreover, our models were successful in predicting the previous distribution range of the target species and identifying highly suitable areas which are coincident with grid cells where the species have not been recorded yet. Based on our model predictions followed by extensive field surveys we were able to locate five new populations of L. cashmiriana thus validating our spatial projections. Our spatial projections can support targeted surveys to collect additional records for the species, help identifying source and sink populations, and support the selection of populations to target urgent conservation measures.  Our study also explains the role of habitat suitability modeling in identifying the habitats for reintroduction of threatened and endemic plant species. Analysis of habitat suitability under current climatic conditions revealed that overall suitable area for species reintroduction is 951 km 2 for L. cashmiriana of which 472 km 2 is less suitable, 230 km 2 medium, 132 km 2 highly suitable and 117 km 2 very highly suitable. The predicted suitable areas compose a mosaic of habitats including alpine rocky slopes, grasslands, pastures and also forest areas in upper reaches. Rocky alpine slopes, moist areas near alpine streams, and habitats among Juniperous patches and Betula patches are among high probability areas for the species; hence, these areas could be used for in situ conservation of the species.
However, while taking species reintroduction plans into consideration, appropriate habitats should be carefully selected based on field observations. Reintroduction of the target species in the identified habitats would help a great deal in rehabilitating the species population and in improving conservation status hence in conserving the overall biodiversity of the region.
Our study can provide a road map for applying distribution modelling as a conservation tool for other species which are threatened and need immediate conservation efforts. Our Model predictions and field assessments reveal that L. cashmiriana has a limited potential distribution. Majority of the predicted habitats are also highly affected by human activities like road construction, over-exploitation, habitat fragmentation etc. Thus there is an immediate need to go for conservation measures both Insitu and Exsitu. Earlier Dar et al. (2006) worked on vegetative propagation methods for L. cashmiriana however these studies need to be further supplemented by micropropagation techiniques for mass multiplication of the target species. The plants can then be reintroduced in their natural suitable habitats using Ecological Niche Modelling tool.
Recently, several studies have used numerous modelling approaches to successfully map distributions of species with fewer occurrences. For instance, Pearson et al. (2007) used only five species occurrence records and modelled Geckos in Madagascar by using Maxent modelling technique. Similarly, De Siqueira et al. (2009) modelled a rare plant with only seven presence records by using GARP software. In our study we also used Maximum Entropy Modelling approach as it has been one of the most widely used among the best performing methods. Maxent has a high predictive performance for both small and large sample sizes (Elith et al., 2006;Hernandez et al., 2006;Pearson et al., 2007;Wisz et al., 2008). Our models provide an excellent discriminatory ability following Thuiller (2003) showing AUC values above 0.90 for L. cashmiriana. Our final models which were calibrated with rather large number of occurrences predicted comparatively larger areas as suitable in comparison with known distribution, thus increasing the chances to  add new locations of occurrence for the species. In our study many areas predicted by our models as suitable were without the target species. Possible reasons for this could be either due to commission error by the model or because of the inability of the species to disperse to these locations. Discovery of additional populations of the target species is important since the species current habitats are being rapidly fragmented by humans. Extensive field surveys are required to further identify unrecovered populations. For this purpose those areas should be focused first which are predicted as highly suitable so that new populations can be discovered in the short term. The choice of suitable number and combination of environmental predictors is crucial when modelling rare and endemic species. It has been observed in ecological science that a few variables account for about 95% of the variation in distribution. Hence while modelling species distributions suitable number of relevant environmental variables should be selected. Too many variables can lead to over-fitting and over-prediction of distributions and too fewer variables can lead to under prediction. The inclusion of climatic and topographic information, to a certain extent, minimizes the potential source of error in prediction. However, with the improvement in technology both spatially and temporally, there could be better availability of data on vegetation, bioclimatic and topography for reliable prediction on the species distribution pattern. In our study we included both climatic and topographic variables which we believe have direct relevance to the species. The temperature variables describe the thermal tolerance of the species, the annual precipitation describes water availability and the physiographic factors describe the macro-habitat characteristics of the species. Out of nineteen bioclimatic variables we cautiously selected only those variables which were not highly correlated and which gave maximum contribution to the maxent model. Our models show that a mixture of climatic and non-climatic variables is needed to explain endemic species distributions in Western Himalaya. The response curves reveal that the species is mainly distributed in areas with lower values of Precipitation of Coldest Quarter, Precipitation Seasonality and low to medium temperatures, which is coherent with the known distribution of the species along the north-western Himalaya (Kashmir Himalaya). Our model can also inform in advance the range dynamics of the target species under climate change scenarios as reflected clear involvement of climatic variables in delimiting the distribution of L. cashmiriana. Climate change has been found to cause distribution shifts in many plant species (Hoegh-Guldberg et al., 2008) and narrowly endemic species are believed to face more extinction risks as compared to others species (Thomas et al., 2004). Specifically for our test species the model highlighted a larger dependence on features of the precipitation regime and low temperatures, which would support more accurate forecasts if climate change scenarios are applied.
We agree that the variables that we selected for our study did not account for species dispersal and biotic interactions which are important factors for determining species distributions (Kearney & Porter, 2009); however, we do not possess enough biological knowledge about the target species to account for such interactions.

Conclusion
L. cashmeriana is an endemic and threatened plant species of Kashmir Himalayan region. It is facing an imminent threat due to over-grazing, fragile habitat, landslides, excessive tourist flow, construction of roads and over-exploitation for local use, thus demanding its immediate conservation. We used Ecological Niche Modelling approach as a firsthand conservation tool to predict suitable habitats for L. cashmiriana. Our results complement the growing body of literature that indicates the significance of Ecological Models to predict potential species distributions, identify new populations of threatened and endemic species and to locate suitable habitats for species reintroductions. On the basis of model predictions, we successfully reconfirmed previous populations and located five new populations of L. cashmiriana. We were able to identify most suitable habitats were the target species can be reintroduced. We believe that well-designed extensive field surveys in the predicted regions will further improve the estimates of range size, which may likely reduce the current threat status for L. cashmiriana. Although we have successfully predicted distribution under current climate conditions, the distribution of L. cashmiriana needs to be studied under future climate by applying various climate change scenarios. Our modelling method can be applied for other endemic and threatened species of the region.