Distribution shifts in habitat suitability and hotspot refugia of Andean tree species from the last glacial maximum to the Anthropocene

ABSTRACT Forecasting the effects of climate change on the distribution of Andean trees (Polylepis, Rosaceae) is important to understand how species respond to climate variability and to assess their resilience to the ongoing climate crisis. Here, paleodistribution modelling is used to assess distribution shifts of 17 Polylepis species during the Last Glacial Maximum (LGM), mid-Holocene (MH), and the Anthropocene in the central Andes. Species distribution models (SDMs) were computed by combining presence records and current climatic conditions using Maxent and projected onto three climatic scenarios for the LGM (~22,000 yr BP), the MH (~6,000 yr BP), and the Anthropocene (1,970–2,000). Subsequently, current refugia hotspots were identified by intersecting suitable habitat over the LGM, MH, and anthropogenic conditions for all the studied species. SDMs for the LGM and MH showed a contraction of climate suitable habitat for most of the species in comparison to the Anthropocene. Four current refugia hotspots were identified: central Cordillera of Peru, Lake Titicaca basin, western Cordillera of Bolivia, and northern Chile. In general, SDMs were consistent with patterns proposed with pollen records, and partially with available phylogeographic studies. Current hotspots are highly important areas for the conservation of Polylepis and associated biota. This study assists in understanding distribution shifts over millennia of Andean tree species in response to climate change and identifying key conservation areas for the delineation of future conservation strategies.


Introduction
The distribution of species has changed over time, contracting or expanding due to climate variability and human influence. As the cold conditions that prevailed during glacial cycles began to recede after the Last Glacial Maximum, major reorganizations of ecosystems produced biodiversity distribution patterns worldwide that involved changes in the abundance of species and large-scale migrations of taxa [1][2][3].
Nevertheless, a few important regions served as climate refugia and remained relatively stable, which helped to sustain certain specialized species and enable their persistence over time [4,5]. Understanding how species distributions changed in the past has important implications when defining conservation strategies, as it can provide useful knowledge on how species responded to extreme climatic events and to identify key conservation areas particularly for highly sensitive and endemic species.
Ongoing global climate change is presently impacting plant composition and ecosystem functions of Andean ecosystems. Recent research demonstrates that Andean ecosystems are changing as a consequence of increasingly warmer and drier conditions [6][7][8][9]. For instance, in the central Andes, between 1965 and 2012, maximum temperature increased at a rate of 0.15°C decade −1 [10] and between 1985 and 2004, precipitation decreased while droughts episodes increased in frequency and intensity [11][12][13]. Global circulation models project a much more intense warming in the Andean highlands, with a potential increase in temperature ranging from 2.5°C to 5.9°C and a 20% decline in precipitation between 2070 and 2099, with potentially catastrophic consequences in ecosystem composition, structure, and the services they provide [14]. Hence, identifying key areas for the conservation of Andean biota is important for maintaining biodiversity in the future.
Polylepis (Rosaceae) includes 20-45 species of small trees and shrubs that inhabit the Andean highlands from Venezuela to Chile and Argentina [15,16]. Polylepis forms small and fragmented woodlands, where the genus usually constitutes the dominant or exclusive arboreal component [17]. In the Central Andes, Polylepis is found in a diversity of high-altitude vegetation formations, including the Puna, humid montane forests, inter-Andean dry forests, and the Tucuman-Bolivian forest [18][19][20]. These woodlands support a particular biota, including several endemic species that rely on these forests for foraging, nesting, and breeding [21][22][23]. Despite their ecological importance and high-conservation priority, Polylepis woodlands have decreased dramatically over the last decades due to the land-use intensification and global climate change [24][25][26][27][28][29].
Paleoecological studies suggest that Polylepis distribution and abundance have fluctuated over time due to climate variability and human disturbance [30,31]. During the Last Glacial Maximum (~25,000-21,000 years BP), ice and cold weather restricted the presence of Polylepis above 4000 m asl and, as a result, woodlands retracted to protected sites at lower elevations [30,32,33]. Subsequently, during the glacial-interglacial transition, conditions became warmer and more humid, which enable upslope migration of Polylepis and its expansion over the Altiplano. However, during the mid-Holocene (9,500-5,000 years BP), aridity prevailed, lowering lake levels and Polylepis distributional range contracted [30,33]. When mesic conditions returned between 5,000 and 4,000 yr BP, Polylepis was not able to expand as anthropogenic disturbance, particularly wood extraction and burning intensified, especially after agropastoralism became the main subsistence strategy throughout much of the Andes around 3,500 years ago [34][35][36]. After the European conquest, forest loss and habitat degradation increased due to intensification of economic activities, such as farming, grazing, and mining as well as an increase of firewood demand [37,38]. During the last several decades, loss of Polylepis woodlands has continued due to unsustainable land-use activities and ongoing global climate change [24][25][26][27][28][29].
Species distribution models (SDM) have become a common tool to reconstruct species responses to environmental changes, identifying climate refugia and predicting how species will respond to climate change [5,25,39]. SDM project suitable areas for a given species occurrence based the relationship between occurrence records and environmental characteristics that define the environmental niche [40]. These models can be used to identify patterns of modern distribution of a species based on suitable areas and use it to project changes in the species distribution under past or future climatic scenarios, assuming a constant biotic response or niche stability [5,[41][42][43][44].
In this study, we seek to complement previous paleoecological reconstructions and identify possible areas of climate refugia by estimating the potential distribution of 17 species of Polylepis during the Last Glacial Maximum (LGM, ~22,000 yr BP), mid-Holocene (MH, ~6,000 yr BP) and the Anthropocene (1,970-2,000). We use species distribution models (SDM) to examine the geographic occurrence of suitable climate under paleoclimate scenarios and modern conditions based on present occurrence records of the species. This approach is based on the assumption of climate niche conservatism, meaning that past changes in the distribution of Andean trees were mainly the product of migration or local extinction in response to changing climatic conditions.
The main goal is to identify areas of long-term climatic stability in the central Andes (Peru, Bolivia, northern Chile, and northern Argentina) and to identify regions of high conservation value for Polylepis woodlands. We specifically aim to: a) define the environmental ecological niche for each species, b) assess if species distribution during the LGM, middle Holocene, and Anthropocene overlaps, and whether the range size shifts over time, and c) identify the current refugia, i.e. regions climatically suitable from the Last Glacial Maximum to the Anthropocene.

Study area and species
A total of 17 species of Polylepis were selected that are distributed in the Central Andes. These Polylepis species were specifically selected because they occur in dry (i.e. Puna, inter-Andean dry forests and Bolivian-Tucuman forest) or in humid mountain regions (i.e. humid Puna and Yungas mountain forest). The Andean dry Puna (3,500-4,100 m) is characterized by low temperatures (2-7°C), scarce precipitation (250-400 mm), and high solar radiation [45]. In contrast, the humid Puna is distributed at elevation between 3,700 and 4,000 m asl (up to 4,100 m) and the weather is cold (7-9°C) and relatively humid (500-1,600 mm) [45]. The inter-Andean dry forest is found between 500 and 3,300 m asl, in regions with a (12-16°C) and dry (500-700 mm) weather [45]. The Bolivian-Tucuman forest (18-28° S) is distributed between 800 and 3,900 m asl and in areas with a relatively warm (5-23°C) and humid (700-2,000 mm) weather [45]. The humid mountain forests or Yungas is distributed between 1,000 and 4,200 m asl, the weather is warm (7-24°C) and humid (1,500 ->6,000 mm) with high cloud cover [45].

Species occurrences
We obtained occurrence records of the selected Polylepis species in the Central Andes from the Global Biodiversity Information Facility (https://www.gbif.org/ ) and herbarium collections (Missouri Botanical Garden, Herbario Nacional de Bolivia, and Herbario del Sur). All herbarium specimens were checked for accurate species identification with the support of specialists. Additional occurrences were obtained in the published literature, field reports, and field surveys conducted by the authors. To control for the quality of the occurrence data, we reduced spatial uncertainty by removing non-georeferenced records, outliers, and duplicate records for a given species. To account for the occurrence of several woodlands of the same species at a finer spatial scale, we imported presence records into QGIS (version 3.8 [46]) and kept only one record within 50 m of Euclidean distance. Locality data varied between 14 and 190 unique point localities per species.

Climatic variables and ecological niche models
To generate SDM, we employed 19 bioclimatic variables available at WorldClim (https://www.worldclim. org/). Current climatic variables (1,970-2,000) at 2.5-arc minute resolution were used to produce the current distribution models. To generate past distribution models during the Last Glacial Maximum (~21,000 years BP) and the middle Holocene (~6,000 years BP), three different circulation models were selected: The community Climate System Model (CCSM4), the Model for Interdisciplinary Research on Climate (MIROC-ESM), and the Max-Planck-Institute fur Meteorologie (MPI-ESM-P) with a spatial resolution of 2.5 arc minutes. As the study region, we delimited a rectangle that surrounded the extent of the Central Andes (latitude −80 to −70° S) and clip all environmental layers. This area was used to model the distribution of all the species. We acknowledge that it includes large regions that species do not inhabit because of limitations to dispersal; thus, potential distribution maps should be taken into consideration.
To model the potential distributions of species, we employed Maxent v. 3.3.3k [47]. Maxent utilizes a maximum entropy algorithm to model species distribution with presence-only data (Phillips et al., 2006). Given that a high correlation is expected between many climatic variables, we conducted an initial run in Maxent using all 19 bioclimatic variables for current conditions for each species. The selection of environmental variables was based on a stepwise removal of environmental variables. This technique assumes that the best performing combination of environmental variables is unknown and it aims to approximate this optimum through a stepwise removal of the least contributing variables [48]. We employed a MaxEent jackknife test to identify which bioclimatic variables contribute the least to the overall model for each species. The variables contributing with less than 0.01 were removed from the list of environmental variables, and this process was repeated for all the 17 Polylepis species [48,49]. The value of variable contribution within each species was determined heuristically via a jackknife approach in the Maxent algorithm, and it fluctuates slightly between each species. We set up Maxent to select randomly 75% occurrence localities for model training and the rest 25% localities for model testing, and 1,000 iterations with a convergence threshold of 0.00001 [50,51]. We subsequently generated modeled potential distribution models using current, LGM, and MH scenarios for three circulation models (CCSM, MIROC, and MPI-ESM-P). We used the receiver operating characteristic curve analysis (AUC) and receiver operating characteristic (ROC) to assess the predictive capacity of the models [52]. ROC analysis is a method designed to evaluate the specificity (absence of commission) and sensitivity (absence of omission error) at a diagnostic test [53]. The AUC provides a threshold-independent measure of model performance as compared with that of null expectations [51].
To identify if species distributions changed in the past, an occurrence binary prediction was generated for each species at each time period. The distribution threshold was set at the minimum probability area containing all the training localities (Phillips et al., 2006). Maxent produces a modeled species distribution with relative probabilities of occurrence between 0 and 100, however the minimum probability threshold allows for the identification of a standardized percentage of the distribution area, thus allowing for comparisons of different models (e.g. Anthropocene, MH, and LGM) [47]. We combined maps to identify grid cells classified as suitable across all scenarios, producing "agreement maps", representing a conservative approach to account for variation between climate scenarios. Agreement maps were used to identify presumed species-specific refugia, by intersecting cells classified as suitable over consecutive times. "Current refugia" distribution models were constructed by identifying areas considered suitable during the LGM, MH, and Anthropocene (Supplementary information 1) [54].

Results
The distribution of sampling points in the environmental space (Figure 1) shows that several species share a similar climatic niche. For example, several species (P. rugulosa, P. subsericans, and P. tarapacana) currently distributed in the semiarid Puna occur under the same climatic conditions but P. tomentella occupies warmer areas. The species found commonly in the humid puna and inter-Andean dry forests (P. besseeri, P. subtusalbida, and P. triacontandra) also occupy areas with similar climates, except for P. incarum that occupies areas comparatively drier. The species that inhabit the humid montane forests (P. flavipila, P. lanata, P. neglecta, P. pauta, P. pepei, P. racemosa, Figure 1. Climatic envelope of 17 species of Polylepis in the central Andes as a function of: a) mean annual precipitation vs. mean annual temperature, b) maximum temperature of warmest month vs. minimum temperature of coldest quarter, and c) precipitation of driest quarter vs. precipitation of wettest quarter in the central Andes used for species distribution models. and P. weberbaueri) showed a high overlap and little disaggregation in the climate space. The only two species distributed in the Tucuman-Bolivian forest, P. crista-galli and P. hieronymi, also exhibit high similarity in their climate niche.
Across the 17 species, the average cross-validated test AUC was 0.98. AUC values ranged from 0.93 in P. racemosa and 0.998 in P. rugulosa and P. subsericans, which indicates good high classifier performance. The most important predictor variable for each SDM varied across species (Table 1). Nevertheless, temperature seasonality was one of the main predictors for 37.5% of the species, including P. besseri, P. neglecta, P. pepei, P. subsericans, P. substusalbida, and P. tarapacana. Minimum temperature of coldest month for 25% of the species, such as P. crista-galli, P. hieronymi, P. lanata, P. pauta, and P. subtusalbida. Some variables were important predictors for a very few species. For example, P. rugulosa distribution was mainly explained by the mean temperature of the coldest quarter and P. flavipila distribution was mostly associated with precipitation of coldest quarter.
Paleoclimate models projected different suitable habitats for the studied Polylepis species. During the LGM, the CCSM4, MPI-ESM-P, and MIROC-ESM models projected a consistent increase in suitable area for P. hieronymi, P. pepei, P. tarapacana, and P. tomentella and a decrease in suitable area for P. crista-galli, P. flavipila, P. pauta, and P. subsericans (Figure 2). SDMs showed inconsistent outcomes (i.e. contractions under some climate scenarios and expansions under other climate scenarios) for some species, including P. lanata, P. neglecta, P. racemosa, P. subtusalbida, and P. weberbaueri. The SDMs of P. besseri project a climatically suitable area on the region occupied by Lake Titicaca, we considered that this is an overprojection and that it is unlikely that the species colonized this region in the past.
During the middle Holocene, SDMs projected a relative increase in suitable area compared to the LGM suitable area for P. incarum, P. triacontandra, P. lanata, and P. racemosa (Figure 2). In contrast, SDMs showed a decrease in suitable area for P. neglecta, P. rugulosa, and P. weberbaueri. SDMs generated for P. pauta, P. subsericans, P. tomentella, and P. triacontandra showed inconsistent projections.
Agreement maps showed that suitable habitat spanned a smaller spatial extent in both previous time periods compared to current conditions for all the species (Table 2). During the LGM, the area of suitable habitat was on average 67% smaller that the warmer modern conditions, with the exception of P. hieronymi, P. rugulosa, P. tarapacana, and P. triacontandra which showed a larger suitable area in the past (range: 141-327%). During the MH, despite conditions became warmer and drier, the suitable area increased for 76% of the species, although for most of the species it was still smaller than the current suitable area. Exceptions were P. incarum, P. lanata, P. racemosa, and P. triacontandra, for which suitable areas were larger than modern scenarios, between 134% and 1158%. The current refugia model (the intersection between LGM, MH, and modern suitable habitat) showed a similar distribution of the centers of climate refugia for Polylepis species (Figure 3). The areas identified were the central Cordillera of Peru, the Lake Titicaca basin, the western Cordillera of Bolivia, and the semiarid Andes of Bolivia, Chile, and Argentina.

Discussion
The Last Glacial Maximum (~22,000 yr BP) was characterized by cooler conditions (5-8°C colder than modern conditions) that restricted vegetation cover and resulted in the contraction and downslope migration of tree species [30,55,56]. SDMs suggest that the suitable habitat for most of the studied Polylepis species was between 22% and 98% smaller during the LGM than under current conditions. However, our analysis does not take into account the area extent covered by glaciers, and it is feasible that ice cover further restricted the presence of Polylepis at higher elevations (>4,000 m asl), particularly P. tarapacana an P. incarum as they currently inhabit such elevations. Paleovegetation reconstructions have shown that at elevations above 4,000 m asl, the occurrence of Polylepis was scarce due to the presence of glaciers and cold conditions during the Last Glacial Maximum (20,000 and 17,000 BP) [30][31][32]. Palynological studies have also shown that Polylepis remained abundant between 2,800 and 3,400 m asl during the LGM [31]. This niche could have been occupied by a diversity of species that currently inhabit lower elevations, such as P. pauta, P. pepei, P. incarum, and P. racemosa.
In a few species, paleoclimate models projected an increase in suitable area during the LGM. These species currently inhabit a diversity of ecosystems, as coldadapted P. tarapacana in the arid Puna. Palynological studies have suggested that P. tarapacana woodlands persisted near Salar de Uyuni during the LGM [57] but were possibly absent near Nevado Sajama [58]. Furthermore, phylogeographic studies of P. tarapacana in the semiarid Central Andes found low levels of genetic differentiation between populations [59,60]. The low levels of genetic differentiation could be either the product of extensive genetic flow among populations or larger but disjunct woodlands during the late Pleistocene.
During the mid-Holocene, temperatures increased substantially (1-3°C warmer than today) and precipitation decreased considerably, generating arid conditions in the central Andes [56]. Species distribution models suggest that suitable habitat expanded from the LGM to the mid-Holocene for most of the species, in particular, for P. hieronymi, P. rugulosa, P. tarapacana, and P. triacontandra. These results agree with fossil records that have shown an increase in Polylepis abundance as well as its expansion during the MH (~9,500-5,000 yr BP) in several areas of the central Andes, including Lake Titicaca, Lake Miski, Chocheras, Huamanmarca, and Chochos [30][31][32][33]56]. It is likely that increasing fire activity may have further resulted in the contraction of suitable habitat of Polylepis during the MH. Evidence has shown that fire regime also increased in the central Andes during this period [55,56]. Fire is an important modulatory mechanism for the distribution of Polylepis, as it promotes patchiness by increasing adult mortality and reducing seedling recruitment [24,61]. However, the effect of fire on Polylepis cover is heterogeneous across the Andes. For example, in Khomer Kocha (central Bolivia) and Pacucha (Peru), a marked decline in Polylepis during the MH was associated with increasing fire activity [30][31][32]55,62]. By contrast, despite fire was a persistent element of the landscape in Huamanmarca and Miski (southern central Peru), Polylepis remained as a relative dominant component of the vegetation during periods of increased burning between ~7,000 and 2,500 yr BP [33]. Topographic relief and water balance have been attributed to play an important role in the permanence of Polylepis woodlands in these sites, as an irregular terrain likely acted as a physical barrier by protecting plants and generating humid microenvironmental conditions [29,30,33].
It is also worth mentioning that isolated and small populations of Polylepis species likely survived outside of the modelled range boundaries likely favored by specific and fluctuating microclimatic and microenvironmental conditions [31,33,63]. For example, Polylepis forests likely thrived during the middle Holocene in the high-rugosity catchments of Huamanmarka and Miski in the Peruvian southern cordillera [33]. Nevertheless, despite the importance of micro-refugia for Polylepis permanence, the spatial resolution used to run the simulations cannot be used to identify them, highlighting the potential that using increasingly finer-scale datasets can have for improving the fidelity and precision of the results. Current refugia were defined as areas that have remained suitable over millennia and across the LGM, MH, and Anthropocene. The modeled results suggest that climate refugia overlap for several species of Polylepis and we were able to identify four refugia hotspots located in the Central Cordillera of Peru, the Lake Titicaca basin, the western Cordillera of Bolivia, and the semiarid Andes of northern Chile. These putative climate refugia hotspots cover a diversity of ecosystems, including the semiarid Puna, the meso-Andean valleys, the humid montane forests, the Inter-Andean dry forests, and the Tucuman-Bolivian forest. Pollen records provide evidence of the existence of Polylepis woodlands in several locations corresponding to putative refugia hotspots [32,55,56,58,62,[64][65][66][67].
The agreement between past and present-day distributions for several Polylepis species could be elucidated by relative climate stability in the refuge areas. Local conditions seem to have buffered the extreme effects of the climate variability during the Late Pleistocene and Holocene, contributing to the survival of Polylepis species.
The identification of current refugia hotspots has important conservation implications for the conservation of Polylepis woodlands in the central Andes. These hotspots also include areas that have been previously identified as high conservation priority areas due to high species richness and endemism of plants and animals, including Polylepis-associated fauna [21,[68][69][70]. The SDMs also support the importance of the  southern-central Andes as a high conservation area [71]. The Bolivian inter-Andean dry forests and the Tucuman-Bolivian forest are highlighted as regions that merit further investigation efforts and conservation attention. The current refugia hotspots could constitute stable areas for the permanence and survival of Polylepis species. These areas may harbor Polylepis woodlands with high levels of genetic diversity in comparison to areas non-climatically stable. At the moment, few regional phylogenetic studies of Polylepis have been conducted, and it is unknown if this pattern hold for the other species. Results have shown, for example, that the semiarid northern Andes of Chile constitutes an area of relatively high-genetic diversity for P. tarapacana [59]. Multispecies genetic diversity studies between putative refugia and colonization areas could help to further understand the complex biogeographic history of Polylepis in the Andes.
This study provides evidence that SDM can be used as a complementary approach to paleoecological studies, offering spatially explicit projections that can be used to draw hypothesis about past and future geographic distributions of Andean species. The development of SDMs under different scenarios could greatly be benefited by taken into consideration the distributions of glaciers, paleolakes, fire occurrences, and archeological data to model the past role of environmental variability and humans on the distribution of Andean trees. Furthermore, the integration of ecological-niche and palaeoecological studies constitutes a useful approach for identifying critical areas in which to prioritize research and conservation efforts. This is particularly important considering the challenges of designing long-term conservation strategies under the ongoing climate crisis.