Contrasting phylogeographic structures between freshwater lycopods and angiosperms in the British Isles

ABSTRACT Aquatic plants face many novel challenges compared to their terrestrial counterparts. The habitat they occupy is typically highly fragmented, with isolated water bodies surrounded by swathes of “dry desert”. This can result in reduced gene flow, inbreeding, and potentially local extinction. The level of gene flow and degree of genetic structure in these species is also likely to be influenced by the mating system they adopt. To test this hypothesis we compare the phylogeographic structure of two freshwater plants in the British Isles, the largely clonal angiosperm Littorella uniflora, and the heterosporous lycopod Isoetes lacustris. We sampled both plants from lakes where they co-occur, and used restriction site-associated DNA sequencing (RAD-Seq) to infer their relationships. Genetic structure among lakes is higher in the angiosperm, which we associate with reduced sexual reproduction, and hence lower levels of gene flow between lakes. Furthermore, we found evidence of lineage-specific association to certain lake nutrient types in L. uniflora, which might result from environmental filtering of specific ecotypes. Overall, we conclude that the reproductive system of lycopods, which is less specialized to terrestrial conditions, provides an advantage following the secondary colonization of aquatic habitats by enabling frequent genetic exchanges between populations and potentially facilitating faster adaptation.


Introduction
The transition from aquatic to terrestrial environments has happened multiple times in animals and plants (Vermeij and Dudley 2000). This is typically accompanied by an array of challenges related to survival and reproduction (Li 2014). In plants, the ancestral mode of reproduction is inherently linked to the presence of water (Renzaglia et al. 2000), and the adaptation to dry conditions once plants became terrestrial required increasing degrees of specialization of the reproductive system (Banks 2009;Linkies et al. 2010;Niklas and Kutschera 2010;Qiu, Taylor, and McManus 2012) Several lineages subsequently made the transition back to aquatic environments, which is likely to disproportionally affect their dispersal abilities depending on their reproductive strategy.
In basal groups of land plants such as mosses, ferns, and lycophytes, male gametes are flagellated and desiccation intolerant, with sexual reproduction often requiring damp habitats even in terrestrial environments (Banks 2009). Secondarily aquatic species of these groups are therefore able to reproduce sexually underwater (Rury 1978;Nagalingum, Schneider, and Pryer 2006;Hutsemékers, Hardy, and Vanderpoorten 2013). By contrast, submerged flowering plants (angiosperms) share the mating systems of their terrestrial ancestors, and generally only sexually reproduce above the water using flowers (Cox 1988;Laushman 1993), although sexual reproduction underwater has evolved in some taxa (Philbrick 1988). The type of propagule will further affect dispersal in aquatic environments. Water-borne propagules will be efficient for dispersal within the aquatic environments, but the production of desiccation resistant propagules, such as fruits and seeds, may facilitate their dispersal across the "dry desert" between isolated aquatic habitats (Li 2014).
Gene flow between populations is determined by their dispersal ability and mating system. This, in turn, affects the genetic structure of populations, impacting their adaptive potential and resilience to environmental change (Loveless and Hamrick 1984). While population size and their spatial distribution will also influence the intraspecific genetic structure, in plants, the reproductive system is arguably the most important factor (Loveless and Hamrick 1984;Holsinger 2000). This has important evolutionary consequences (Morjan and Rieseberg 2004;Eckert et al. 2010;Schiffers et al. 2014; Barrett and Harder 2017), particularly in highly fragmented habitats (Young, Boyle, and Brown 1996;Aguilar et al. 2006). Habitat fragmentation is especially likely for plants from freshwater environments, such as rivers and lakes. These environments are ephemeral in evolutionary time, and not necessarily directly connected to other suitable habitats, leading to high risks of local extinction, small effective population sizes, and inbreeding depression (Barrett, Eckert, and Husband 1993). Despite these limitations, plants are widespread in freshwater environments and indeed have very large species ranges compared to terrestrial plants, a paradox that has long fascinated biologists (Barrett, Eckert, and Husband 1993). Solving this paradox requires estimating effective dispersal rates and gene flow using population genetics approaches. A number of studies have inferred the genetic structure of angiosperms and more basal groups of plants (e.g Lokker et al. 1994;Dong et al. 2007;Hutsemekers et al. 2010;Korpelainen et al. 2013;Zhu, Yu, and Xu 2015;Hofstra and de Winton 2016;Martínez-Garrido et al. 2017). However, genetic structure has never been directly compared between angiosperms and basal vascular plants colonizing the same freshwater environments.
The lycopod Isoetes lacustris L. and the angiosperm Littorella uniflora (L.) Aschers co-occur in lakes across Northern Europe (Murphy 2002). Despite 400 million years of independent evolution (Kenrick and Crane 1997), these two species exhibit convergent ecological and phenotypic traits. Both species have independently adapted to carbondepleted aquatic environments via a relatively slow growth rate, evergreen leaves, isoetid growth form, internal lacunae allowing access to sediment CO 2 , and Crassulacean Acid Metabolism (CAM) (Keeley 1981;Richardson et al. 1984;Boston 1986;Keeley 1998;Madsen, Olesen, and Bagger 2002). While their distribution, ecology and vegetative types are convergent, these two species retain divergent reproductive systems corresponding to their taxonomic groups. The angiosperm L. uniflora propagates asexually when submerged, by producing short stolons (Robe and Griffiths 1998), although the buoyancy and longevity of floating whole plants (Spierenburg et al. 2013) may also allow asexual dispersal over short distances within lakes. Flowering, and therefore sexual reproduction, can only occur when water levels decrease during the summer, exposing plants near the shores to the air (Robe and Griffiths 1998). Rates of outcrossing are unknown in L. uniflora, although Tessene (1968) found possible evidence of self-incompatibility in the closely related L. americana Fernald. Because immersion can be variable between populations and years (Hoggard et al. 2003) genetic exchanges might be limited in L. uniflora. On the other hand, seed dispersal might occur over long distances, with bird dispersal considered the most likely mechanism (Thorne 1972;Hoggard et al. 2003). However, little is known about how these traits influence the population genetic structure of L. uniflora (Hoggard et al. 2003).
In contrast, the reproduction of I. lacustris occurs via the fusion of micro-and mega-spores. Because spores disperse in the water (Vöge 2006), genetic exchanges are possible between submerged plants, although rates of outcrossing versus selfing are unknown. Little is known about the between-lake dispersal mechanism of heterosporous lycopopds (Larsén and Rydin 2015;Troia et al. 2016), with water fowl-and wind-mediated dispersal being the most prominent suggestions (Brunton 2001;Hoot, Taylor, and Napier 2006;Troia 2016). Long-distance dispersal in this species may still be challenging, as drying spores of the two closely related species I. lacustris and I. echinospora Durieu results in failure to germinate (Kott and Britton 1982). Whilst a number of studies of Isoetes species suggest some geographic structure, many of these are based on endangered species that have suffered population decline, and the age of the populations are unknown (Jin-Ming et al. 2005;Kim, Shin, and Choi 2009;Hofstra and de Winton 2016).
In this study, we contrast the intraspecific structure of L. uniflora and I. lacustris in Britain. Ice sheets covered most of Northern Europe, including Britain, until about 12,000 years ago and these geographic areas were subsequently recolonized from refugia (Cottrell et al. 2002;Hoarau et al. 2007). Both studied species were present in refugia in Ireland prior to recolonization, and are estimated to have arrived at similar times in paleolakes throughout Europe (Godwin 1984;Birks 2000). As a result, populations of I. lacustris and L. uniflora in Britain are highly similar in ecology and demographic history, and therefore represent an excellent system in which to understand the effects of their contrasting reproductive systems on population genetic structure, and its implications for adaptive evolution in these species. Using restriction site-associated DNA sequencing (RAD-Seq) of population samples spread from Snowdonia in Wales, to Aberdeenshire in Scotland and the Outer and Inner Hebrides of the Scottish Isles, we (i) infer the intraspecific genetic structure for each species, (ii) test for elevated differentiation in L. uniflora resulting from limited opportunities for sexual reproduction and (iii) test for genetic differentiation among nutrient types of lakes. Overall, this first parallel phylogeographic investigation of freshwater lycopods and angiosperms sheds new light on the effect of sexual reproductive strategies on population genetic structure, in association with habitat specialization.

Plant material and sequencing
Samples of Isoetes sp. and Littorella uniflora were collected from the Scottish mainland and the Outer and Inner Hebrides in August-September 2016, dried and stored in silica gel. In addition, individual samples of I. lacustris and L. uniflora were collected in 2016 from Llyn Idwal in Snowdonia, Wales (Fig 1(a) and 2 (a), Supplementary Table 1). Lake type was classified according to the Scottish Natural Heritage standing water database and the scheme of Duigan, Kovach, and Palmer (2007).
DNA was extracted from silica-dried leaf material using the DNeasy Plant Mini Extraction kit (Qiagen) following the manufacturer's protocol, with the exception of the elution step, which was performed once with 50 µl AE buffer. RAD-Seq libraries were built following the protocol of Soria-Carrasco et al.
(2014) using a modified common indexed adaptor to allow for paired-end sequencing (Peterson et al. 2012). In short, DNA (approximately 200-700 ng) was double-digested with EcoRI and MseI. Barcoded adaptors were ligated to the EcoRI side and a common adaptor was ligated to the MseI side. Following ligation, libraries were PCR amplified using standard Illumina sequencing primers. A total of 96 samples from the same and different projects were pooled based on relative estimates of library concentrations. The library pool was size selected by gel extraction, with a target size of 300-600 bp, and purified using the Qiagen QIAQuick Gel Extraction Kit (Qiagen).
Paired-end sequencing (2x125 bp) was performed on a single HiSeq2500 lane at the Edinburgh Genome Centre following standard protocols.
Raw-sequencing data were cleaned using the trimmomatic tool kit (Bolger, Lohse, and Usadel 2014), removing adaptor and primer sequences with the ILLUMINACLIP option in palindrome mode. The expected primer and adaptor sequences were supplied to the program and a maximum of two mismatches were allowed. The cleaned reads were further trimmed by removing low quality bases with q < 3 from both the 5ʹ and 3ʹ ends. Finally, bases with a quality score below 15 in a four base sliding window were also removed. Only reads longer than 36 bp after trimming were kept for downstream analyses. The cleaned reads were de-multiplexed and barcodes were removed using the processRADtag.pl script from the program STACKS (Catchen et al. 2013). Cleaned de-multiplexed reads are available from the NCBI Sequence Read Archive (accession number SRP155707).

Assembly and analyses of chloroplast genomes
Cleaned and trimmed reads were mapped onto previously assembled plastomes of I. lacustris and L. uniflora collected from Llyn Idwal, Wales (Wood 2018), using bowtie2 v.2.2.3 (Langmead and Salzberg 2012) with default settings for paired end reads. Base calls for each plastid genomic position were extracted using in-house developed shell-scripts (Olofsson et al. 2016) and maximum likelihood phylogenies were  Identification and analyses of nuclear polymorphisms RAD loci were de novo assembled using the program ipyrad v.0.7.2 (Eaton 2014), with default parameters for clustering and assembly. To avoid incorporation of plastid and mitochondrial loci in the final assembly, only clusters with coverage below 100x were processed. The maximum number of alleles per single nucleotide polymorphism (SNP) was set to two and only loci present in at least 40% of samples were incorporated in the final assembly. All samples from each genus were used for two separate clusterings.
For Isoetes a second assembly was performed using only the samples of the species I. lacustris (see Results).
A random single nucleotide polymorphism (SNP) with less than 60% missing data was extracted from each assembled RAD locus using vcftools v.0.1.15 (Danecek et al. 2011). The resulting unlinked SNP dataset was used for phylogenetic and genetic structure analyses. A maximum likelihood phylogeny was inferred for each genus in RAxML under a GTR + G substitution model and node support was evaluated with 100 bootstrap pseudo-replicates. Principal component analyses (PCA) were performed in the R package adegenet (Jombart 2008) using the dudi.pca function. Finally, pairwise F ST between different geographic regions and lake types as well as homozygosity were calculated in vcftools v.0.1.15.

Results
Genetic structure within L. uniflora A mean of 83% (SD = 4%) of the chloroplast genome of L. uniflora was covered by the filtered reads (Supplementary Table 2). The inferred plastid phylogeny was overall poorly resolved, with low support values (Fig 1(b)). Interestingly, some geographically distant populations were grouped together (e.g. samples from North Uist (20) and Perthshire (45) or Harris (9) and Coll (30) ; Fig 1(b)) while geographically close populations were placed in different parts of the tree (e.g. samples from two Lochs in Perthshire (41 and 45) or from Tiree (36 and 37) ; Fig 1(b)). Overall, a high diversity was observed, including within the single lake from Snowdonia, Wales (sample A, B and C ; Fig 1(b)).
The number of cleaned reads per sample varied from 800,000 to 2.4 million (mean = 1.5 million; SD = 520,000), probably reflecting variation in the quality and quantity of input DNA and libraries (Supplementary Table 3). A total of 128,359 RAD loci were assembled for L. uniflora. After filtering, 14,669 of these with 1.7% polymorphic sites were retained for analyses. The level of homozygosity was moderately high (mean F = 0.55, Supplementary Table 3).
The first two principal components (PC) in L. uniflora explained 16.7 and 12.4% of the variation in the data, respectively (Fig 3(a)). The first PC separated two samples (30 and 19) from all others, mirroring the chloroplast phylogeny where these two samples form a separate lineage (Fig 1(b)). The remaining samples formed three groups on the second PC, one of which corresponded to the Welsh samples (A, B and C), while the two others represent different types of lake independently of geography (41 and 45 from mesotrophic lakes, and 20, 37, 38 and 39 from oligotrophic lakes -Fig 1(a) and 3(a)). This pattern was broadly recapitulated in the maximum likelihood nuclear phylogeny, which placed the two distinct samples (30 and 19) as identified in the PCA as monophyletic and sister to all other samples (Fig 1(c)). Among the remaining samples, the monophyly of the mesotrophic and oligotrophic groups was strongly supported (bootstrap values of 97 and 82 ; Fig 1(c)). However, some important incongruences are observed between the chloroplast and nuclear phylogenies, such as a lack of clustering by lake type in the chloroplast phylogenies (Fig 1(b,c)). Pairwise F ST values (Table 1) show a moderate differentiation based on geographic origin with values ranging from 0.14 to 0.22 between populations from different regions. However, pairwise F ST among phylogenetic groups mostly confirms the genetic structure we observe.

Genetic structure within I. lacustris
A mean of 51% of the plastome of I. lacustris was covered by sequencing reads (Supplementary Table 2). The phylogeny inferred from plastomes revealed two divergent groups within Isoetes, with a bootstrap support of 100 (Fig 2(b)). Comparison of previously published I. lacustris and I. echinospora sequences identified a diagnostic SNP in that trnL gene, which suggested that members of the smaller clade were I. echinospora and those of the larger clade were I. lacustris (Fig 2(b)). Bootstrap support within the I. lacustris group was generally low.
In total 134,378 RAD loci were assembled for Isoetes, of which 16,451 were retained after filtering (Supplementary Table 4). These loci contained 4.4% polymorphic sites. A second assembly was performed using only I. lacustris samples, which resulted in a total of 99,672 RAD loci, of which 19,855 were retained after filtering, with 3.5% showing polymorphisms (Supplementary Table 5

The principal component analysis performed on all
Isoetes samples clearly separated the two Isoetes species identified in the chloroplast phylogeny (Fig 3(b)). Similarly, the nuclear phylogeny of Isoetes clearly separated the two species into two highly supported monophyletic clades (Fig 2(c)). Within I. lacustris, evidence of clustering is less clear than in L. uniflora, with samples broadly distributed over the first PC (explaining 14.5% of the variation) with little clustering between the geographic regions or lake types (Fig 3  (c)). The second PC explains 14.0% of the variation and broadly separates one sample (48), from a eutrophic loch on North Uist (Outer Hebrides), from the rest. Branch support values within the nuclear phylogeny of I. lacustris are low, and no clustering by geography or lake type is evident (Fig 2(d)). F ST values between geographic regions were generally lower in I. lacustris than L. uniflora (0.09-0.11 vs 0.14-0.22), with similar levels of differentiation between the Welsh, Scottish Mainland and Outer Hebrides samples (0.09-0.11, Table 2). Oligotrophic and mesotrophic samples showed limited genetic differentiation (F ST = 0.10, Table 2).

Waves of colonization of the British Isles
As the ice sheets retreated in post-glacial Britain, L. uniflora and I. lacustris were both early colonizers of the exposed aquatic habitats (e.g. Birks 2000). This pattern does, however, not seem to have involved a single wave of colonization from a limited number of sources. For both species we identified divergent genetic lineages in geographically proximal lakes. The cohabitation of distinct genetic groups is consistent with multiple, independent colonizations (e.g. Prentice, Malm, and Hathaway 2008;Rosenthal, Ramakrishnan, and Cruzan 2008;Hedrén 2009;Schenekar, Lerceteau-Köhler, and Weiss 2014). The distinct group of individuals of Littorella identified in some of the Hebridean lakes (samples 30 and 19) might represent glacial survivors (Westergaard et al. 2011) or post-glacial colonization from a distinct glacial refuge (Jiménez-Mejías et al. 2012). While these two scenarios cannot be distinguished without additional sampling beyond the British Isles, the coexistence of different genetic groups indicates that the freshwater plant populations are not homogenized. This view is moreover supported by the overall high chloroplast diversity coupled with a lack of a clear isolation by distance as estimated with pairwise F ST .

Higher population structure in Littorella
A higher level of genetic structure is observed in L. uniflora compared to I. lacustris, in terms of phylogenetic resolution, clustering in the principal component analyses, and pairwise genetic distances. These results are consistent with the hypothesis of more frequent sexual reproduction in I. lacustris than in L. uniflora. A lower rate of sexual reproduction in L. uniflora could potentially also explain higher levels of nucleotide diversity in I. lacustris. An alternative explanation is that migration between lakes is higher in I. lacustris than in L. uniflora. However, we find the latter scenario unlikely as desiccation boosts the germination of L. uniflora seeds (Arts and van der Heijden 1990), while it reduces that of I. lacustris spores (Kott and Britton 1982). Furthermore, the similar colonization times of these species observed in paleolakes (Godwin 1984;Birks 2000) suggests similar rates of dispersal. Establishing the causal mechanism for the higher genetic structure in Littorella would require additional studies, but our results suggest than gene exchanges in freshwater plants are more effective in lycopods capable of sexual reproduction underwater than in flowering plants where sexual reproduction is only possible in emerged flowers. Rather than being linked to the effectiveness of dispersal among lakes, we suggest that the observed pattern stems from the rate of intra-population genetic exchanges and the resulting impact of rare migrants on the different genetic pools. Lake nutrient type is correlated with geography in our sampling. However, an effect of habitat type is suggested for L. uniflora, where the population from the oligotrophic Loch of Lowe (39) in Scotland clusters more closely to those of the oligrophic Hebridean and Welsh lakes more than 250 km away, as opposed to the populations from mesotrophic lakes in Aberdeenshire only 6-9 km away (Fig 1(a) and 3(a)). This pattern suggests that selection was acting on migrants of L. uniflora colonizing the lake, effectively filtering genotypes by the nutrient condition. Littorella uniflora shows increased growth rates in response to increased nutrient concentrations (Christiansen, Skovmand Friis, and Søndergaard 1985), but declines in growth in highnutrient habitats (e.g. Farmer and Spence 1986;Robe and Griffiths 1992), potentially due to competition or nitrogen toxicity (Robe and Griffiths 1994;Smolders, Lucassen, and Roelofs 2002). Transplant experiments between eutrophic and oligotrophic lakes in Cumbria found some evidence of adaptation to increased nutrient levels (Robe and Griffiths 1992), suggesting the existence of ecotypes specializing in lakes of different nutrient status. Ecological filtering would also be consistent with higher levels of homozygosity observed in Littorella, due to reduced hybrid fitness. We suggest that the capacity to thrive in mesotrophic lakes evolved in some L. uniflora populations before or at the early stages of the colonization of the British Isles, limiting the subsequent migration to different lake types. Within I. lacustris, there is less evidence of genetic associations between samples due to nutrient type, although a single sample from a eutrophic lake is relatively highly differentiated from the other populations (Fig 3(c)). Growth of Isoetes is also likely to be influenced by nutrient levels (Gacia and Ballesteros 1994;Arts 2002), so that local adaptation might be expected. Our results do not test for local adaptation, but indicate that genetic lineages within I. lacustris are not restricted to specific lake types. More genetic exchanges as a result of frequent sexual reproduction would increase the pool of adaptive alleles available to the populations, potentially facilitating adaptation to complex, heterogenous aquatic environments (Santamaría 2002;Becks and Agrawal 2010;Luijckx et al. 2017). The extent to which these exchanges could contribute to adaptation to particular lake types would be dependent on multiple factors, such as rates of migration, the strength of selection and the genetic architecture of the trait (Rundle and Nosil 2005;Leimu and Fischer 2008). Testing the extent to which different Isoetes populations are adapted to varying nutrient conditions would require dedicated experiments (Blanquart et al. 2013), but our results suggest that the ability to reproduce sexually underwater could facilitate the spread of adaptive alleles between populations in I. lacustris.

Conclusions
In this study, we compared the genetic structure within the British Isles of two freshwater plants belonging to highly divergent groups; the lycopod I. lacustris and the angiosperm L. uniflora. Our investigations revealed higher levels of population structure in L. uniflora than in I. lacustris. We suggest this stems from increased opportunity for underwater sexual reproduction in the lycopod I. lacustris, with Litorella uniflora being reliant on above water structures. We also show that certain lineages of Littorella appear to be restricted to lakes of particular nutrient status. We suggest that this pattern results from early adaptation of some populations to new habitats following by strong ecological filtering. This pattern is not observed in I. lacustris, which could be explained by frequent genetic exchanges in this species allowing the potentially more rapid spread of adaptive alleles among lineages.

Notes on contributors
Daniel P. Wood is a postdoctoral research associate at Bangor University. His research focuses on the evolution of heavy metal tolerance in plants.
Contribution: conceived the study, secured plant material, performed lab work, data analysis and wrote the paper with the help of all authors.
Jill K. Olofsson is a postdoctoral research associate at the University of Sheffield. Her research focuses on the population genomics of complex trait evolution. Contribution: performed lab work and data analysis.

Scott W.
McKenzie is an ecological consultant with expertise in aquatic ecology.

Contribution: secured plant material
Luke T. Dunning is a postdoctoral research associate at the University of Sheffield. His research focuses on the genomics of adaptation in grasses. Contribution: conceived the study, secured plant material.