Biodiversity of phytoplankton cyst assemblages in surface sediments of the Black Sea based on metabarcoding

Abstract Resting stages are common for the life cycle of some phytoplankton species, including blooming and potentially toxic species. The “seed bank” accumulated in the sediments can initiate blooms in the water column and could be an early warning signal of harmful algal blooms (HABs). In order to identify the phytoplankton cyst assemblages, thirteen surface sediment samples were collected from different sites in the Black Sea. The diversity of the resting stages was assessed using high-throughput sequencing metabarcoding (V7-9 hypervariable region of the 18S rDNA). One hundred and eighty microalgal species were identified with high level of similarity to the reference sequences. Dinoflagellates were dominated by Biecheleria, Gymnodinium and Karlodinium. Within diatoms, Skeletonema, Chaetoceros and Thalassiosira were the most abundant genera. Sixteen of the detected operational taxonomic units (OTUs) were assigned to harmful microalgae (12 dinoflagellates species, 1 diatom, 1 haptophyte and 2 raphidophytes). No pattern of microalgal sequences depth distribution was discriminated. The results show that DNA metabarcoding has a great potential for assessment of the phytoplankton diversity in environmental sediments.


Introduction
Dormant stages are common for the life cycle of many planktonic organisms [1]. Dormant stages could be a result of sexual reproduction (the resting cysts of dinoflagellates) or of vegetative division (spores or resting cells of diatoms) [2]. These stages are of multiple functions: a strategy to overcome unfavourable conditions, a seed population for bloom initiation and re-colonisation of the pelagic domain, aiding in the genetic recombination of the population and more effective mobilisation of nutrients [3]. Thus, a benthic life cycle stage can provide survival advantages in unstable pelagic environments [4]. Some of the species forming benthic resting stages may produce harmful algal blooms (HABs) having negative ecological effects and causing severe economic losses [5], i.e. benthic cysts could represent a "dormant threat" for an ecosystem in the future.
Current benthic intra-and inter-specific diversity is poorly investigated and we lack the understanding of how and in which range the sediment seed bank supplies the overlaying water column with genetic standing variation. Recently molecular studies have become a powerful tool for decoding marine protist benthic diversity [6][7][8]. The aim of this study was to assess the phytoplankton biodiversity in the Black Sea floor by HTS metabarcoding.  Table S1). Samples were stored in a dark place at 8 C in order to ensure that only resting stages survived in the sediments.

DNA extraction and sequencing
DNA was extracted from 0.5 g of sediment samples using an ISOIL DNA extraction kit (NIPPON GENE, Tokyo, Japan). Triplet DNA samples were separately prepared from each sediment of the sampling locations. To carry out a metagenomic analysis using the MiSeq 300PE platform (Illumina, USA), a set of universal primers to amplify the V7-9 hypervariable regions of the 18S-rRNA gene (SSR-F1289-sn, F: TGGAGYGATHTGTCTGGTTDATTCCG; SSR-R1772-sn, R: TCACCTACGGAWACCTTGTTACG) were used [9]. The massively parallel paired-end sequencing workflow was derived from the document 16S metagenomic sequencing library preparation: preparing 16S ribosomal gene amplicons for the Illumina MiSeq system distributed by Illumina (part no. 15044223 Rev. B). Two-step PCR was employed to construct the paired-end libraries [10]. The amplified PCR products were quantified, and the indexed second PCR products were pooled in equal concentrations and stored at -30 C until use for sequencing.

Treatment of massively parallel sequences (MPSs) data and selection of operational taxonomic units (OTUs)
Nucleotide sequences were demultiplexed and trimmed using Trimmomatic version (v.) 0.35. Trimmed sequences shorter than 200 bp were filtered out. The remaining sequences were merged into paired reads using Usearch v. 8.0.1517. Sequences were aligned with each other using Clustal Omega v. 1.2.0. Illumina paired-end reads were processed using mothur v. 1.33.0 software [11]. Identical sequences (100% similarity) were collated using the "unique.seqs" command in Mothur. Erroneous and chimeric sequences were detected and removed in Mothur using the "pre.cluster" (diffs = 4) and "chimera.uchime" (minh ¼ 0.1) [12]. Contiguous sequences identified using "count.seqs" in Mothur were considered OTUs and were used in the subsequent taxonomic analysis.

Taxonomic identification of OTUs
The taxonomic identification of each OTU was done using a BLAST search [13] against a template sequence database constructed with sequences downloaded from the National Center for Biotechnology Information (NCBI) FTP server that satisfied the definite conditions as described previously [10]. Subsequently, the taxonomic information was obtained from the BLAST hit with top bitscores for each query sequence, then, the OTUs of the same top-hit were merged.

Statistical analyses
The OTU accumulation curves were drawn using the rarecurve function in vegan package v. 2.0-10 [14] for the R statistical environment. The frequency distributions of BLAST top hit similarities for OTUs and MPSs were calculated at 1% difference increments. To compare among supergroup levels, frequencies of the OTUs and MPSs that were identical to those in the INSDs scoring >0.990 BLAST top hit similarity were calculated for each supergroup. Supergroups were defined as Alveolata, Amoebozoa, Apusozoa, Archaeplastida, Excavata, Hacrobia, Metazoa, Opisthokonta, Rhizaria, Stramenopiles, and Viridiplantae as described in [10].
Taxonomic richness (S), abundance (N), Shannon-Wiener diversity (H 0 ) and Pielou's evenness (J') were calculated by PRIMER v. 5 package (Primer-E Ltd., Plymouth, UK). The differences among sampling sites (congregated in CO, SH and OS habitats based on station depth) were tested by a one-way analysis of variance (ANOVA), with "habitat" as orthogonal and random factor with 10 levels. 3D Non-parametric multidimensional scaling (nMDS) performed on the Bray-Curtis matrix of the means was used to represent the relationships among all sequences (PRIMER v. 5 package). SIMPER procedure was performed to test the similarities within the three habitats and the dissimilarities between the habitats and to identify the responsible taxa. Box-Whisker plots were constructed to depict the descriptive statistics results for the three habitats (StatSoft, Inc. 2011).

Results and discussion
The present metabarcoding-based study provides structural information of benthic phytoplankton diversity in the Black Sea detecting a high OTU richness. After removal of sequence errors and chimeras, 2,483,047 sequences were obtained from all surface sediment samples. The MPSs number between samples varied from 139,855 to 248,521 (191,004 average (AVG) ± 31,795 standard deviation (SD)). The maximum number of MPSs after data treatment was about 1.8 times higher than the minimum number of MPSs obtained from the 13 samples. The sequences were clustered into 2,251 18S OTUs and varied between 277 and 748 (557 AVG ±168 SD), which is lower than the OTUs richness observed in other seas [6,7,15] ( Table 1).
The relationship between the numbers of MPSs and OTUs for the samples was investigated and shown as  a rarefaction curve. The rarefaction curves computed for each sample as triplicates reached saturation (Figure 2), suggesting that the sequencing effort was sufficient for estimating the eukaryotic biodiversity in the sediments. The frequency distribution of the top BLAST hit similarity indicated that approximately 28% of the OTUs and 57% of the MPSs were identical (>0.990) to those registered in the INSDs and used for the taxonomic assignments (Figure 3). The results are comparable with these from other studies [7].
The total number of protist sequences in the dataset was 2,054,782 (1459 OTUs): 436 OTUs (29.9% of all protist OTUs) and 1,158,927 MPSs (56.4% of all protist MPSs) score >0.990 BLAST top hit similarity. This confirmed the previous finding about the lack of data for a high proportion of benthic OTUs in the reference databases [7].
Phytoplankton taxa were represented by 180 OTUs (77.8% identified to the species level). The relative abundance at a class level is shown in Figure 4.
Although the sampling locations differ significantly in depth, no pattern of microalgal sequences (taxa) depth distribution (by habitats) was discriminated as shown in the 3D nMDS plot of samples clustering ( Figure 5). The SIMPER average similarity in CO, SH and OS was not high (55.6%, 64% and 55.5%, respectively) and matched the low SIMPER average dissimilarity between habitats: CO-SH (38.9%), CO-OS (42.5%) and SH-OS (34.9%). This is well supported by the high  number and uniformity of sequences (taxa) that represent the 70% cut-off dissimilarity matrix between habitats, indicating rather homogeneous assemblages at the "habitat" level as well as by the ANOVA results (p ¼ 1.0, at p 0.05)), indicating no statistically significant difference. For example, among the top 10 species in the Bray-Curtis matrices, 7 were common for the three "habitats" (Biecheleria sp., Gonyaulax spinifera, Ebria tripartita, Azadinium polongum, Woloszynskia pascheri, Nannochloropsis limnetica and Biecheleriopsis adriatica). In contrast, the difference among the sampling stations within the CO, SH and OS "habitats" was high ( Figure 5, Supplemental Figure S1 and Table S2).
Previous morphology-based studies reported a high diversity of dinoflagellate cysts in Black Sea sediments [3,16,17], but so far there is little information about diatoms resting spores. The molecular data obtained in this study added new information about sediment phytoplankton biodiversity. Among 51 identified dinoflagellate species, 30 have not been described in the Black Sea sediments until now. The dominant dinoflagellate OTUs were assigned to Biecheleria (67.9% of the dinoflagellate sequences), followed by Gymnodinium (14.6%) and Karlodinium (3.7%). Biecheleria OTUs were not identified to the species level due to the equal similarity of the sequences with two species Biecheleria brevisulcata and Biecheleria cincta. Biecheleria baltica described in the Black Sea sediments [16] was not found in the samples analysed in this study. Gymnodinium was represented by five species (Gymnodinium aureolum, Gymnodinium catenatum, Gymnodinium nolleri, Gymnodinium simplex and Gymnodinium sp.non-discriminated between Gymnodinium catenatum and Gymnodinium impudicum). Cysts of Gymnodinium nolleri were found in sediments from different Black Sea areas [3,16,17], Gymnodinium impudicum in Western Black Sea [3], whereas the other identified Gymnodinium species were reported only in the plankton [18]. Karlodinium was represented only by the potentially toxic K. veneficum, found in all samples. For this species there is no information in the literature about cyst formation, but it has been described in the Black Sea plankton [10,18].
In the present study, 40 diatom species were identified in the whole dataset, including the deepest sediment samples (>1900 m). The most abundant diatom OTUs were Skeletonema (65.7% of diatom sequences), followed by Chaetoceros (21.8%) and Thalassiosira (8.1%). The similar pattern was described for the Mediterranean Sea [8]. Some of the species identified in the present study (Chaetoceros mitra, C. wighamii, Navicula sp., Skeletonema menzelii, S. pseudocostatum, Staurosira elliptica, Thalassiosira pseudonana and T. rotula), were found in sediment samples from other seas on the basis of molecular and morphological studies [8,19].
Chlorophytes were represented by high OTUs number (23.9% of microalgal OTUs), with many freshwater species (e.g. Chlorella sorokiniana, Choricystis parasitica, Desmodesmus abundans, Desmodesmus intermedius). Additionally, the presence of freshwater picoplankton Nannochloropsis limnetica (Eustigmatophyceae), the resting stages of which have been reported in lake samples [20], was registered in 11 stations. New data for haptophyte and prasinophyte flagellates in Black Sea sediments were also obtained, among which Isochrysis galbana and Mantoniella squamata were recently described as forming resting stages in Mariager Fjord [21]. In total 16 of the detected OTUs were assigned to microalgal species listed as harmful [22] (Supplemental Table S3). They constitute 8.9% of all microalgae distinguished at the species level with scores >0.990 (8.2% of all microalgal sequences). Most of the potentially toxic species belonged to dinoflagellates, some present in all samples with a significant total sequence number (like Gymnodinium catenatum and Karlodinium veneficum). Among the OTUs assigned to genus Alexandrium, one is known as a paralytic shellfish poisoning (PSP) producer (A. minutum), two species are considered non-toxic (Alexandrium affine and Alexandrium tamutum) and two OTUs could not be identified to the species level and were interpreted as Alexandrium sp. Three species reported as producers of azaspiracids were also registered (Amphidoma languida, Azadinium dexteroporum and Azadinium poporum). The ichthyotoxic species Margalefidinium polykrikoides, Karlodinium veneficum, Pheopolykrikos hartmannii and Prymnesium polylepis were found in most of the samples. In addition, OTUs  Figure 1). similar to the yessotoxins-producing Protoceratium reticulatum and Lingulodinium polyedra, potentially toxic diatom Pseudo-nitzschia delicatissima and the toxigenic raphidophytes Heterosigma akashiwo and Fibrocapsa japonica were also detected in the sampling set. More than a half of the potentially toxic species identified in the present study had not been previously detected in Black Sea sediments, but resting stage formation had been reported in different seas [23][24][25][26], except for Amphidoma languida, Azadinium dexteroporum, Karlodinium veneficum, Pseudo-nitzschia delicatissima and Prymnesium polylepis. Generally the genus Pseudo-nitzschia is not known to form a resting stage with the exception of one single report [27].
The results suggest that environmental metabarcoding has the potential to improve the identification of resting stage cells in sediments with the prospect to assess their potential as a "seeding population" for further blooms. However, the method also has some limitations. It can provide only relative abundance data about the number of different OTUs in the samples related to the difficulty to interpret the cell abundance due to variations in the copy numbers of rRNA genes among different groups [28] and even within the same genus [29]. In addition, the problem with the DNA from sedimented vegetative cells, as well as the extracellular DNA preserved in the sediment [30] may affect the reliability of the results, although Piredda et al. [8] stressed the effectiveness of sediment storage in the dark and cold for selecting the resting stages.
Although the V7-V9 and V1-V3 hypervariable regions allow better resolution than the other regions of the SSU rRNA gene [9], their taxonomic identification power is still under consideration. Of the generated 18S OTUs, only 28.5% had sufficiently close matches (>0.990 BLAST top hit similarity) ( Figure 3) confirming that the effective number of sequences registered in the international nucleotide sequence databases from benthic protists is still insufficient [7]. Out of the OTUs satisfying the taxonomic assignment criteria about 56.4% could be identified at the species level, 35.2% classified to the genus level, and 8.4% matched multiple genera, indicating the lack of intergenus taxonomic resolution of the marker (e.g. Alexandrium pseudogonyaulax and A. hiranoi, Chaetoceros neogracile and C. curvisetus, Thalassiosira eccentrica and T. antarctica).

Conclusions
The obtained environmental high-throughput sequence data provide a better assessment of the taxon composition of benthic propagules, adding new information about sediment phytoplankton biodiversity. The results indicate that high protist diversity is accumulated in Black Sea sediments. In addition to already reported cysts morphotypes, 30 dinoflagellate species were identified. New data for the presence of other phytoplankton groups as diatoms, chlorophytes, haptophytes and prasinophytes were also provided. The identification of 16 potentially toxic species, some of which found in all stations, highlights the ecological significance of the benthic resting stages for HAB events. Moreover most of the potentially toxic species (Amphidoma languida, Azadinium poporum, Azadinium dexteroporum, Karlodinium veneficum, Prorocentrum cordatum, Prymnesium polylepis, Pseudo-nitzschia delicatissima, Fibrocapsa japonica and Heterosigma akashiwo) had not been reported earlier in Black Sea sediments. By improving the detection of HAB species diversity in the sediment, the high-throughput sequencing approach could be instrumental for advancing our knowledge about their life cycle. Further studies are needed towards understanding the association between life stages, biodiversity and HAB events.