Mitochondrial and plastid genome variability of Corallina officinalis (Corallinales, Rhodophyta)

ABSTRACT Corallina officinalis is a calcifying red alga, common in tide pools in the North Atlantic with occasional reports from the north-east Pacific. It is an important habitat-forming alga, providing shelter and substrata to many other organisms. To date there are only five published organellar genomes for Corallina, including C. chilensis and C. ferreyrae. This study reports the first four published plastid genomes for C. officinalis, along with three new mitogenomes from samples in the United Kingdom, Spain and Iceland. The plastid genome is 178 kbp and 99.9% of bases are identical for all samples. The mitogenomes are more variable than the plastid genomes, with lengths varying from 26.2 to 26.7 kbp and 99.0% base identity. Structure and length of both of the genomes are consistent with other published Corallina genomes. The most variable mitochondrial gene is sdhD (3.3% variability), while all plastid genes have <1% base variability, with the most variable being psb30 (0.95% variability). The stability of the plastid genome means it is not useful for examining intra-specific variability within Corallina. We discuss whether the ratio of mitogenome and plastome sequences recovered in the readpool of NGS sequencing is indicative of relative copy number.


Introduction
The red coralline alga Corallina officinalis (Corallinales) is a calcified seaweed that is widespread in the North Atlantic (Brodie, Walker, Williamson, & Irvine, 2013) with a few reports from the north-east Pacific (Hind, Gabrielson, Lindstrom, & Martone, 2014;Magill, Maggs, Johnson, & O'Connor, 2019). It inhabits tide-exposed rock-pools, where it provides vital ecosystem services such as substrata provision and a habitat that facilitates invertebrate recruitment Nelson, 2009;Perkins et al., 2016). Corallina officinalis has a high-magnesium-calcite skeleton, the most soluble polymorph of calcium carbonate deposited by marine calcifiers, making it potentially vulnerable to the global decrease in ocean pH (i.e. ocean acidification), which can critically undermine the structural integrity of calcifying organisms . To-date, studies have reported a wide tolerance of this species to fluctuating tide-pool conditions, including ambient carbonate chemistry, water temperature and light regime Williamson, Perkins, Voller, Yallop, & Brodie, 2017;Williamson et al., 2018). However, recent work has demonstrated that physiological responses in C. officinalis vary over its distribution in the NE Atlantic, with populations from the UK and Spain showing markedly different responses to conditions (Kolzenburg et al., 2019).
We know from population genetic analysis that there is significant genetic structure (based on nuclear SNPs) within C. officinalis over a wide latitudinal gradient, from Iceland to Spain (Yesson, Jackson, Russell, Williamson, & Brodie, 2018). There is evidence of geneflow between the British Isles and Spain, but Icelandic populations appear more isolated (Yesson et al., 2018). We also know there is substantial genetic variability within and between Corallina species based on widely used mitochondrial and plastid DNA "barcode" sequences (Williamson et al., 2015). However, genomelevel assessment has yet to be undertaken.
Genome analysis can provide valuable insight into the relationships of many organisms including red algae (Iha et al., 2018). However, although relatively few florideophycean algae (the largest class of red algae) have had both organellar genomes sequenced, of the 102 florideophycean plastid sequences available (Cho, Choi, Lam, Kim, & Yoon, 2018), 40 are accompanied by a mitogenome (Bustamante, Calderon, & Hughey, 2019;Salomaki & Lane, 2017). To date three species of Corallina have complete mitogenomes published: C. officinalis (Williamson, Yesson, Briscoe, & Brodie, 2016), C. chilensis (Alejo et al., 2019) and C. ferreyrae (Bustamante et al., 2019); the latter two species also have complete plastid genomes (Alejo et al., 2019;Bustamante et al., 2019). This study examines the mitochondrial and plastid genomes of four samples of Corallina officinalis from three countries to assess organellar genome variability within this species and assess the value of these organellar genomes for assessing intra-specific patterns.
Total genomic DNA was extracted from 0.5 to 1 cm 2 of each frond sample, using a modified CTAB extraction method (Williamson et al., 2015). Double-stranded DNA was quantified with a Qubit fluorometer 2.0 (Invitrogen, Waltham, MA). Index libraries were constructed with a TruSeq Nano DNA sample preparation kit using the recommended ~100 ng of gDNA (Illumina Inc., San Diego, CA) and sequenced on an Illumina MiSeq flowcell, v 3 chemistry (2x300 paired end reads). Two runs were performed. Run one contained equal amounts of the UK samples, run 2 sequenced the Spanish & Icelandic samples. These are the same data used in constructing the mitochondrial genome of sample BM001215284 .
After sequencing, the read pools were assessed for quality using FastQC Version 0.11.8 (Babraham Bioinformatics, Cambridge, UK). Adaptors were removed from the sequences using cutadapt (Martin, 2011), which also trimmed poly-A tails, and the end 10 base-pairs were trimmed from all reads. After trimming, reads shorter than 35 bp were discarded.
Seed sequences were used for the assembly process. For mitogenomes, the cox1 region from accession KU641510 was used. For the plastid genomes the rps1 region was used based on C. ferreyrae (NC_041636). The seed-and-extend algorithm was run until the contig could be circularized.
Annotations were transferred from published Corallina genomes (Alejo et al., 2019;Bustamante et al., 2019;Williamson et al., 2016) in Geneious ver. 2019.1.327. Gene orders were validated by manual comparison with the template and by coverage. Annotations of transfer RNAs (tRNAs) were verified by tRNAscan-SE 2.028 (Chan & Lowe, 2019). Gene boundaries were extended/contracted to the widest matching open reading frame (ORF).

Results
All four samples produced millions of short sequence reads (c. 300 bp). Post-filtering, the size of the four readpools varied from 17 to 57 m (see Table 1). Coverage for the full genomes averaged in the hundreds for all samples/genomes. The Icelandic sample (the northern-most sample) contained a lower ratio of plastid genome reads relative to mitochondrial reads, while the southern- most sample (from northern Spain) showed the highest ratio of chloroplast to mitochondrial reads. Mitogenome lengths ranged from 26,265 to 26,700 bp, which represents a 1.7% length variability. GC content was consistently 30.0%. The structure of the mitogenomes were conserved over all samples and code for 23 protein-coding genes, 25 tRNA genes, and two rRNA genes (Table 2, Supplementary figure S1). Each mitogenome included a large (1999-2434 bp) intergenic region (IGR) containing just transfer RNAs with the longest stretch of unannotated sequence of the IGRs being between the trnY and trnN regions (up to 898 bp). We note that this section contains long, inverted repeats at either end. Contained within this large IGR, and between the two long repeats were the only two features which differed in length between populations: trnS1 (76-88 bp), and trnY (83-88 bp). All other regions were fully conserved in length. The most variable region is sdhD, which shows 3.3% (8/243) base variability, while overall there is a 0.99% variability on annotated regions (Table 2). Supplementary table S2 provides a detailed list of the annotated regions of the mitogenomes.
The plastid genome lengths ranged from 1,78,170 to 1,78,183 bp, a much lower length variability than the mitogenome (0.2%). GC content was consistently 30.2%. Each plastid genome codes for 205 proteincoding genes (including 27 hypothetical conserved proteins, and 10 unassigned ORFs), 2 non-coding RNAs, 31 transfer RNA genes, and 3 ribosomal RNAs (Fig 1, Table  2). A singular group II intron is situated within the chlB gene, along with an intron-encoded ORF (ORF 456). Overall, the four plastid genomes were highly conserved, with only 102 variable bases (99.9% conserved) of these variable bases 91 are in coding regions and 61 in named/recognized genes (excluding ycf and ORF). The majority of these are synonymous substitutions with only 14 resulting in a change of amino acid (Table 2). Supplementary table S3 gives a detailed list of the annotated regions of the plastid genomes.
The relatively conserved plastid genome gives much lower genetic distances between samples than the more variable mitogenomes (Fig 2, Table 3). The Icelandic sample is the most genetically distinct according to the plastid genomes, but the north Devon sample shows slightly higher genetic distances according to the mitogenome. Furthermore, the mitogenome and plastid genome phylogenies differ slightly, firstly C. ferreyrae is shown as sister to C. officinalis (with weak support) based on the mitogenome, while C. chilensis is sister (with high support) based on the plastid genome data. The within-species relationships reflect the genetic distances, with the low genetic differentiation of the plastid genome being reflected in weak support values for the recovered relationships.

Discussion
This study reports the first complete plastid genome for the widespread C. officinalis from samples in the North Atlantic. Two other Corallina species have complete mitochondrial and plastid genomes published, C. chilensis (Alejo et al., 2019) and C. ferreyrae (Bustamante et al., 2019).

Plastomes
Corallina officinalis plastomes are similar in size to the other Corallina genomes, with a < 200 bp length difference from C. chilensis for all samples. The gene order amongst Corallina is identical for all shared annotations, although the rnpB and petJ regions are not annotated in either of the previously published Corallina sequences. rnpB is a non-coding RNA region which shows only 0.6% base variability; in contrast the petJ gene shows higher base variability (6.1%). We also note that closely matching unannotated regions are contained within the other published Corallina sequences and are both present and annotated in other published plastomes of Corallinaceae (Janouškovec et al., 2013). Another notable feature is a large (1600 bp) insertion in C. officinalis (relative to other Corallina species) between ompR and rrs rRNA, which encompasses two substantial (> 400 bp) ORFs (ORF 1811 and ORF 1809). The presence of species-specific regions containing multiple ORFs has been observed in plastid genomes of other red algae (Janouškovec et al., 2013). The group II intron within the chlB gene is present in other Corallinaceae (Alejo et al., 2019;Bustamante et al., 2019;Janouškovec et al., 2013). Overall, there is high plastome conservancy observed within and between Corallina species, which is indicative of the high plastome conservancy reported within the in general red algae (Iha et al., 2018).

Mitogenomes
As for the mitogenomes, there is at most an 805 bp length difference with other Corallina, and the structure and gene order are highly conserved. C. chilensis has a 500 bp ORF (orf158) and a similar 507 bp ORF is evident in all C. officinalis mitogenomes located between trnW and trnA. This section of the genome is a highly variable region across the genus Corallina with 26.4% of bases showing some variation. This region appears to be a hotspot of variability for the red algae as Iha et al. (2018) found high variability for this region in Gracilariaceae. The sdh3 region could also be a potential target for sequencing studies as it showed the highest variability of all genes; however it is noted Graphic generated by DNAPlotter (Carver, Thomson, Bleasby, Berriman, & Parkhill, 2009).
that this region is lost in some other Florideophyceae (Yang et al., 2015), potentially limiting its wider use. The trnT and trnI regions are not annotated for other Corallina mitogenomes (Alejo et al., 2019;Bustamante et al., 2019), although these sequence regions are highly conserved across Corallina. The trnI region is situated alongside trnL1 between cob and nad6, whereas other Corallines see this positioned within a group II intron between nad5 and nad6 (Lee et al., 2018). The length variability of trnS1 (gct) seen within C. officinalis is also observed over the Corallinaceae, with the length for Neogoniolithon (94 bp) being longer than any of our samples (Lee et al., 2018). trnY also shows length variability across other Corallina (C. chilensis 84 bp, Alejo et al., 2019;C. ferreyrae 83 bp, Bustamante et al., 2019). The number of tRNA (25), CDS (23) and rRNA (2) are in the same range reported for other red algae (Yang et al., 2015), although the tRNA count is the joint highest reported only matched by the Rhodymeniophycidae Plocamium. The higher variability within Corallina mitogenomes relative to plastid fits the expected pattern reported for many other organisms (Smith & Keeling, 2015).

Ratio of cpDNA to mtDNA
The ratio of cpDNA reads to mtDNA is lowest in the Icelandic samples. It is notable that this is the most northerly population (close to the northern limits of the species in the North Atlantic). Environmental conditions for these northern populations are characterized by lower irradiance, temperature, and carbonate saturation . We know that environmental conditions can affect copy number of plastid and mitochondrial genomes in plants (Wang, Anderson, & Griffin, 2004), so it would be worth further investigation to test whether the relative coverage rates in sub-Arctic Iceland are influenced by the colder temperatures and lower light levels, particularly in light of evidence that photoregulatory capacity of C. officinalis appears to decrease in more northerly populations (Kolzenburg et al., 2019).
The relative copy number of organellar genomes is a potentially interesting variable characteristic that is not   (Palmeira & Rolo, 2015) and western-blotting (Picard et al., 2011)). Currently, the turnover rate of Krebs cycle enzymes is the nearest proxy for absolute mitochondrial number, at least in animal models (Larsen et al., 2012). Assembly of multiple, paired, conspecific organellar genomes allows relativized estimation of copy-number proportions, as the assembler detects relative abundance of organellar sequences in the readpool. However, it is difficult to draw conclusions based on the low level of sampling in this study, particularly when copy number can change over time (Zoschke, Liere, & Börner, 2007).
In conclusion, this study reports the first four plastomes and three novel mitogenomes for C. officinalis. Plastome variability is very low within the species, making it an unlikely target for assessment of intra-specific variability. Mitogenome variability is higher and would make a better target for sequencing-based studies within the group.