Advanced search
875
Views
2
CrossRef citations to date
0
Altmetric
Point of View

Long non-coding regulatory RNAs in sponges and insights into the origin of animal multicellularity

ORCID Icon ORCID Icon, ORCID Icon & ORCID Icon
Pages 696-702
Received 13 Dec 2017
Accepted 28 Mar 2018
Accepted author version posted online: 04 Apr 2018
Published online: 25 May 2018

ABSTRACT

How animals evolved from a single-celled ancestor over 700 million years ago is poorly understood. Recent transcriptomic and chromatin analyses in the sponge Amphimedon queenslandica, a morphologically-simple representative of one of the oldest animal phyletic lineages, have shed light on what innovations in the genome and its regulation underlie the emergence of animal multicellularity. Comparisons of the regulatory genome of this sponge with those of more complex bilaterian model species and even simpler unicellular relatives have revealed that fundamental changes in genome regulatory complexity accompanied the evolution of animal multicellularity. Here, we review and discuss the results of these recent investigations by specifically focusing on the contribution of long non-coding RNAs to the evolution of the animal regulatory genome.

Multicellular life has evolved independently in at least 10 different eukaryotic lineages as diverse as animals, fungi, plants, slime molds and seaweeds [13]. The transition from a unicellular to a multicellular lifestyle required the emergence of genomic regulatory systems to allow for dynamic spatiotemporal and cell-type specific gene expression, which allows for greater cell type functional diversity and specialization (reviewed in [4]). This process has been deemed to be orchestrated by the interplay between regulatory genes, including transcription factors (TFs) and signaling molecules, as well as with non-coding regulatory DNA and RNA sequences [5]. Although it is widely appreciated that these systems ultimately evolved from genomic regulatory mechanisms present in single-celled ancestors, the questions of how and when did multicellularity evolve, and what genomic innovations underpin these evolutionary steps remain a focus of evolutionary and developmental biology.

While many TFs and signaling genes evolved after the divergence of animals and their closest living relatives (e.g., choanoflagellates) and, thus, are correlated with the evolution of animal multicellularity (e.g. [68],), others have an older origin and are found in unicellular organisms [918]. For instance, an early burst in the diversity of LIM homeobox TFs and cell adhesion genes (e.g., Type IV collagens) occurred in unicellular holozoans well before the emergence of the first animals, consistent with the later co-option of these gene families into roles in multicellular development [19]. The striking conservation of the TF family repertoire in non-bilaterian multicellular animal lineages (cnidarians, placozoans, ctenophores and sponges) - and to a lesser extent unicellular holozoan lineages - further supports the premise that the evolution of animal multicellularity must have been a result of the evolution of other regulatory features that are likely responsible to control how and when genes are employed during animal development, including cis-regulatory DNA, as well as chromatin modifications and non-coding RNAs (ncRNAs) [5].

Molecular reconstruction of the most recent common ancestor of animals is, therefore, crucial to determine which of these regulatory features, if not all, were instrumental in the origin of animal multicellularity. Sponges are one of the earliest-branching animal lineages, diverging from other animals around 700 Mya [20]. Hence, key insights into animal origin could be gained through the comparison of genomic traits shared between sponges and all other animals (i.e., eumetazoans). Since their divergence, sponges and eumetazoans have had radically different evolutionary histories, with the ancestor of bilaterians, cnidarians and placozoans giving rise to a range of morphologically-complex body plans, and the ancestor of sponges yielding one morphologically-simple body plan. However, despite sharing a remarkably similar repertoire of developmental genes [6,2125], these disparate evolutionary trajectories had yet to be reconciled in terms of regulatory non-coding genome content and organization.

Here, we review recent discoveries of long non-coding RNAs and their associations with specific chromatin states in the sponge Amphimedon queenslandica, and discuss their implications for the evolution of gene regulation and animal multicellularity.

Animal long non-coding RNA-based regulation: Insights from the sponge Amphimedon queenslandica

The application of next-generation sequencing technologies over the last decade, largely in bilaterian model species, has revealed that animal genomes encode thousands of lncRNAs [2639], which may be sense or antisense, intronic, and intergenic with respect to protein-coding genes [40]. LncRNAs range in size from ≥200 nucleotides to ≥10 kilobases in length, and are often multi-exonic and polyadenylated. Despite lacking obvious protein coding potential, lncRNAs perform a wide range of regulatory roles beyond RNA's cardinal function in the flow of genetic information (see e.g. [4144],). These regulatory roles of lncRNAs can be performed by the lncRNA transcripts themselves, or they may be mediated by lncRNAs involvement in higher-order chromatin structure, recruitment of regulatory protein complexes through RNA-protein interactions, or a combination thereof (reviewed in [45]). For instance, lncRNAs act as scaffolds to bring two or more proteins into a complex or in physical proximity [42,46]. LncRNAs can also act as guides to recruit chromatin modifying enzymes and can be required for localization of ribonucleoprotein complexes to specific targets [41,43]. Finally, several lncRNAs have been shown to act as decoys that titrate away microRNAs or regulatory proteins [41]. These roles suggest that evolutionary innovations involving regulatory lncRNAs were a cornerstone for increasing complexity of genetic regulation in animals.

Unlike many protein-coding sequences, lncRNA genes are rapidly evolving and exhibit poor primary sequence similarity between species; orthologous lncRNA are difficult to identify [47], thus precluding a detailed understanding of their evolution in terms of sequence, structure and function. In fact, while the role of lncRNAs in the regulation of developmental gene activity now appears to be widespread amongst animals [23,48-65], only a handful of lncRNAs have thus far been shown to possess conserved function(s) in evolutionarily divergent animals; all functional studies currently are restricted to bilaterians [52,58,6668] (Fig. 1A). Understanding the early evolution of these putative master orchestrators can contribute to reconstructing the origin of animal gene regulatory complexity and multicellularity.

Figure 1. Early evolution of animal long non-coding RNAs: Insights from the sponge Amphimedon queenslandica. (A) Despite a growing number of lncRNAs having been identified in bilaterian animals, the systematic investigation of lncRNAs in non-bilaterian animals has been lagging behind and, thus, we lack an understanding of their origin and early evolution. Yellow background highlights the animal kingdom. (B) and (C) Identification of Amphimedon lncRNAs. (B) Schematic representation of the Amphimedon queenslandica life cycle. Larvae emerge from maternal brood chambers and then swim in the water column as precompetent larvae before they develop competence to settle and initiate metamorphosis. Upon settling, the larva adopts a flattened morphology as it metamorphoses into a juvenile, which displays the hallmarks of the adult body plan. This juvenile will grow and mature into a benthic adult [121]. Adapted from [23]. (C) Developmental expression profiles of Amphimedon lncRNAs. Expression profiles of the top 50 differentially expressed lncRNAs during the transition from pelagic swimming competent larva to benthic juvenile. Each row represents data for one lncRNA. Pelagic stages include precompetent (P) and competent (C) larva; benthic stages include juvenile (J) and adult (A). Red indicates high expression level, light blue low expression. Adapted from [23].

Animal lncRNAs tend to be abundantly expressed in discrete cell types [69,70] and exhibit more tissue [31] and developmental stage specificity [7174] than protein-coding genes at different expression ranges, suggesting that animal development requires the fine-scale regulation of expression of specific lncRNAs [75]. Consistent with this, we recently showed that the sponge Amphimedon queenslandica expresses an array of lncRNAs akin to their bilaterian counterparts (i.e., in a spatiotemporal and cell type-specific manner) [23,76]. The analysis of Amphimedon lncRNAs expression profiles has indeed revealed that lncRNA abundances correlate with morphogenetic and developmental milestones [23,76] - a hallmark of regulatory molecules (Fig. 1B, C). For example, while the complexity of the morphogenetic events during Amphimedon early embryo cleavage is reflected in the high diversity of the lncRNAs expressed at this stage, the subsequent embryonic stages have markedly fewer lncRNAs expressed at high levels [23]. This observation is similar to previous findings in bilaterians, where early embryonic stages appear to be a period of active transcription of lncRNAs [23,31,51,63,75,77], perhaps to regulate maternal transcripts or transcription of cell-cycle genes [75].

The highly dynamic and tightly regulated expression of lncRNAs in sponges and bilaterians suggests these features were present in their last common ancestor. However, the origin and evolution of this class of non-coding RNAs remain unclear.

Evolutionary conservation of animal lncRNAs: Homology or co-option?

The scarcity of lncRNA annotations (especially in non-bilaterian animals) and their rapid sequence divergence has posed challenges to understand lncRNAs evolution [47]. Only a handful of functionally homologous bilaterian lncRNAs have been identified and analyzed [52,58,6668]. These studies suggest that sequence conservation is not an essential requirement for lncRNA functionality [78]. Nonetheless, highly conserved elements within lncRNA sequences (micro-homologies), interspersed with longer and less conserved stretches of nucleotide sequences, have been reported [79,80], and include the miR-7 binding site in the lncRNA Cyrano [52], the PRC2-binding elements in the lncRNA Xist [81], and short sequences derived from Alu repeat elements [82]. Other lncRNA features that appear to be conserved include syntenic relationships to neighboring genes [58,79,83], conservation of secondary structure [8487], and specific expression patterns [79,88,89].

As in bilaterians where lncRNAs have been shown to be co-expressed with multiple protein-coding genes [50], Amphimedon lncRNAs also appear to belong to co-expressed developmental gene modules [23]. By comparing gene co-expression networks between Amphimedon and Sycon ciliatum [63] - a distantly-related calcisponge - we have recently identified several putative evolutionarily conserved developmental modules of co-expressed homologous genes and lncRNAs in sponges [76]. This is despite the lack of sequence similarity between the network-embedded sponge lncRNAs. One such example is comprised of two lncRNAs found in Amphimedon and Sycon sponges, differing in sequence but sharing in their co-expression, and thus presumably co-regulation, with the G protein-coupled receptor Frizzled B (a key component of the Wnt signaling pathway in animal development) and other regulatory genes (e.g., TGF-beta) [23,63,76]. As gene regulatory networks and modules are central for the control and timing of animal development [9092], the finding of similar sets of homologous protein-coding genes co-expressed with lncRNAs between evolutionarily divergent sponge species points to lncRNAs being developmental regulators that might operate in conserved gene regulatory networks. Given the lack of sequence identity of these lncRNAs and of functional data in sponges, the independent co-option of non-homologous lncRNAs into these modules cannot be discounted at this time.

Recent discoveries of lncRNAs in unicellular lineages closely related to animals generate further uncertainty about the evolutionary origin of animal lncRNAs. In fact, while lncRNAs appear to be greatly expanded in multicellular animals [23,63,80,93], several hundred lncRNAs have now been annotated in two of animal closest unicellular relatives, Capsaspora owczarzaki and Creolimax fragrantissima [94,95]. This is consistent with this class of non-coding RNAs antedating the origin of animal multicellularity and development.

The origin of animal cis-regulatory complexity: Insights from the sponge histone modifications landscape

If differences in information content (e.g., proteome or lncRNAome size) between animals and their unicellular relatives cannot fully account for their phenotypic differences, evolutionary innovations in gene regulatory mechanisms, rather than the introduction of novel regulatory genes (i.e., TFs and lncRNAs), could have been a crucial step in the emergence of animal multicellularity. Recent chromatin profiling of histone H3 post-translational modifications (PTMs) in Amphimedon has shown that that sponges use the same chromatin-based regulatory system found in more complex animals [96]. Based on the presence and genomic location of specific histone PTMs patterns (e.g., H3K27me3, H3K4me1 and H3K27ac) associated with regulatory features that had previously only been identified in more complex animals, these results identified novelties that may underlie the emergence of multicellular animals, including epigenetic memory mediated by chromatin repressors, and distal cis-regulatory elements (i.e., enhancers) [96,97].

Enhancers are regulatory elements critical for accurate spatiotemporal and cell type-specific expression of the genes that regulate development in animals and are distinguished by a unique chromatin signature of histone H3 lysine 4 and 27 monomethylation/acetylation (H3K4me1 and H3K27ac) [98100]. Similar to eumetazoans, analysis of predicted enhancer elements in Amphimedon revealed that these regulatory elements also contain putative binding sites for TFs that are important for animal development (e.g., SOX) [96,101105]. Their identification in Amphimedon is, therefore, consistent with this regulatory feature evolving along the metazoan stem at the transition to multicellularity, as signatures of animal enhancers have not been detected at regulatory sites of closely related unicellular holozoan sister taxa [94] (Fig. 2A).

Figure 2. Long non-coding RNAs are defined by specific chromatin signatures. (A) Recent analyses [94,96,101] of non-coding regulatory DNA and histone marks have revealed that some cis-regulatory mechanisms, such as those associated with proximal promoters, are present in non-metazoan holozoans (right panel) while others appear to be metazoan innovations, most notably distal enhancer regulation (left panel). Shown is a schematic representation of the presence or absence of the typical chromatin signatures associated with animal distal enhancer elements [the transcriptional cofactor p300, histone 3 lysine 4 monomethylation (H3K4me1), histone 3 lysine 27 acetylation (H3K27ac), and ATAC site]. Adapted from [122]. (B) LincRNAs can be separated in two distinct populations of polyadenylated transcripts based on the chromatin status at their transcription-start sites. Shown is the enrichment of H3K4me1 (left) and H3K4me3 (right) (ChIP versus input) at enhancer-associated and promoter-like lincRNAs, respectively.

Enhancer elements are known to be associated with the transcription of non-coding RNA transcripts, termed enhancer RNAs (eRNAs) [106108]. The majority of eRNAs have been shown to be short, unspliced and non-polyadenylated transcripts. However, a second subset of long intergenic non-coding RNAs displaying enhancer-like activity (elincRNAs) has recently been described. These polyadenylated transcripts arise from genomic locations with typical enhancer properties such as enrichment of histone H3 monomethylated at lysine 4 (H3K4me1) and depletion of histone H3 trimethylated at lysine 4 (H3K4me3) chromatin marks. This is in contrast to yet another type of lincRNA, the more canonical promoter-associated lincRNAs (plincRNAs; low H3K4me1-to-H3K4me3 ratio) [94,109112]. These two distinct populations of chromatin-enriched lincRNAs were also recently discovered in Capsaspora [94] and Amphimedon [96] (Fig. 2B). Similar to bilaterians, these two lincRNA populations showed only minor differences in length, expression level, and expression variation [94]. While their functional significance is yet to be determined, they may represent distinct classes of polyadenylated lincRNA transcripts with putative diverse biological functions, consistent with elaborate genome regulation by lncRNAs being already present in unicellular holozoans [94,109112].

What can we learn about the evolution of multicellularity by investigating other lineages?

Animals are not the only multicellular organisms, and thus not the only system suitable to the study of multicellularity's origins. It has evolved independently multiple times across the tree of life, including at least 10 times across eukaryotes [3,113]. For instance, studies on the evolution of multicellular plants showed that multicellularity evolved several times across two main plant lineages: streptophytes (charophyte algae and all land plants) and chlorophytes (sister clade which includes green algae) [114116]. Comparative genomic analyses between two representatives of the chlorophytes - the single cell green alga Chlamydomonas and its multicellular relative Volvox [117] - show patterns of diversification in gene content very similar to what has been observed in animals, with very few differences between Chlamydomonas and Volvox genomes that could explain the drastic differences in their morphologies.

While land plants employ the same tool kit of chromatin-based gene regulatory mechanisms [118], the roles of non-coding RNAs and histone PTMs in the evolution of plant multicellularity remain unknown. In fact, while recent reports show that gene repression by PRC2-mediated histone methylation is present in unicellular algae [119,120], its role(s) in determining the spatiotemporal and cell-specific gene expression patterns in early multicellular plant lineages, together with the role(s) and evolution of lncRNAs, remain unknown at this time.

Concluding remarks

Recent in-depth analysis of gene regulation in the sponge Amphimedon queenslandica shows that fundamental changes in the non-coding regulatory architecture of the genome occurred along the metazoan stem, in concert with the evolution of the multicellular condition. It now appears that most of the genes and long non-coding regulatory mechanisms underlying the formation of complex animals, like ourselves, had an unexpected early origin - probably as early as the first steps in the evolution of multicellular animals from single-celled organisms, at least 700 Mya. Thus, the first animals likely evolved from a unicellular ancestor through the co-option of multiple ancestral gene modules, as well as the evolution of new coding gene families, distal enhancer elements and other classes of non-coding RNAs, such as microRNAs and Piwi-interacting RNAs [122]. These changes are likely to have contributed an increase in the capacity to regulate spatial and temporal gene expression, which is necessary for complex multicellularity. With a complex gene regulatory landscape already in place at the dawn of animals, the further differential expansion of genomic regulatory repertoires in bilaterian animals likely account for their increased regulatory and morphological complexity relative to non-bilaterian animals.

Acknowledgments

This work was supported by an Australian Research Council grant (FL110100044) to BMD.

Disclosure of potential conflicts of interest

No potential conflicts of interest were disclosed.

Additional information

Funding

This work was supported by the Australian Research Council, FL110100044.

References

 

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.