RNA-seq analysis of the salt stress-induced transcripts in fast-growing bioenergy tree, Paulownia elongata

ABSTRACT Paulowina elongata is a fast-growing tree species is grown in different climates and types of soils. Environmental adaptability as well as high-yielding biomass make the P. elongata species an ideal candidate for biofuel production. High soil salinity is known to inhibit plant growth dramatically or leads to plant death. The purpose of this study was to characterize the salt-induced changes in the transcriptome of P. elongata. Transcriptome differences in response to salt stress were determined by RNA sequencing (RNA-seq) using next generation sequencing and bioinformatics analysis. A total of 645 genes were found to have significant altered expression in response to salt stress. Expression levels of a selective subset of these genes were chosen and confirmed using quantitative real-time PCR. To the best of our knowledge, this is the first report of salt-induced transcriptome analysis in P. elongata. The current study indicates that differential expression of a select group of genes of P. elongata and their possible roles in pathways and mechanisms related to salt tolerance. Functional characterization of these genes will assist in future investigations of salt tolerance in P. elongata, which could be used to enhance biofuel production. Abbreviations: ABA: abscisic acid; SOS: salt overly sensitive; ROS: reactive oxygen species; PCR: polymerase chain reaction; qPCR: quantitative real-time polymerase chain reaction


Introduction
High levels of salinity in soil are a reducing agent for plant growth and productivity due to altered water uptake and ion-specific toxicities (Stavridou et al. 2017). Plants exhibit an intricate regulatory network involving plant hormones that play a central role in modulating physiological responses to salinity. This abiotic stress substantially affects agriculture worldwide, including biomass crops used for biofuel production purposes. Knowledge of genes involved in salt stress regulation is limited, yet 20% of the world's cultivated plant productivity is impacted by salinity (Stefanov et al. 2016). According to the United Nations, as of 2013, salt-induced crop loss was estimated at $441 per hectare or equivalent to $27.3 billion per year (World Losing Farm Soil … 2013). Most plants are categorized as glycophytes, meaning that they are intolerant to high levels of salinity. The growth rate of a glycophyte is severely inhibited or lethal under exposure to 100-200 mM of NaCl, while halophytes can survive under higher levels of salinity without deterring growth rate or causing plant death until salt levels reach 300-400 mM of NaCl in the soil (Carillo et al. 2011). At a molecular level, high levels of salinity in soil affects plants in two major ways: (1) a decrease in water potential and (2) high levels of toxic Na + . The effects of salinity lead to inhibition of physiological and biochemical processes (Sairam & Tyagi 2004). Several biological mechanisms such as the abscisic acid (ABA), salt overly sensitive (SOS) and reactive oxygen species (ROS)-related pathways are instrumental in providing salt tolerance. The majority of these salt tolerance mechanisms involve ionic homeostasis, activation of antioxidant enzymes, generating nitric oxide and hormone modulation (Tsukagoshi et al. 2015).
Abiotic stresses, such as salinity, affect productivity and profitability of plants used for biofuel technology. Increasing scarcity of petroleum-based fuel has increased the demand of alternative sustainable fuel. Biofuel is an alternative fuel produced in the form of bioethanol, syngas, biodiesel, and biogases (Yuan et al. 2008). Biofuel currently provides 10% of the world's energy and contributes up to 90% of energy in some of the lowest income regions of Latin America, Asia, and Africa (Tomes et al. 2010). Bioethanol is a platform of biofuel derived from starch, sucrose or lignocellulosic biomass through a variety of chemical processes (Yuan et al. 2008). The U.S.A and Brazil produce more than half of the world's bioethanol using corn and sugarcane (Sarkar et al. 2012). However, a major deterrent of using crops, such as corn or sugarcane, is their primary use as food sources conflicting with their possible use for biofuel. Unlike starch, lignocellulosic biomass is found within all plants, opening the usability of non-crop plants as biofuel and eliminating conflicting issues related to affecting world food supplies. A major focus of biofuel research is to improve overall production efficiency by selecting and using optimal plants capable of producing high quantities of biomass. Due to fast-growing nature of Paulownia, the plant is an ideal candidate for biomass production and biofuel production.
Paulownia elongata produces wood for a variety of purposes and has potential to produce bioethanol from lignocellulosic biomass (Donald 1990;Zhu et al. 1986). Paulownia trees are known for their rapid growth, adaptability to different climates, and thriving in different types of soils (Donald 1990). P. elongata trees have been utilized for phytoremediation of contaminated mine lands and could be used to improve soil quality in different scenarios involving land with polluted soil (Doumett et al. 2008). P. elongata can be harvested in as little as four years, while 5-7-year-old trees can produce approximately a cubic meter of wood (Ayrilmis & Kaymakci 2013). A three-year-old Paulownia tree, on an average, can sequester 50.11 lb CO 2 /year (29.91-92.74 lb CO 2 /year). Studies conducted at Fort Valley State University, Fort Valley, GA, U.S.A. indicate that three-year-old trees can reach a diameter of 14.8 cm and achieve a height of 8.31 m (Basu et al. 2015). The tree benefits from not requiring replanting and can be used for fodder and medicinal purposes (Ipekci & Gozukirmizi 2003). New shoots sprout from a stump after harvesting and grow as much as 20 feet in the first growing season (Basu et al. 2015). Approximately 65% of P. elongata is composed of cellulose or hemicellulose; high content of lignocellulosic biomass has made it an attractive energy source for biofuel production (Naik et al. 2010;Joshee 2012;Yadav et al. 2013) . P. elongata is tolerant of a wide variety of abiotic and biotic stresses, yet its tolerance toward salinity is not fully understood (Nicholas et al. 2007).
Cellulose obtained from wood could be a major source of different types of biofuel including bioethanol production (Suzuki et al. 2006). An increase in vegetative growth of plants leads to a higher production of plant biomass depending on a number of factors such as cell elongation, photosynthetic efficiency, and secondary wall biosynthesis (Demura & Ye 2010). These biochemical processes are adversely affected under saline conditions resulting in lower vegetative growth and loss of biomass yield. Hence, a sound knowledge of genes regulating these pathways in diverse climatic conditions would help selection of desirable tree stocks for future tree breeding and improvement programs. Characterizing salt stress-induced genes is vital for revealing possible functions of these genes in defense mechanisms, genomics, and transcriptomics of plants. Gene characterization can also be used to develop effective methods for combatting worldwide salinity. The recent development of RNA-sequencing (RNA-seq) allows us to study the expression of genes at a significantly deeper level. For example, RNA-seq analysis of Bermuda grass, identified genes regulating lignin synthesis and phytohormone signaling which affects cell wall loosening and cell growth (Tsukagoshi et al. 2015). Findings from the Bermuda grass study can be used for future improvements of biofuel processing and in the same manner analysis of plants in response to different abiotic stresses could improve processing of second generation biofuels.
This study investigates P. elongata gene expression changes induced by high concentrations of salt (125 mM) under control conditions. Total RNA was extracted from leaves of P. elongata after a 10-day salt stress treatment. RNA-seq libraries were prepared, sequenced, and analyzed. A de novo transcriptome of P. elongata was generated using the Trinity software. The de novo transcriptome served as reference for the Tuxedo and RSEM suite software generate a list of statistically significant salt-induced expressed genes. Differentially expressed genes in response to salt stress were identified and characterized by BLASTing the sequences against NCBI's non-redundant (nr) database. Available literature was used to investigate potential mechanisms and salt stress-related pathways. Identified genes in this study could play a critical role in understanding salt tolerance in plants considered for biofuel production. The characterization of salt-induced genes could be used for ameliorating P. elongata growth and salt tolerance in the future, making second generation biomass biofuel more appealing and P. elongata trees a more suitable candidate for biofuel production.

Materials and methods
Entire experimental design is shown as a flowchart in Figure 1.

Salt treatment conditions
Three-year-old P. elongata pots were grown in a greenhouse under 25-30°C, 75% relative humidity and 13 h of natural sunlight for the experiment. Plants were irrigated with 1.5 L of 125 mM NaCl every 24 h for 10 days in Pro-Mix HP Mycorrhizae potting soil (ingredients: Canadian Sphagnum peat moss (65-70%), perlite and limestone (Premier Tech Horticulture, Canada). Salt water drained completely through the soil and out of the pots. Control plants were irrigated under the same conditions with 1.5 L of water. There were two biological replicates for controls and two biological salt stress replicates. Total RNA extraction and RNA-seq Total RNA was extracted from Paulownia leaves using Quick-RNA MiniPrep Kit (Zymo Research, U.S.A.) following the manufacturer's protocol. Paulownia RNA was treated with DNase I (Life Technologies, U.S.A.) and the RNA Clean-Up kit (Zymo Research, U.S.A.) eliminated genomic DNA contamination. Barcoded cDNA libraries were synthesized from 1 µg of total RNA using the TruSeq RNA Sample Prep Kit (v2; Illumina, U.S.A.) following the manufacturer's low sample protocol. Barcoded cDNA libraries were pair-end sequenced using the MiSeq Reagent Kit (v3; 150-cycle; Illumina, U.S.A.) utilizing the MiSeq instrument located in the Department of Biology at California State University, Northridge, .U.S.A. The sequencing generated a total number of pair-end reads per sample listed in Table 1. P. elongata barcoded cDNA libraries samples were stored in −20°C.

RNA-seq gene expression analysis
Currently, there is no reference genome available for P. elongata, therefore the Trinity software (v2.0.6) (Haas et al. 2013) generated a de novo transcriptome utilizing the Illumina reads from all the samples pooled together. The original de novo Trinity transcriptome assembly resulted in 84,136 contigs, these contigs were then examined for potential contamination by BLASTing all of the contigs against the NCBI non-redundant (nr) protein database using the blastx command in the DIAMOND (v0.71) program (Buchfink et al. 2015), and then the output matches were screened for non-plant matches using the MEGAN6 program (Huson et al. 2007). In total, 44 contigs were identified as contamination (10 bacterial matches, 34 fungal matches), and were removed from the transcriptome prior to analysis. The finalized de novo Trinity transcriptome assembly resulted in 84,092 contigs with an average length of 929.4 bp. The transcriptome was employed as a reference for further analysis (Kim et al. 2013). Next, the Tuxedo software suite, Bowtie 2 software (v2.1.0), TopHat2 software (v2.0.9), Cufflinks software (v2.1.1), and Cuffdiff   software (v2.2.1) determined sequences with differential expression levels in response to salt stress in comparison to the controls (Trapnell et al. 2012). The Tuxedo suite mapped, quantified, and determined the expression differences between these conditions from the generated reads. Sequences with a q-value (<.05) were considered statistically significantly expressed. The data were also analyzed using the RSEM program (v1.3.0) (Li & Dewey 2011) to further determine the differentially expressed genes when the P. elongata was salt stressed compared to the controls. Contigs with a p-value (<.004), which is equal to the q-value (<.05) used for the Tuxedo software suite, were considered statistically significant.

Functional annotation of salt-induced genes
In order to identify a selective group of salt-induced genes, the subset of statistically significant expressed sequences generated from the RNA-seq gene expression analysis were BLASTed against the NCBI nr database for the top hits for each sequence. The TUXEDO software was used to generate FASTA files (Supplementary data 1) of all genes and the FASTA files were used for the Blast2Go program. The Blas-t2Go software (www.blast2go.con) functionally characterized all statistically significant expressed sequences (Götz et al. 2008). The software was used to perform: (1) top species similarities from BLAST search and (2) gene ontology (GO) terminologies.

Quantitative real-time PCR
BLAST results classified salt-induced genes in P. elongata, from which a subset of 20 were chosen (from the final output of TUXEDO software) for further investigation of their function in providing salt tolerance (Table 2). Quantitative real-time PCR (qPCR) was conducted to validate RNA-seq results, using iTaq Universal SYBR Green Supermix (BioRad, U.S.A.), of selected P. elongata salt-induced genes under control conditions (Table 2). A BioRad CFX 96 real-time PCR machine was used for qPCR experiments. The qPCR conditions were as follows: initial denaturation: 95°C for 3 min., annealing temperature varied (Table 2), and final extension at 65°C.

Salt stress conditions
Daily salt treatment of P. elongata was conducted for 10 days using a concentration of 125 mM NaCl, a concentration that is toxic for most glycophytes such as P. elongata (Carillo et al. 2011). P. elongata phenotypic conditions slightly worsened after the salt treatment despite receiving the high concentrations of salt water (data not shown). A separate salt treatment of 28 days of 125 mM NaCl was conducted demonstrating how P. elongata can survive high concentrations of salinity for a longer period of time (data not shown). Further studies could analyze and compare salt-induced changes in the transcriptome of P. elongata in response to salt treatments lasting different periods of time.

Salt stress RNA-seq
Each cDNA barcoded sample produced approximately 10 million pair-end sequence reads (Table 1). As previously mentioned, there are currently no reference genomes for P. elongata. At first the Poplar genome was used as a reference genome, however, using this reference genome produced inaccurate results and filtered out the majority of the sequence reads. Therefore, the Trinity software generated a de novo transcriptome and the Tuxedo suite software analyzed gene expression analysis. The RSEM software generated the following data: a total of 645 genes with significant expression changes in P. elongata after a 10-day salt treatment (p-value cutoff of .0004). Out of these 645 genes, 318 genes were downregulated, 304 genes were upregulated, 5 genes were turned off and 18 genes turned on (Supplementary data 2). The Blast2Go program determined the top three species mapping the closest to the transcriptome of P. elongata, on the basis of best sequence alignments with lowest E-value, are: (1) Sesamum indicum, (2) Mimulus guttatus, and (3) Boea hygrometrica (Figure 2(a)). However, the top three species found for all blast hits (irrespective of E-value) are (1) M. guttatus, (2) S. indicum, and (3) Nicotiana tabacum (Figure 2(b)). All sequences were annotated using the GO terminologies (www.geneontology.org) with subvocabularies of CC, BP, and MF (CC, cellular component; BP, biological process; MF, molecular function). The top three GO terms for BP were: metabolic process, single organism process, and cellular process. The top three GO terms for MF were catalytic activity, binding and transport activity. The top three GO terms for CC were: cell, cell part and membrane (Figure 2(c)). These results are consistent with other GO analysis of non-model organisms having undergone salt treatment such as the Petunia hybrida, a salt-resistant Solanaceous plant. The Petunia hybrida showed increased levels of metabolic processes in response to salt stress and a total number of unigenes in the transcriptome mapped directly to Solanum lycopersicum (Villarino et al. 2014) similar to the GO of P. elongata as in this study. Metabolic processes was a top GO term, most likely due to toxic Na + levels, plants have to maintain ion homeostasis and activate antioxidant enzymes, the stress plants undergo require metabolic processes to increase substantially (Gupta & Huang 2014). Prominent expression in catalytic, transporter, and transcription activity (Figure 2(c)) reflects P. elongata salt tolerance by attempting to maintain cellular homeostasis. The SOS pathway is known to activate as a response to high levels of salinity in plants; the signaling pathway involves antiporters responsible for regulating export and import of Na + and H + ions across the vacuolar membrane which corresponds to the GO annotations displaying high expression of metabolic processes and cell compartmentalization activity (Carillo et al. 2011). Future sequencing and annotation of the P. elongata genome would help characterize the transcriptome and understand salt stress-related pathways better.

Quantitative real-time PCR analysis
RT-qPCR was performed based on the gene expression information obtained from the Tuxedo software (Supplementary file 2). The top five upregulated and downregulated genes were chosen for RT-qPCR validation, while the most promising genes, based on functional characterization, turned on and turned off were also selected for RT-qPCR validation (Figure 3). RNA expression of salt-stressed-induced genes was normalized against an actin11 housekeeping gene labeled HACTIN. RT-qPCR results confirmed RNA-seq data for upregulated and downregulated salt-induced genes, but inconsistencies with RNA-seq results were found in turned on or off gene results (Table 2).

Hypothesized gene pathways and mechanisms
The current literature was used to hypothesize potential roles of novel genes in salt stress-related pathways and mechanisms. The gene encoding for a 21 kDa protein, G4KDA, is a highly upregulated salt-induced gene found in this study. G4KDA acts as a calmodulin-binding protein that is responsible for activating calcium channels (Jun et al. 1996). NADPH induces hydrogen peroxide activating calcium channels by an ABA-mediated response induced by salt stress leading to stomatal closure (Naranjo et al. 2006). It is possible that G4KDA protein may serve as a signaling molecule of ABA, activating salt stress response mechanisms. G4KDA protein and ABA have been previously linked as accumulating proteins in response to different abiotic stresses in rice, demonstrating that these might have an interconnecting role arising from abiotic stress. The results showed the G4KDA protein accumulating in large quantities in response, to not only salt stress, but to drought stress as well (Pareek et al. 1998). The G4KDA gene might play a key role in salt stress resistance by indirectly regulating stomatal closure.
Glyceraldehyde-3-phosphate dehydrogenase GAPCP2, chloroplastic-like (known as GADPH) is associated with nitric oxide and is another upregulated salt-induced gene from the study labelled as G53P. GADPH has been suggested to interact with osmotic stress-activated protein kinases that are regulated through NO activity (Wawer et al. 2010). Osmotic protein kinases are known to play a role in regulation of stomatal closure. GADPH and NO might play a regulatory role, despite the fact that osmotic kinases are traditionally associated with the ABA pathway (Wang et al. 2009). G53P could be a novel regulator of a nitric oxide pathway preventing stomatal closure and photosynthesis inhibition during high levels of salinity in P. elongata plants.
An additional upregulated salt-induced gene encoding for GDSL esterase/lipase, GDSL, revealed to play a role in plant  resistance to wound stress and pathogenic stress. Studies have demonstrated that Salicylic Acid (SA)-induced treatment activated overexpression of a GDSL esterase/lipase resulting in increased salt tolerance, GDSL is suggested to be induced by SA as a response to salt stress (Jiang et al. 2012). Overexpression of GDSL esterase/lipase in Arabidopsis causes inhibition of NaCl activity. Inhibition might be linked to the release of fatty acids by GDSL esterase/lipase serving as hormone signal transduction molecules (Naranjo et al. 2006). A hypothesized pathway would involve SA induced by salt stress consequently activating GDSL esterase/lipase. GDSL esterase/lipase would release fatty acids inhibiting NaCl activity that would increase overall salt tolerance. Other genes identified by RNA-seq were also associated with important mechanisms inhibited as a result of the toxicity, osmotic imbalance, and Na + accumulation within plants.

Conclusion
Salt stress-related pathways and mechanisms in P. elongata lack extensive research, this study identified 645 genes with altered expression levels from a 10-day salt treatment. The RNA-seq analysis demonstrated novel genes were upregulated and downregulated while others turned on and turned off due to salt stress in P. elongata. Treatment over a longer period of time could be used in conjunction with the data obtained from the 10-day treatment to analyze salt-induced changes in gene expression over a longer period of time. Different levels of salinity up to 200 mM have been tested on Paulownia hybrid lines (Stefanov et al. 2016) displaying the trees ability to survive high levels of salinity; transcriptome expression in response to different levels of salinity would be beneficial tool for understanding the mechanisms involved in salt tolerance. The function of genes in relation to salinity was hypothesized based on previous publications that have also referenced genes in response to salt stress in plants. The data generated could have future implications for genetically modified plants that could be engineered to for better salt tolerance. By overexpressing upregulated genes found in this study, modified P. elongata plants could be a more suitable candidate for second generation biofuel in the future. Possible hypothesized pathways in this study could be used as a foundation to test cellular pathways that are not completely known. Further studies of downregulated genes might also identify many unknown processes that occur during salt stress and be used to ameliorate both agricultural production and biofuel crop production. This study demonstrates P. elongata as a novel candidate for studying abiotic stresses. Overall, salt-induced genes from this investigation encoded for functional groups such as water channel proteins, transcription factors, and oxidative stress defenses which are traditionally linked to potentially provide salt tolerance (Parida & Das 2005). The RNA-seq analysis provides great insight into novel salt-induced genes of P. elongata and serves as a possible reference for future investigations of salinity and possible biofuel plant candidates.

Data archiving statement
All data have been registered and deposited at the National Center for Biotechnology Information (NCBI) under the Bio-Project accession number PRJNA343673 and BioSample accession number SAMN05792024. Additionally, the raw sequence reads have been deposited in the Sequence Read Archive under the accession numbers SRR4291420 and SRR4291421. This Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GeBank under the accession GFBN00000000. The version described in this paper is the first version, GFBN01000000. The quantitative gene expression (RNA-seq) components of the study including the raw counts of sequencing reads and normalized abundance measurements have been deposited in the NCBI Gene Expression Omnibus database under the accession numbers GSE94501, GSM2477139, and GSM2477140.