Caragana korshinskii phenylalanine ammonialyase is up-regulated in the phenylpropanoid biosynthesis pathway in response to drought stress

Abstract Drought is one of the most severe abiotic stresses, the damage due to which, various plant species mitigate by activating mechanisms that are not yet well understood. Caragana korshinskii is a xerophytic shrub found in the semi-arid regions of northwest China with high tolerance to several abiotic stresses, including drought. Based on the de novo transcriptome data from C. korshinskii leaflets collected along a precipitation gradient on the Loess Plateau (China), most of the differentially expressed genes were explored using trend analysis along the precipitation gradient. Gene ontological analysis showed that “phenylpropanoid biosynthesis process → secondary metabolite biosynthetic process” terms were the most significant gene ontologies, whereas Kyoto Encyclopedia of Genes and Genomes-based analysis indicated that the biosynthesis of secondary metabolites was a significant metabolic pathway. Real-time polymerase chain reaction and enzyme activity analyses confirmed the increased transcription of the phenylalanine ammonialyase (PAL) gene in C. korshinskii under drought stress in field and laboratory conditions. These results suggested that C. korshinskii adjusts its secondary metabolism to water-deficit environments and activates PAL by drought stress. Therefore, further studies on the obtained data can expand the current understanding of the molecular and genetic mechanisms responsible for the drought endurance in C. korshinskii.


Introduction
In nature, the stresses such as drought, salinity, freezing and heat are the major limitations to plant growth [1]. Drought refers to the deficiency of water in the soil and is one of the major widespread abiotic determinants of many plant species [2] that have devastating impacts on the environment and water resources [3]. Severe drought has led to the widespread death of the trees in many forest biomes [4]. Due to global warming, a future massive drought-induced die-off of trees may be severe and extensive [5], and regionalscale droughts are expected to intensify [4]. In China, the frequency of severe droughts increased, especially on the Loess Plateau, with extreme climatic conditions [6]. In this area, the apparent drying and warming processes occurred from 1971 to 2010, and the majority of the regions exhibited spatially increasing trends of drought frequency and severity [7,8]. The level of available water will reduce further, thereby increasing the need for plants that are highly drought-tolerant [2,9]. Plants respond to drought stress by complex adjustments including the induction of droughtresponsive genes and adaptations based on the modifications in protein synthesis and metabolites [10][11][12]. Hence, drought tolerance in plants is a physiologically complex trait that is regulated by diverse metabolic and genetic mechanisms [13], and the identification of these underlying mechanisms might provide key insights into the adaptive responses of plants to water deficiency.
Caragana korshinskii is a perennial xerophytic flowering plant of the family Fabaceae, with small, pinnate leaflets, widely distributed in the arid and semi-arid regions of northwest China [14,15]. It is a vital forage bush with high drought resistance, and hence, used as a model species in abiotic stress studies [16]. Owing to its tolerance to abiotic stresses, C. korshinskii has been used for dune stabilization and water conservation in north China [17,18]. In addition, this species restores degraded land by fixing atmospheric nitrogen, serves as supplemental livestock forage and plays a key role in vegetation succession from shifting dunes to sandy grasslands [19]. Previous studies have focused on the taxonomy, community distribution, and biological and ecological characteristics of C. korshinskii, as well as the changes in several abiotic stress-related genes [14,17,20,21]. However, molecular response mechanisms to water deficiency remain unknown, especially in natural habitats because of the significant selective pressure imposed by the precipitation gradient.
In order to better understand the drought tolerance mechanisms in C. korshinskii under different water conditions, we conducted a de novo transcriptome analysis, analyzed the cluster of multiple droughtrelated genes and reported the expression profiling of transcriptomes in the leaflets. The results allowed us to identify the stress-related candidate genes and reveal the essential pathways mediated by regulatory gene networks. The phenylalanine ammonialyase (PAL) gene was identified as an up-regulated response in phenylpropanoid biosynthesis pathway in a C. korshinskii population along a precipitation gradient. The obtained data might extend the current knowledge regarding the strong drought resistance of C. korshinskii at the molecular level.

Sampling sites, conditions and plant material
The sites for collecting the samples were selected as the Loess Plateau in Shaanxi Province and Inner Mongolia, Northwest China (Figure 1), where the annual precipitation (from 150 mm in the northwest to 800 mm in the southeast) and the annual evaporation (from 1400 mm in the southeast to 2000 mm in the northwest) show a gradient distribution and 55%-78% of the precipitation occurs from June to September [22][23][24]. The annual precipitation, annual evaporation, soil moisture content (SMC) and plant leaf water potential were reported previously [25]. However, no significant difference was detected in the annual average temperature from the northwest to the southeast, soil pH and altitude along the precipitation gradient [26]. The collection and freezing of plant materials and soil samples with respect to C. korshinskii were described previously [27].

RNA isolation and library construction
The three total RNA samples, each one from three individuals in a site, were isolated using TRIzol (Invitrogen, Carlsbad, CA, USA), according to the manufacturer's instructions, and treated with RNase-free DNase I (Takara Bio, Shiga, Japan) for 30 min at 37 C to remove residual DNA. RNA quality was verified using a 2100 Bio-analyser (Agilent Technologies, Santa Clara, CA, USA) and by RNase-free agarose gel electrophoresis. Poly (A) mRNA was isolated using oligo-dT beads (Qiagen, Venlo, the Netherlands). All mRNA was broken into short fragments by the addition of fragmentation buffer. First-strand cDNA was generated by reverse transcription using random hexamer-primers, followed by the synthesis of second-strand cDNA using RNase H and DNA polymerase I. The cDNA fragments were purified using the QIAquick PCR Purification Kit (Qiagen, Venlo, the Netherlands), washed with EB buffer for end reparation and poly (A) addition, and ligated to sequencing adapters. Following agarose gel electrophoresis and extraction, cDNA fragments were purified and enriched by polymerase chain reaction (PCR) to construct the final cDNA library, which was sequenced by Illumina Hi-Seq TM 2000 (Illumina, San Diego, CA, USA) using the paired-end technology of Gene Denovo Co. (Guangzhou, China). Clean reads were selected by removing low-quality sequences (Q value 10), reads with more than 5% of unknown nucleotides and reads containing adapter sequences.

Sequence, assembly and annotation of reference transcriptome
To obtain a reference transcriptome for the leaflets of C. korshinskii, RNA-Seq data were generated using RNA from all leaf samples. Raw reads were filtered to remove low-quality sequences (Q value 10), reads with more than 5% of unknown nucleotides and reads containing adapter sequences. Unigene annotation was performed using BLASTX (ftp://ftp.ncbi.nlm.nih. gov/blast/executables/blast+/2.2.29/) with a threshold E-value of 1e-5 against the National Centre for Biotechnology Information (NCBI) non-redundant (NR) database (https://www.ncbi.nlm.nih.gov/), the Swiss-Prot protein database (https://www.expasy.ch/sprot), the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (https://www.genome.jp/kegg/) and the Clusters of Orthologous Groups (COG) database (https://www.ncbi.nlm.nih.gov/COG/). The sequence direction of unigenes was determined according to the best alignment. When results differed between the databases, the direction was determined consecutively by the NR, Swiss-Prot, KEGG and COG databases. When a unigene was not aligned, sequence direction was confirmed using ESTscan. GO annotation was analyzed using Blast2GO (BioBam Bioinformatics S.L., Valencia, Spain). The functional classification of unigenes was performed using WEGO [28]. KEGG pathway annotation was performed by Blastall (ftp.ncbi.nlm.nih. gov/blast/executables/blast+/2.2.29/) against the KEGG database.

Alignment of reads and normalization of gene expression levels
Sequence reads were mapped to reference sequences using SOAPaligner/soap2 [29], a tool designed for short sequence alignment. The coverage of reads in a gene was used to calculate gene expression levels. Using this method, we determined the expression levels of all detected genes. Reads uniquely mapped to a gene were used to calculate the expression level. The gene expression level was measured by the number of uniquely mapped reads per kilobase pair of transcript per million reads mapped (RPKM) [30]. For DEGs between Huangling and Yulin or Yulin and Dalad Banner populations, the effect of drought stress was restricted to a false discovery rate (FDR) of 0.001 and an absolute value of log2 ratio of !1 [31].

Drought treatment of C. korshinskii in laboratory
The weighing method was adopted to set three gradients to ensure that the water content of the soil was maintained at 75%, 55% and 35% of the maximum, which were termed as suitable water availability (CK), mild drought stress (MD) and severe drought stress (SD), respectively [32]. C. korshinskii seeds obtained from Huangling of the Loess Plateau of China were germinated under humid conditions at 28 C in darkness for two days. Then, germinated seeds were selected for sowing in pots with a diameter of 11 cm and a height of 30 cm; each pot was filled with 4 kg of sand and soil (sand/soil ¼ 1:1). The cultivars were grown in a glasshouse under natural light conditions (16 h light/8 h dark photoperiod and the temperature was 25 ± 2 C during the day and 18 ± 2 C during the night) [14]. The drought treatment was started by controlling the soil moisture levels when the seedlings grew to a height of about 25 cm after 3 months for 20 days as described previously [33]. Subsequently, the leaf samples were collected, frozen in liquid nitrogen and stored at -80 C until further use.

RNA extraction and real-time PCR confirmation of RNA-Seq data
Total RNA was extracted from C. korshinskii leaflets using RNAiso Plus (TaKaRa Bio, Shiga, Japan), according to the manufacturer's instructions. The purified RNA was dissolved in RNase-free water, and RNA quantity measured using a NanoDrop ND-2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Messenger RNA (mRNA) was transcribed using PrimeScript RT reagent Kit with gDNA Eraser (TaKaRa Bio, Shiga, Japan). The PAL sequence published in the GenBank database (DQ652565.1) was used to select a 214-bp region with high homology as a target for real-time PCR amplification. Primers for quantitative reverse transcription PCR (qRT-PCR) were designed using Primer Premier 5.0 (Premier Biosoft International, Palo Alto, CA, USA) and synthesized by Invitrogen (Shanghai, China). The PAL-specific primers PAL-F (Supplementary  Table S1) were used to amplify the 214-bp region.
Real-time PCR reactions were carried out using a Bio-Rad iQ5 Multicolor Real-Time Detection System (Bio-Rad, Hercules, CA, USA) using a SYBR Green-based PCR assay. Each reaction mixture (20 lL) contained 2 lL diluted first-strand cDNA, 250 nmol/L of each primer, 10 lL TransStart Tip Green qPCR SuperMix (TransGen Biotech, Beijing, China) and 7 lL water. Real-time PCR conditions were as follows: 94 C for 30 s, followed by 40 cycles at 94 C for 5 s, 60 C for 30 s, and a final extension step at 72 C for 5 min. Real-time PCR analysis was performed in triplicate. The expression levels of reference genes were determined using CT values and calculated using the 2 -DDCt method [34].
Determination of enzyme activity of PAL in C. korshinskii from three sampling sites PAL activity was measured as described by Lynda et al. [35] with some modifications. Leaflets were pooled (0.1 g) and ground in ice with 1 mL 0.05 mol/L Tris-H 2 SO 4 buffer (pH 8.4) containing 15 mmol/L b-mercaptoethanol, 1 mmol/L ethylenediaminetetraacetic acid and 1% polyvinylpyrrolidone. The homogenate was subjected to centrifugation at 12,000 Âg for 20 min at 4 C, and the supernatant was used in the subsequent enzyme assays. A 1 mL reaction mixture consisted of 0.05 lmol Tris-H 2 SO 4 buffer (pH 8.4), 6 lmol L-phenylalanine and 10 lL enzyme extract. The control group included the reaction liquid except for L-phenylalanine. After 70 min at 37 C, the reaction was stopped by the addition of 0.1 mL of 5 mol/L HCl. The relative absorbance was measured at 290 nm using a spectrophotometer. A molar extinction coefficient of 10,900 L/(mol cm) was used to calculate the enzyme activity [36]. The PAL activity was expressed in lmol transcinnamic acid per milligram protein per minute.

Statistical analysis
The laboratory data were expressed as the average of three independent experiments, and statistical analyses were performed using SPSS 16.0 (SPSS Inc., Chicago, IL, USA); P-value < 0.05 is considered as statistically significant.

Results and discussion
Analysis of three libraries and comparison of differentially expressed genes (DEGs) Solexa sequencing was used to generate a de novo library per sampling site. Tags were mapped against the reference genes in each of the three libraries, and the statistical difference was determined in the classification of raw reads and the distribution of gene coverage (Supplementary Figure S1). The saturation of sequencing data refers to the point at which new unique tags could not be detected [37]. Next, the saturation analysis was performed to assess whether the sequencing depth and transcriptome coverage were sufficient, and the results showed that the three libraries were saturated with transcripts (Supplementary Figure S2). A Venn diagram (Figures 2(A,B)) was constructed to illustrate the distribution of genes with common expression that might be unique to the Huangling, Yulin and Dalad Banner libraries. Figure  2(A) shows the distribution of genes expressed in all three libraries; 46,036 non-DEG genes were coexpressed in the three libraries. We also analyzed the distribution of DEGs (Figure 2(B)) to illustrate the genes that were differentially expressed between the three libraries. A total of 2,441 specific DEGs were found in the Yulin library as compared to the Dalad Banner library, and 1,040 common DEGs were found in both libraries. Similarly, 3,156 DEGs were found in both Huangling and Yulin libraries and Yulin and Dalad Banner libraries.
Furthermore, we compared the de novo profiles of the three libraries (Huangling vs. Yulin, Yulin vs. Dalad Banner, and Huangling vs. Dalad Banner) to identify the up-or down-regulated DEGs in response to the precipitation gradient. The results (Supplementary Figure S3) showed that 4,437 DEGs were significantly up-regulated in Dalad Banner as compared to Huangling; 4,075 DEGs were down-regulated in Yulin as compared to Huangling; 5,699 DEGs were up-regulated and 4,241 DEGs were down-regulated in Dalad Banner as compared to Yulin; 4,338 DEGs were upregulated and 3,660 DEGs were down-regulated in Yulin as compared to Huangling. The distribution and characteristics of DEGs revealed similar expression patterns in each pair of libraries, indicating that several key genes were involved in the response processes of C. korshinskii when subjected to abiotic stress.
Drought stress can significantly alter the gene expression in plants. The identification of genes related to drought tolerance aids in better understanding the plant survival mechanisms under water deficiency [38]. In this study, we evaluated the gene expression patterns in C. korshinskii leaflets collected from the three sites (Huangling, Yulin and Dalad Banner) along the precipitation gradient on the Loess Plateau. As shown in Figure 2(A), 46,036 genes, commonly expressed in the three generated libraries, might be responsible for gene activation and suppression in C. korshinskii. Several genes were differentially expressed when drought stress occurred, while unique DEGs were evenly distributed in all libraries ( Figure  2(B)). Thus, a gradual increase in drought levels resulted in gradual changes in the transcriptomes of C. korshinskii, indicating the stability of genomic changes that provided the molecular basis for drought adaptation in C. korshinskii populations along a precipitation gradient on Loess Plateau.

Cluster analysis of gene ontology (GO) terms in DEGs
Considering the gradual strengthening of the drought stress from Huangling to Yulin to Dalad Banner, a total of 5,623 DEGs were grouped into eight clusters using short time-series expression miner (Figure 3). Among the eight major clusters, clusters 2-4 were enriched in the up-regulated transcripts and clusters 5-7 in the down-regulated transcripts. Next, we performed the GO term analysis within the up-regulated and downregulated cluster groups, excluding clusters 1 and 8, in order to identify the genes that are involved in the metabolic processes. All clusters were classified into three main categories: biological processes, cellular components and molecular functions. The biological process category was comprised of four major sub-categories, including cellular process, metabolic process, response to stimulus, and biological regulation. The cellular component category consisted of three major sub-categories, including cell, cell part and organelle, while the molecular function category included three major sub-categories: catalytic activity, binding and transporter activity ( Figure 3). As indicated by the GO analysis of interactions and relationships (Supplementary Figure S4), "phenylpropanoid biosynthesis process ! secondary metabolism process" terms ( Figure 4) were identified as the major droughtrelated GO terms (FDR 0.25).
Cluster analysis was carried out to interpret the transcriptional levels of C. korshinskii and identify any underlying molecular mechanisms. Of the eight trend profiles, we primarily focused on the up-and downregulation trends. The GO term analysis of these trends revealed a series of metabolic processes affected by drought stress that were classified into three categories: biological process, molecular function and cellular component. The identification of these major GO terms indicated that the transcriptome recovery of the genome balance occurred after the occurrence of the precipitation gradient. For example, metabolic and cellular processes included in the GO term "biological process" (Figure 3) were highly annotated, suggesting that alterations in the comprehensive gene expression occurred during the adaptation to drought stress. "Response to stimulus" and "biological regulation," both accounted for 18% of the total GOs and were relevant to the precipitation gradient.
To further characterize the functional roles of DEGs responsible for the precipitation gradient in C. korshinskii, the pathway analysis of DEGs was performed. Significant differences were identified in the signalling pathways in C. korshinskii by pairwise comparison between the samples (Huangling vs. Yulin, Yulin vs. Dalad Banner and Huangling vs. Dalad Banner, respectively). Moreover, in these samples, the phenylpropanoid biosynthesis was considered the distinct pathway that contributed to precipitation gradient ( Figure 5). Previous studies demonstrated that plant responses to stress rely on the function of complex gene networks. Plant secondary metabolism, generally related to plant defence responses, has evolved based on the interactions between plants and the environment [39]. Moreover, plant secondary metabolites have been found to play a major role in several aspects of plant life, some of which are essential for survival [40]. In this study, the biosynthesis of secondary metabolites in C. korshinskii was identified as a highly significant KEGG pathway that underwent fundamental changes at the transcriptome level and was enriched by both up-and down-regulated genes. Therefore, water-deficit conditions favour a shift in the balance between the production and elimination of secondary metabolites.
The secondary metabolism of plants, especially the vital physiological role of phenylpropanoid biosynthesis in plant development [41], is demonstrated in several ways. All phenylpropanoid substances are generated directly or indirectly through this pathway, whereas active components play crucial roles in the growth and development of plants [42]. Phenylpropanoids serve as essential chemical regulators of plant communication with insects and microbes [43]. The phenylpropanoid metabolism in plants contributes to the defensive response against pathogens [44]. Interestingly, phenylpropanoids exert a significant effect on plant responses to abiotic stresses [45]; however, their effect on drought response is yet unknown. The results from the present study showed that phenylpropanoid biosynthesis is the first significant KEGG pathway to participate in drought stress response (Figures 3 and 4), followed by phenylalanine metabolism (Table 1).
Real-time PCR and enzymatic activity of PAL confirmation by RNA-seq data in field and laboratory C. korshinskii The first step of phenylpropanoid biosynthesis is catalysed by the PAL enzyme (EC 4.3.1.5) [46]. The sequence of PAL in C. korshinskii has orthologs in leguminous plants as further supported by the phylogenetic analysis (Figure 6(A)). In addition, clear orthologous correlations were found in the PAL gene of C. korshinskii among drought-tolerant plants, such as Robinia pseudoacacia [47], which might be attributed to the evolution as a result of living in water-deficit environments. The transcriptome analysis results for PAL were compared to those obtained from qPCR of C. korshinskii in field and laboratory ( Figures  6(B-D)). The comparison revealed that the expression of this gene increased significantly along the precipitation gradient from Huangling to Yulin to Dalad Banner. Also, the enzymatic activity of PAL was detected from C. korshinskii of Loess Plateau, which showed a significant difference in the up-regulated expression along the precipitation gradient, corresponding to the analysis on the laboratory sample under drought stress. These results indicated that changes in the enzymatic activity of PAL ( Figures  6(E,F)) were consistent with increasing PAL expression and the up-regulation of the phenylpropanoid biosynthesis pathway.
Based on the phenylpropanoid biosynthesis or phenylalanine metabolism, PAL (EC 4.3.1.5) is a major enzyme that connects primary metabolism with metabolic phenylpropanoids [48] and is responsible for the initiation of the biosynthesis of several secondary metabolites from phenylalanine. The phenylpropanoid pathway produces diversified intermediate metabolites and converts them into secondary metabolites, such as lignin, flavonoids, isoflavones and alkaloids [49]. These products play a critical role in the growth and development of plants, and their content was closely related to the enzymatic activity of PAL; hence, PAL is a vital component of the physiological processes in plants. In this study, PAL expression showed the same trend as the enzymatic activity of PAL under drought stress in field and laboratory samples. The involvement of PAL in the core reactions of phenylpropanoid metabolism in C. korshinskii suggested its critical role in phenylpropanoid biosynthesis and the crosstalk between phenylpropanoid and secondary metabolites.
After the PAL enzyme is activated, the phenylpropanoid pathway branches into two primary pathways: lignin biosynthetic pathway and flavonoid metabolic pathway [50]. The resulting products act as plant antitoxins, cell wall components and signal transduction molecules [51]. Another primary pathway that branches off the phenylpropanoid pathway is the flavonoid metabolic pathway after the activation of the PAL enzyme [50]. The flavonoid metabolic pathway plays a crucial role in multiple processes, such as defence against pathogens and pests, pollination, light screening and seed development, which are involved in environmental stress protection [52]. In this study, the expression of PAL was up-regulated in C. korshinskii, suggesting a protective role of the phenylpropanoid pathway against drought stress.

Validation of RPAM data by qPCR
To further validate the results obtained from tagmapped genes and computational analysis, six genes (three up-regulated and three down-regulated ones) were randomly selected for qPCR (Supplementary  Table S1) using samples from all the three sites. Consequently, similar qPCR trends were observed for the selected genes and those detected by sequencing (Figure 7).
Plants in arid and semi-arid areas employ a variety of strategies and mechanisms to endure stress under long-term water-deficit conditions. Signalling pathways are considered as key elements of regulatory networks, underpinning the cross-tolerance responses to stress and leading to stress defence and growth regulation. The results from this study indicated that significantly up-regulated pathways related to drought responses might require increased metabolic changes and oxidation resistance, thereby explaining the dominance of C. korshinskii in the arid Loess Plateau.
Water starvation exerts contrasting effects on the transcriptome responses in C. korshinskii. This study also indicated that major pathways related to plant growth, such as zeatin biosynthesis, were significantly down-regulated with the decreasing precipitation gradient (Table 2). Zeatin is a plant hormone from the family of cytokines [53], first discovered in the immature kernels of genus Zea. Zeatin promotes the growth of lateral buds, whereas its application to meristems stimulates cell division [54]. Salt stress might restrict plant growth by decreasing zeatin levels [55]. A previous study reported that zeatin levels remain low throughout the stress period but increase rapidly after stress relief [56]. This effect was considered to be associated with drought-related processes, although the underlying mechanisms are yet to be elucidated entirely.

Conclusions
In the present study, the transcriptome profile dataset of C. korshinskii was generated by the de novo assembly of next-generation sequencing data. Next,   stress. The involvement of PAL in the core reactions of the phenylpropanoid metabolism in C. korshinskii implicated its critical role in phenylpropanoid biosynthesis and the crosstalk between phenylpropanoid and secondary metabolites. These data expand our understanding of the molecular and genetic mechanisms underlying the drought endurance in C. korshinskii, identify genes useful for breeding droughtresistant genotypes and apply to restore desertification.