Transcriptional profiling of alkaline stress-induced defense responses in soybean (Glycine max)

Abstract Alkaline stress has been considered a major environmental threat to plant growth and agricultural production. Even though the mechanisms have been characterized in some plants, the information about the alkaline stress-induced defense responses in soybean is insufficient. To identify defense related genes, the transcriptional profiles of soybean seedlings were examined using RNA sequencing under alkaline stress. The results showed that a total of 4603 genes were up-regulated and 1659 were down-regulated compared to expression in control plants. Further, the functional enrichment analysis showed that these differentially expressed genes were significantly enriched in 43 pathways and mainly associated with catalytic activity and binding, metabolic process, cellular process and cell progress. Then, 19 transcription factors from AP2-EREBF, MYB families and isoflavone related genes were chosen and identified to be induced under alkaline stress by quantitative real-time polymerase chain reaction (qRT-PCR). Taken together, our results highlight the differentially expressed genes and identify a set of genes associated with the response to alkaline stress, which findings establish a foundation for further analyses of soybean alkaline stress tolerance mechanisms. Supplemental data for this article is available online at https://doi.org/10.1080/13102818.2021.1976078 .


Introduction
Alkaline stress has been considered as a major detrimental factor for crop productivity. A total of 3,105,000 km 2 of global land is affected by alkaline soil [1]. Compared with neutral salts stress such as NaCl and Na2SO4, alkaline stress due to NaHCO3 and Na2CO3 stress may cause more harmful effects to plant growth [2]. Soil alkalinity inhibits plant growth and development by means of high pH, HCO 3 and CO 3 2-. Severe alkaline stress can inhibit sugar production, N metabolism and photosynthesis, as well as disrupt the ionic homeostasis and limits the absorption of ions, such as H 2 PO 4 -, Cl -, Al 3+ and Fe 2+ [3,4]. Previous studies have showed that high soil pH affects nutrient uptake, organic acids balance and ions balance [5]. Also, the high pH could initiate the accumulation of malonic dialdehyde and reactive oxygen species [6]. Therefore, alkaline stress is different from other types of abiotic stress, so plant cells may respond to it with a more complex tolerance mechanism [7]. Thus, an understanding regarding the basic mechanisms of plant responses towards alkaline stress is urgently needed as little is known about basic alkaline stress responses especially in soybean crop.
Soybean has been adopted globally as one of the most important crops. Unlike other cereal crops, where carbohydrates are the main constituent of the nutrient profile, soybean is comprised of protein, oil, micronutrients and isoflavones and thus represents greater nutritive value [8,9]. The productivity of soybean is significantly hampered by abiotic stresses, especially due to alkaline stress [5]. Alkaline stress deficit dramatically limits the growth and yield for soybean. RNA sequencing is an effective technology to facilitate the understanding of molecular mechanisms in plant stress studies [10]. It has been applied for elucidation of the responses of plants to alkaline stress, including alfalfa and sugar beet [11,12].
However, studies about transcriptome analysis of soybean in response to alkaline stress are limited. Therefore, in this study, we sequenced transcripts from soybean under NaHCO 3 stress. Differentially expressed genes were identified and annotated. Functional categorization of responsive genes was conducted to unveil their roles in various pathways and metabolic processes. Furthermore, we focused on the transcription factors and isoflavone related genes and detected their expression changes under alkaline stress by quantitative real-time polymerase chain reaction (qRT-PCR), and suggested the critical roles of these candidate genes under alkaline stress conditions.

Plant material, growth and tissue harvest
The soybean (DN50) seeds were germinated and grown as reported in a previous study [13]. For alkaline stress, the 12-day-old seedlings were grown in Hoagland's nutrient solution with 50 mmol/L NaHCO 3 for 6 h. The soybean whole seedlings were harvested, and stored at −80 °C for further analysis.

RNA extraction and sequencing of RNA-seq library
Total RNA was extracted from soybean seedlings, using an E.Z.N.A plant RNA kit (Omega Bio-Tek, Beijing), according to the manufacturer's instructions. RNA samples were sent to BGI-Shenzhen Co. Ltd (Shenzhen, China), and 150-bp paired-end reads were generated using BGI-Seq500 platform [14]. The Wm82.a2.v1 was used as the soybean reference genome.

Sequencing data analysis and identification of differentially expressed genes
After removing adapter and low-quality reads (less than 150 bp) from the raw data, only high-quality tags were retained. All the down-stream analyses performed were based on the high-quality clean data. A reference genome database was constructed and paired-end clean reads were aligned using bowtie2 v2.2.8 and HISAT2v2.0.4. tools [15]. Differentially expressed genes were identified using the Cuffdiff program, and the changes were screened with the threshold |log 2 ratio| > 1 and FDR < 0.01 [16].

Functional annotation and pathway analysis
The GO functional classification and enrichment analyses were performed by WEGO and agriGO software. The enrichment of KEGG pathway and transcription factors were performed using Benjamini and Hochberg method.

Quantitative real-time PCR
Complimentary DNA (cDNA) was synthesized from total RNA by First Stand cDNA Synthesis kit (Toyobo, Osaka). qRT-PCR was performed using UtraSYBR Mixture (Baioleibo, Beijing) on 7500 Real Time PCR system. The reference gene GmGADPH and gene-specific primers were used to verify the RNA-Seq results of differentially expressed genes (Supplemental Table S1). Relative transcript levels were analyzed using the 2 -ΔΔCT method and one-way analysis of variance (ANOVA; *p < 0.05, **p < 0.01) followed by Duncan's test.

Quality analysis of the RNA-seq datasets
In order to establish the transcriptional profiles of soybean under alkaline stress, total RNA from three samples of plant subjected to alkaline stress were used, while three untreated samples were used for generation of an informative reference transcriptome database. After data filtering, the proportion of the total number of clean read segments was higher, such as the proportions of all clean read segments were over 86.34%, while Q20 values were found greater than 97%. The sequencing sample and the reference genome sequence alignment was higher than 90.39%, and the unique-mapping was over 67.2%. The numbers of clean reads of samples were obtained as shown in Table 1.
By calculating the Pearson correlation coefficient of all gene expression amounts between each sample, we identified the correlation coefficient of gene expression between the test samples. The results showed that the correlation coefficient of each sample met the standard, and the gene expression level was similar (Supplemental Figure S1). Furthermore, the significantly expressed genes were calculated by FPKM (FPKM > 1). The results showed that 75.7% of the genes were significantly expressed under alkaline stress, compared with 75.0% of the genes in non-treated soybean samples (Supplemental Figure S2).

Identification and GO enrichment analysis of differentially expressed genes (DEGs)
In the present study, a total of 6262 genes were identified to have significant changes in their expression level under alkaline stress (|log 2 ratio|>1, FDR < 0.01). Among them, 4603 genes were up-regulated and 1659 genes were down-regulated ( Figure 1).
To further investigate the roles of DEGs, we conducted GO enrichment analysis. According to the GO enrichment analysis, the DEGs in alkaline treated soybean were significantly enriched in 43 pathways, including 18 in biological processes, 14 in cell components and 11 in molecular function (Supplemental Figure S3). GO analysis also revealed that the DEGs were mainly associated with catalytic activity (GO:0003824 2423 unigenes), binding (GO:0005488 2472 unigenes), metabolic process (GO:0008152 1792 unigenes), cellular process (GO:0009987 1792 unigenes) and cell (GO:0005623 2009 unigenes) in the categories of molecular function, biological process and cellular component ( Figure 2).

KEGG pathway analysis and isoflavone involved in the alkaline stress responses
KEGG pathway analysis showed that the DEGs associated with alkaline stress were involved in 133 pathways. For example, most of the DEGs were widely enriched in the pathways of plant-pathogen interaction, MAPK signaling pathway, plant hormone signal transduction, phenylpropanoid biosynthesis, isoflavonoid biosynthesis, flavonol biosynthesis and anthocyanin biosynthesis (Supplemental Table S2). In this study, we observed that the secondary metabolite biosynthesis pathway was the most predominant among pathways, especially the isoflavonoid biosynthesis pathway. To investigate the potential roles of isoflavone in alkaline stress response, we detected the expression patterns of isoflavone biosynthesys related genes using qRT-PCR analysis (Figure 3). The results showed that eight genes appeared with up-regulated expression levels under alkaline treatment, which are consistent with the results from the transcriptome sequencing data. This finding suggests that isoflavones may play important roles in response to alkaline stress.

Transcription factors involved in the alkaline stress responses
Transcription factors have been found to play a crucial role in response to abiotic stresses, such as salt, drought and cold [17][18][19]. In this study, many transcription factors were differentially expressed when soybean plants were exposed to alkaline treatment. The comparison showed that the number of DEGs from AP2-EREBF, MYB, WRKY, NAC, bHLH and C2H2 families were comparatively higher than other families (Figure 4), which suggested that these transcription factors might play significant roles in the response to alkaline stress. Among them, AP2-EREBF and MYB transcription factor families had the largest number of differentially expressed genes. Hence, we focused on AP2-EREBF and MYB-related genes, and 11 AP2-EREBF and MYB genes were selected for further gene expression analysis by using qRT-PCR analysis. Intriguingly, all of these genes appeared with higher transcript levels in response to alkaline stress ( Figure 5). These results are in agreement with the common knowledge that transcription factors might be important for plant in response to alkaline stress.

Discussion
Alkaline stress is one of the major detrimental factors for crop productivity [20]. Soybean has been considered one of the most important nutritive value crops [21]. However, the productivity of soybean is significantly limited by alkaline stress. Therefore, to establish the transcriptional profiles of soybean under alkaline stress, we sequenced transcripts from soybean under NaHCO 3 stress.
Previous studies have identified a large number of alkali-responsive genes in different plants, such as rice, alfalfa and peach [11,22,23]. This is also supported by our results that the expression levels of a total of 6262 DEGs (including 4603 upregulated and 1659  Figure 1. the expression profiles of differentially expressed genes in soybean under alkaline stress. the heat map was generated using R gplots package and shows the differentially expressed of genes. downregulated) were significantly changed under alkaline stress (Figure 1). We found that these DEGs were significantly enriched in 43 pathways (Supplemental Figure S3), and mainly associated with catalytic activity, binding, metabolic process, cellular process and cell ( Figure 2). These biological processes provide valuable information for understanding the mechanism of soybean response to alkaline stress. For example, the amino acid production and nitrogen metabolism were significantly inhibited in metabolic process under alkaline stress [24]. The catalytic activity and binding process play important roles against alkaline stress in wild jujube [25].
Isoflavonoids are abundantly present in soybean and play multiple roles in plant response to abiotic stresses [26]. Isoflavone metabolism might play a crucial role in response to UV-B stress [27]. Isoflavone synthase genes IFS1 and IFS2 highly correlated with isoflavone accumulation under water deficit conditions [28]. The isoflavone contents were significantly increased in chickpea sprouts germination stage under salt stress [29]. In the present study, we also found some DEGs were enriched in the category of flavonol biosynthesis (Supplemental Table S2). Therefore, we detected the expression patterns of eight isoflavone biosynthesis related genes, and found that these genes were significantly induced under alkaline treatment ( Figure 3). Thus, our results indicated that isoflavones may play important roles in alkaline stress mediated signaling pathways.
Besides isoflavone biosynthesis related genes, we found that some of the DEGs were transcription factors, such as AP2-EREBF, MYB, WRKY, NAC, bHLH and C2H2 families (Figure 4). This result is consistent with previous studies that these transcription factor families are involved in the response to various abiotic stresses [18]. In this study, the expression of 11 AP2-EREBF and MYB genes were detected by qRT-PCR, and all of them appeared with higher transcript levels under alkaline stress ( Figure 5). The results are in line with the studies that AP2-EREBF and MYB transcription factor families are involved in various abiotic stresses [30,31]. For example, a large number of AP2/ERF genes were induced in response to drought and salinity stresses in durum wheat [32]. GmERF3 was found to be induced by salt, drought, ethylene and ABA, and this category of transcription factors can bind to the GCC box or DRE/CRT element to respond under  . expression pattern analysis of isoflavonoid biosynthesis related genes under alkaline stress. the seedlings were treated with 50 mmol/l nahco 3 for 6 h. the transcript data results were analyzed using the 2 -ΔΔct method and one-way analysis of variance (anoVa; *p < 0.05, **p < 0.01) followed by Duncan's test. the expression levels are normalized relative to GAPDH gene in Glycine max. Data are mean values with standard deviation (±SD).
abiotic stresses [33]. Additionally, MYB transcription factors also play essential roles in response to abiotic stresses in various species, such as in cassava and wheat [34,35]. The ABA-mediated AtMYB96 could improve drought resistance through activation of cuticular wax biosynthesis [36]. Previously, we also isolated an ethylene-responsive gene GsERF71 and found that overexpression of GsERF71 could enhance plant response to alkali stress in Arabidopsis. Similarly, overexpression of soybean GmMYB68 significantly enhanced the resistance to salt and alkali stresses [37]. These findings confirmed that these transcription factors may contribute to the response to alkaline stress, and it might be interesting to study their functional roles and mechanisms in response to alkaline stress.

Conclusions
This study is a comprehensive transcriptome analysis of RNA sequencing in soybean under alkaline stress. Overall, 6262 genes were identified as responsive genes under alkaline stress. The KEGG pathway analysis showed that these responsive genes were involved in 133 pathways. Nineteen AP2-EREBF, MYB and isoflavone related genes were identified, which suggests that these genes can be good candidates for understanding the functions in response to alkaline stress.

Disclosure statement
No potential conflict of interest was reported by the authors.  . expression pattern analysis of ap2-eReBF and myB genes under alkaline stress. the seedlings were treated with 50 mmol/l nahco 3 for 6 h. the transcript data results were analyzed using the 2 -ΔΔct method and one-way analysis of variance (anoVa; *p < 0.05, **p < 0.01) followed by Duncan's test. the expression levels are normalized relative to GAPDH gene in Glycine max. Data are mean values with standard deviation (±SD).

Data availability
All data that support the findings reported in this study are available from the corresponding authors upon reasonable request.