Transcriptome analysis for genes involved in fructan biosynthesis in the Jerusalem artichoke (Helianthus tuberosus L.)

Abstract Jerusalem artichoke (Helianthus tuberosus L.) is a fructan-accumulating plant and an industrial raw material for fructan production. However, the genetic mechanism of fructan biosynthesis remains unclear. Therefore, this study performed transcriptome analysis to determine the genetic differences between two Jerusalem artichoke genotypes and different fructan contents. Approximately 19.73 Gb of clean data were obtained after filtering, and a total of 164,006 unigenes were annotated based on GO, KOG and KEGG functional classification. Then, seven homologous genes involved in fructan biosynthesis were obtained by homology comparison. CL11458.Contig2 and CL7122.Contig14 were candidate genes for sucrose:sucrose 1-fructosyl-transferase (1-SST) and fructan:fructan 1-fructosyl-transferase (1-FFT), respectively. The genetic relationship between SST and SST from artichoke in the phylogenetic tree are relatively close, and FFT is located in the same branch as FFT from ‘Qingyu No.1’ Jerusalem artichoke and Viguiera. SST and FFT were the main regulatory genes of fructan accumulation in Jerusalem artichoke. Therefore, this study successfully identified major genes involved in fructan biosynthesis. Future studies should aim to prove the role of each candidate gene in fructan biosynthesis in Jerusalem artichoke.


Introduction
Jerusalem artichoke (Helianthus tuberosus) is a sunflower plant of the Compositae family that originated from North America [1] and is cultivated and planted in North America, Europe and Asia. Its planting area is mainly concentrated in Northwest China [2]. Because of its rich fructan (inulin) content -the fructan content in tubers is >70% of the dry matter -Jerusalem artichoke is one of the main raw materials for processing inulin series products [3][4][5].
High-throughput RNA sequencing (RNA-Seq) has become a powerful technology for analysing gene expression under specific conditions and has been widely used in many biological systems. Jerusalem artichoke (H.tuberosus) is a new economic crop; however, its related studies began late; therefore, a more comprehensive study should be conducted. If no reference genome information is available, transcriptome sequencing is selected as an effective research method to find Helianthus tuberosus L.; transcriptome analysis; sucrose:sucrose 1-fructosyl-transferase; fructan:fructan 1-fructosyl-transferase gene; fructan biosynthesis candidate genes. Determination of the fructan content of Jerusalem artichoke germplasm and the use of Illumina sequencing technology to obtain the Jerusalem artichoke chloroplast genome have been completed. The genetic relationship between Jerusalem artichoke (Helianthus tuberosus) [14] and artichoke in the phylogenetic tree is relatively close, which proved that transcriptome sequencing technology is feasible for Jerusalem artichoke. Therefore, we selected JA1 (768 mg/g) and JA3 (524 mg/g) with a large difference in fructan content in this study, we used RNA-Seq technology for transcriptome sequencing and screened the dominant genes involved in fructan biosynthesis.

Plant materials
In this study, two germplasm resources, JA1 and JA3 ( Figure 1), were selected from the Jerusalem artichoke (Helianthus tuberosus) germplasm resource bank at the Academy of Agriculture and Forestry Sciences, Qinghai university. The difference in fructan content was relatively stable for 3 years. The average fructan content of the tuber dry matter of JA1 and JA3 was 768 mg/g and 524 mg/g, respectively. The materials were obtained from the Academy of Agriculture and Forestry Sciences, Qinghai university. mature tubers were selected for sequencing with three biological replicates, and were immediately frozen in liquid nitrogen and stored at −80 °C until further use.

cDNA preparation and Illumina sequencing
The Jerusalem artichoke tuber was selected for RNA extractionand performed using the Tiangen RNAprep pure plant Kit. Then, the RNA quality after extraction was detected using 1% agarose gel, and the concentration was determined using a Nanodrop nucleic acid detector (Thermo Fisher Scientific, Wilmington, DE, uSA). The RNA was sent to Beijing Nuohe Biotechnology Co., Ltd. (Beijing, China) for a subsequent sequencing experiment. cDNA libraries were created after the sample preparation of mRNA-Seq (Illumina Inc., San Diego, CA, uSA) and sequenced using Illumina HiSeq 2000 (Beijing, China) by Nuohe Biotechnology Co., Ltd. (Beijing, China), with a reading length of 100 bp and three repeats.

Data filtering and assembly
Data before the assembly and the original reading generated by sequencing were filtered to remove ambiguous, splice and low-quality sequences to obtain high-quality readings. After purification, it was assembled with Trinity software. The assembled unigenes were entered into the NCBI blastx (E-value < 0.00001) protein database for comparison. The CDS was extracted from unigene sequences using blastx results and transformed into peptide sequences [15].

Differential gene expression analysis
The expression level of unigenes was calculated using the FpKm values, and the differences between JA1 and JA3 transcripts were analysed using the IDEG6 software. unigenes with FDR ≤ 0.001 and |log2 ratio| ≥ 1 were identified as significantly differentially expressed genes.

Protein annotation and classification
To annotate and classify genes, the assembled unigenes were submitted to the public databases for searching, including NR, NT, KOG, Swissprot (http:// www.uniprot.org/), KEGG (http://www.genome.jp/ kegg/), GO (http://geneontology.org/) and pfam (http:// pfam.xfam.org/). All unigenes with significant differences in JA1 and JA3 were mapped into the GO and KEGG pathway databases. The GO classification and KEGG pathway analysis of a single unigene were conducted to further search for the pathway annotation with significant enrichment in different genes.

Screening of key genes for fructan biosynthesis
Reference gene sequences involved in fructan biosynthesis in galactose and starch sucrose biosynthesis pathways were selected, and unigenes associated with fructan biosynthesis were screened out. Expression levels of each unigene in JA1 and JA3 were compared based on the fold change value to analyse the differences among all homologous genes to identify the most likely candidate genes regulating fructan biosynthesis.

Bioinformatics analysis
The sequences of fructose synthase genes were found and downloaded from the NCBI database, and a phylogenetic tree of fructose synthase genes between different species was constructed using the neighbour-joining algorithm of mEGA 6.0.

Illumina sequencing and unigene assembly
The original data (37.62 Gb) and the reading of 250.83 m were obtained frommaterials. The raw reads of JA1 and JA3 were 43.843 m and 39.769 m, and clean reads of JA1 and JA3 were 43.842 m and 39.768 m, respectively. Both Q20 and Q30 were >94%, indicating that the Illumina sequencing was of high quality for further analysis ( Table 1).
The Trinity software was used to assemble unigenes. A total of 199,216 unigenes were generated, with an average length of 1102 bp and an N50 length of 1723 bp (Figure 2). Of these, 164,006 unigenes were annotated as Nr, Nt, Swissprot, Kyoto Encyclopedia of Genes and Genomes (KEGG), KOG, pfam, GO and Intersection in the database, accounting for 82.33% of the 199,216 unigenes ( Table 2).   The sequence comparison of predicted proteins showed that the predicted protein is homologous to Helianthus annuus and Cynara scolymus, and the highest homology with Helianthus annuus is 92.44% (Figure 3).

Differentially expressed unigenes in JA1 and JA3
The expression levels of a total of 178,931 unigenes were calculated based on fragments per kb per million reads (FpKm) values. The expression of 30,519 and 35,055 unigenes was up-and down-regulated in JA1 as compared to JA3, respectively. A total of 113,357 genes had no differential expression (Figure 4).

Differential GO and KEGG analyses
GO analysed a total of 58,661 differentially expressed unigenes; of which, 23,301 were involved in biosynthetic pathways, 23,665 were related to cellular components and 11,695 were associated with molecular functions. Based on the biosynthesis category, most unigenes were involved in the metabolic process (5673), cell structure (5443) and single organism process (3361) ( Figure 5). A total of 88,568 unigenes were annotated in the KEGG database, and 35,858 of these exhibited different expression levels; therefore, the metabolic pathways accounted for 62.8% in all differentially expressed unigenes ( Figure 6) and were mainly concentrated in carbohydrate and amino acid metabolism.

Screening of candidate genes for fructan biosynthesis
The fructan synthase genes include 1-SST, 1-FFT, 6G-FFT and 6-SFT. Different types of fructans are produced by enzymes that are encoded by different genes (Figure 7). We found that the expression level of SST mRNA in JA1 was 3.96 times that of JA3 in the sucrose-to-1kestotriose transformation pathway, whereas the expression level of FFT mRNA in JA1 was 8.28 times that of JA3 in the 1-kestotriose-to-inulin transformation pathway. The SST and FFT expression levels in JA1 were higher than those in JA3. It is inferred that JA1 may have higher tuber fructan content than JA3, which affected the SST and FFT expressions in tubers.
Based on the GO function and KEGG pathway analysis, 102 differential genes were found in these biological processes in Jerusalem artichoke JA1 and JA3. Seven homologous genes involved in fructan biosynthesis were identified (Table 3). Sequence comparison  Differentially expressed genes between Jerusalem artichoke Ja1 and Ja3 varieties. the genes were classified into three classes: Red dots indicate genes whose expression level in Ja1 was higher than in Ja3 (fold change ≥ 2). Blue dots indicate genes whose expression level in Ja3 was higher than in Ja1 (fold change ≥ 2). grey dots indicate genes that were not differentially expressed (0.5 < fold change < 2). note: Ja1 has high fructan content. of all homologous genes showed that all transcripts were not related to 6-SFT and 6G-FFT. These results are consistent with the metabolic model of fructan synthesis in higher plants first proposed by Edelman and Jeeeord [13], indicating that 6G-FFT and 6-SFT may not play a role in the artichoke fructan synthesis. In the transcription data, SST has 4 homologous genes belonging to different transcripts of the same gene and FFT has three homologous genes. Different single genes may be caused by alternative splicing of the same gene. Therefore, CL11458 and CL7122 are candidate genes for SST and FFT.

Genetic analysis of candidate genes
To compare the phylogenetic relationship between SST and FFT and other species, the mEGA6.0 software was used to create a phylogenetic tree (Figure 8). SST   and SST from artichoke are genetically close in the phylogenetic tree. In a study on artichoke in 1997, the Cy21 gene encoding 1-SST was cloned, and 1-SST was detected in transiently transformed tobacco (Nicotiana tabacum Linn.) leaves and potato (Solanum tuberosum L.) tubers, indicating that the Cy21 gene encode enzymes can catalyze the synthesis of fructans in transgenic [16]. FFT is in the same branch with FFT from 'Qingyu No.1' Jerusalem artichoke [17] and Viguiera [18], with a close genetic distance. 'Qingyu No.1' Jerusalem artichoke was verified by transgenic function, indicating that FFT can regulate fructan synthesis [17]. SST and FFT are the main genes for fructan synthesis in other crops, indicating that SST and FFT may have the same fructan synthesis function. In recent years, studies have shown that genes encoding 1-FFT and 1-SST were involved in fructan synthesis in asparagus (Asparagus officinalis L.)roots [19]. And the 1-SST from Schedonorus arundinaceus that efficiently converts sucrose into 1-kestotriose was used for fructooligosaccharides production by immobilized Pichia pastoris cells [20]. These studies confirm that SST and FFT are related to fructan synthesis.CBm41476.2, Pachysandra terminalis 6-SST/6-SFT; CAH18938.1, Bellis perennis 1-FFT; CAA70855.1, Cynara cardunculus var.   Note. Ja1 was used as the treatment and Ja3 was used as the control.