Molecular response of gall induction by aphid Schlechtendalia chinensis (Bell) attack on Rhus chinensis Mill

ABSTRACT Horned gall is named after its shape of irregular diamond, and is induced by the fluid-feeding aphid Schlechtendalia chinensis (Bell) attacks on the leaflets that are located in the branchlets of Rhus chinensis Mill., which is enriched in gallotannin and can be widely used in medicine and the food industry. To explain the molecular mechanism of gall development, we performed transcriptome analysis by Illumina deep sequencing and digital gene expression of four tissues, including galls, leaves that grew on the same branch as the gall taken (GL), leaves from a branch without any gall (LW), and leaves from a tree without any gall (CL). Differentially expressed genes abundantly enriched in the biosynthesis of secondary metabolites, plant–aphid interactions, and plant hormone signal transduction were highly expressed in galls compared with GL and LW. Phytohormone signal transduction, dominated by IAA and ABA, coordinates primary and secondary metabolism and thus induces gall induction and development after attack by aphids. This study provides a theoretical basis for the genetic improvement and processing of gallnut resources.


Introduction
Rhus (Anacardiaceae) contains more than 250 species found in temperate and tropical regions worldwide that are characterized by the presence of several compound groups, including tannins, flavonoids, triterpenoids, phenolics, and aromatic alkanes (Djakpo and Yao 2010). Of these, Rhus chinensis is an important plant that has long been used in folk medicine for the treatment of cold, fever, and malaria in China, Japan, and Korea (Rayne and Mazza 2007). Notably, the extract of galls induced by infection with Schlechtendalia chinensis (Bell) is known to be effective in bacterial control since galls are rich in gallotannin (50-70%) and phenolic compounds, such as gallic acid (Djakpo and Yao 2010). The tannin-rich horned galls have extensive use in medicine, chemical engineering, and the food and textiles industries and account for the majority of the yield of gallnut in China (Chen 2000).
In nature, species-specific gall is abnormal plant tissue, and is generally induced by insect parasitism, which then evolves into a relationship between the insect and the host plant akin to mutualism (Cooper and Rieske 2010;Tokuda 2012;Nabity et al. 2013). To disclose the adaptive significance of galls, several hypotheses have been proposed, including defense from natural enemies, protection from the external environment, expansion of the surface area available for consumption, and enhancement of the nutritional quality of plant tissue (Price et al. 1987). Based on the nutrition hypothesis for the adaptive nature, gall insects control the nutrient levels in galls for their own benefit (Hartley and Lawton 1992). Mapes and Davies (1998) reported that the gall-fly larva is needed to maintain early gall development. When the gall inducer dies in a young developing gall, cell division of gall tissue slows and cell differentiation is altered back to more normal stem tissue. Gall-fly mortality and even the timing of gall-fly mortality can have substantial effects on the quality of gall tissue (Sarahe et al. 2008). There is a dynamic biochemical interaction that is usually well balanced for most of the Aphididae between aphids and plants. Gall-forming Aphidoidea trigger and control abnormal growth in the plant to the insects' advantage, possibly by eliciting vigorous oxidation in selective meristematic tissues, thereby limiting supply of molecular oxygen and inhibiting oxygen-dependent growth-controls (Miles 1999). However, the molecular mechanisms that underlie gall induction and gall development remain unknown. RNA-seq is a powerful and efficient tool to identify differentially expressed genes (DEGs) at the whole-genome level and is especially effective for screening the transcriptome of non-sequenced species ).
In the present study, we investigated the molecular processes involved in gall and leaf development in R. chinensis by high throughput RNA-seq. A range of analysis related to transcriptome analysis, including de novo assembly, gene characterization, gene classification, and gene enrichment, were performed. In order to identify the molecular features that distinguish gall and leaf and to disclose what makes the gall unique, the gall and three types of leaf were compared. Two major questions were addressed: What is the main molecular difference between gall and leaf and which factors are needed to involve in the gall initiation and development?

Plant material
R. chinensis and Plagiomnium maximoviczii (Lindb) T. Kop. are the primary and secondary hosts to which S. chinensis migrates. In mid-April, fundatrices appear at the R. chinensis and induce horned gallnuts that remain small for several months and then rapidly expand in the autumn while the number of female aphids in a gall drastically increases. In the late October, the galls crack open and alate aphids disperse, and their offspring develop on the secondary hosts in winter. In the next spring, alate sexuparae are formed on the secondary hosts and migrate to R. chinensis where they produce sexual males and females. After sexual reproduction, each female produces only one fundatrix that can induce a new gall (Takada 1991).
As reported previously , galls and leaves were collected at 100 days from gall formation in eight-yearold trees at a natural plantation located in Yanjin county (Yunnan province) of China (28°04 ′ ～28°11 ′ N, 104°14 ′ ～ 104°27 ′ E). Four types of samples were randomly harvested from different locations of the tree crown and mixed. Of these, three types of tissue samples were collected from trees with galls, which contain galls, leaves that grew at the same branch as the gall taken (GL), and leaves from branches without any gall (LW). In addition, control leaves were collected from trees without any galls (CL) (Figure 1). After collection, galls were immediately cut into two halves, and aphids as well as impurities on the inner gall surface were removed, followed by washing with deionized water and wiping by filter paper. Next, samples were immediately frozen in liquid nitrogen and stored at −80°C. Ten randomly selected galls or leaves per plant were pooled, and measurements were based on three biological and technical replicates, respectively.
RNA extraction, library construction, and RNA-seq Total RNA was extracted by TRIzol Reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's protocol. The concentration of the purified RNA was determined using a UV-VIS spectrophotometer (Quawell Q5000, San Jose, CA, USA), and the purity and degradation of the total RNA were checked with 1% agarose gels. Using Illumina's TruSeq RNA Sample Preparation Kit (Illumina, Inc., San Diego, CA, USA), the mRNA-seq library was constructed, and mRNA was purified using oligo (dT) magnetic beads. The mRNA was fragmented into small pieces using divalent cations after purification, and the cleaved RNA fragments were used for first strand cDNA synthesis using reverse transcriptase and random primers, followed by second strand cDNA synthesis using DNA polymerase I and RNaseH. Subsequently, the cDNA fragments were subjected to an end repair process and ligation of adapters. The product was purified and enriched by PCR to create the final cDNA library, which was sequenced using an Illumina HiSeq ™ 2000 sequencing platform.

Analysis of sequencing results
After sequencing, raw reads were filtered by discarding the reads with adaptor contamination, masking low-quality reads with ambiguous 'N' bases and removing the reads in which more than 10% bases had a Qvalue <20 (Tang et al. 2011). Next, the clean reads obtained were assembled into contigs by the Trinity program, in which an optimized kmer length of 25 was used for de novo assembly . The contigs were linked into transcripts according to the paired-end information of the sequences. Furthermore, the transcripts were clustered by nucleotide sequence identity, and the longest transcripts were defined as unigenes to eliminate redundant sequences.

Functional annotation
To annotate the unigenes, a BLASTX search (E ≤ 10 −5 ) with unigenes was performed against protein databases, including non-redundant protein (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genome (KEGG) and Cluster of Orthologous Groups (COG) of proteins, and the best aligning results were used to determine the sequence direction of Figure 1. Photo showing sample types. Two types of trees were sampled. The first was a galled tree that was infected with gall-forming aphids (A and B), and the other was an ungalled tree that was uninfected (C). From the two types of trees, four types of tissues were harvested, including gall, GL, LW, and CL. CL represents control leaves that grow on ungalled trees; GL signifies galled leaves that grow in the same branches as galls and are adjacent to galls; LW represents leaves without gall that grow in the gall-free branches in galled trees. Panels A, B, and C represent 100, 155, and 100 days from gall initiation, respectively. the unigenes. ESTScan was used to predict the coding regions and sequence direction when a unigene could not be aligned to sequences in any of the above databases (Grabherr et al. 2011). The expression abundance of the unigenes was represented by the number of fragments per kilobase of transcript per million fragments mapped (FPKM) (Iseli et al. 1999). Additionally, Blast2GO (Grabherr et al. 2011) and WEGO (Conesa et al. 2005) were used for the functional annotation of genes. The KEGG pathway annotation was performed using Blastall software and the KEGG database.

Detection of SSR and single-nucleotide polymorphism markers
The assembled sequences were screened to search for potential simple sequence repeat (SSR) markers using MISA (http://pgrc.ipk-gatersleben.de/misa/). The principle requirement for the identification of perfect dinucleotide motifs was a minimum of six repeats, and for tri-, tetra-, penta-, and hexa-nucleotide motifs, a minimum of five repeats (Ye et al. 2006;Wei et al. 2011). In addition, we used SOAPsnp to detect single-nucleotide polymorphisms (SNPs) in the transcripts, which can be identified on the consensus sequence by comparison with the unigenes (Zeng et al. 2010).

Analysis of DEGs
To determine the expression levels of unigenes, the ratio of FPKM values was defined as the fold-changes in the expression of each gene, and thus DEGs between two groups were identified (Mortazavi et al. 2008). The false discovery rate (FDR) control method was used to identify the threshold of the P-value for transcript abundance (Reiner et al. 2003). In this study, significant DEGs were identified when the fold change values were above log2 (FPKM of sample/FPKM of control) ≥ 1, and FDR ≤ 0.001.

Results
Transcriptome sequencing and de novo assembly RNA from different developmental tissues was pooled to obtain a broad gene library related to gall development; a total of 86.17 million raw reads and 6.71 gigabase pairs were obtained. With quality checks, the average guanine and cytosine (GC) content reached 43.73%. High-quality reads were defined as reads with Q ≥ 20 and no ambiguous 'N' by SOAPdenovo (Li et al. 2009). Thus, after removal of adaptor sequences and contaminated or short reads, 43,027,436 high-quality reads were assembled into 88,592 contigs. Furthermore, short-read sequences were assembled into 59,522 unigenes with a mean length of 889 bp using the Trinity de novo assembly program, including 40,588 unigenes (68.19%) with lengths of 100-1000 bp, 11,855 unigenes (19.92%) with lengths of 1000-2000 bp, and 7079 unigenes (11.89%) with lengths >2000 bp (Table 1 and Figure 2). The N50 values of contigs and unigenes were 1224 and 1675 bp, respectively. The number of reads based on the relative position in a gene (5 ′ -3 ′ ) exhibited a normal distribution ( Figure  3), which indicates a randomly fragmented transcriptome. To facilitate access to the R. chinensis transcriptome data, raw paired-end sequence data in FASTQ format were deposited in the NCBI Sequence Read Archive (SRA) database with the accession number SRA302347.

Functional annotation and classification
For annotating the assembled unigenes, a threshold of 10 −5 was used when performing a BLASTX search against diverse databases, consisting of the NCBI non-redundant protein (Nr) database, NCBI non-redundant nucleotide sequence (Nt) database, UniProt/Swiss-Prot, KEGG, C OG of proteins   and Gene Ontology (GO). Next, 63.39% (37,732) of the unigenes were successfully annotated. Based on the BLASTX results, 36,153 (60.74%), 31,128 (52.30%), and 24,856 (41.76%) unigenes displayed significant similarity to Nr, Nt, and Swiss-Prot database, respectively (Table 2). With regard to the species annotation for the unigenes, 8345 (23%) unigenes displayed significant homology with sequences of Vitis vinifera, and 21%, 18%, and 13% of the mapped sequences had a similarity with sequences of Ricinus communis, Populus balsamifera subsp. trichocarpa, and Amygdalus persica, respectively. R. chinensis and the four species mapped are dicotyledonous, implying that these species may share similar genetic and genomic characteristics. A total of 38,943 coding sequences (CDSs) were successfully identified, of which, 35,833 CDS were based on BLASTX, and another 3110 gene relied on ESTscan software prediction. The remaining 20,579 unigenes (34.57%) was not identified, possibly because the length of the CDSs in some transcripts is too short, the background database does not have sufficient information, and some transcripts correspond to non-coding RNAs.
Based on GO classification, 42,508 unigenes were divided into three categories, which includes 51.76% in biological processes, 29.16% in cellular components, and 19.08% in molecular functions. The three largest biological processes were cellular processes, metabolic processes, and single-organism process. For the cellular components, the major classifications were cell, cell part, and organelle. Under the molecular functions category, most of the classifications were of the catalytic activity, binding, and transporter activity type. The above-mentioned data imply that gall development is an extremely complicated biological process.

Detection of SSR and SNP markers
Based on the screening of 59,522 unigenes, 7526 sequences containing 8926 SSRs were identified. Of these, 1172 unigene sequences contain more than one SSR, and 391 unigene sequences contain SSRs in a compound formation. The largest numbers of motifs are of the mononucleotide and trinucleotide type, which account for 40.43% and 29.01% of the SSR sequences, respectively (Table 3). Dinucleotide (1956) is the most abundant repeat type in all SSRs, followed by AG/CT (1139), AAG/CTT (916), AT/ AT (670), and ATC/ATG (355), respectively. The number of quad-, penta-, and hexa-nucleotide motifs is relatively less. Furthermore, 73,226 SNPs were identified with a frequency of 1/722, and contain 45,648 transitions and 27,578 transversions. Of these, the majority are of the A/ G and C/T type, which account for 31.09% and 31.25% of all SNPs, respectively. The four other types of nucleotide variations, A/C, A/T, C/G, and G/T, account for 9.22%, 11.39%, 7.75%, and 9.31% of the SNPs, respectively ( Table 4).

Sequencing of digital gene expression
To identify the digital gene expression (DGE) profiles involved in gall development, four DGE libraries (Gall, GL, LW, and CL) were generated and sequenced using an Illumina platform. Quality control and filtering of raw data were implemented to decrease data noise, which included removing the sequences of adapters, sequences with a high content of unknown bases and low-quality reads before further analysis. The percentages of clean reads from the four tissues are 99.58%, 99.61%, 99.69%, and 99.87%, respectively, suggesting that the sequence data are of high quality ( Figure 4). The reads in every position are evenly distributed, indicating a good randomness ( Figure 5). In addition, a sequence saturation analysis was performed; the growth curve of detected genes flattens when the number of reads reaches a certain value, which implies that the number of detected genes tends to saturation ( Figure 6).

Comparison of gene expression between tissues
As shown in Figure 7, 39,845, 36,389, 35,796, and 39,526 genes were annotated from Gall, GL, LW, and CL tissues, respectively. Of these, 24,951 genes overlapped in all four tissues. In addition, 5985, 1141, 1194 and 2633 genes specifically expressed in only one tissue type. As for gene expression, those genes that satisfied the criteria of more than a twofold difference between tissues and an FDR ≤ 0.001 were defined as DEGs. Thus, 7979, 1634, and 3190 up-regulated genes, and 6792, 2889, and 6474 down-regulated genes were found when Gall, GL, and LW were compared with CL, respectively (Figures 8 and 9). These DEGs were further analyzed by GO classification (Figure 10). Three GO classifications, cellular processes, metabolic processes, and single-organism processes, are the three largest biological processes. In the category of cellular components, the major classifications were cell, cell part, and organelle. Catalytic activity, binding, and transporter activity formed the biggest group in molecular function. Interestingly, the GO classification for different tissues showed the same pattern, which indicates that these tissues (Gall, GL, and LW) possibly have similar metabolic processes. To verify this conclusion, a pathway enrichment analysis was performed ( Figure 11). For all groups of comparisons, the three largest classifications are biosynthesis of secondary metabolites, plant-aphid interactions, and plant hormone signal transduction. Interestingly, starch and sucrose metabolism was enriched both in the Gall vs. GL and GL vs. CL comparisons, which implies that gall-related tissues are more involved in starch and sucrose metabolism, compared to the leaf.

DEGs involved in plant hormone signal transduction
Generally, IAA, GA, CTK, and brassinosteroid (BR) are classified as growth hormones, while ABA, ETH, JA and salicylic acid (SA) are classified as defense hormones. A total of 28 DEGs involved in IAA metabolism were identified, including 3 auxin influx carrier proteins, 5 auxin-responsive proteins, 16 auxin response factors, and 4 SAUR family proteins. Additionally, four isoforms of the two-component response regulator ARR, which plays an important role in CTK signal transduction, were annotated. Of these, the gene expression ratio of three genes (Unigene10325, CL3428.Contig1 and Uni-gene18453) in Gall vs. CL is noticeably higher than in YL vs. CL and LW vs. CL. Moreover, four gibberellin receptor (GID1) and two DELLA proteins were identified. In the ABA signal transduction category, 19 protein phosphatase 2c and 2 serine/threonine-protein kinase SRK2E-like isoform 1 proteins were found. Five ETH-related genes were identified. Interestingly, the majority of the ABA and ETH-related genes were over expressed in Gall tissues (Table 5).

Functional genes involved in secondary metabolism, plant-aphid interactions, and starch and sucrose metabolism
Overall, plant hormones regulate primary and secondary metabolism. To determine the metabolic differences among tissues, the three largest pathways (secondary metabolism, plant-aphid interactions, and starch and sucrose metabolism) were selected, and 50 unigenes with the highest expression in gall tissues for each pathway were screened. Interestingly, the majority of these 150 unigenes have higher expression levels in Gall than in GL and LW. After further analysis by hierarchical clustering (Figure 13), three clustering patterns were found. Cluster 'a' consists of 53 genes that have the highest expression in Gall/CL, the lowest expression in GL/CL, and tempered expression in LW/CL. Cluster 'b' consists of 93 genes whose expression is dramatically decreased going from Gall/CL to GL/CL, and then LW/CL. Cluster 'c' only includes four genes, of which three genes (beta-glucosidase 46-like, endo-1, 4-beta-glucanase, polygalacturonase precursor) are only expressed in Gall/CL, and the WRKY transcription factor was significantly increased going from Gall/CL, GL/CL, and through to LW/CL.

Discussion
General characteristics of the transcriptome of R. chinensis The significance of the BLAST comparison depends partly on the length of the query sequence; short sequencing reads would rarely be matched to known genes (Novaes et al. 2008). In the present study, 63.39% (37,732) of the unigenes were annotated, which is on the lower end when compared to other uncharacterized plants, such as Litsea cubeba (56.00%) (Han et al. 2013), Siraitia grosvenorii (59.90%) (Tang et al. 2011), Phyllostachys heterocycla (69.75%) (He et al. 2013), Dendrocalamus sinicus (78.71%) , Dendrocalamuslatiflorus (78.90%) , and Mangifera indica (80%) (Azim et al. 2014). The large number of unmapped unigenes might be due to the short sequence reads from sequencing and the relatively short sequences of the assembled unigenes, most of which probably lack conserved functional domains (Hou et al. 2011). Other possible reasons include the fact that some of these unigenes belong to non-coding RNAs ) and insufficient sequence information in public databases (Hou et al. 2011).

DEGs involved in plant-aphid interactions
Galls are products of parasitism by aphids upon the Rhus tree; their development is an extremely complicated physiological process and is implicated in active attack by aphids, passive defense of trees, confrontation between the host and the aphid, and mutual coordination between the two. It is expected that massive transcripts linked to plant-aphid interactions would be enriched in the four tissues, especially in galls. Serine/threonine-protein kinases phosphorylate the OH group of serine or threonine residues, leading to a functional change in the target protein; protein phosphorylation has been identified as one of the most important events in the disease resistance pathway (Cao et al. 2011). For example, the tomato bacterial speck disease resistance gene Pto encodes a serine/threonine-protein kinase (Martin et al. 1993a); the bacterial blight disease resistance gene Xa21 (Song et al.    (Brueggeman et al. 2008), and stripe rust resistance gene Yr36 in wheat (Fu et al. 2009) also code for a serine/ threonine-protein kinase domain. The network of serine/ threonine-protein kinases in plant cells appears to act as a 'central processor unit', accepting input information from receptors that sense environmental conditions, phytohormones, and other external factors, and converting it into appropriate outputs, such as changes in metabolism, gene expression, and cell growth and division (Hardie 1999). In the present study, 87 transcripts encoding serine/threonine-  . Pathway enrichment analysis of DEGs. In this figure, we used three indices to measure the degree of enrichment, including the Rich factor, Qvalue and number of genes enriched in a pathway. Of these, the Rich factor represents the ratio of the number of DEGs, accounting for all annotated genes that share the same pathway as the DEGs. Larger values of the Rich factor imply a greater degree of enrichment. The Qvalue is the adjusted P-value by multiple hypothesis testing. The Qvalue ranges from 0 to 1, and values closer to 0 imply more significant enrichment. protein kinases were identified, indicating their indispensable role in gall formation.
In the present study, 24 WRKY transcription factors were identified, including WRKY1, WRKY 2, WRKY17, WRKY22, WRKY26, WRKY44, WRKY47, WRKY57, WRKY60, WRKY72, and WRKY74. The large family of WRKY transcription factors exerts an influence over plant growth, development, and plant environment interactions (Ülker and Somssich 2004;Rushton et al. 2010). In Arabidopsis thaliana, AtWRKY44 is involved in trichome development (Johnson et al. 2002), while AtWRKY6, AtWRKY53, and AtWRKY70 are involved in leaf senescence (Robatzek and Somssich 2002;Ülker et al. 2007). AtWRKY3, AtWRKY4, AtWRKY52, AtWRKY53, and AtWRKY70 all participate in the plant's response to pathogen attack (Pandey and Somssich 2009). AtWRKY22 and AtWRKY29 are two downstream components in a MAPK pathway that is responsible for conferring resistance against both bacterial and fungal pathogens (Asai et al. 2002). A transcript (CL1111.Contig2_Rhuscm) was identified as WRKY transcription factor 22-like, which was highly expressed in LW, mildly expressed in GL, and slightly expressed in Gall. This implies that both leaves and galls show a significant defense response. A transcript (Unige-ne11387_Rhuscm) identified as WRKY1 was highly expressed in Gall, mildly expressed in LW, and slightly expressed in GL. This is consistent with evidence that the order of withering is GL, LW, and Gall (Figure 1(B)). According to a previous study (Qiao et al. 2016), WRKY1 is associated with stomatal movement in drought stress. Thus, it is possible that WRKY1 regulates water management during gall development.

DEGs involved in secondary metabolism
Two shikimate dehydrogenase genes from grapevine, VvSDH3 and VvSDH4, were mainly expressed in immature berry tissues in which galloylated flavan-3-ols were accumulated, and the content of gallic acid, β-glucogallin, and galloylated flavan-3-ols was increased (Bontpart et al. 2016). In contrast, two transcripts encoding shikimate dehydrogenase (CL2974.Contig1 and CL3909.Contig1) that expressed highly in galls were identified in this study, attesting to the fact that shikimate dehydrogenase has an impact on gallic acid metabolism.
In plants, anthranilate synthases are monofunctional, and consist of two non-identical subunits, the alpha (ASA) and the beta subunit (ASB) (Crawford 1989). ASA alone can convert chorismate to anthranilate with ammonia acting as an amino group donor, while ASB acts as an accessory component to ASA. In most plants, there are two non-identical copies of ASA genes that are regulated by different mechanisms (Bohlmann et al. 1995;Tozawa et al. 2001). In Arabidopsis, the mRNA expression of ASA1 was approximately 10 times higher than that of ASA2 and was induced by pathogen infection as well as wounding (Niyogi and Fink 1992). At the protein level, the enzymatic activities of ASA are also differentially regulated through feedback inhibition by Trp (Bohlmann et al. 1996;Kreps et al. 1996). The regulated expression of ASA genes suggests that the control of anthranilate flux is an important step in regulating the production of Trp (Lu et al. 2005), which is converted to a diverse range of secondary metabolites, such as IAA, serotonin and, in many medicinal plants, the pharmacologically important terpene indole alkaloids (Dubouzet et al. 2007). In this study, two transcripts encoding the anthranilate synthase alpha subunit (CL3745.Contig1 and Unigene19101) were identified with abundant expression in galls compared with leaf tissues, suggesting that the anthranilate synthase alpha subunit is possibly associated with gallic acid synthesis.

DEGs involved in starch and sucrose metabolism
Fructokinase directly regulates sucrose degradation after the conversion of b-D-fructofuranose and D-fructose-6-phosphate, and the latter is the starting material for glycolysis (Martin et al. 1993b). Additionally, fructokinase performs a regulatory role in photosynthetic tissues, regulating photosynthesis, growth, and senescence (Dai et al. 1999). During hypoxia, overexpression of fructokinase can improve adenylate energy status in tomato root (Gharbi et al. 2007). Additionally, fructokinase significantly is up-regulated during fiber elongation in cotton (Yang et al. 2008). In this study, a fructokinase-2 gene (Unigene15517) was abundantly expressed in gall compared with the other three types of tissues, which implies that up-regulation of the fructokinase-2 gene might be associated with gall formation.
Invertase, also called beta-fructofuranosidase, is a glycoprotein that cleaves the terminal non-reducing beta-fructofuranoside residues. Based on solubility, optimum pH, Hormone concentration data were extracted from our previously published paper . Gene expression data are listed at Table 5. isoelectric point, and subcellular localization, plant invertases can be classified into three subgroups: vacuolar (soluble acid), cytoplasmic (soluble alkaline), and cell-wall-bound invertase. Soluble acid invertase has important biological functions in sucrose metabolism and predominantly hydrolyses sucrose for growth and developmental processes (Kulshrestha et al. 2013). In addition, sucrose hydrolysis by soluble acid invertase helps in the regulation of osmotic pressure, which controls cell expansion and depends on the size of vacuole (Ji et al. 2005). Sucrose synthase can catalyze the cleavage of sucrose and produce uridine diphosphoglucose, which is the substrate for cellulose synthesis (Nairn et al. 2008). Three transcripts with abundant expression in gall encoding soluble acid invertase (Unigene4744), sucrose synthase (CL127.Contig1), and cellulose synthase A catalytic subunit 3 (CL1316.Contig1) were identified.
Pectinesterase is a ubiquitous cell-wall-associated enzyme thought to be responsible for the demethylation of galacturonyl residues in high-molecular-weight pectin (Phan et al. 2007), which has been implicated in many developmental processes, including cellular adhesion, stem elongation, pollen tube development, abscission, and fruit ripening (Brummell and Harpster 2001;Micheli 2001;Bosch and Hepler 2005). Twelve transcripts encoding pectinesterase were enriched in four types of tissues. Of these, eight transcripts were highly expressed in gall, and four transcripts were substantially expressed in the other three tissues. Given that galls experience a substantial accumulation of biomass when compared with leaves, we speculate that the above-mentioned DEGs, which are involved in sugar and starch metabolism, play a key role in primary substance metabolism.

Endogenous hormones coordinate primary and secondary metabolism
Plants reorganize their metabolism as they establish vegetative structures, acquire nutrients, produce viable offspring, survive drought, and other abiotic stresses, as well as navigate the challenges of maintaining mutualistic relationships while thwarting the advances of parasitic or herbivorous organisms. Phytohormones are important regulators of plant growth and development, in addition to playing an important role in their adaptation to their respective environments. Often, phytohormones are classified by their most prominent function, such as 'growth hormones' or 'defense hormones' (Schäfer et al. 2016). However, these hormones have also been shown to participate in adaptations different from their classical functions. For example, JAs can regulate flower development (Stitz et al. 2014) and senescence induction (Ueda and Kato 1980), while cytokinins also participate in interactions with pathogens and herbivores (Smigocki et al. 1993;Choi et al. 2010). In addition, many phytohormone pathways interact with each other.
Phytohormone interactions can occur at the signaling level. A classic example is the GA pathway amplifying JA signaling by the binding of the GA-regulated DELLA proteins to JASMONATE ZIM-DOMAIN (JAZ)-suppressors, negative regulators of JA signaling (Hou et al. 2010). Phytohormones can also influence each other at the metabolic level, as is Figure 13. Hierarchical clustering of the DEGs. FPKM values were transformed to log 2, analyzed with Cluster 3.0 (http://www.eisenlab.org/eisen/) and displayed using Treeview. The three columns represent the different ratios between tissues, including gall/CL, GL/CL, and LW/CL. The color ranges from red to green for the up-regulated and down-regulated proteins, respectively. Undetected proteins on the gels are displayed in grey. The gene names are listed on the right of the figure. In terms of differential expression, three subclusters were defined and are marked by lowercase (a-c).
reported for AXs that can reduce CK levels by promoting cytokinin oxidase/reductase-mediated degradation of CKs (Werner et al. 2006;Carabelli et al. 2007). However, phytohormone pathways are also known to interact at the signaling and metabolic levels. Additionally, phytohormones can simultaneously affect nutritional value and the abundance of defense compounds.
In the present study, abundant enrichment of signal transduction pathways involving seven types of hormones was identified, which is in accordance with our previous study, wherein concentrations for the seven types of hormones showed significant differences. However, the PCA results between hormone concentration and signal transduction do not completely overlap (Figure 12), possibly due to differences between hormone synthesis and signal transduction. Interestingly, ABA has high weight in the PCA results for both hormone concentration and signal transduction, suggesting that it plays a key role in gall formation. In this study, 22 transcripts were related to ABA signal transduction, of which 19 transcripts were annotated as protein phosphatase 2c, and the majority of them retain higher expression in galls compared with other tissues.
protein by the ubiquitin 26S proteasome. Therefore, the inhibitory effect of DELLA protein on plant growth is reversed (Vierstra 2003;Ueguchi-Tanaka et al. 2007). Of the four unigenes encoding the gibberellin receptor GID1, only Unigene11024 is consistent with the majority of DEGs in being related to secondary metabolism and plant-aphid interactions ( Figure 13); therefore, we speculate that Uni-gene11024 has an important role in GA signal transduction regulating gall development, and its specific function needs to be determined.
Currently, there are no mRNA expression data for R. chinensis. In this study, transcriptome data based on RNA-seq technology were obtained and gene expression profiling of four tissues was performed. We identified thousands of DEGs enriched in the biosynthesis of secondary metabolites, plant-aphid interactions, and plant hormone signal transduction, which were abundantly expressed in galls compared with other leaf tissues. Of the seven endogenous hormones, IAA and ABA dominated hormone signal transduction according to PCA analysis. Additionally, the transcriptome data were verified by quantitative real-time PCR ( Figure  14). In conclusion, due to aphid attack, hormone signal transduction coordinates primary and secondary metabolism and the defense response, leading to gall induction ( Figure 15). Table 6. Quantitative PCR validation of the unigenes from RNA-seq.
These data lay the groundwork for further genomic research on gallnuts.