Genome-wide analysis of WRKY transcription factors and their response to abiotic stress in celery (Apium graveolens L.)

ABSTRACT Celery (Apium graveolens L.) is rich in nutrient substances and is cultivated worldwide. WRKY protein family is one of the largest transcription factor families in plants and is involved in growth and development, signal transduction, senescence and stress resistance. In this study, we identified 69 celery WRKY family transcription factors based on conserved WRKY domain sequence from the celery genomic and transcriptomic data. Phylogenetic analysis of celery and Arabidopsis WRKY proteins divided these celery WRKY proteins into three groups, seven subgroups. The numbers of WRKY proteins in celery and other 16 species were compared to illustrate the evolution of WRKY transcription factors in plants. Quantitative real-time polymerase chain reaction (qRT-PCR) was used to analyse the expression profiles of five selected AgWRKY genes under abiotic stress (abscisic acid, cold, drought and salt) treatments. The qRT-PCR results indicated that those selected AgWRKY family genes were upregulated under abiotic stress treatments. This study provides a basic research for further investigation of the structure and function of AgWRKY genes.


Introduction
Transcription factors (TFs) are proteins that can bind to sequence-specific DNA and interact with target genes [1,2]. TFs are found in large amounts in the plant genome, and have an important role in signalling pathways to regulate plant growth and development [3][4][5]. In the Arabidopsis genome, more than 5% of the genes encode at least 1533 TFs [6]; and 12.2% of the soybean genome genes code more than 5600 TFs [7].
TFs can be classified as families on the basis of their DNA-binding domain [8]. The WRKY family is one of the largest TF families and is mainly found in plants [9]. WRKY proteins contain one or two WRKY domains that contain about 60 amino acids, which are composed of a peptide sequence WRKYGQK motif at the N-terminus followed by a C 2 H 2 or C 2 HC zinc-binding motif [9]. In some of the WRKY TFs, the conserved WRKYGQK domain also can be replaced by WRKYGKK, WRKYDQK or WRKYDHK [10,11]. WRKY TFs have been identified to interact with the W-box (TTGACT/C) and SURE (sugar responsive) cis element in the promoter region of large number of plant target genes [12][13][14]. WRKY TFs can be divided into three groups in accordance with the number of WRKY domains and the structure of the zinc-finger motifs.
Group I contains two WRKY domains with a C 2 H 2 zinc-finger motif. Group II WRKYs have only one WRKY domain and a C 2 H 2 zinc-finger motif, and can be further divided into five subgroups (IIa-IIe). Group III contains one domain and a C 2 HC zinc-finger motif [9].
Celery (Apium graveolens L.), originated from the Mediterranean basin, is one of the most important vegetables in the Apiaceae family [33]. This biennial herb is rich in dietary fibre, flavonoids, carotenoids and volatile oils, and is now cultivated worldwide [34]. However, little reports on celery WRKY TFs have been found to date. In the present study, we identified 69 WRKY TFs from celery. The transcriptome data of celery have been built by our group [35,36]. Phylogenetic relationships and domain structure of AgWRKY family factors were analysed. The expression profiles of several AgWRKY genes under abiotic stress were detected. Our study could be used to investigate the structure and resistance to abiotic stress of WRKY genes in celery.

Analysis of celery AgWRKY family TFs
The multiple alignments of Arabidopsis and celery WRKY protein sequences was performed by ClustalW with default parameters [39]. WRKYs in Arabidopsis are listed in Supplementary Table S1. A phylogenetic tree was constructed using MEGA 5.0 by the neighbour-joining method [40]. The chemical and physical characteristics of AgWRKY amino acid sequences were analysed by the Expert Protein Analysis System (ExPASy) program (http://web.expasy.org/protparam/) [41]. The conserved motifs of AgWRKYs were identified by MEME (Version 4.11.2) [42]. The protein interaction networks were constructed by STRING software [43].

Plant materials, growth conditions and stress treatments
Celery plants were grown in pots within a soil/vermiculite mixture (3:1) in phytotron. The phytotron programme was set as 25/18 C (day/night) for 16/8 h; during the day the light intensity was 300 mmol m ¡2 s ¡1 , the relative humidity was set at 75%. After two months, celery plants were transferred to 4 C growth chamber or irrigated with 200 g¢L ¡1 PEG6000, 200 mmol¢L ¡1 ABA and 200 mmol¢L ¡1 NaCl, respectively. Clipped leaf blades were sampled after 0, 1, 3, 6, 12, 24 and 48 h. All the samples were frozen in liquid nitrogen immediately and then stored at ¡70 C for RNA extraction.

Total RNA extraction and cDNA reverse transcription
Total RNA of celery leaf blades was extracted using an RNA extraction Kit (Tiangen, Beijing, China) according to the manufacturer's instructions. cDNA was reverse transcribed from total RNA by a Prime Script RT reagent Kit (TaKaRa, Dalian, China) according to the manufacturer's instructions.

Quantitative real-time polymerase chain reaction (qRT-PCR) analysis
AgWRKYs from five subgroups (I, IIa, IIc, IId, III) were selected for qRT-PCR analysis. All the specific primers were designed by Primer Premier 6.0 and synthesized by Genscript Inc (Nanjing, China). Celery TUB-B gene was used to normalize the relative expressions of AgWRKYs [44]. All the primers are listed in Table 1. The Bio-Rad IQ5 real-time PCR System (Bio-Rad, CA, USA) and the TaKaRa SYBR Premix Ex Taq (TaKaRa, Dalian, China) were used for qRT-PCR reaction. The operating instructions were as follows: 95 C for 30 s, following 40 cycles of 95 C for 5 s, 60 C for 30 s and melting curve analysis (61 cycles) at 65 C for 10 s. The 20-mL reaction system contained 10-mL SYBR Premix Ex Taq, 7.2 mL ddH 2 O, 2 mL diluted Table 1. The primer sequences used for qRT-PCR amplification in this study.
Each reaction was repeated three times. The relative gene expression level was calculated by Pfaffl's method [45].

Results and discussion
Identification and classification of AgWRKY TFs in celery Using HMMER and BLAST, we searched the celery genome with the WRKY domain model (PF03106) to identify all the WRKY TFs in celery. A total of 69 AgWRKYs were identified and denoted from AgWRKY1 to AgWRKY69 (Supplementary Table S2). The WRKYGQK domain existed in 64 AgWRKY TFs, while the remaining 5 AgWRKYs contained WRKYGQF, WRKYRQK or WRKYGKK domains. The average amino acid length of the putative AgWRKY proteins is 365, ranging from 118 to 726. In accordance with the number of the WRKY domains and the structure of the zinc-finger motifs, the AgWRKYs were classified into three groups ( Table 2). Group I contains 10 AgWRKYs with two WRKY domains with a C 2 H 2 zinc-finger motif. Group II includes 53 AgWRKYs, which can be further divided into five subgroups. Group III members, six AgWRKYs, have one domain and a C 2 HC zinc-finger motif. A phylogenetic tree of WRKY proteins from celery and Arabidopsis was constructed ( Figure 1). Group I and group III aggregated into one clade, while group II was divided into five subgroups. Groups IIa and IIb composed of one branch, and groups IId and IIe detached from one clade. On the basis of further analysis of WRKYs in Arabidopsis and rice, the systematization of group II can be reorganized into IIa +IIb, IIc, IId+IIe [46].

Evolution and distribution of WRKY transcription factors in different species
WRKY TFs have been identified in many species, including Giardia lamblia from Diplomonads and Dictyostelium discoideum from Amoebozoa [46,47]. We constructed a phylogenetic tree of 17 species in the evolutionary history and compared the number of WRKY TFs in each species ( Figure 2) [17,[19][20][21][22][48][49][50][51][52][53][54]. The primitive eukaryote G. lamblia, slime mold D. discoideum and the green alga Chlamydomonas reinhardtii only have a single group I copy, which indicated that WRKY TFs are likely to have derived from group I. Brand et al. [55] arranged the WRKY TF evolution relationship with group I as existing earliest, followed by IIa and IId, and group III as the youngest group in the WRKY TFs family. In moss Physcomitrella patens, groups IIa and IIe are absent, whereas group III exists, suggesting that group III WRKY TFs may have evolved earlier than groups IIa and IIb in moss [49]. The higher plants contain a larger number of WRKYs and more subgroups than lower plants. The quantitative distribution of WRKY TFs in plants indicates that WRKY TFs family originated from group I. Within group II, group IIc has more WRKY TFs than other subgroups. In monocots, group III has more members than group I, which is opposite to eudicots. Group III may have a more important role in monocot plant physiological biochemical process [21,[56][57][58].

Motif discovery and physicochemical analysis of AgWRKY TFs
The conserved motifs of AgWRKY proteins were predicted by the MEME program. As shown in Figure 3, motif 1 and motif 3 contain a WRKYGQK sequence. Most AgWRKYs contained motif 1. Group I members were shown to possess both motif 1 and motif 3. The motif composition in each subgroup was observed to be similar. For instance, most of the group IIc members contained motifs 1, 2 and 4. Group I possessed motifs 1, 2, 3 and 4, whereas group III contained motifs 1, 2 and 5. Both groups IIa and IIb contained motifs 1, 2, 5, 6 and 8, which indicated that these two subgroups were adjacent in evolutionary relationship.
The Protparam tool was used to predict AgWRKYs' physical and chemical characteristics (Supplementary Table S3). The average molecular weight of the AgWRKYs was 40.35 kDa. The molecular weight variation of the members of group IIc was significant. However, the molecular weight variation in other groups did not change significantly. The theoretical isoelectric point (pI) ranged from 4.44 to 10.3, and the average pI for all proteins was 7.37. Several AgWRKY proteins were found to have pI values over 10, e.g. AgWRKY17, AgWRKY64, AgWRKY24, AgWRKY41 and AgWRKY47. In all AgWRKYs, the grand average of hydropathicity was less than zero, suggesting that AgWRKY proteins were hydrophilic. The percentage of aliphatic amino acids was twofold higher than that of aromatic amino acids in AgWRKY proteins, approximately. The positively charged amino acids in AgWRKYs were a bit more than the negatively charged amino acids at pH 6.5, except in group IIc, in which the proportion of negatively charged amino acids was a little higher.

Interaction networks of AGWRKYs and related proteins
To investigate the relationship between AgWRKYs and related proteins, the protein interaction networks were  1677 933 828 WRKYGQK IIe C-X5-C-X23-HXH AgWRKY69 921 WRKYGQK IIa C-X5-C-X23-HXH integrated by STRING software using the Arabidopsis association model [43]. In this study, we found 14 AgWRKY proteins associated with each other (Supple  mentary Table S4). Of the 14 AgWRKY proteins, AgWRKY16, AgWRKY19, AgWRKY45, AgWRKY51 and AgWRKY56 belonged to group I; AgWRKY57 and AgWRKY69 belonged to group IIa; AgWRKY42 and AgWRKY59 belonged to group IIb; AgWRKY15, AgWRKY39, AgWRKY44, AgWRKY49 and AgWRKY67 belonged to group IIc. As shown in Figure 4, the association between AgWRKY45 and AgWRKY16/AgWRKY51 was very strong, indicating these proteins may co-regulate some biological processes. Salt tolerance zinc (STZ) finger ZAT10 and MKS1 have strong associations with AgWRKY19, AgWRKY56, AgWRKY57 and AgWRKY69. In higher plants, ZAT10 and MKS1 play important roles in defence responses [59,60]. ZAT10 is a transcriptional repressor and is involved in plant tolerance to salinity, heat and osmotic stress [59]. MKS1 can repress salicylic acid (SA)dependent resistance and activate jasmonate (JA)dependent gene expression [60]. Those AgWRKYs may be also involved in the defence response of celery plants against abiotic stress.

Expression level analysis of AgWRKY genes during different developmental stages in celery
According to the transcriptome data of celery [35,36], the transcript abundances of AgWRKY genes were analysed. The expression level analysis of 30 AgWRKY genes of celery leaves in three developmental stages (leaf length was 10, 20 and 30 cm) is shown in Figure 5. The genes of groups I, IId and IIe of celery AgWRKY TFs show higher expression levels in stage 2 than in other stages. Groups IIa and III genes were more highly expressed in stage 3 than in stages 1 and 2. The expression levels of group IIc genes were lower than those of other groups. In three developmental stages, most of group III genes showed a low expression level, except for AgWRKY10 and AgWRKY31 genes in stage 3.

Expression profiles of the selected AgWRKY genes under abiotic stress treatments
Evidence proved that WRKY TFs participate in a series of abiotic stresses. GmWRKY20 in wild soybean was upregulated under ABA treatment [61]. Overexpression of OsWRKY8 in Arabidopsis improved the resistance to PEG and NaCl [62]. Arabidopsis transgenic plants with GmWRKY21 can tolerate cold stress [32]. Five AgWRKY TFs closely related to stress-related AtWRKYs [31,[63][64][65][66] were selected to deduce their role under abiotic stress: AgWRKY2 from group I, AgWRKY8 from group III, AgWRKY38 from IIa, AgWRKY47 and AgWRKY50 from groups IId and IIc, respectively.
To reveal the expression patterns of AgWRKYs in different abiotic stresses, qRT-PCR was used to detect the expression levels of the five selected AgWRKYs (AgWRKY2, AgWRKY8, AgWRKY38, AgWRKY47 and AgWRKY50) in celery leaves ( Figure 6). Under ABA treatment, the expression levels of AgWRKY2, AgWRKY8, AgWRKY38 and AgWRKY47 showed an increased trend at first, followed by a reduction and then an increase. On the contrary, the expression levels of AgWRKY50 went down first and were then upregulated, with a final decline after 48 h. All the five selected AgWRKY genes were highly expressed under cold treatment, suggesting that these AgWRKYs may play important roles in the response to cold stress. Under drought treatment, AgWRKY2, AgWRKY8, AgWRKY38 and AgWRKY50 were  upregulated during the first 24 h and then showed a decrease in expression. AgWRKY47 maintained a high expression level after 48 h drought treatment. The expression patterns of all the five AgWRKYs genes under salt treatment were similar; they increased at first and then declined, but were enhanced again at the end of the treatment. AtWRKY71 is reportedly induced by H 2 O 2 , mannitol and ABA [66]. Its orthologous gene AgWRKY50 was also found to be upregulated under ABA, drought treatment and other abiotic treatments. High temperature can induce upregulation of AtWRKY39 [31] and its orthologous gene in celery, AgWRKY47, was upregulated under different abiotic stresses in this study. These results demonstrated that the five AgWRKY TFs were involved in abiotic stress response. The results provided the foundation for investigating the roles of AgWRKYs in the resistance to abiotic stress in celery. This study expands the knowledge of WRKY TFs in celery and other higher plants. Further research should be performed to reveal the regulatory mechanism of WRKY TFs in the abiotic stress regulation network.

Conclusions
In this study, we identify 69 AgWRKY TFs from celery transcriptome data. AgWRKY proteins can be divided into three groups, of which group II contains five subgroups. Bioinformatics analysis predicted that AgWRKYs in the same subgroup share similar physicochemical properties. qRT-PCR results showed that the expression of AgWRKY genes can be upregulated or downregulated in response to various abiotic stresses. Our study will provide useful information for further research into the role of AgWRKY TFs in regulation mechanism in abiotic stress response. This study can provide potential useful information for the study of WRKY TFs in other plant species and molecular breeding of celery.