Isolation and characterization of a MADS-box gene in cucumber (Cucumis sativus L.) that affects flowering time and leaf morphology in transgenic Arabidopsis

Abstract The MADS-box family genes are important transcription factors that play essential roles in plant growth and development. In this study, a MADS-box gene, CsMADS02, was identified and functionally characterized in cucumber (Cucumis sativus L.). CsMADS02 encoded a typical SEP MADS-box protein which contained a MADS-box, a K-box, an I region and a C region, as well as two short conserved SEP motifs. Sequence alignment and phylogenetic analysis showed that the CsMADS02 protein belonged to the SEP3 subfamily and was highly homologous to other published plant SEP proteins. Expression analyses revealed that CsMADS02 was primarily expressed in flowers and ovaries, and differentially expressed during the development of male and female flowers. Ectopic overexpression of CsMADS02 caused earlier flowering of Arabidopsis. In addition, morphological changes were also observed in transgenic plants, including curled and fewer rosette leaves. These findings suggest that CsMADS02 is a potential regulator of the flowering time and leaf morphology in cucumber.


Introduction
MADS-box proteins, a large transcription factor (TF) family, are key regulators involved in many aspects of the developmental process in plants. They contain a highly conserved N-terminal domain with a length of approximately 60 amino acids, which is named MADS and possesses functions of DNA-binding and dimerization [1,2]. There are two major categories (type I and type II) of plant MADS-box genes, and the type II members are also known as MIKC-type MADS because they harbour the common structure of four domains including a MADS (M) domain, an intervening (I) domain, a keratin-like (K) domain, and a C-terminal (C) domain [3,4]. The MIKC-type MADS-box proteins can be further divided into canonical (MIKC C ) and star type (MIKC Ã ) depending on the alteration of their motif structure and on phylogenetic standards [5,6].
Cucumber (Cucumis sativus L.) is a major vegetable crop cultivated worldwide, and is also a model system for flower development studies owing to its diversity of floral sex types [38][39][40]. Our previous study had identified 43 MADS-box family genes in cucumber, and some of them may function in determining the flower organ identity and floral transition [1]. However, up to now, the specific functions of these genes still remain to be elucidated. In the present study, a MADS-box gene from cucumber named CsMADS02 was cloned and characterized. The expression patterns of CsMADS02 in different tissues and during flower development were analyzed. The overexpression of CsMADS02 affected the flowering time and leaf morphological development in Arabidopsis, suggesting a potential role of CsMADS02 in cucumber growth and development.

Plant materials and growth conditions
Cucumis sativus var. sativus line 9930 was used in this study. Cucumber seeds were germinated and grown in trays containing soil mixture (peat: sand: pumice, 1:1:1, v/v/v). Both female flowers (FF) and male flowers (MF) during different developmental stages at the 20 main-stem node stage were collected for RNA isolation. The five developmental stages of MF1/FF1 to MF5/FF5 were separated on the basis of their corolla length according to a previous study [41]. All of the collected samples were frozen immediately in liquid nitrogen and stored at À80 C until RNA isolation.
The Arabidopsis thaliana Col-0 ecotype (wild-type, WT) was used in this study. WT and transgenic Arabidopsis plants were grown on soil at 22-24 C in an artificial climate chamber with a 16-h light/8-h dark cycle and 70% relative humidity.

Gene cloning and sequence analysis
Total RNA isolation and first-strand cDNA synthesis were carried out using Trizol reagent (Tiangen, China) and Superscript III RNase H-Reverse Transcriptase kit (Invitrogen, USA), respectively, according to the manufacturers' instructions. A pair of primers (CsMADS02-1F and CsMADS02-1R) were designed and synthesized (Shanghai Sangon, China) based on the sequence of CsMADS02 (Gene ID: Csa008448) in our previous study (Table 1) [1]. Semi-quantitative reverse transcription polymerase chain reaction (RT-PCR) was conducted to amplify the coding sequence (CDS) of the CsMADS02 gene using the cDNA of female flower samples according to the method described previously [42]. The PCR product was purified and cloned into the pMD18-T vector (TaKaRa, Japan) for sequencing. The genomic DNA of CsMADS02 was obtained in the cucumber genome database (http://cucumber.genomics.org.cn/), and the exon-intron structure of CsMADS02 was analyzed by comparing the CDS and genomic DNA (gDNA) sequences using Gene Structure Display Server (GSDS, http://gsds.cbi.pku.edu.cn/). The theoretical isoelectric point (pI), molecular weight (MW) and grand average of hydropathy index (GRAVY) of the CsMADS02 protein were determined by ProtParam (http://web.expasy.org/protparam/). The online tools of WoLF PSORT (https://www.genscript. com/tools/wolf-psort), Plant-mPLoc (http://www.csbio. sjtu.edu.cn/bioinf/plant-multi/), and CELLO (http:// cello.life.nctu.edu.tw/) were employed to analyze the subcellular localization of CsMADS02 protein. The conserved domains of the CsMADS02 protein were examined using the SMART (Simple Modular Architecture Research Tool) program (http://smart. embl-heidelberg.de/).

Sequence alignment and phylogenetic tree analysis
Multiple sequence alignments of the full-length protein sequences of CsMADS02 and several published MADS-box proteins from other plant species were conducted by Clustal Omega with default parameters. A neighbour-joining (NJ) phylogenetic tree was constructed using the MEGA 5.0 software [43].

Cis-elements analysis of the promoter of CsMADS02 gene
To understand the transcriptional regulation and potential expression patterns of CsMADS02, the 1,500 bp region upstream of the start codons of

Expression pattern analysis of CsMADS02
Transcriptome sequencing (RNA-seq) was performed to study the expression patterns of CsMADS02. The raw data of 10 different tissues including ovary (unexpanded, unfertilized, and fertilized), flower (male and female), root, tendril, tendril base, stem and leaf were retrieved from a public repository database (SRA, Sequence Read Archive) based on a previous study [44]. Gene expression levels were calculated as Reads Per Kilobase of exon model per Million mapped reads (FPKM) according to our previous study [45]. To assess CsMADS02 expression patterns under floral organ development, RT-PCR was carried out in different developmental stages of female and male flowers in cucumber as described previously [42]. The corresponding primers (CsMADS02-2F and CsMADS02-2R) are listed in Table 1.

Arabidopsis transformation and morphological analyses
To express CsMADS02, the CDS of CsMADS02 was amplified using the specific primers (CsMADS02-3F and CsMADS02-3R, which are listed in Table 1), and inserted into pHB [42], resulting in the pHB::CsMADS02 overexpression construct. Then the pHB::CsMADS02 construct was introduced into Agrobacterium tumefaciens strain GV3101 for transformation into A. thaliana by the floral dip method [46]. Transformants were selected on 1/2 Murashige and Skoog (MS) medium supplemented with 50 mg/L hygromycin, and the expression of CsMADS02 in T 2 generation was examined by RT-PCR. AtTubulin4 was used as an internal control and the corresponding primers are listed in Table 1. Transgenic phenotypes were analyzed in T 2 and T 3 generations, and the day of sowing was counted as day 0.

Results and discussion
Cloning and sequence analysis of the CsMADS02 gene The CDS of the CsMADS02 gene was amplified and sequenced by RT-PCR with a pair of primers designed based on the sequence of CsMADS02 gene (Gene ID: Csa008448) identified in our previous study [1]. The sequencing results showed that the CDS of CsMADS02 was 729 bp in length, and encoded a deduced protein corresponding to 242 amino acids ( Figure 1A). Analysis of the deduced amino acid sequence using the SMART program (http://smart.embl-heidelberg.de/) revealed that CsMADS02 contained a characteristic MADS domain and an additional K-box, which were present at the positions of 1-60 and 83-174, respectively ( Figure 1A). GSDS analysis comparing the sequences of CDS and gDNA of the CsMADS02 gene showed that it contained 8 exons and 7 introns ( Figure 1B). The Protparam program analysis showed that the deduced CsMADS02 protein had a predicted theoretical pI of 8.80 and a molecular weight of 27.75 kDa. The grand average of hydropathicity (GRAVY) was calculated to be -0.637, suggesting that it was hydrophilic. The subcellular localization analysis performed using WoLF PSORT, Plant-mPLoc and CELLO revealed that CsMADS02 was located in the nucleus.

Sequence alignment and phylogenetic relationships between CsMADS02 and MADS-box family members from other plant species
Our previous study showed that CsMADS02 clustered with VvSEP3 in the SEP subfamily [1], suggesting that it is a member of SEP proteins. Multiple sequence alignments were performed with CsMADS02 and other published plant SEP proteins, including AtSEP3 [47], TaMADS1 [48], GmMADS28 [49], OsMADS1 [50], GbSEP [51], ZjMADS47 [52] and SlCMB1 [23]. The results showed that CsMADS02 had 55.98-86.03% identity in deduced amino acid sequence with these SEP proteins, such as OsMADS1 in Oryza sativa (55.98% identity), SlCMB1 in Solanum lycopersicon (59.05% identity), GbSEP in Ginkgo biloba (59.91% identity), TaMADS1 in Triticum aestivum (68.78% identity), AtSEP3 in A. thaliana (78.33% identity), GmMADS28 in Glycine max (83.82% identity) and ZjMADS47 in Ziziphus jujuba (86.03% identity) (Figure 2). In addition, the MADS domain and the K-box were relatively conserved, while the I and C regions were less conserved, which is in accordance with the results of a previous study [53]. Furthermore, two short conserved motifs (SEP motifs I and II) were present in the SEP proteins although the C-terminal regions were highly divergent (Figure 2).
To investigate the divergence of CsMADS02 compared to other plant SEP proteins during evolution, a phylogenetic tree was constructed using the amino acid sequences of CsMADS02 and 33 other SEP proteins from various plant species. The phylogenetic analysis indicated that these proteins could be divided into three distinct subgroups (SEP1/2, SEP3, and SEP4), and CsMADS02 was clustered in the SEP3 subgroup (Figure 3), suggesting that CsMADS02 belongs to the SEP3 subfamily. Besides, in the SEP3 subgroup, CsMADS02 was the closest to SEP members in dicotyledons, such as Z. jujuba, Prunus persica and Malus domestica, whereas it was most distant from those in monocotyledons, such as T. aestivum, O. sativa and Zea mays (Figure 3).

Promoter region analysis of the CsMADS02 gene
The identification of cis-elements in plant promoters is important to understanding the specificity of interactions between TFs and the target gene promoters, and then the subsequent revealing of the gene regulatory networks [54]. To investigate the cis-elements in the promoter region of the CsMADS02 gene, a 1500-bp sequence upstream of the translation initiation codon was downloaded and several putative cis-elements were identified using the PlantCARE tool (Supplementary material Table  S2). A series of putative cis-elements involved in development, hormone and stress responses are displayed in Figure 4. The Skn-1_motif, which is required for the endosperm-specific promoter activity, was found at four positions, implying that CsMADS02 has tissue-specific expression. Two other development-related elements O2-site and MBSI, which are respectively involved in regulating zein metabolism and flavonoid biosynthetic genes, were also identified in the promoter region of CsMADS02. In addition, CsMADS02 possessed two heat stress-responsive elements (HSE) and a series of hormone-related elements such as ABRE, CGTCAmotif, GARE-motif and TCA-element, implying that CsMADS02 may function in response to heat stress and hormones (Figure 4).

Expression analysis of CsMADS02 in different tissues
To understand the function of CsMADS02 during the growth and development of cucumber, its expression patterns in different tissues including ovary (unexpanded, unfertilized and fertilized), flower (male and female), root, tendril, tendril base, stem, and leaf were analyzed by public transcriptome data [44]. The results showed that the expression was detected in flowers and ovaries, with the highest expression in female flowers ( Figure 5). However, its expression was not observed in other tissues, such as leaf, root, stem and tendril ( Figure 5).
To study the potential role of CsMADS02 during flower development, its expression was examined in MF1/FF1 to MF5/FF5 on the basis of the corolla lengths of flowers [41]. As shown in Figure 6A, the expression of CsMADS02 was not detected at two stages of male inflorescences (MF1 and MF3), and was relatively higher at MF2, MF4 and MF5. During the formation of female inflorescences, CsMADS02 was constitutively expressed, with the highest expression at FF3 ( Figure 6B). The expression of CsMADS02 was exhibited different patterns between male and female flowers during development (Figure 6), which is consistent with the expression profiles of the SEP genes in other plant species. For example, the Arabidopsis SEP genes are preferentially expressed in flowers, and they are necessary for specifying the identities of floral organs and ovules [13,55]. GmMADS28 is the homolog of AtSEP3 in soybean, and its expression was noticeably observed in reproductive organs including flowers, seeds and pods, but not in leaves or roots [49]. Three SEP subfamily genes in Z. jujuba (ZjMADS30, ZjMADS47 and ZjMADS48) were found to have high expression levels in floral organs including sepals, petals and pistils, and might be involved in jujube flower development [52]. The tissue-specific expression of SEP genes in flowers may be related to specific biological functions in flower development. Hence, the conserved expression of SEP subfamily members in floral organs suggests that CsMADS02 might play a role in regulating flower development in cucumber.

Effects of CsMADS02 overexpression on flowering time and leaf morphology in transgenic Arabidopsis
To reveal the potential roles of CsMADS02 in flower development, CsMADS02 was inserted into plant expression vector pHB ( Figure 7A), and expressed in Arabidopsis under the control of double constitutive     cauliflower mosaic virus (CaMV) 35S promoter. Among the 16 independent transgenic plants, 12 displayed an earlier flowering phenotype. Two representative transgenic lines (OE1 and OE2) with constitutive expression were selected for further analysis ( Figure 7B). As shown in Figure 8A, the flowering time of transgenic plants was significantly earlier than that of WT, particularly OE1 and OE2, whose flowering was approximately 10 and 2 days earlier than that of the WT plants ( Figure 8B), suggesting that CsMADS02 has a significant effect on the flowering time. This early flowering phenotype has been reported for some SEP genes when overexpressed in Arabidopsis, such as AtSEP3 [47], and LMADS3 [56], TaMADS1 [48], PrpMADS5 [57] and ZjMADS47 [52]. In addition, overexpression of GmMADS28 in tobacco also led to early flowering [49]. CsMADS02 is highly homologous to ZjMADS47, GmMADS28 and AtSEP3, all of which belong to the SEP3 subfamily and share 78.33-86.03% identities in amino acid sequence levels (Figures 2 and 3). These results reveal that the SEP3 genes can act as a significant regulator of flowering time.
Besides the alteration in flowering time, the transgenic plants also exhibited abnormal leaf morphology. Curling at the leaf edge was observed for the transgenic plants at 2 weeks, and became severe at the booting stage ( Figure 8D-F). Compared with the lowexpression transgenic line (OE2), nearly all leaves were severely curled in the high-expression transgenic line (OE1). In addition, the transgenic plants had a smaller number of rosette leaves compared with WT plants ( Figure 8C). These results suggested that CsMADS02 also plays a key role in leaf morphological development.
Similar results of flowering time and leaf morphology in transgenic Arabidopsis have also been reported in previous studies. For example, Arabidopsis plants overexpressing TaMADS1 exhibited the phenotypes of earlier flowering as well as smaller and curled leaves compared with WT plants [48]. Besides early flowering, LMADS3-overexpressing Arabidopsis plants showed only two to three small curled rosette leaves and two small curled cauline leaves on the inflorescence [56]. In addition, ectopic expression of SEP genes in other plants may promote flowering by activating the flowering time gene orthologs. For example, the expression of AtSEP3 was found to increase in TaMADS1-overexpressing and ZjMADS47overexpressing transgenic plants [48,52]. And a previous study has also revealed that the E-type SEP proteins are required for the development of floral organs by forming MADS-box protein complexes with proteins of other lineages [3]. Hence, CsMADS02 may be involved in flowering and leaf morphological development through direct or indirect interactions with other MADS-box proteins.

Conclusions
In summary, we isolated and characterized a MADSbox gene CsMADS02 from cucumber. CsMADS02 belongs to the SEP3 subfamily and is closer to the SEP3 proteins of Z. jujube and other dicotyledons than to those of monocotyledons. Overexpression of CsMADS02 affects the flowering time and leaf morphology in Arabidopsis, suggesting its possible roles in cucumber. Further studies are needed to characterize and functionally reveal the roles of CsMADS02 and other class E MADS-box genes during the developmental processes of plants.

Disclosure statement
The authors declare that they have no conflict of interest.