Cloning, expression and functional analysis of the SOC1 homologous gene in pak choi (Brassica rapa ssp. Chinensis makino)

Abstract Flowering transition is important in plant growth and development, and the SUPPRESSOR OF OVEREXPRESSION OF CO 1 (SOC1) gene belongs to the MADS-box gene family, which is an integrator involved in the regulation of flower formation in plants. Current studies have shown that the structural domain of SOC1 is relatively conserved in a variety of plants. To further investigate the function of the SOC1 homologous gene in pak choi, the full-length coding sequence of pak choi BrcSOC1 was cloned, and the structure, subcellular localization and spatiotemporal expression pattern of BrcSOC1 were analyzed in this study. The tissue-specific expression results showed that BrcSOC1 was expressed in roots, leaves, bolts, flowers and pods. The expression results at different developmental stages showed that BrcSOC1 was expressed at the highest level at floral bud differentiation stage 5. When BrcSOC1 was overexpressed in Arabidopsis thaliana, the flowering time in all transgenic plants was significantly earlier than in the wild type, and the expression analysis indicated that BrcSOC1 positively regulated the expression of the downstream genes AGAMOUS-LIKE 24 (AGL24) and LEAFY (LFY). These results provide an important reference for further research on the mechanism of BrcSOC1 in the flowering pathway of pak choi.


Introduction
The flower is an important reproductive organ of angiosperms and the most abundantly varied structure during the evolution of angiosperms [1]. In the molecular genetic studies on flower development, the development of floral organs has been most intensively studied. The majority of ABC model genes in plant floral organ models belong to the MADS-box transcription factor family; therefore, MADS-box genes are essential for floral organ development [2]. In 2000, Arabidopsis thaliana 35S::CO suppressor mutants were screened, and 35S::CO was identified as causing four early-flowering mutants, one of which suppresses the overexpression of CO1 (CONSTANS 1) and was named the SOC1 gene [3], a member of the MADS-box gene family. Studies have shown that SOC1 genes and plant flowering-related genes form a complex regulatory network, and the interaction between these genes can promote flowering [4].
SOC1 in Arabidopsis thaliana can act on its downstream flowering genes by regulating the long daylight, vernalization and gibberellin pathways to induce flowering in plants [5,6]. AGL24 is also a member of the MADS-box family [7,8]. It has been shown that AGL24 and SOC1 form a positive feedback loop that regulates downstream flower organ trait genes [9]. When SOC1 is induced at the shoot apex, it forms a critical heterodimeric structure with AGL24 [10], which activates transcription of the floral meristematic tissue characteristic gene LFY. Another floral meristem characteristic gene, APETALA1 (AP1), can be activated by Flowering Locus T (FT), and when LFY and AP1 start to be expressed, the shoot apex begins to differentiate the floral bud primordia according to the ABC model [11]. In addition to the model plant Arabidopsis thaliana, the function of the SOC1 gene has been extensively studied in species such as barley (Hordeum vulgare) [12], soybean (Glycine max) [13], tobacco (Nicotiana tabacum) [14], phalaenopsis (Phalaenopsis jiuhbao) [15] and leaf mustard (Brassica juncea) [16].
Pak choi (Brassica rapa ssp. Chinensis makino) belongs to the genus Brassica in the family Cruciferae. Flower development in Chinese cabbage is regulated by multiple genes in a complex manner. Since Chinese cabbage is a typical heterogeneous pollinated crop with an obvious hybrid advantage, genes related to pollen development and pollination and fertilization in Chinese cabbage have been widely studied [17,18]. However, current research on Chinese cabbage flower development has mostly focused on male gametophytes and male sterility, and systematic studies on the subject are lacking. Currently, SOC1, as a key flower-forming integrator, has been verified for its homologous genes' functions sequentially in different plants. A previous study has shown that Bra000393 (SOC1) was significantly upregulated at stage 5 in pak choi [19], while the precise functions of SOC1 homologs in pak choi have not been elucidated. Therefore, BrcSOC1 (genBank ID: OM339438), a homolog of SOC1, was cloned from pak choi, and the expression pattern of BrcSOC1 was analyzed by quantitative real-time polymerase chain reaction (qRT-PCR) in this study. The subcellular location of the BrcSOC1-encoded protein was determined. In addition, the overexpression of BrcSOC1 in Arabidopsis thaliana and the flowering phenotype of transgenic plants were observed to clarify its function in the flowering, which laid an important foundation for understanding the molecular mechanism of flowering regulation in pak choi.

Plant material and treatment
In this experiment, pak choi '75 # ' , an easily bolting self-incompatible line from Shanxi Agricultural university (Shanxi Academy of Agricultural Sciences), was used as the research material. The seeds were soaked in sterilized distilled water in a beaker at 55 °C ± 5 °C for 1 h. Two layers of sterile filter paper were laid in a sterilized petri dish, and the soaked seeds were evenly put on the filter paper. The seeds were placed in a light incubator with 16 h/8 h of light, 25 °C ± 1 °C, and 55% ± 5% relative humidity for 2 d. After the seeds had sprouted, they were moved to a refrigerator at 4 °C for 20 d during which the light was controlled for 16 h/8 h. The filter paper was kept moist during the entire process. After vernalization, seedlings were transplanted into 50-hole cavity trays with a substrate ratio of peat:vermiculite:perlite = 3:1:1 and managed in a solarium (maximum temperature 30 °C, minimum temperature 10 °C, light 12 h/12 h) for regular cultivation. Floral bud differentiation was observed during plant growth, and samples were taken from shoot apexes at different developmental stages including post-vernalization stage (shoot apices with cotyledons), vegetative stage (10 d after transplanting), stage 0 (15 d after transplanting, immediately prior to floral bud differentiation), stage 1 (floral bud differentiation stage 1) and stage 5 (floral bud differentiation stage 5) [20]. Samples were also taken from different tissues of roots, leaves, bolts, flowers and pods.

RNA extraction and cDNA synthesis
The shoot apexes of '75 # ' at floral bud differentiation stage 5 were used as material for total RnA extraction using an RnAprep pure plant kit (DP432, TIAngen Biotech, China). Total RnA integrity was detected by 1% normal agarose gel electrophoresis, and first-strand cDnA was synthesized using the TransScript® first-strand cDnA synthesis superMix (AT301, Transgen Biotech, China).

Full-length coding sequence cloning of BrcSOC1
The nucleotide sequence of Bra000393 (SOC1) in the Brassica database (http://brassicadb.cn), a genus of Brassicaceae, published by the Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, was used as a template to design primers. The CDS full-length primers 393 F and 393 R were designed using the online primer design software Primer 3.0 ( Table 1). The SOC1 homologous gene of pak choi was amplified by a standard polymerase chain reaction (PCR) procedure using TaKaRa PrimeSTAR ® HS DnA polymerase (R010A, Takara Bio Inc., Japan) using Mastercycler ® nexus (eppendorf, germany) model. PCR products were recovered by 1% agarose gel electrophoresis using a TaKaRa MiniBeST agarose gel DnA extraction kit ver. 4.09 (9762). using a TaKaRa pMD™ 19-T vector cloning kit (6013), they were transformed into the Trans5α chemically competent cells (CD201, Transgen Biotech, China). Single colonies were screened for resistance and PCR was performed, then the positive clones were sent to Sangon Biotech Co., Ltd. (Shanghai, China) for sequencing.

Bioinformatics analysis of BrcSOC1
The sequencing results were analyzed using DnAMAn 6.0, and the protein blast tool at nCBI (https://blast. ncbi.nlm.nih.gov/Blast.cgi?PROgRAM=blastp&PAge_ TyPe=BlastSearch&LInK_LOC=blasthome) was used to identify amino acid homologous sequences of leaf mustard (Brassica juncea), european oilseed rape (Brassica napus), turnip (Brassica rapa), radish (Raphanus sativus), white mustard (Sinapis alba) and Arabidopsis thaliana. Multiple amino acid sequence alignment and homology tree construction were performed using DnAMAn 6.0 and MegA 5.1, respectively. To further understand the properties of the proteins encoded by the cloned genes, SMART (http://smart.embl-heidelberg. de/) was used to predict the protein structural domains of BrcSOC1.

Subcellular localization of BrcSOC1
To illustrate the intracellular location of the BrcSOC1 protein, a subcellular localization vector was constructed via the homologous recombinant cloning method. Homologous primers BrcSOC12300-egFP-F (Kpn I) and BrcSOC12300-egFP-R (Xba I) were designed (Table 1), the full-length coding sequence of BrcSOC1 (without terminator) was amplified, and the plasmid pCAMBIA2300-35S-egFP was linearized by doubledigestion with Xba I and Kpn I. The target fragment and linearized vector were ligated using a novoRec ® PCR one-step targeted cloning kit (nR001, Vazyme, China) to recombinantly generate the fusion vector pCAMBIA2300-BrcSOC1-egFP. Subsequently, the negative control (nC) group 35S:egFP and the experimental group 35S:BrcSOC1-egFP were set up to be transiently expressed by Agrobacterium tumefaciens strain gV3101-mediated transformation into leaves of Nicotiana benthamiana. The transformed material was managed routinely in the growth chamber, and the infested area was cut after 48 h. The tobacco epidermal leaf cells were observed under an Olympus laser confocal scanning microscope (FLuOVIeW FV3000, Japan) with an excitation wavelength of 488 nm and an emission wavelength of 510 nm.

BrcSOC1 gene expression analysis
To clarify the expression pattern of BrcSOC1 in pak choi shoot apexes at different developmental stages and in different tissues, qRT-PCR was used to determine its expression levels in post-vernalization transplants on 10 d after transplanting, 15 d after transplanting, floral bud differentiation stage 1 and stage 5 of the shoot apex, as well as in roots, leaves, bolts, flowers and pods. Total RnA was extracted using an RnAprep pure plant kit (DP432), reverse transcription was performed using a TaKaRa PrimeScript ® RT reagent kit (RR037A), and each reaction system contained 500 ng of total RnA.
The qRT-PCR primers BrcSOC1-F/R were designed according to the genomic sequence of Brassica rapa V1.5 (Table 1). qRT-PCR was performed on an ABI 7500 real-time PCR instrument using a TaKaRa SyBR ® premix ex Taq TM II (Tli RnaseH plus) kit (RR820A). The Chinese cabbage ACTIN was used as the internal reference gene, and the primers BraACTIN-F/R (Table 1) were designed and simultaneously amplified with the target gene. Relative quantification was calculated using the 2 -ΔΔCt method [21]. Three biological replicates and three technical replicates were performed for each sample.

Phenotypic analysis of transgenic Arabidopsis thaliana plants
BrcSOC1 was inserted into the plant expression vector pBWA(V)KS, and the obtained pBWA(V)KS-BrcSOC1-guS plasmid was then transferred into Agrobacterium tumefaciens gV3101 and transformed into Arabidopsis thaliana using the flower dipping method [22]. T 1 generation plants were obtained by screening in 1/2 Murashige and Skoog (MS) solid medium containing 50 mg L −1 kanamycin (Kan), which was subjected to β-galactosidase reporter gene staining (guS) and PCR. Subsequently, the positive transgenic plants were further screened to obtain a T 2 generation resistant F indicates forward primer, R indicates reverse primer, and the underline indicates restriction sites.
plants after an individual harvest. The T 2 generation seeds were sown and transplanted at the same time with the wild-type A. thaliana ecotype Columbia (Col) seeds and cultured under the same growth conditions. The flowering phenotype of the transgenic generation plants and wild-type plants were compared.

Gene expression analysis of transgenic Arabidopsis thaliana
To study the expression of BrcSOC1 in transgenic A. thaliana lines and its effect on the expression of downstream genes, the expression levels of BrcSOC1, AtSOC1, AtAGL24 and AtLFY were determined in two transgenic lines of the T 2 generation and wild-type Col through qRT-PCR. Fresh and tender rosette leaves of wild-type Col and transgenic plant (more than 100 mg) were cut and rapidly ground in liquid nitrogen. Total RnA was extracted from rosette leaves using an RnAprep pure plant kit (DP432). A nanodrop was used to determine the RnA concentration and optical density (OD) at 260 nm, and 1% agarose gel electrophoresis was utilized to evaluate RnA integrity. The RnA samples with good quality control and integrity were stored at −80 °C. RnA from T 2 generation plant line 1, line 2 and the Col line were used as templates for reverse transcription using a TaKaRa PrimeScript ® RT reagent kit (RR037A). The Arabidopsis thaliana ACT11 gene was used as the internal reference gene, and the internal reference gene primers AthACT11-F/R, AtSOC1-specific primers AtSOC1-F/R, AtAGL24-specific primers AtAGL24-F/R and AtLFY-specific primers AtLFY-F/R were designed ( Table 1). The cDnA from wild-type Col and two T 2 generation transgenic lines were used as templates for qRT-PCR.

Data analysis
Quantitative data are presented as average values with standard deviation (±SD). Statistical analysis was performed via Duncan's test using graphpad Prsim 8. Differences were considered statistically significant at the p < 0.05 level.

Cloning and sequence analysis of pak choi BrcSOC1
Primers were designed using the genome sequence of Chinese cabbage, and the fragment was amplified by RT-PCR using pak choi cDnA as the template to obtain the SOC1 homologous gene Bra000393 of pak choi, which was sequenced to obtain a fragment size of 1117 bp. Amino acid translation was performed using DnAMAn 6.0, and the results showed that the open reading frame of Bra000393 contained 642 bp, encoding 213 amino acids. Multiplexed amino acid sequence alignment using DnAMAn 6.0 showed that BrcSOC1 had high homology values of 99%, 99%, 100%, 99%, 99% and 95%, respectively (Figure 1), compared with BjSOC1 (leaf mustard, Brassica juncea), BnSOC1 (european oilseed rape, Brassica napus), BrSOC1 (turnip, Brassica rapa), RsSPL8 (radish, Raphanus sativus), SaSOC1 (white mustard, Sinapis alba) and AtSOC1 (Arabidopsis thaliana), indicating that the cloned fragment was indeed pak choi SOC1. The phylogenetic tree was constructed using MegA 5.1, which showed that BrcSOC1 belonged to the same branch as other cruciferous plants and clustered into a small branch with Brassica rapa and Brassica napus (Figure 2). The phylogenetic tree showed that BrcSOC1 was most closely related to B. rapa and B. napus, and it is speculated that they might have similar or even identical functions. using SMART to predict the protein structural domains, the BrcSOC1 had a MADS structural domain at amino acids 1-60 and a K-box structural domain at positions 80-172, indicating that BrcSOC1 belonged to the MADS-box transcription factor family similar to other SOC1 homologs.

Subcellular localization of BrcSOC1
A previous study showed that AtSOC1 was localized in the nucleus in Arabidopsis thaliana [23]. Since BrcSOC1 is relatively conserved with SOC1 in the structural domain of the protein, WoLF PSORT (https://www.genscript.com/wolf-psort.html?src= leftbar) was first used, and bioinformatics predictions indicated that BrcSOC1 might be expressed in the nucleus, mitochondria or cytoskeleton. To identify the specific location of BrcSOC1 in cells of pak choi, tobacco leaf transient expression analysis was used to further verify the subcellular localization of BrcSOC1. The pCAMBIA2300-35S-BrcSOC1-egFP fusion expression vector was transformed into gV3101 and mediated the transient overexpression of the fusion protein in leaf cells of N. benthamiana. The results showed that BrcSOC1 was expressed in both the nucleus and cytoplasm (Figure 3), which was consistent with the bioinformatics prediction.  , np182090.1). the dark blue color shows the same amino acid alignment, the light blue color shows more than two differences in amino acid sequence alignment, and the pink color shows one difference in amino acid sequence alignment

Gene expression analysis of BrcSOC1
To clarify the expression pattern of BrcSOC1 in pak choi, qRT-PCR was used to analyze the relative quantification of shoot apex at different developmental stages and in different tissues ( Figure 4). Overall (Figure  4a), the expression of BrcSOC1 showed a trend of first decreasing and then increasing during different developmental periods of pak choi from the completion of vernalization transplantation to floral bud differentiation stage 5, decreasing briefly from transplantation to 10 d after transplantation (vegetative growth period), then increasing gradually, and increasing sharply from bud differentiation stage 1 to stage 5. It reached its highest value at floral bud differentiation stage 5. Comparing the expression in different tissues (Figure 4b), BrcSOC1 was expressed in all tissues of pak choi, with the highest expression in leaves but decreased successively in bolts, flowers, roots and pods.

Functional analysis of BrcSOC1 during flowering
After transferring BrcSOC1 into A. thaliana, a total of 42 T 1 plants were screened in 1/2 MS solid medium containing 50 mg L −1 Kan, and all of them were confirmed to be transgenic plants by guS staining and   Figure 5). Compared with the wild-type Col, the T 2 generation OE-BrcSOC1 plants required an average of 4.08 fewer days from sowing to flowering; the number of rosette leaves at flowering was reduced by 1.83 on average, and the flowering period was more consistent among the T 2 generation plants (Table 2, Figure 5). This result indicated that overexpression of BrcSOC1 in A. thaliana promoted bolting and earlier flowering. The results of qRT-PCR ( Figure 6) showed that the expression of the BrcSOC1 gene in fresh and tender leaves of T 2 generation transgenic lines was significantly higher than that in the wild-type Col line at bolting stage, indicating that the BrcSOC1 gene was overexpressed in transgenic early flowering lines. Simultaneously, the expression level of endogenous AtSOC1 was slightly higher in the overexpressing plants than in the wild-type plants. Moreover, the downstream genes of SOC1, AtAGL24 and AtLFY were both highly expressed in the transgenic lines, indicating that BrcSOC1 could positively regulate the flowering-related genes AtAGL24 and AtLFY, and thus positively promoted flowering.

Discussion
Recent studies have shown that a large proportion of transcription factors involved in the regulation of flower development belong to the MADS-box gene Data are average values with standard deviation (±SD). the significant differences were tested by Duncan's test. a and b represent significant differences between Col and t 1 , Col and t 2 respectively at p < 0.05 level. family, all of which contain a highly conserved MADS-box structural domain with a length of approximately 180 bp; hence, these genes are also called MADS-box genes [24]. The SOC1 homolog cloned in this experiment is a member of the MADS-box gene family and was predicted to contain a MADS structural domain encoding 60 amino acids with a length of approximately 180 bp, which is consistent with the results of previous studies [24]. SOC1 is triplicated in Brassica into three subgenomes, LF, MF1 and MF2; all of them maintain a high sequence identity (92.8% ~ 100%) and have an important role in the molecular mechanism of flowering regulation in Brassica [25]. In this study, BrcSOC1 showed a high homology (99%, 100%) with Brassica, e.g. BjSOC1 and BnSOC1, and showed 95% homology with the cruciferous model plant A. thaliana, presumably with similar or even identical functions to the homologous genes. Subcellular localization is an indispensable technical tool for studying gene function; it can localize a protein or expression product to a specific location in the cell and thus provide a direction for understanding the mechanism of action of the gene [26]. Therefore, we performed subcellular localization of BrcSOC1 by transient expression in transformed leaf cells of N. benthamiana. The results showed that BrcSOC1 was expressed in both the nucleus and cytoplasm, and it was hypothesized that it could function biologically as a transcription factor. gene regulatory changes often contribute to species specificity as well as intraspecies variation in complex phenotypes [27]. There are differences in the expression patterns of homologous genes in different species. Research on A. thaliana has shown that the SOC1 gene encoding the MADS-box transcription factor is mainly expressed in developing leaves and meristematic tissues and is an important integrator in the vernalization, autonomous, and photoperiodic pathways to control flowering [28]. The function of AtSOC1 is associated with the development of floral organs and integration genes related flowering in the floral transition from vegetative to reproductive growth. In other species, the function of the SOC1 homologous gene has also been extensively studied. Two homologs of SOC1 in chrysanthemum, CISOC1-1 and CISOC1-2, were identified, and their expression was increased under short-day treatment and was higher in the shoot apex and leaves at early plant developmental stages [29]. The expression of SOC1 homologs in B. napus was higher in both flowering or not flowering but bolting plants than in non-bolting plants, speculating that it plays a positive regulatory role in promoting the flowering pathway [30]. In addition, homologous genes of SOC1 in plants, such as Pyrus bretschneideri [31] and Bambusa oldhamii [32], showed high expression in leaves and flower buds. In this experiment, the expression level of BrcSOC1 reached its peak at floral bud differentiation stage 5, while the expression pattern in different tissues showed the highest expression in leaves, followed by higher expression in bolts and flowers, which is consistent with the results of previous studies [19], speculating that BrcSOC1 function may be related to flower development of pak choi.
Functional studies on SOC1 promoting flowering have been repeatedly confirmed in researches from model plants to horticultural crops. The expression of SOC1 increases substantially during the flowering transition, and overexpression of SOC1 positively regulates flowering. In Brassica juncea, both BcSOC1-1 and BcSOC1-2 can advance the flowering time under exogenous gibberellin treatment and low-temperature treatment, and the expression levels of both genes are significantly upregulated [33]. The K-domain of the SOC1-like gene VcSOC1K was used to regulate blueberry MADS-box gene expression and found that overexpression of VcSOC1K promoted blueberry flowering, which in turn increased the blueberry yield [34]. Overexpression of horticultural SOC1 homologs in Paeonia [35], Eriobotrya japonica [36] and Juglans regia L. [37] all promote flowering and have important roles in flowering time regulation and reproductive growth transition. In the present study, overexpression of BrcSOC1 in A. thaliana shortened the average flowering time by 4.08 d, further indicating that the BrcSOC1 gene in pak choi could positively regulate flowering.
SOC1 has a role as a regulator in the organogenesis of plant development because it not only regulates the flowering time but also the flowering type and determines the formation of plant floral meristematic tissue [38,39]. Comprehensive analysis of the phenotype of transgenic leaf mustard in the field revealed changes in 11 agronomic traits, such as flowering time, number of lateral branches, number of flower nodes and seed yield. Among these, BniSRB2_SOC1 and BjuVAR20_SOC1 overexpression directly affected two important developmental traits, flowering time and number of lateral branches [16]. In the present study, BrcSOC1 was overexpressed in A. thaliana, and the number of lateral bolts was significantly increased in transgenic plants. SOC1 and LFY are important flowering regulators that integrate flowering signals from multiple pathways in A. thaliana [40]. AGL24 is a dose-dependent flowering promoter that may act downstream of SOC1 and upstream of LFY [8]. Overexpression of strawberry FaSOC1 in A. thaliana leads to early flowering and upregulates LFY expression, while FaSOC1 and AGL24 can interact with each other [41]. In this experiment, BrcSOC1 was overexpressed in A. thaliana and positively regulated the expression of AtAGL24 and AtLFY, suggesting that it might be an integrator of the flowering stage in pak choi and could synergize with multiple flowering genes to promote flowering.

Conclusions
In summary, SOC1 is an integrator of flowering that promotes flowering in a variety of ways. In this experiment, BrcSOC1 was cloned from pak choi and the sequence was analyzed, which showed that its open reading frame contained 642 bp and encoded 213 amino acids. The SMART protein structural domain results indicated that BrcSOC1 belonged to the MADS-box transcription factor family. In addition, the expression pattern of BrcSOC1 in different developmental stages and different tissues of pak choi were analyzed, and it was found that its expression level was higher in flowers. The subcellular location of BrcSOC1 was clarified and was found to be in cell nucleus and cytoplasm. Overexpressing it in A. thaliana revealed that BrcSOC1 positively regulates the flowering and the formation of floral organs. The results of this study provide a theoretical basis for further studies on the molecular mechanisms of flowering regulation in pak choi. Future experiments are aimed to identify and validate its role in the molecular network of flowering regulation in pak choi through protein interactions.

Data availability statement
All data that support the findings reported in this study are available from the corresponding author HLP upon reasonable request.

Disclosure statement
no potential conflict of interest was reported by the authors.