Overexpression of GhSWEET42, a SWEET-like gene from cotton, enhances the oil content and seed size

Abstract SWEET (‘sugars will eventually be exported transporters’) family genes reportedly play a critical role in sugar translocation and oil biosynthesis in various plant species. However, their functions in cotton are unknown. The present study demonstrated that while GhSWEET42 was widely expressed in different cotton tissues, it had the highest expression level in the developing ovules. Hence, it performs a vital role in seed development. We constructed GhSWEET42 transgenic Arabidopsis lines to verify the biological function of this gene and found that the oil content and weight of the seeds produced by the overexpression lines were 18–23% and 19–20% higher, respectively than those of the wild-type. Gas chromatography–mass spectrometry (GC–MS) analysis revealed that it was mainly a relative increase in unsaturated fatty acids (FAs) that contributed to the relative increase in oil content in the transgenic seeds. Moreover, the latter exhibited comparative upregulation of certain genes associated with FA and triacylglycerol biosynthesis as well as cell expansion. GhSWEET42 might work synergistically with the aforementioned genes. This finding indicates that GhSWEET42 may be essential in oil biosynthesis and seed development in cotton. The results of the present work may facilitate further explorations into the molecular mechanism of cottonseed oil biosynthesis as well as the cultivation of novel oil-rich cotton varieties.


Introduction
Cotton (Gossypium hirsutum L.) belongs to the Malvaceae family [1].Currently cultivated species include upland cotton (G.hirsutum L.), island cotton (Gossypium barbadense L.), grass cotton (Gossypium herbaceum L.) and Asian cotton (Gossypium arboreum L.).Of these, upland cotton is the most widely planted [2].Cotton is one of the most important economic crops in the world and is the major source of natural textile fibre [3,4].Cotton fibre is considered the principal product of cotton cultivation.The quality of the cotton fibre directly affects that of textiles [5][6][7].The seed oil and fibre derived from cotton production have important economic significance [8].Cotton is not only an important fibre crop but also plays an important role in the oil industry and the supply of edible oil [9].Oil comprises ∼16% of the total seed weight and is the main by-product of cotton processing [10].The oil is suitable for human consumption as it has a high unsaturated fatty acid (UFA) content [11].Dietary cottonseed oil can lower total cholesterol, trans-fatty acid intake, and prevent certain diseases [12][13][14][15].Cottonseed oil is also an important industrial resource [16] and is used in biodiesel fabrication.In response to increasing energy demand, limited resources and environmental degradation, biodiesel production has attracted research attention [9,17].Seed oil is a reasonable alternative to conventional petroleum-based diesel as both substances have similar chemical compositions [18].Compared to fossil fuels, however, cottonseed oil has a negative carbon profile and may significantly reduce greenhouse gas emissions.It is also suitable as a raw material in biodiesel production [19][20][21][22].Conventional oil crops are the main sources of seed oils.The fuels produced from grains can only meet a small proportion of the total demand [18].Hence, increasing seed oil content may be helpful for alleviating the shortage of biodiesel.Moreover, the mechanisms by which seed oil yield can be increased should be identified and applied [23].
Triacylglycerol (TAG) is one of the most important organic compounds in cottonseed.It consists of three fatty acids (FAs) esterified to a glycerol backbone [8].Seed oil formation and transformation are highly complex and regulated by numerous transcription factors (TFs) and enzymes [24].While the LPAAT gene may marginally increase cottonseed oil content, upregulating it might significantly increase the total TAG and FA content [25].The heterologous expression of each GhACCase subunit can increase the oil content of upland cotton seeds [26].In Arabidopsis (Arabidopsis thaliana), GhWRI1a is highly expressed in the ovules and increases the seed oil content [27].GhPDATs are expressed during the cottonseed oil accumulation stage, and the ectopic GhPDAT1d expression increased the oil content of Arabidopsis seeds [28].The identification of the genes regulating oil biosynthesis is of high value in the cultivation of high-oil cotton germplasms [29].
The SWEET family comprises newly discovered sugar transporters that function as bidirectional uniporters/ facilitators and promote sugar diffusion along the concentration gradients across cell membranes [30].Sucrose is the main source of carbon-based energy and is delivered to the developing seeds via the phloem [31].SWEET proteins regulate sugar transport to the seeds and affect seed sett, filling and composition.In Arabidopsis, sweet11;12;15 triple mutants displayed resulted in retarded embryo development and reduced seed weight and starch and lipid content [32].In soybean, SWEET10a/b knockout resulted in decreased seed oil content and size [33].
Ideally, increasing the seed oil content should not affect fibre yield or quality.In this manner, the socioeconomic benefits of cotton may be improved and ensured.Fifty-five SWEET genes have been identified in cotton and they have been divided into four groups based on their evolutionary relationships.GhSWEET3, 6,7,40,42,38 and 51 are all in the first clade of Group III and are highly expressed in the seeds and ovules [34].
Here, we aimed to elucidate the functions of GhSWEET42 by analying its spatiotemporal expression patterns and heterologously overexpressing it in Arabidopsis.GhSWEET42 expression was substantially higher in the ovules than in the other tissues of cotton plants.Transgenic experiments demonstrated that GhSWEET42 overexpression could also increase Arabidopsis seed size and oil content.Furthermore, certain genes governing FA content and seed size were also upregulated in the transgenic lines.The findings of the present work provided evidence that SWEET42 has a positive regulatory effect on seed size and oil content and contributed to the research into the genes regulating cottonseed oil biosynthesis, providing theoretical and empirical foundations for the breeding of high-oil cotton varieties.

Plant materials and growth conditions
The Arabidopsis Col-0 and upland cotton TM-1 (Texa Marker-1) used in the present study were derived from the germplasm resources located at the School of Agricultural Science and Engineering of Liaocheng University, China.Arabidopsis transgenic and wild-type (WT) seeds were surface sterilized in 1 mL of 75% (v/v) ethanol for 5 min and 2.6% (w/v) sodium hypochlorite solution for 10 min and then washed 10 times with sterile water [35].The surface-sterilized seeds were evenly distributed on solid one-half-strength Murashige-Skoog (½ MS) medium (pH 5.8) containing 1.5% (w/v) sucrose and 0.8% (w/v) agar [36].The seeds were then vernalized for 3 days, stored in the dark at 4 °C, maintained in a culture room at 22 °C and under a 16 h light/8 h dark (long day) photoperiod until maturity and planted in a loess field at the Agricultural College Experimental Field of Liaocheng University (36.4N, 116.0E).

Construct preparation and Arabidopsis transformation
The GhSWEET42 coding region sequence (GH_ D12G2595) was retrieved from the National Center of Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/)database [34].Specific amplification primers were designed for the target gene using the online tool Primer3 (https://github.com/primer3-org).The total RNA of the samples was extracted with an E.Z.N.A. ® Plant RNA Kit (No. R6827-01; Omega Bio-Tek, Norcross, GA, USA) according to the manufacturer's instructions.The NanoDrop™ Lite Spectrophotometer (No. ND-LITE-PR; Thermo Scientific, Waltham, MA, USA) was used to measure the quality of RNA.Extracted RNA was then reverse-transcribed into complementary DNA (cDNA) with HiScript ® III RT SuperMix for polymerase chain reaction (PCR; +gDNA wiper; No. R312-02; Vazyme Biotech, Nanjing, China).Amplification of GhSWEET42 using 3 days post anthesis (DPA) ovule cDNA and stored at −20 °C until use.PCR amplification was conducted using high-fidelity enzyme 2× T8 High-Fidelity Master Mix (No. TSE111; Tsingke Biotechnology Co. Ltd., Beijing, China) and cDNA as the template.The PCR conditions were 98 °C for 2 min followed by 34 cycles of 98 °C for 10 s, 60 °C for 10 s, 72 °C for 15 s and 72 °C for 5 min.The GhSWEET42 coding sequence was cloned into pCAMBIA1305.1vector using MonClone Single Assembly Cloning Mix (No. 250548; Monad Biotech Co. Ltd., Wuhan, China) and was designated pCAMBIA1305.1-SWEET42(Supplemental Figure S1).Sangon Biotech (Shanghai, China) verified the accuracy of the sequence by the Sanger method.The PCR primers are listed in Table 1.
The pCAMBIA1305.1-SWEET42vector was mobilized to Agrobacterium tumefaciens GV3101 Chemically Competent Cell (No. TSC-A01; Tsingke Biotechnology Co. Ltd.Beijing, China) and wand-transformed into Arabidopsis by floral dipping [37].Briefly, the whole inflorescences of healthy plants were immersed in Agrobacterium suspension (OD 600 ∼0.8) for 20-30 s, incubated in the dark for 24 h and grown in a greenhouse.The mature, dry seeds were then collected after 1 month.The primary transformants and their progeny were selected on solid ½ MS medium containing 25 mg/L hygromycin.Plants with healthy growth were used to identify transgenic seedlings.
The qRT-PCR conditions were 95 °C for 30s followed by 34 cycles of 95 °C for 30 s, 60 °C for 30 s and 95 °C for 15 s.The RNA transcript fold changes were calculated by the 2 − △△ Ct method.AtUBQ10 was the internal control for Arabidopsis [36], and GhHIS3 was the internal control for cotton.All qRT-PCR analyses were performed using three independent biological replicates.The qRT-PCR primers are listed in Table 1.

FA composition and oil content analysis
The seed lipids were extracted as previously described [38].Briefly, the sample added with internal standard was saponified and methyl esterified under alkaline conditions after the hydrolysis/ether extraction of fat.The FA methyl ester was analysed by capillary column gas chromatography, and the content of FA methyl ester was quantitatively determined by the internal standard method.For each group, mature dry seeds (500 mg) were selected and ground, and their lipid content was detected by gas chromatography/mass spectrometry (GC-MS; Trace1310 ISq; Thermo Fisher Scientific, Waltham, MA, USA).The chromatographic column used was a TG-FAME (50 m × 0.25 mm × 0.20 μm; Thermo Fisher Scientific).The temperature program was as follows: 80 °C for 1 min, then increased to 160 °C at a rate of 20 °C/min for 1.5 min and then increased to 230 °C at a rate of 5 °C/min for 6 min.The carrier gas flow rate was set to 0.63 mL/min, and the split ratio was 100:1; the ion source and transmission line temperatures were 280 °C and 240 °C, respectively; the solvent delay time was 4.00 min, and the electron ionization source was 70 eV.

Seed weight and size determinations
One thousand mature dry seeds were randomly selected per batch, at least three different batches were used, and the seeds were weighed on an electronic analytical balance (No.ME204E; Mettler Toledo, Columbus, OH, USA).Appropriate amounts of the transgenic and WT seeds were sampled from the same batch and observed under a microscope (No.KL2; Leica Microsystems, Wetzlar, Germany).The lengths and widths of the transgenic and WT seeds were measured and analysed with ImageJ [https://imagej.nih.gov/ij/; National Institute of Health (NIH), Bethesda, MD, USA].

Data analysis
The data are means ± standard deviation (SD) and were analysed and plotted in GraphPad Prism

GhSWEET42 expression pattern
We analysed the bioinformatics data from previous studies on cotton SWEET family genes and found that the GhSWEET42 gene was highly expressed in seeds and ovules [34].Thus, GhSWEET42 might be involved in ovule development in cotton.We used cDNAs from the petals, bracts and ovules of 1, 3, 10, 15 and 20 DPA plants and fibres from 10, 15 and 20 DPA plants in the qRT-PCR analyses to validate the accuracy of the bioinformatics analysis and spatiotemporal GhSWEET42 expression patterns (Figure 1).GhSWEET42 expression was always higher in the ovules than in the other tissues, peaked at 3 DPA and gradually decreased thereafter until maturity.Hence, GhSWEET42 might play a critical role in ovule development.

Molecular characterization of GhSWEET42 transgenic Arabidopsis
Transgenic Arabidopsis lines controlled by the 35s promoter were constructed to determine GhSWEET42 function in planta.Six independent transgenic lines were identified and three representative T3 homozygotes were selected for the subsequent analyses.
The RT-PCR analysis showed that GhSWEET42 was highly expressed in the OE-1, OE-2 and OE-3 lines but not in WT (Figure 2A).GhSWEET42 expression was measured and compared by qRT-PCR in three overexpressing transgenic lines (Figure 2B).While GhSWEET42 was upregulated in all three lines, its relative expression level was highest in OE-3.

overexpression increases seed oil content in transgenic Arabidopsis
quantification of the total fatty acid methyl esters (FAMEs) reflects the seed oil content [38].Here, we measured the total FAME content to determine whether seed-specific GhSWEET42 expression increases seed oil accumulation (Figure 3A).Compared with the WT, the seed oil content of OE-1, OE-2 and OE-3 had 18%, 17% and 23% more seed oil, respectively (Figure 3A).We then analysed the FA composition of the seed TAG to  identify the changes in total FAMEs content (Figure 3B,C).the FA content was higher in the OE lines than in the WT, the FA composition remained similar among them.Therefore, GhSWEET42 overexpression increased the seed oil content solely by increasing the FA content without affecting its composition.

GhSWEET42 overexpression increases seed size and weight in transgenic Arabidopsis
We measured the size and weight of dry homozygous seeds to determine the influence of GhSWEET42 on seed size in transgenic Arabidopsis (Figure 4).Compared with the WT, the transgenic lines had larger seeds (Figure 4A).The transgenic seeds were 12-14% longer and 14-16% wider than those of the WT (Figure 4C,D).The 1000-seed weight was significantly higher in the transgenic lines than in the WT (Figure 4B), suggesting that GhSWEET42 overexpression affects seed size and weight in transgenic Arabidopsis.

GhSWEET42 overexpression upregulates certain genes governing cell expansion and TAG and FA biosynthesis in transgenic Arabidopsis
The transgenic lines had a relatively higher seed lipid content possibly because they also had comparatively greater FA and TAG biosynthesis activity as a result of GhSWEET42 overexpression.To test this hypothesis, we selected several genes implicated in FA and TAG biosynthesis that have been previously reported [25,[39][40][41][42][43][44][45] and measured their transcription levels by qRT-PCR.(Figure 5A,B).Acyltransferase is required for the Kennedy pathway in TAG biosynthesis [46].The GPAT9 and LPAAT2 expression levels were similar in the WT and all transgenic lines.In contrast, the DGAT1 transcription level was about twice as high in the transgenic lines than in the WT (Figure 5A).LPCAT2 was upregulated in all transgenic lines relative to the WT.Phospholipid:diacylglycerol acyltransferase (PDAT) catalyses the final step in TAG generation from diacylglycerol [41].The PDAT1 and PDAT2 transcription levels were twice and four times higher, respectively, in the transgenic lines than in the WT (Figure 5A).The FATA1, FATB and PDHE-1α expression levels were twice to thrice as high in the transgenic lines than in the WT (Figure 5B).
To elucidate the molecular mechanism by which GhSWEET42 regulates seed size, several genes that are reported to be involved in cell expansion were selected and their expression levels were measured (Figure 5C).The transcription levels of the expansin genes EXPA1, EXPA8 and EXPA10 were higher in the transgenic lines than in the WT, and the EXPA8 and EXPA10 expression levels were significantly higher than that of EXPA1 in the transgenic lines (Figure 5C).
As overexpression of GhSWEET42 upregulated all of the genes implicated in cell expansion as well as FA TAG biosynthesis, it may account for the that seed size and oil content were higher in transgenic plants than in WT plants.

Discussion
Sugar is an indispensable carbon source and energy source in plant growth and development [47].Multicellular organs in plants such as seeds are sinks for fructose, sucrose and other carbohydrates that are translocated to them [30].This process itself involves the exchange of carbon sources and energy sources [31].In apoplastic sugar loading, sucrose enters the cell wall space and is transported through it [48].The MtN3/saliva/SWEET-type genes are evolutionarily conserved in higher eukaryotes [49].In plants, these genes regulate numerous physiological processes and biochemical reactions during growth and development and under biotic or abiotic stresses [50].The results of an earlier study on cotton indicated that the genes encoding the SWEET sugar efflux transporters control these processes [34].As GhSWEET42 is upregulated in the seeds and ovules, it may play vital roles in vegetative and reproductive growth [34].Given this finding, we assumed that GhSWEET42 is a candidate gene for an increase in seed size and oil content regulator in cotton.
The qRT-PCR results of the present work showed that GhSWEET42 mRNA was expressed at different levels in 11 different samples.GhSWEET42 was expressed at higher levels in the 3 DPA ovule than in any other organ or tissue (Figure 1).Hence, this gene might perform crucial functions in the early stages of ovule development.
The TFs implicated in FA biosynthesis and TAG conversion have been the focus of seed oil content engineering in Arabidopsis and other plant species [23].A previous investigation showed that overexpression of GmSWEET10a or GmSWEET10b overexpression significantly increased the oil content of soybean seeds [33].In sweet11;12;15 triple mutants of Arabidopsis, the seed lipid content was 71% lower than that of the WT [32].The homologous genes in different organisms usually have similar biological functions.Here, our results showed that GhSWEET42 overexpression significantly increased the oil content of Arabidopsis seeds (Figure 3A).FAs are essential for the normal functioning of all living organisms [51].We found that the transgenic lines had higher UFA (linoleic acid (C18:2n6c), acid (C18:3n3) and eicosapentaenoic acid (C20:5n3 EPA)) than the WT.In contrast, the levels of certain saturated fatty acids (SFAs) such as pentadecanoic acid (C15:0), stearic acid (C18:0) and arachidic acid (C20:0) were similar in both the transgenic lines and the WT (Figure 3B,C).Therefore, the observed relative increase in the oil content of the transgenic lines was mainly the result of relative increases in their UFA content.
In the transgenic lines, GhSWEET42 might have regulated the oil content by upregulating other genes governing FA and TAG biosynthesis.Diacylglycerol acyltransferase (DGAT) is the final and rate-limiting step in TAG biosynthesis [52].In Arabidopsis pollen and seeds, the DGAT and PDAT pathways may cooperate in TAG biosynthesis [41].PDAT1/2 is expressed in developing seeds and may be a key regulator of hydroxy FA production in the seeds of transgenic plants [53].Our results showed that the expression levels of various genes regulating FA and TAG biosynthesis were higher in transgenic than in WT Arabidopsis (Figure 5A,B).Based on our experimental findings, we speculated that two factors might explain the observed increases in the oil content of GhSWEET42 overexpressing lines.Firstly, GhSWEET42 may be a key gene in the process of oil biosynthesis.Secondly, overexpression of GhSWEET42 may upregulate seed oil content by affecting the expression level of other genes related to FA and TAG biosynthesis.Nevertheless, further research is required to test these hypotheses in cotton.
Seed size is an important agronomic trait in crop domestication as it partially determines the ability of the plant to adapt to its external environment and greatly affects yield [48,54].In soybean, GmSWEET10a or GmSWEET10b knockout and overexpression significantly decreased and increased the 100-grain weight, respectively, compared to the WT [33].Our results showed that the seed size and 1000-seed weight of the transgenic Arabidopsis lines overexpressing GhSWEET42 were higher than those of the WT (Figure 4).GhSWEET42 may positively regulate both plant development and seed size.Moreover, qRT-PCR revealed that the relative expression levels of several cell expansion-related genes were also higher in the transgenic plants than in the WT (Figure 5C).Similar studies about seed size regulators were reported recently.Compared to the WT, Arabidopsis plants overexpressing the ZmGS5 promoter were larger and had higher biomass, seed size and weight.The opposite was true for those harbouring antisense ZmGS5 [36].Arabidopsis overexpressing GhDA1-1A had significantly greater seed size and weight than the WT.Hence, GhDA1-1A is a promising cotton breeding target [55].KIX8/9 and PPD1/2 disruption resulted in relatively enlarged seeds because of increased cell proliferation and elongation in the integuments [54].The observed increase in Arabidopsis seed size may be explained by an increase in oil content as oil is the principal storage component in this plant.Our results demonstrated that the seeds of the transgenic lines had a higher oil content than those of the WT (Figure 3).

Conclusions
In the present work, we identified GhSWEET42 by transcriptome analysis and qRT-PCR and discovered that it is preferentially expressed in cotton seeds and ovules.In transgenic A. thaliana, GhSWEET42 increased seed oil biosynthesis and accumulation and improved seed quality.Compared with the WT, the oil content and weight of the seeds of the overexpression lines were 18-23% and 19-20% higher, respectively.Furthermore, the genes governing seed oil biosynthesis and cell expansion were also upregulated.Overall, the present study revealed a novel gene that promotes seed oil biosynthesis in cotton and lays the foundation for clarifying the molecular mechanism of seed oil biosynthesis and breeding high-oil cotton varieties.Though the role of heterologous GhSWEET expression in A. thaliana was preliminarily determined, it remains to be established whether it performs a similar function in cotton.Future studies should aim to overexpress and knock out GhSWEET42 in cotton to determine the effects of this gene on seed size and oil content.Transcriptome analysis should be performed to identify other genes interacting with GhSWEET42 as well as the regulatory mechanisms involved.If GhSWEET42 effectively increases the seed oil content without affecting other traits such as fibre yield, it may have a positive impact on cotton production.
analysed data and wrote the manuscript.All authors and approved the final manuscript.

Figure 3 .
Figure 3. Seed oil content and fatty acid composition of GhSWEET42 transgenic arabidopsis lines.(a) Fa content in the seeds of Wt and transgenic lines.(B,c) Fa composition (g/100 g) of Wt and transgenic seeds.error bars indicate the standard error (±Se) of three independent biological replicates.Different letters indicate statistically different groups (p < 0.05; anoVa with tukey's multiple comparison test).the values shown above the columns are the Fa content of the seeds of the transgenic lines and are relative to that of the Wt seeds (100%).

Figure 4 .
Figure 4. Sizes of seeds of GhSWEET42 transgenic lines.(a) mature Wt, oe-1, oe-2 and oe-3 seeds; bar = 500 μm.(B) Seed weight.(c) Relative seed length.(D) Relative seed width.error bars indicate the standard error (±Se) of three independent biological replicates.Different letters indicate statistically different groups (p < 0.05; anoVa with tukey's multiple comparison test).the values shown above the columns are the seed sizes of the transgenic lines and are relative to that of the Wt seeds (100%).

Figure 5 .
Figure 5. qRt-pcR analysis of GhSWEET expression in transgenic lines.the samples used were siliques developed with 7-10 Dpa in Wt and transgenic arabidopsis overexpression lines.(a) expression levels of tag biosynthesis-related genes.(B) expression levels of Fa biosynthesis-related genes.(c) expression levels of cell expansion-related genes.AtUBQ10 was the internal control.Different letters indicate statistically different groups (p < 0.05; anoVa with tukey's multiple comparison test).

Table 1 .
primers used for qRt-pcR in the present study.