The complete chloroplast genome of Solanum sisymbriifolium (Solanaceae), the wild eggplant

Abstract Solanum sisymbriifolium is a critical wild eggplant resource with resistance to many serious diseases that affect eggplant production. In this study, the chloroplast genome of S. sisymbriifolium was successfully sequenced using Illumina high-throughput sequencing technology. The length of the complete chloroplast genome is 155,771 bp, and its GC content is 37.76%. There is a large single-copy region (86,404 bp), a small single-copy region (18,525 bp), and a pair of inverted repeat regions (25,421 bp) in the chloroplast genome. A total of 128 coding genes were annotated in the entire chloroplast genome, including 83 protein-coding genes, 37 transfer RNA genes and eight ribosomal RNA genes. The phylogenetic tree of 17 complete chloroplast genomes shows that S. sisymbriifolium is closely related to Solanum wrightii.

Solanum sisymbriifolium; chloroplast genome; phylogenetic tree Solanum sisymbriifolium (Solanum sisymbriifolium Lamarck 1758), a native of South America, is a perennial herbaceous plant belonging to the genus Solanum of Solanaceae. Previously, it had been cultivated in Guangdong and Yunnan provinces in China, and now it grows in the wild in Kunming (Chinese Botanical Society Editorial Board of Chinese Academy of Sciences 1978). In recent years, studies have shown that S. sisymbriifolium has resistance against many serious diseases and pests of Solanaceae (Collonnier et al. 2001), especially verticillium wilt (Fassuliotis and Dukes 1972;Wu et al. 2019), phomopsis blight (Kalda et al. 1977 ), bacterial wilt (Mochizuki and Yamakawa 1979) and nematodes (Fassuliotis and Dukes 1972;Dias et al. 2012). To realize its potential for providing disease resistance, S. sisymbriifolium could be used as an important rootstock for tomato production (Baidya et al. 2017;Deb et al. 2019); it has also been used as a trap crop for potato cyst nematodes (Timmermans et al. 2009;Dias et al. 2017). However, although transcriptome-related research has been carried out (Wu et al. 2019), there is no genomic information on S. sisymbriifolium, which significantly limits its utilization and related research. Here, the complete chloroplast genome of S. sisymbriifolium is reported, providing genomic data for the phylogenetic analysis of the genus Solanum. Importantly, the results will lay a foundation for conservation genetics and molecular research on this plant.
In this study, fresh leaves of S. sisymbriifolium were collected from the Horticultural Institute of Yunnan Academy of Agricultural Sciences (25 7 0 27 00 N, 102 45 0 46 00 E), Kunming, China. The specimens was deposited at the Herbarium of Kunming Institute of Botany of CAS (http://www.kun.ac.cn, Xuedan Xie and xiexuedan@mail.kib.ac.cn) under the specimen code: KUN184762. A modified CTAB method (Yang et al. 2014) was used to extract high-quality total genomic DNA. The quality and quantity of the extracted DNA were examined using a NanoDrop 2000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA), Qubit dsDNA HS Assay Kit on a Qubit 3.0 Fluorometer (Life Technologies, Carlsbad, CA, USA) and electrophoresis on a 0.8% agarose gel. Then, the genomic DNA was sent to Shanghai Majorbio Biopharm Technology Company (Shanghai, China) for sequencing by Illumina NovaSeq. Raw reads were filtered by using the NGSQC toolkit with default parameters to obtain clean reads of high quality (Patel and Jain 2012). The clean reads were trimmed and assembled by NOVOPlasty software (Dierckxsens et al. 2017). Then, the assembled sequences were analyzed for possible assembly errors by collinearity with related species using Mummer (http://mummer.sourceforge.net/manual/). Finally, the assembled chloroplast genome was annotated by PGA (Qu et al. 2019).
The size of the complete chloroplast genome of S. sisymbriifolium (GenBank accession number: OL597592) is 155,771 bp, and its overall GC content is 37.76%. The chloroplast genome had a characteristic quadripartite circular structure, and it was comprised of a large single-copy region (86,404 bp), a small single-copy region (18,525 bp), and a pair of inverted repeat regions (25,421 bp). In addition, there were 83 protein-coding genes, 37 transfer RNA (tRNA) genes, and 8 ribosomal RNA (rRNA) genes in the entire genome.
To explore the phylogenetic relationship of S. sisymbriifolium in Solanum, 15 complete chloroplast genomes of Solanum species were used to construct the phylogenetic tree, and 2 species (Capsicum annuum and Solanum lycopersicum) in Solanaceae were selected as an outgroup. These 17 published sequences were obtained from NCBI GenBank. All chloroplast genome sequences were aligned using MAFFT software (Katoh and Standley 2013). With 100 bootstrap replicates, a neighbor-joining phylogenetic tree was constructed by MEGA X software. The phylogenetic analysis results showed that S. sisymbriifolium was most closely related to Solanum wrightii among the other 16 species (Figure 1). Together, these results will provide a reference for future studies of Solanum chloroplasts.

Ethical approval
The research on plants in this study, including the collection of plant materials, has been carried out in accordance with guidelines provided by the author's institution and national or international regulations.

Author contributions
M.Y.Y. and Y.N.Y. conceived and designed the research framework, and drafted the manuscript. Y.J.G., M.G., and Z.B.L. analyzed the data. R.B. and J.C. cultivated the seedlings and helped with sampling. G.H.D. and L.Y.W. conceived the study, participated in its design and coordination, and helped draft the manuscript. All authors read and approved the final manuscript.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the National Natural Science Foundation of

Data availability statement
The genome sequence data that support the findings of this study are openly available in GenBank of NCBI at https://www.ncbi.nlm.nih.gov, reference number OL597592. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA809910, SRR18136644, and SAMN26209660, respectively.