Chloroplast genome of silverleaf nightshade (Solanum elaeagnifolium Cav.), a Weed of National Significance in Australia

Abstract Solanum elaeagnifolium Cav. is a widely distributed weed and recognized as a Weed of National Significance in Australia. This study sequenced the chloroplast (cp) genome of S. elaeagnifolium, which is 155,049 bp in length, including a large single-copy region at 85, 426 bp, a small single-copy region at 18,419 bp and two inverted repeats at 25,602 bp. A total of 130 genes were annotated. The phylogeny among the S. elaeagnifolium and 42 Solanum chloroplast genomes suggested S. elaeagnifolium is closely related to Solanum species from the section of Melongena.


Chloroplast genome; silverleaf nightshade; Solanum elaeagnifolium; Leptostemonum
Silverleaf nightshade (Solanum elaeagnifolium Cav.) is one of the worst agricultural weeds around the world. It is native to North America and is widely distributed beyond its native range (Gopurenko et al. 2014). In Australia, S. elaeagnifolium is listed as a Weed of National Significance and causes up to 77% yield loss in cereals (Stanton et al. 2009). The lack of a reliable method to distinguish S. elaeagnifolium from other Solanum species, however, often resulted in misidentification (X Zhu et al. 2018, XC Zhu et al. 2011) and caused unnecessary delay in control and missed the prime opportunity of early eradication (Hosking et al. 1996).
Chloroplast (cp) genome is a circular DNA molecular that generally ranged from 120 to 160 kbp, typically consisting of a large and a small single-copy regions (LSC and SSC), and two inverted repeats (IRs). cp genome is highly conserved and lacks recombination. Therefore, it has been used to develop DNA barcodes for the identification purpose (Raubeson and Jansen 2005). More completed cp genomes are available due to the reduced time and cost of the nextgeneration sequencing technology.
In our study, we obtained the complete cp genome sequence of S. elaeagnifolium. This study provides information for the future development of molecular tools to improve the identification, thereby contributing to the effective management of S. elaeagnifolium.
Fresh leaf sample of S. elaeagnifolium was collected from Wagga Wagga, New South Wales, Australia (N -35.101776, E 147.386991). This sample was preserved at Wagga Wagga Agricultural Institute (voucher ww19479). Genomic DNA was extracted using a modified CTAB method (Doyle 1987). Sequencing of the genomic DNA was performed using an Illumina Hiseq2000 platform at Beijing Genomics Institute (BGI, Hong Kong). A total of 13,324,346 raw reads were generated with an average read length of 125 bp. The raw reads were subjected to the quality-control process by readfq v5 (https://github.com/lh3/readfq) to trim adapter sequence, duplications and low-quality read. The cp genome was obtained by de novo assembly using SOAPdenovo2 with k-mer size optimized to 61 (Luo et al. 2012). The assembled S. elaeagnifolium cp genome was 155,049 bp in length, consisting of a 85,426 bp LSC, a 18,419 bp SSC regions and two IRs at 25,602 bp. The GC content of S. elaeagnifolium cp genome was 37.8%, with the highest GC content found in the IR region (43.1%), followed by LSC at 35.8% and SSC at 32.0%. The assembled chloroplast genome was deposited into GenBank under the accession number of KX792501.
The S. elaeagnifolium cp genome was annotated with CpGAVAS using S. bulbocastanum, S. lycopersicum and S. tuberosum as references with default settings , followed by verification using local BLAST and manual adjustments. A total of 108 distinct genes were annotated, including 4 ribosomal RNA genes, 28 transfer RNA genes, and 78 protein-coding genes. Duplicated genes in the cp genome include nine protein-coding genes, seven tRNA genes, and four rRNA genes, making a total of 130 genes. Introns were found in nine protein-coding genes atpF, rpoC1, ycf3, clpP, rpl2, ndhA and ndhB.
Sequences of S. elaeagnifolium and 42 published Solanum species were aligned using MAFFT v7 (Katoh and Standley 2013) and phylogeny were inferred using MrBayes v3.2 using GTR þ I þ G model (Ronquist and Huelsenbeck 2003). Capsicum annuum (MH559323) was used as an outgroup. Indels were excluded for analysis. Four clades were formed in the phylogenetic trees (Figure 1). Solanum tuberosum, S. lycopersicum and related species (Potato major clade) were clustered into separated clades. Solanum nigrum (Morelloid major clade) and S. dulcamara (Dulcamaroid major clade) were grouped together. Solanum elaeagnifolium and S. melongena related species from the major clade of Leptostemonum formed a highly supported clade.  Figure 1. Phylogenetic relationships among S. elaeagnifolium (underlined) and 42 other Solanum species, inferred from the complete chloroplast. Ã Branches with Bayesian posterior probability lower than 90%.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by Biosecurity SA, Primary Industries and Regions South Australia.

Data availability statement
The data that support the findings of this study are openly available in GenBank at http://www.ncbi.nlm.nih.gov, accession number KX792501.