Characterization of the complete chloroplast genome of the Solanum tuberosum L. cv. Favorita (Solanaceae)

Abstract Potato (Solanum tuberosum L.), a species of the family Solanaceae, is the fourth most important food crop worldwide. Solanum tuberosum L. cv. Favorita is a long oval, smooth, yellowish-skinned potato variety with green and plump leaves. It has a dry matter content of 17.7% and starch content of 12.4–14.01% in the tuber. In order to support more genetic data for the taxonomy of S. tuberosum, the complete chloroplast (cp) genome sequence of S. tuberosum L. cv. Favorita was determined using next-generation sequencing. In leaves, the chloroplast genome accounts for 5.17% of the total genome. The entire cp genome was determined to be 155,296 bp in length. It contained large single-copy (LSC) and small single-copy (SSC) regions of 85,737 and 18,373 bp, respectively, which were separated by a pair of 25,593 bp inverted repeat (IR) regions. The genome contained 132 total genes, including 87 protein-coding genes, 37 tRNA genes, and eight rRNA genes. The overall GC content of the genome is 37.9%. A phylogenetic tree reconstructed by 60 chloroplast genomes reveals that S. tuberosum L. cv. Favorita is most closely related to S. tuberosum L. cv. Desiree and S. tuberosum L. cv. Atlantic.

Solanum tuberosum; complete chloroplast genome; phylogenetic analysis; Solanaceae Solanum tuberosum L. (family: Solanaceae) has high nutritional value, adaptability, and large yield. It is the largest non-cereal food crop worldwide and ranked as the world's fourth most important food crop after rice, wheat, and maize. (Horton and Sawyer 1985;Zhang et al. 2017). Solanum tuberosum L. cv. Favorita (https://www.europotato.org/varieties/ view/Favorita-E#/) is a long oval, smooth, yellowish-skinned potato variety with green and plump leaves. Compared with other varieties, its tubers have a medium dry matter content of 17.7% and starch content of 12.4-14.01%. Since the main sites of starch synthesis are amyloid and chloroplast, and the chloroplast genome contains many genes involved in starch synthesis, it is necessary to characterize the chloroplast genome of the potato Favorita. In addition, because Favorita is an important parent material for potato breeding, the chloroplast genome will enrich the genetic information for potato Favorita sprout mutation breeding and cross-breeding (Duan et al. 2019).
Healthy leaf samples were collected from a tissue culture plant (E:125.417353, N43.821995). The total genomic DNA was extracted from the fresh leaves of S. tuberosum L. cv. Favorita using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA). The voucher specimen (JAUHL07) was deposited at the Herbarium of College of Vegetable Science, Jilin Agricultural University. After DNA isolation, 1 lg of purified DNA was fragmented and used to construct short-insert libraries (insert size $350 bp) according to the manufacturer's instructions (BGISEQ) detailed in the previous literature (Huang et al. 2017). Then DNA libraries were sequenced by Hefei Bio&Data Biotechnologies Inc. (Hefei, China) on the BGISEQ-500 platform with PE150 read lengths. The filtered reads were assembled using the program NOVOPlasty Version 3.8.3 (Dierckxsens et al. 2017). The cp-genome was annotated with the GeSeq (Tillich et al. 2017) and tRNAscan (Schattner et al. 2005).
In leaves of S. tuberosum L. cv. Favorita, the chloroplast genome accounts for 5.17% of the total genome. Such a high proportion of the chloroplast genome may have contributed to its green and plump leaves and high starch production. The chloroplast genome was determined to comprise double stranded, circular DNA of 155,296 bp containing two inverted repeat (IR) regions of 25,593 bp each, separated by large single-copy (LSC) and small single-copy (SSC) regions of 85,737 and 18,373 bp, respectively (Genbank acc. no. MW307948). The genome contained 132 total genes, including 87 protein-coding genes, 37 tRNA genes, and eight rRNA genes. Seven protein-coding genes, six tRNA genes and four rRNA genes were duplicated in IR regions. Nineteen genes contained two exons and four genes (clpP and ycf3 and two rps12) contained three exons. The overall GC content of S. tuberosum L. cv. Favorita cp genome is 37.9% and the corresponding values in LSC, SSC and IR regions are 36.0, 32.1 and 43.1%, respectively. Heteroplasmy testing showed that there are about 134 low-frequency SNP sites with minor allele frequency (MAF) !0.03 and !5Â reads coverage in the chloroplast genome of potato Favorita. Most of these SNPs are located between ycf15 and trnL-CAA.
To investigate its taxonomic status, a maximum-likelihood (ML) was reconstructed based on whole chloroplast genomes from 59 Solanum species and one outgroup species   (Price 2010). The ML phylogenetic tree shows that S. tuberosum L. cv. Favorita is most closely related to S. tuberosum L. cv. Desiree and S. tuberosum L. cv. Atlantic. with bootstrap support values of 100% (Figure 1). Chloroplast genome of S. tuberosum L. cv. Favorita adds valuable information for understanding the phylogenetic position of S. tuberosum in the genus Solanum.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The data that support the findings of this study are openly available in NCBI at Genbank with accession number MW307948 (https://www.ncbi. nlm.nih.gov/nuccore/MW307948.1). Raw sequencing reads used in this study was deposited in the public repository SRA with accession number SRR13162919 (https://www.ncbi.nlm.nih.gov/sra/?term=SRR13162919).