The complete chloroplast genome of Camellia confuse Craib 1914, an economically valuable oil crop

Abstract Camellia confuse Craib 1914 is an industrially valuable oil crop from southern China for which little genetic information is available. Here, we found that its complete chloroplast genome is a circular sequence (156,905 bp) with a large single-copy region (LSC) of 67,724 bp, a small single copy region (SSC) of 18,400 bp, and two inverted repeats (IRs). In total, 130 genes were identified, including 86 protein-coding genes, 36 transfer RNAs, and 8 rRNA genes. Phylogenetic analysis showed that C. confusa is close to C. meiocarpa. These results provide valuable information for accelerating research on the evolution of camellias.

Camellia confuse Craib 1914, best known for its beautifully shaped flowers and industrially valuable oil, is a woody plant that is mainly cultivated in the Yunnan and Guizhou provinces of China, as well as in Laos, Thailand, and Vietnam (Liu et al. 2018). It belongs to the ancient genus Camellia, which contains about 200 species (Fan et al. 2021). Nonetheless, little genetic information for C. confusa is available. The chloroplast is an indispensable organelle for photosynthetic organisms and an ideal model for taxonomic classification because of its small size and conserved structure. Here, we describe the complete chloroplast genome of C. confusa and compare it with those of its relatives.
Leaves of C. confusa were sampled from the Research Institute of Subtropical Forestry, Chinese Academy of Forestry (RISF), Hangzhou, China (119 95 0 E, 30 07 0 N). All collections were approved by the head of RISF. All samples were collected by SiWu in May 10, 2021. The specimens were stored in the laboratory at the RISF, and the voucher number was YL914711 (XinLei Li, lixinlei2020@163.com). Total DNA was extracted using the MiniBest Plant Genomic DNA Extraction Kit (Takara, Dalian, China), and sequenced on the Illumina HiSeq 4000 platform (Illumina, San Diego, California, USA) at Genesky Biotechnologies (Shanghai, China). We obtained a total of 25,129,048 reads, and 23,938,766 clean reads remained after quality control with Trimmomatic (Bolger et al. 2014). SPAdes version 3.10.1 (Bankevich et al. 2012) was used to assemble the chloroplast genome, and Velvet Optimizer version 2.2.5 was used to maximize the splicing results. We used CpGAVAS2 (Shi et al. 2019) to annotate the final genome. BLASTp at NCBI was used to confirm the annotation of the protein-coding sequences. Information on the chloroplast genome was uploaded to NCBI (MW034673).
To investigate the evolutionary relationships of C. confusa with other members of the genus, we performed phylogenetic analysis of thirteen Camellia species. MEGA v7.0.14 was used to construct the phylogenetic tree by the maximum likelihood method (Kumar et al. 2016). The result revealed that C. confusa was closely related to C. meiocarpa (Figure 1).

Author contributions
SiWu drafted the manuscript; Meiying Yang and Menglong Fan performed the data analysis; Ying Zhang and Xinlei Li designed this study and revised the manuscript critically for intellectual content; Hengfu Yin and Jiyuan Li carried out literature search, data acquisition and manuscript editing; all authors contributed to the final approval of the version to be published and all agree to be accountable for all aspects of the work.

Disclosure statement
No potential conflict of interest was reported by the author(s).