The complete chloroplast genome sequence of Trapa kozhevnikoviorum Pshenn. (Lythraceae)

Abstract Trapa (Lythraceae) is an economically important aquatic genus used for food and medicine, with wide distribution in Asia, Africa, and Europe. Identification of species, genetic studies and utilization of Trapa are limited by lack of molecular data. Herein, we report the complete chloroplast (cp) genome sequence of a wild species, Trapa kozhevnikoviorum Pshenn. The cp genome size of T. kozhevnikoviorum is 155,545 bp, consisting of a pair of inverted repeat regions (IRa/IRb) of length 24,388 bp, separated by the small single copy (SSC) region of 18,275 bp and a large single copy (LSC) region of 88,494 bp. A total of 113 unique genes, including 79 protein-coding, 30 tRNA, and four rRNA were annotated. Phylogenetic analysis based on 15 whole cp genomes of Lythraceae species supported the monophyletic clustering of Trapa. A cladal relationship among T. kozhevnikoviorum, T. bicornis, and T. natans was revealed.


Background
Trapa L., commonly known as water chestnut, is a genus in the family Lythraceae, including approximately 30 species distributed across the temperate and subtropical regions of Asia, Africa, and Europe (Chen et al. 2007). Trapa is an economically important genus used for food in China and India because of the high protein and starch content in seeds. Its seed coat is used as an antimicrobial medical resource in many other countries (Karg 2006;Artyukhin et al. 2019). The complexity of morphological characteristics of Trapa species has created a taxonomical challenge among researchers world-wide, although previous studies agreed that fruit size was a crucial classification criterion (Takano and Kadono 2005;Fan et al. 2016Fan et al. , 2021. Wetland degradation due to human activity and climate change has led to population size decrease in some Trapa taxa, and a few of them are endangered (Batsatsashvili and Machutadze 2014;Frey et al. 2017). Among such Trapa taxa, T. kozhevnikoviorum is a four-horned water chestnut with large fruit, which is sporadically distributed in Tumen River Basin, the border between Russia and China. Only four natural populations of T. kozhevnikoviorum were found in recent field investigations (Xue et al. 2016), which makes the conservation of extant populations and their genetic information a concern. Here, we report the complete chloroplast (cp) genome sequence of T. kozhevnikoviorum and its phylogenetic position among other species within Lythraceae.

Methods
Samples of T. kozhevnikoviorum were collected from Jixi City, Heilongjiang Province, China (N46 53 0 9.5 00 , E133 3 0 8.9 00 ). The voucher specimen has been deposited in the Herbarium of Wuhan Botanical Garden (voucher number: yychen20180042; Yuanyuan Chen, yychen@wbgcas.cn). DNA was extracted from 0.3 g silica-dried leaf tissue using the CTAB protocol with minor modification of 3 Â CTAB buffer used (Doyle and Doyle 1987). DNA library construction and sequencing were performed at the Novogene Co. Ltd. (Beijing, China). DNA libraries were prepared with an insert size of 350 bp using NEBNext Ultra DNA Library Prep Kit. Paired-end sequencing (150 bp reads) was performed on an Illumina NovaSeq 6000 platform (San Diego, CA). The cp genome was assembled using GetOrganelle v1.7.1 with default parameters (Jin et al. 2020). The resultant genome was annotated using plastid genome annotator (PGA) (Qu et al. 2019) with T. maximowiczii (NC037023) and T. bicornis (NC049010) as reference genomes. Geneious 2020.2.3 (www.geneious.com) was used for further manual annotation with reference to T. bicornis. The annotated cp genome was deposited in GenBank with accession number MW027640. Using DnaSP v.6 (Rozas et al. 2017) and a python script (https://www.biostars.org/p/ 119214/), we identified the insertions/deletions (indels) and single nucleotide polymorphisms (SNPs) between T. kozhevnikoviorum and the three published Trapa species.

Results
The cp genome of T. kozhevnikoviorum exhibited a typical quadripartite structure of length 155,545 bp, consisting of a pair of inverted repeat regions (IRa/IRb) of length 24,388 bp each, separated by a small single copy (SSC) region of 18,275 bp and a large single copy (LSC) region of 88,494 bp. The GC contents of IR regions, SSC, and LSC were 42.8%, 30.2%, and 34.2%, respectively, with a total GC content of 36.4%. The cp genome encoded a total of 130 genes, including 113 unique genes (79 protein-coding, 30 tRNA, and four rRNA) and 17 duplicated genes (six protein-coding genes (PCGs), seven tRNA genes, and four rRNA genes). Eleven PCGs contained one intron, and two PCGs (ycf3 and clpP) contained two introns. There were minor differences among large-fruit species, with 27 indels and 60 SNPs between T. kozhevnikoviorum and T. natans, and 23 indels and 66 SNPs between T. kozhevnikoviorum and T. bicornis. Conversely, obvious differences, with 236 indels and 1412 SNPs, were found between T. kozhevnikoviorum and the small-fruit T. maximowiczii.
The phylogenetic tree supported the close relationship between Trapa and Sonneratia, which was reported in previous studies (Yu et al. 2018;Gu et al. 2019;Sun et al. 2020). Monophyletic clustering of Trapa was revealed, with the large-fruit Trapa species (T. kozhevnikoviorum, T. bicornis, and T. natans) forming a clade closely related to T. maximowiczii with small fruit (Figure 1), suggesting the distinct genetic divergence between the two clades, and the basal classification status of the small-fruit species T. maximowiczii.

Disclosure statement
No potential conflict of interest was reported by the author(s).