The complete chloroplast genome of Utricularia tenuicaulis Miki (Lentibulariaceae) isolated in Korea

Abstract Utricularia tenuicaulis Miki 1935 is an aquatic carnivorous plant species found in East Asia including Korea and Japan. In this study, the chloroplast genome of U. tenuicaulis was successfully sequenced. The assembled genome (153,976 bp; GC ratio, 37.0%) contains four subregions, with the large single copy (LSC; 84,596 bp; 34.9%) and small single copy (SSC; 17,946 bp; 30.5%) regions separated by 25,718 bp of inverted repeat regions (42.7%), and includes 126 genes (81 protein-coding genes, 8 rRNAs, and 37 tRNAs). Phylogenetic analyses based on the whole-chloroplast genomes of 18 species, including 17 Lentibulariaceae species and one outgroup species, suggest a close relationship between U. tenuicaulis and Utricularia macrorhiza Leconte 1824. A comparison of genomic variation between U. tenuicaulis and U. macrorhiza confirmed the validity of the specific discrimination of U. tenuicaulis.

The U. tenuicaulis chloroplast genome (GenBank accession no. MN529625) is 153,976 bp in length, with a GC ratio of 37.0%, and has four subregions; the large single copy (LSC; 84,596 bp; 34.9%) and small single copy (SSC; 17,946 bp; 30.5%) regions separated by two inverted repeats (IRs; 25,718 bp; 42.7%), including 126 genes (81 protein-coding genes, 8 rRNAs, and 37 tRNAs) in the LSC and SSC regions and 17 genes (6 protein-coding genes, 4 rRNAs, and 7 tRNAs) duplicated in the IR regions. We determined four subregions by identifying junctions of two IR regions using the program, 'BLAST 2 Sequences' that supports BLAST searches to find the duplicated regions.
17 species in Lentibulariacea and Lippia origanoides Kunth. 1818 (Verbenaceae) as an outgroup were used for phylogenetic analysis. We used MEGAX (Kumar et al. 2018) to construct maximum-likelihood (ML) and neighbor-joining (NJ) trees and MrBayes v3.2.6 (Ronquist et al. 2012) to perform Bayesian inference (BI) after aligning the full chloroplast genomes using MAFFT v7.450 (Katoh and Standley 2013). We performed a heuristic search using nearest-neighbor interchange branch swapping, the Tamura-Nei model, and uniform rates among sites to construct ML and NJ phylogenetic trees, with default values for other options. To estimate node confidence, we performed bootstrap analyses with 1,000 and 10,000 pseudoreplicates for ML and NJ trees, respectively. For BI analysis, we used the general-time-reversible (GTR) model with gamma rates as the molecular model and a Markov chain Monte Carlo algorithm implemented for 1,100,000 generations. To build the BI consensus tree, we sampled trees every 200 generations after removing 100,000 generations as burn-in. All phylogenetic trees inferred from the ML, NJ, and BI methods showed the same topology, with three genera of Lentibulariaceae grouped with strong support (Figure 1). Our phylogenetic analysis indicated that U. tenuicaulis is closely related to but distinguished from U. macrorhiza as the phylogenetic distance between two species is similar or larger than those among Genlisea species (Figure 1).
These results suggest that U. tenuicaulis is an independent taxon, genetically distinguished from U. macrorhiza. Further sequencing analysis including a wider range of taxa is necessary to clarify the phylogenetic relationships of Utricularian species in greater detail.

Acknowledgments
We thank Ms. Kumsoon Lee for her advice on the sampling location in Korea.

Ethical approval
Authors declare that there is no ethical or legal violation in obtaining the study materials and preforming research. The species used in this study is not listed in the IUCN Red List and plant materials were collected in the location that was not designated as a protective area in Korea. Authors confirmed that the plant materials for this study were not

Data availability statement
The chloroplast genome sequence data that support the findings of this study are openly available in GenBank of NCBI at (https://www.ncbi.nlm. nih.gov/) under the accession no. MN529625. The associated BioProject, Sequenced Read Archive, and Bio-Sample numbers are PRJNA764600, SAMN21509661, and SRR15970505 respectively.