Complete chloroplast genome sequence of Heritiera angustata (Malvaceae): an endangered plant species

Abstract Heritiera angustata (Malvaceae) is an evergreen tree distributed in the Chinese provinces of Hainan and Yunnan and in Cambodia. In China, it is listed as ‘Endangered’ (EN) China Red List of Higher Plants, although it is not protected by law. The complete chloroplast (cp) genome sequence of this threatened species is reported in this study, based on high-throughput sequencing (Illumina). The complete cp genome is 168, 953 bp in length, containing a pair of inverted repeat regions (IRs) of 34,491 bp, a large single copy (LSC) region of 89,054 bp, and a small single copy (SSC) region of 10,917 bp. The cp genome contains 129 genes, consisting of 85 protein-coding genes, 36 tRNA genes and eight rRNA genes. The overall AT content in the cp genome of H. angustata is 63.2%. The phylogenetic analyses indicate that there is a close relationship between H. angustata and Firmiana pulcherrima.

Heritiera angustata; chloroplast genome; illumina sequencing; phylogenetic analysis Heritiera angustata Pierre (Malvaceae) is an evergreen tree widely distributed in the mountains or near coastal areas in southeast region of Hainan and in Yunnan provinces of China, as well as in Cambodia (Tang et al. 2007). It is listed as 'Endangered' (EN) in the China Red List of Higher Plants (MEP-CAS 2013). Despite its threatened status, it is not protected by law in China, although some of its populations are included within nature reserves. Genetic diversity is prerequisite for evolutionary adaptation; thus, preservation of genetic diversity should be a key factor in species' conservation plans (Hamrick et al. 1991). In order to inform a comprehensive conservation strategy for this threatened species, an improved understanding of its genomics information is urgently needed. Here, the complete chloroplast genome of H. angustata (GenBank: this study) is determined and presented, based on the Illumina paired-end sequencing data.
In this study, we sampled a healthy individual of H. angustata from Diaoluo Mountain (18.67 N, 109.88 E), which is a National Nature Reserve of Hainan, China. We employed the modified CTAB method (Doyle and Doyle 1987) to extract the total genomic DNA of H. angustata from silica gel-dried leaves. A voucher specimen of H. angustata (H.-F. Wang B38) was deposited at the Institute of Tropical Agriculture and Forestry, Hainan University (Haikou, China). Whole-genome short-gun sequencing was performed on the Illumina Hiseq 2500 platform, with the 150 bp paired-end sequencing method. After filtering and trimming with NGSQC-Toolkit v2.3.3 (Patel and Jain 2012), the clean reads with an average coverage of 300 Â were used to conduct the referencedbased chloroplast genome assembly with the program MITObim v1.8 (Hahn et al. 2013). All genes were annotated with the program of GENEIOUS R8.0.2 (Biomatters Ltd., Auckland, New Zealand). The chloroplast genome of Theobroma cacao (HQ336404.2) (Jansen et al. 2011) was used as the reference for assembling and annotation. The annotation was corrected with Dual Organellar Genome Annotator (DOGMA) software (Wyman et al. 2004), and a circular plastid genome map was generated with OGDRAW (Lohse et al. 2013).
The cpDNA of H. angustata was a circular molecule of 168,953 bp with a quadripartite structure, containing a pair of inverted repeats (IRs) of 34,491 bp, separated by a large single copy (LSC) region, and a small single copy (SSC) region of 89,054 and 10,917 bp, respectively. It contained 129 genes, including 85 protein-coding genes (76 PCG species), eight ribosomal RNA genes (four rRNA species) and 36 tRNA genes (29 tRNA species). Among these, 13 genes (trnA-UGC, trnI-GAU, trnK-UUU, trnL-UAA, trnV-UAC, atpF, ndhA, ndhB, petB, petD, rpoC1, rpl2, and rpl16) harboured a single intron and one gene (ycf3) had two introns. The majority of gene species in H. angustata occurred as a single copy. The overall AT content of H. angustata chloroplast genome was 63.2%, while the corresponding values of the LSC, SSC and IR regions were 65.1%, 68.28% and 59.95%, respectively.
A maximum likelihood (ML) phylogenetic tree of the nine published complete chloroplast genomes of Malvales (plus H. angustata) was built with RAxML (Stamatakis 2006), using Braya humilis (Brassicaceae, Brassicales) as outgroup (Figure 1). The phylogenetic analysis indicated that all members of Malvales were clustered with a high bootstrap support (BS) value and there was a close relationship between H. angustata and Firmiana pulcherrima. In this paper, we report the characterization of the complete chloroplast genome of H. angustata for the first time, which may provide a useful resource for conservation genetics studies of H. angustata, and also for phylogenetic studies of Malvales.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This study was funded by National Natural Science Foundation of China (31660055 and 31660074) and the start-up grant from Hainan University (kyqd1633).