The complete chloroplast genome sequence of Populus rotundifolia, and a comparative analysis with other Populus species

ABSTRACT: Populus rotundifolia, which is an endemic of the Himilayas and adjacent regions, is the species that occupy the highest habitat in the Populus genus. The complete chloroplast genome sequence of Populus rotundifolia was characterized from Illumina pair-end sequencing. The chloroplast genome of P. rotundifolia was 155,212 bp in length, containing a large single-copy region (LSC) of 84,545 bp, a small single-copy region (SSC) of 16,859 bp, and two inverted repeat (IR) regions of 26,904 bp. The overall GC content is 36.80%, while the correponding values of the LSC, SSC, and IR regions are 34.5%, 30.5%, and 42.3%, respectively. The genome contains 131 complete genes, including 86 protein-coding genes (62 protein-coding gene species), 37 tRNA genes (29 tRNA species) and 8 rRNA genes (4 rRNA species). The Neighbour-joining phylogenetic analysis showed that P. rotundifolia and Populus davidiana clustered together as sisters to other Populus species.


Introduction
Populus rotundifolia, whose habitat is of the highest elevation among all Populus species (Salicaceae), occurs in the mountainous areas between 2300 m and 4500 a.s.l. in the Himalaya and adjacent areas. This aspen is an ideal model in ecological and evolutionary studies, since it is a dominant species in high mountain ecosystem. However, largely due to anthropogenic cutting and climatic environment's change, its natural habitats have been fragmented, yet none is known concerning its genetic background (Zheng et al. 2017;Hou et al. 2018). P. rotundifolia plays an important ecological role in boreal and temperate forests, serving as wildlife habitats and watersheds; they can dominate riparian forests, but are ecologically adaptable. P. rotundifolia has wide geographic distribution, high intraspecific polymorphism, adaptability to different environments, combined with a relatively small genome size. Consequently, P. rotundifolia represents an excellent model for understanding how different evolutionary forces have sculpted the variation patterns in the genome during the process of population differentiation and ecological speciation (Neale and Antoine 2011). Moreover, we can develop conservation strategies easily when we understand the genetic information of P. rotundifolia. In the present research, we constructed the whole chloroplast genome of P. rotundifolia and understood many genome varition information about the species, which will provide beneficial help for population genetics studies of P. rotundifolia The fresh leaves of P. rotundifolia were collected from Xizang (29 30'N, 92 15'E). Fresh leaves were silica-dried and taken to the laboratory until DNA extraction. The voucher specimen (YYY001) was laid in the Herbarium of Chongqing University of Arts and Sciences and the extracted DNA was stored in the À80 C refrigerator of the Key Laboratory of College of Landscape Architecture and Life Science. We extracted total genomic DNA from 25 mg silica-gel-dried leaf using a modified CTAB method (Doyle 1987). The wholegenome sequencing was then conducted by Biodata Biotechnologies Inc. (Hefei, China) with Illumina Hiseq platform. The Illumina HiSeq 2000 platform (Illumina,San Diego, CA) was used to perform the genome sequence. We used the software MITObim 1.8 (Hahn et al. 2013) and metaSPAdes (Nurk et al. 2017) to assemble chloroplast genomes. We used P. tremula (GenBank: NC_027425) as a reference genome. We annotated the chloroplast genome with the software DOGMA (Wyman et al. 2004), and then corrected the results using Geneious 8.0.2 (Campos et al. 2016) and Sequin 15.50 (http://www.ncbi.nlm.nih.gov/Sequin/).
The complete chloroplast genome of P. rotundifolia (GenBank accession number MT482542) was characterized from Illumina pair-end sequencing. T The complete chloroplast genome sequence of Populus rotundifolia was characterized from Illumina pair-end sequencing. The chloroplast genome of P. rotundifolia was 155,212 bp in length, containing a large single-copy region (LSC) of 84,545 bp, a small single-copy region (SSC) of 16,859 bp, and two inverted repeat (IR) regions of 26,904 bp. The overall GC content is 36.80%, while the correponding values of the LSC, SSC, and IR regions are 34.5%, 30.5%, and 42.3%, respectively. The genome contains 131 complete genes, including 86 protein-coding genes (62 protein-coding gene species), 37 tRNA genes (29 tRNA species) and 8 rRNA genes (4 rRNA species).
To confirm the phylogenetic location of P. rotundifolia within the family of Populus, we used the complete chloroplast genomes sequence of P. rotundifolia and 13 other related species of Populus and Salix interior as outgroup to construct phylogenetic tree. The 14 chloroplast genome sequences were aligned with MAFFT (Katoh and Standley, 2013), and then the Neighbour-joining tree was constructed by MEGA 7.0 (Kumar et al. 2016). The results confirmed that P. rotundifolia was clustered with P. davidiana (Figure 1).

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The data that support the findings of this study are openly available in National Center for Biotechnology Information (NCBI) at https://www. ncbi.nlm.nih.gov, accession number MT482542.