Two complete chloroplast genome sequences of genus Paulownia (Paulowniaceae): Paulownia coreana and P. tomentosa

Abstract The nucleotide sequence of the two chloroplast (cp) genomes from Paulownia coreana and P. tomentosa are the first to be completed in genus Paulownia of family Paulowniaceae. The structure of two Paulownia cp genomes shows similar characteristic with general cp genome of angiosperms. The lengths of two cp genomes are 154,545 bp and 154,540 bp, respectively. The cp genomes are divided into LSC region (85,241 bp and 85,236 bp) and SSC region (17,736 bp and 17,736 bp) by two IR regions (25,784 bp and 25,784 bp). Both of two cp genomes contain 113 genes (79 protein coding genes, 30 tRNA genes and 4 rRNA genes), eight protein-coding genes, seven tRNA genes and four rRNA genes duplicated in the IR regions. Similar to the general cp genome of angiosperms, 18 of the genes in the two cp genomes have one or two introns. The overall A-T contents of two genomes are 62.0% which is similar with general angiosperms. The A-T content in the non-coding (64.6%) is higher than in the coding (60.1%) regions. Seventy-one and seventy simple sequence repeat (SSR) loci were identified in the P. coreana and P. tomentosa cp genomes, respectively. In phylogenetic analysis, genus Paulownia shows closed relationship with Lindenbergia philippensis of Orobanchaceae.

Paulowniaceae; Paulownia coreana; Paulownia tomentosa; chloroplast genome Genus Paulownia which included eight species is one of the four genera in the family Paulowniaceae (Nakai 1949;Olmstead et al. 2001;APG 2016). Paulownia is fast-growing plant which is used for ornamental tree, the materials of instruments and contributions in agroforestry system. We sequenced and analyzed chloroplast (cp) genomes of Paulownia coreana Uyeki and P. tomentosa Steud. P. coreana is controversial and unresolved species that there is no significant morphological difference which compared with P. tomentosa.
The plant materials of P. coreana and P. tomentosa were collected from a single individual that planted in the Korea University. Voucher specimens (KUS 2014-1539, KUS2014-1540 and DNA samples (PDBK 2014-1539, PDBK 2014-1540 were deposited in the Korea University Herbarium and Plant DNA Bank in Korea (PDBK), respectively. Chloroplast genome sequences were analyzed using Illumina MiSeq (San Diego, CA), and assembled by Geneious 8.1.7 (http://www.geneious. com, Kearse et al. 2012). The complete cp genome sequences were submitted into NCBI database with accession numbers of KP718622 and KP718624, respectively.
Length of complete cp genome sequence of P. coreana and P. tomentosa are 154,545 bp and 154,540 bp, respectively. The cp genome of P. coreana is composed of 85,241 bp of LSC region, 17,736 bp of SSC region and 25,784 bp of two IR regions, whereas the cp genome of P. tomentosa is composed of 85,236 bp of LSC region, 17,736 bp of SSC region and 25,784 bp of two IR regions. Both of two cp genomes are consist of 113 individual genes which included 79 proteincoding genes, 30 transfer RNA genes and four ribosomal RNA genes. Among them, eight protein-coding genes, seven tRNA genes and four rRNA genes are duplicated on the IR regions. Similar to the general cp genome of angiosperms such as Panax and Sesamum, 18 of the genes in the each cp genome have one or two introns. Of these, rps12, clpP and ycf3 have two introns (Shinozaki et al. 1986;Kim & Lee 2004;Yi & Kim 2012).
The major portion of the P. coreana and P. tomentosa cp genomes consist of gene-coding regions (57.4% and 57.4%) which consist of protein-coding region (51.1% and 51.2%) and RNA regions (6.2% and 6.2%), whereas the intergenic spacers (including 23 introns) of both cp genomes comprise 42.6%. The overall A-T contents of two genomes are 62.0% which is similar with general angiosperms and other cp genomes of Lamiaceae and some genus of Orobanchaceae (Shinozaki et al. 1986;Kim & Lee 2004;Yi & Kim 2012;Wicke et al. 2013;Zhu et al. 2014;Welch et al. 2016). In both of two genomes, the A-T content in the non-coding (64.6%) is higher than in the coding (60.1%) regions. The A-T contents of the IR region is 52.8% in two cp genomes and the A-T contents of LSC and SSC regions are 64.0% and 67.6%, respectively. Seventy-one and 70 SSR loci which repeated more than ten times identified in the P. coreana and P. tomentosa cp genomes, respectively.
For the phylogenetic analysis, we assembled the 54 complete cp DNA sequences from the Lamiales clade and two outgroup sequences from Rubiaceae in Gentianales. A total of 79 protein CDSs including rrn genes were aligned for the 56 analyzed taxa. The aligned data matrix consists of a total of 85,408 bp. An ML tree was obtained with an -lnL ¼ 458,425.7176 using the GTR þ G þ I base substitution model (Figure 1). Similar to APG system, genus Paulownia forms a monophyletic group which shows closed relationship with Orobanchaceae (Olmstead et al. 2001(Olmstead et al. , 2009Bremer et al. 2002;APG 2016).