The complete chloroplast genome of Casuarina cunninghamiana (Casuarinaceae)

Abstract Casuarina cunninghamiana Miq. naturally occurs in eastern Australia from New South Wales to north Queensland. After being introduced to China, it has become an important tree species of ecological shelter plantations in coastal areas of southern China. In this study, the complete chloroplast (cp) genome of C. cunninghamiana was sequenced and analyzed based on the Illumina NovaSeq 6000 platform. The cp genome of C. cunninghamiana was found to be 15,6129 bp in length, including a large single copy (LSC) region of 86,200 bp and a small single copy (SSC) region of 18,457 bp, which were separated by two inverted repeats (IRs) of 25,736 bp. The cp genome contains 132 genes, consisting of 87 protein-coding genes, 37 tRNA genes, and eight rRNA genes. The overall GC content of the cp genome was 36.34%. The phylogenetic analyses indicated that C. cunninghamiana was closely related to C. glauca and C. equisetifolia and clustered with 4 Betulaceae species.

Casuarina cunninghamiana Miq. is an important tree species with considerable ecological and economic values (Doran and Hall 1983;Jiang et al. 2012), which has been extensively planted as timber forest, windbreaks in the inland and coastal regions of China. It is one of the most successful tree species of Casuarinaceae introduced to China ( Zhong and Bai 1996;Zhong et al. 2010). The phylogenetic relationship of Casuarinaceae with other plants is still controversial up to now (Beadle 1981;John and Wilson 1989). The chloroplast genome is a very effective tool for tracing the origin of species and migration, and has been widely used in phylogenetic analysis (Mehmood et al. 2020a(Mehmood et al. , 2020b(Mehmood et al. , 2020c. In this study, we obtained the complete chloroplast genome sequence of C. cunninghamiana and analyzed its phylogenetic relationships with other related species, which will help us clarify its phylogenetic status and contribute for further effective utilization. In this study, the samples of C. cunninghamiana were collected from Raoping, Guangdong Province of China (23 35 0 15 00 N, 117 8 0 10 00 E). Total genomic DNA was extracted from young and fresh branchlets using E.Z.N.A V R Plant DNA kits (Omega Bio-Tek Inc., Norcross, GA, USA). A specimen was deposited at the Key Laboratory of National Forestry and Grassland Administration on Tropical Forestry Research, Research Institute of Tropical Forestry, Chinese Academy of Forestry (Zhen Li, lzlizhenlz@yeah.net, Guangzhou, China) under the voucher number RCCZCCRPLZ202009. A pairedend library with an insert size of 450 bp was constructed and the library was sequenced on the Illumina NovaSeq 6000 platform (BIOZERON Co., Ltd, Shanghai, China). Approximately 5.01 Gb of raw data from C. cunninghamiana were generated with 150 bp paired-end read lengths. The cp genome sequence of C. cunninghamiana was assembled using the program NOVOPlasty v4.2 with a 39-mer length and the genome range was from 120 kb to 200 kb (Dierckxsens et al. 2017). Then we used the software GeSeq to annotate chloroplast protein and rRNA-coding genes by BLAST with profile HMM hits based on protein search identity was set to 60 and rRNA, tRNA, DNA search identity was set to 35 (Tillich et al. 2017). The software tRNAscan-SE v2.0.7 was used to verify the tRNA genes with default settings (Lowe and Chan 2016). To obtain a high accurate gene set, we manually corrected the exon/intron boundaries and the head and tail of genes based on the reference genome (Ostrya japonica MG386375). Finally, we obtained the annotated complete chloroplast genome of C. cunninghamiana and submitted to GenBank with accession number MZ474954.
The complete chloroplast genome of C. cunninghamiana showed a typical quadripartite structure with the length of 15,6129 bp, including a large single copy (LSC) region of 86,200 bp, a small single copy (SSC) region of 18,457 bp, and two inverted repeat regions (IRs) of 25,736 bp. The base composition of the complete chloroplast genome existed differences, and the content of A, T, C, and G were 31.37%, 32.30%, 18.50%, and 17.83%, respectively. The overall GC content of the cp genome was 36.34%, and the corresponding values of the LSC, SSC, and IR regions were 34.16%, 29.72%, and 42.36%, respectively. A total of 132 genes were annotated, including 87 protein-coding genes, 37 tRNA genes, and eight rRNA genes. Most of them were in a single copy, while 19 genes (eight of the protein-coding genes, four rRNA genes, and seven tRNA genes) were duplicated in the IR regions.
To reveal the phylogenetic status of C. cunninghamiana, as well as Casuarinaceae, we reconstructed the phylogenetic relationships based on the cp genomes of three species from genus Casuarina and fifteen cp genomes from other species in Fagales. Rosa praelucens was served as outgroup. Genome sequences were downloaded from NCBI GenBank and were aligned by software MAFFT (Katoh and Standley 2013). The maximum likelihood bootstrap analyses with 1000 replicates were performed using MEGA (Kumar et al. 2016). As shown in the phylogenetic tree (Figure 1), C. cunninghamiana was closely related to C. glauca and C. equisetifolia and clustered with 4 Betulaceae species. In summary, this study provided essential data for phylogenetic and evolutionary analyses of C. cunninghamiana, as well as Casuarinaceae and will be useful in studies on its population genetics, molecular-assisted breeding, genetic resources evaluation and utilization.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The genome sequence data that support the findings of this study are openly available in GenBank of NCBI under the accession No. MZ474954 at (https://www.ncbi.nlm.nih.gov/nuccore/mz474954). The associated BioProject, SRA and Bio-Sample numbers are PRJNA760639, SRR15729584 and SAMN21219931, respectively.