The complete chloroplast genome and phylogenetic analysis of Syringa reticulata subsp. amurensis (Rupr.) P.S.Green & M.C.Chang from Qinghai Province, China

Abstract Syringa reticulata subsp. amurensis (Rupr.) P. S. Green & M. C. Chang (Oleaceae) is a shrub or tree with high medicinal value as well as great ecological significance as an urban garden plant. To better understand the molecular genetics and evolutionary of S. reticulata subsp. amurensis, its complete chloroplast genome was sequenced and annotated. The assembled chloroplast genome is a circular 156,141 bp sequence, consisting of 87,108 bp large single copy (LSC) region and 17,239 bp small single copy (SSC) region, which were flanked by a pair of 25,897 bp inverted repeats (IRs). The GC content of the chloroplast genome is 36.14%. Moreover, a total of 132 functional genes were annotated, including 88 protein-coding, 36 tRNA, and eight rRNA genes. Phylogenetic analysis showed that S. reticulata subsp. amurensis was most closely related to S. reticulata subsp. Pekinensis and the genus Syringa is paraphyletic group. This study provides important information for further phylogenetic studies on S. reticulata subsp. amurensis and its allies.

Syringa reticulata subsp. amurensis (Oleaceae: Syringeae), is a shrub or tree with high medicinal and horticultural values. Its flowers are white, luxuriant, and fragrant. Syringa reticulata subsp. amurensis can also be used to treat respiratory diseases (Zhu et al. 2021). In addition, with its excellent adaption to the strong environmental stress in northwest China, this species is considered a high-quality garden plant with great potential to improve urban ecology. Wild individuals of S. reticulata subsp. amurensis grow mainly in mixed forests on slopes and grasslands, or near gullies, 100-1200 m above sea level (Chang et al. 1996). The plant is usually cultivated as an ornamental in northern China (Chang et al. 1996). The chloroplast genome is particularly useful in studies on the maternal evolutionary history of angiosperms for its matrilinear inheritance without recombination (Nock et al. 2019). However, no studies on the complete chloroplast genome of S. reticulata subsp. amurensis have been published. In the present study, the complete chloroplast genome of S. reticulata subsp. amurensis was obtained using the next-generation sequencing (NGS) technologies and a phylogenetic analysis of S. reticulata subsp. amurensis and its allies was carried out.
(Nanjing, China). Approximately, 5 GB of clean data were yielded. The sequencing reads were mapped to the reference chloroplast genomes using the Bowtie2 software (Langmead and Salzberg 2012). The SPAdes version 3.10.1 (Bankevich et al. 2012) and SSPACE version 2.0 (Boetzer et al. 2011) were used to assemble the chloroplast genome. The chloroplast genes were annotated with CpGAVAS (Liu et al. 2012) and the sequence coordinates for the genes were verified by BLAST search against the Syringa reticulata subsp. pekinensis (GenBank accession number: MN901632.1) reference chloroplast genome. Annotation errors were manually corrected.
The phylogenetic relationships of S. reticulata subsp. amurensis and its allies were inferred using the maximum-likelihood (ML) method based on the General Time Reversible (GTR) model (Nei and Kumar 2000). The sequences were aligned by MAFFT version 7.473 (Katoh and Standley 2013; online version: https://mafft.cbrc.jp/alignment/server/), and evolutionary analyses were performed in MEGA7 (Kumar et al. 2016). The evolutionary tree with the highest log likelihood (-464764.75) is shown (Figure 1). The bootstrap percentages of trees in which the associated taxa clustered together based on 1000 replicates are shown at the branch nodes. Initial trees for the heuristic search were obtained automatically by applying Neighbor-Joining and BioNJ algorithms to a matrix of pairwise distances estimated using the maximum composite likelihood (MCL) approach, and then topologies with superior log likelihood value were selected. The tree was drawn to scale, with branch lengths corresponding to The complete chloroplast genome of S. reticulata subsp. amurensis was 156,141 bp in length and has a typical quadripartite structure, containing a pair of IR regions of 25,897 bp, a large single copy (LSC) region of 87,108 bp, and a small single copy (SSC) region of 17,239 bp. The two IRs were separated by the LSC and the SSC. The GC content of the complete chloroplast genome was 38.03%. A total of 132 functional genes were annotated, including eight rRNA genes, 36 tRNA genes, and 88 protein-coding genes. The rRNA, tRNA, and protein-coding genes account for 6.06%, 27.27%, and 66.67% of all annotated genes, respectively.
The phylogenetic analysis fully resolved S. reticulata subsp. amurensis in a clade with S. reticulata subsp. pekinensis (Figure 1). The phylogenetic tree suggested that S. reticulata subsp. amurensis and S. reticulata subsp. pekinensis are polyphyletic with respect to Ligustrum spp. and other species classified to the genus Syringa. This result is consistent with previous studies (Li et al. 2002;Dupin et al. 2020;Wang et al. 2020). The present study provides important information for further molecular studies on S. reticulata subsp. amurensis and its allies.

Disclosure statement
No potential conflict of interest was reported by the authors.

Data availability statement
The genome sequence data obtained in this study are openly available in GenBank of NCBI at https://www.ncbi.nlm.nih.gov/ under the accession number MW525283. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA692790, SRR13480493, and SAMN17717949, respectively.