The complete chloroplast genome of Salix lindleyana (salicaceae), a plateau plant species

Abstract Salix lindleyana Wallich ex Andersson 1851 is a species of genus Salix which mainly grows on mountains above 3000 m at sea level in Qinghai–Tibetan Plateau (including the Himalayas and Hengduan Mountains). To determine its phylogenetic position within Salix, we reconstructed S. lindleyana complete chloroplast (cp) genome sequence by de novo assembly using whole-genome sequencing data. The completed chloroplast genome was 155,304 bp, with a total GC content of 36.7%. It had a very typical tetrad structure, including a large single-copy (LSC) region of 84,539 bp, a small single-copy (SSC) region of 16,161 bp, and two inverted repeats (IR) regions of 27,302 bp. A total of 132 functional genes were distributed in the chloroplast genome, including 87 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Phylogenetic analysis showed that S. lindleyana was clustered with Salix dasyclados Wimmer 1849 and Salix variegata Franchet 1887. The complete chloroplast genome of S. lindleyana provides potential genetic resources for further phylogenetic studies.


Introduction
Salix lindleyana Wallich ex Andersson 1851, a member of the genus Salix, is a cushion-like shrub with a prostrate and rooted main trunk, and only a few centimeters high. It is more common in the wet rock crevices at altitudes over 3000 m at sea level in Qinghai-Tibetan Plateau (including Himalayas and Hengduan Mountains). S. lindleyana grows in an environment with large temperature difference between day and night, high ultraviolet radiation intensity, and strong wind. To adapt to the complex and changeable geographical environment of the plateau, the plant shape and the genome sequence of S. lindleyana have been continuously evolved, which makes it of important ecological value and high research value. However, the genome of S. lindleyana has not been sequenced or assembled. In this study, the complete chloroplast of S. lindleyana was first assembled and annotated to explore it's genomic structure, and a phylogenetic tree of S. lindleyana and other willows was also constructed to understand their evolutionary relationships.

Material and methods
We collected leaf samples from Kangding City, Ganzi Tibetan Autonomous Prefecture, Sichuan Province, China, at an altitude of 4010 m, with specific coordinates of 29 54 0 18 00 N, 102 0 0 7 00 E ( Figure 1). This article was licensed under the Regulations of Strategy of Sichuan Province on Biodiversity Conservation and approved by the Beijing Forestry University   (Sato et al. 1999) and CPGView (Liu et al. 2023) software was used to visualize the chloroplast genome. Each annotation error in the chloroplast genome is manually corrected using the Apollo (Lewis et al. 2002) software. To clarify the accuracy of the assembly, we further mapped our clean reads back to the assembled chloroplast (cp hereafter) genome to assess the depth of coverage ( Figure S1). To evaluate phylogenetic position of S. lindleyana, another 31 Salix complete chloroplast genomes were downloaded from GenBank database. The chloroplast genome of Populus trichocarpa (EF489041) (Tuskan et al. 2006) was used as Figure 2. Schematic representation of the plastome features of Salix lindleyana. From the center outward, the first track shows the dispersed repeats. The dispersed repeats consist of direct (D) and palindromic (P) repeats, connected with red and green arcs. The second track shows the long tandem repeats as short blue bars. The third track shows the short tandem repeats or microsatellite sequences as short bars with different colors. The colors, the type of repeat they represent, and the description of the repeat types are as follows. Black: c (complex repeat); green: p1 (repeat unit size ¼ 1); yellow: p2 (repeat unit size ¼ 2); purple: p3 (repeat unit size ¼ 3); blue: p4 (repeat unit size ¼ 4); orange: p5 (repeat unit size ¼ 5); red: p6 (repeat unit size ¼ 6). the small single-copy (SSC), inverted repeat (IRa and IRb), and large single-copy (LSC) regions are shown on the fourth track. The GC content along the genome is plotted on the fifth track. The base frequency at each site along the genome will be shown between the fourth and fifth tracks. The genes are shown on the sixth track. The optional codon usage bias is displayed in the parenthesis after the gene name. Genes are color-coded by their functional classification. The transcription directions for the inner and outer genes are clockwise and anticlockwise, respectively. The functional classification of the genes is shown in the bottom left corner.
outgroup. Sequences were aligned by MAFFT v7.310 (Katoh et al. 2019) with default parameters, and the phylogenetic tree was constructed by maximum likelihood method using Phyml v3.3 (Guindon et al. 2010) with the GTR þ I þ G model and 1,000 rapid bootstraps. The amino acid substitution model was calculated by modelgenerator (Keane TM et al. 2006). The completed chloroplast genome of S. lindleyana was submitted to GenBank with accession number OM892926.
The relationship among 32 Salix could be well revealed by the phylogenetic tree which constructed with chloroplast genome datas (Figure 3). The results suggested that S. lindleyana as a sister group to Salix dasyclados Wimmer and Salix variegata Franchet with high bootstrap support.

Discussion and conclusions
The chloroplast genome of S. lindleyana was the second report of Sect. Lindleyanae Schneid which further complement the genome information of plateau Salix. The complete chloroplast genome of S. lindleyana provides potential genetic resources for further evolutionary and genomic studies on Salix. Chloroplast genomes with high conservation have been widely used in species identification and phylogenetic analysis (Szymon et al. 2016). Wagner's research suggested that Salix had lower plastid variation than other angiosperms, making it unsuitable for phylogenetic reconstruction . However, Zhou's study on the whole genome phylogeny and classification of chloroplasts in five Salix species found that ycf1, psaI, ycf2-2, rpoC2, rpl22, atpF and ndhF genes were under positive selective in 21 Salix species. Rps7 is the most variable region in 21 Salix chloroplast genomes and can be used as a molecular marker for species identification . These results indicated that chloroplast genome can provide reference for phylogenetic and evolutionary studies of Salix. He et al. (2021) use drestriction-site associated DNA (RAD) sequencing data to reconstruct the phylogeny and spatiotemporal evolution. Salix lindleyana was clustered in subclades I of the Hengduan Mountains clade and was a sister group to S. cff. flabellaris Andersson 1860, which was consistent with monophyly of taxonomic sections. Alpine dwarf willow branches (S. lindleyana, S. cff. flabellaris, S. oreinoma C. K. Schneid. in Sargent 1916, and S. opsimantha C. K. Schneid. in Sargent 1916) differentiated at 6.93-8.1 Ma, and S. oreinoma clustered in subclade II, showing adaptability to the high-altitude niche.
This study constructed the phylogenetic tree with chloroplast genome dates to reveal the relationship among 32 Salix. Phylogenetic analysis of chloroplast genome showed that shrub willows and tree willows were separated in phylogenetic tree. The shrub willows were divided into two distinct branches. S. lindleyana, Salix variegata, and Salix dasyclados form an independent clade. This was very similar to the clustering results of subclasses I and II of the mountain clade of HDM. S. oreinoma clustering in other branches was consistent with the differentiation of alpine dwarf willow branches in He's study. The phylogenetic results in this study were partially consistent with the nuclear genome. However, due to the limited chloroplast genome data of the members of Salix genus, it was hard to fully display the phylogenetic relationship of the plastid genome, and the comparison with nuclear genome evolution was still one-sided.