The complete chloroplast genome of Mazus pumilus (Mazaceae)

Abstract Mazus pumilus (N. L. Burman) Steenis is the representative species of Mazus mainly distributed in China. Here, we report the complete chloroplast genome sequence of M. pumilus. The genome was 153,149 bp in length with 106 genes comprising 79 protein-coding genes, 23 tRNA genes, and 4 rRNA genes. The overall GC content of M. pumilus chloroplast genome was is 37.8%. ML phylogenomic analysis suggested that M. pumilus forms a monophyletic group with Lancea which shows a close relationship with the clade of Phrymaceae, Paulowniaceae, and Orobanchaceae.

Mazus Lour. (ca. 35 species) is mainly distributed in East Asia, Australia, and New Zealand, and about 25 species are found in China (Wu and Raven 1998). Mazus was firstly placed in Scrophulariaceae (Wettstein 1891). However, the systematic position of Mazus was altered by recent molecular-phylogenetic studies. Beardsley and Olmstead (2002) found that Mazus and Lancea form a well-supported clade recognized as the subfamily Mazoideae belonging to the Phrymaceae. However, phylogenetic studies of Oxelman et al. (2005), Albach et al. (2009), Xia et al. (2009) and Sch€ aferhoff et al. (2010 confirmed that Mazus should be placed apart from the Phrymaceae. Based on the previous molecular-phylogenetic studies, Reveal (2011) described a new family named Mazaceae which including Mazus, Lancea and Dodartia. Up to now, previous literature has not well revealed the phylogenetic relationship of Mazus and its related genus by different sequence fragments and need to be further elucidated.
In the present study, we report the completed chloroplast genomes of Mazus pumilus (N.L. Burman) Steenis which is the representative species of Mazus. M. pumilus was collected in Luoyang (112 26 0 45.2 00 E, 34 38 0 3.9 00 N, China) and the specimen was deposited in the Qinghai-Tibetan Plateau Museum of Biology (HNWP). The DNA was isolated from fresh leaves via the modified CTAB method (Doyle 1987). The complete chloroplast genome was sequenced at Novogene Biotech Co. (Tianjin, China) using the Illumina MiSeq platform. Genomic sequence was assembled with SOAPdenovo (Luo et al. 2012) and annotation was performed with CpGAVAS ) by comparing with the previously reported chloroplast sequences of Lancea (Chi et al. 2018). The completed chloroplast genome sequences of M. pumilus together with 26 species from Lamiales and Lactuca sativa (outgroup) were aligned with MAFFT (Katoh and Standley 2013). Gblocks (Castresana 2000) was introduced to remove ambiguously aligned sites. A maximum likelihood (ML) analysis was implemented using RAxML-HPC2 on XSEDE based on the GTR þ G þ I nucleotide substitution model as recommended by jModelTest2 with 1000 replications.
The M. pumilus chloroplast genome (GenBank Accession No. MF593117) was 153,149 bp in length with a pair of inverted repeats (IR) regions (25,831 bp), a large single copy (LSC) region (84,034 bp), and a small single copy (SSC) region (17,453 bp). The GC content of the genome was 37.8%, and the GC contents of IR regions (43.1%) was higher than the LSC regions (35.7%) and SSC regions (32.1%). There were 106 predicted genes including 79 protein-coding genes, 23 tRNA genes, and 4 rRNA genes. Among the protein-coding genes, 63 were found in the LSC region, 11 were located in the SSC region, while ndhB, rpl2, rpl23, rps7, and ycf2 were duplicated in the IR regions.
ML analysis showed that M. pumilus and Lancea species constituted one monophyletic group as Mazaceae (Figure 1). Additionally, Mazaceae showed a close relationship with Phrymaceae, Paulowniaceae, and Orobanchaceae, rather than the Scrophulariaceae. This newly reported chloroplast data not only provided genomic information for Mazaceae but also revealed the phylogenetic relationships. These data will empower genetic engineering, conservation genetics and evolutionary studies involving this taxon.

Disclosure statement
No potential conflict of interest was reported by the authors.