Characterization of the complete mitochondrial genome of the lung fluke, Paragonimus heterotremus

Abstract In this study, the complete mitochondrial genome of human lung fluke, Paragonimus heterotremus, was recovered through Illumina sequencing data. This complete mitochondrial genome of P. heterotremus is 13,927 bp in length and has a base composition of A (16.6%), T (41.8%), C (13.%), G (28.4%), demonstrating an obvious bias of high AT content (58.4%). The mitochondrial genome contains a typically conserved structure, encoding 12 protein-coding genes (PCGs), 22 transfer RNA genes (tRNA), 2 ribosomal RNA genes (12S rRNA and 16S rRNA) and a control region (D-loop region). All PCGs were located on the H-strand. ND4 gene and ND4L gene were overlapped by 39 bp. The nucleotide sequence of 12 PCGs of P. heterotremus and other 10 parasite species were used for phylogenetic analysis. The result indicated P. heterotremus a relative close relationship with species Paragonimus westermani (AF219379.2).

Paragonimus heterotremus is mainly distributed in Asia; China, Laos, Cambodia, and Thailand and can cause Paragonimiasis in human and other crab-eating mammals. Until now, the organelle genome information of P. heterotremus is still limited. In this study, the complete mitochondrial genome of P. heterotremus was recovered through Illumina Hiseq2500 sequencing. This complete mitochondrial genome can be subsequently used for clinical diagnosis and provide valuable insight into phylogeny relationship among Paragonimus species.
The eggs of P. heterotremus was collected from sputum of patients in Zhuang region, Guangxi Province, China (22 48 0 48.17 00 N, 108 19 0 15.61 00 E). Adult worms were obtained by feeding dogs with metacercariae. Genomic DNA was extracted from adult worms using the commercial QiaAmp DNA extraction kit and DNeasy Tissue kit (supplied by Qiagen) according to the manufacturer's instructions. The isolated DNA was stored at À20 C in the functional lab of Institute for Translation Medicine in Qingdao University. The partial genomic DNA was then subjected to standard Hiseq 2000 library construction. A total of 40 Gb reads were obtained with average length of 100 bp. After quality filtration, the clean reads were assembled by SPAdes 3.6.1 (Bankevich et al. 2012) based on default settings. We used another mitochondrial genome of Paragonimus westermani (AF219379.2) as a reference sequence to align the contigs and identify gaps. To fill the gap, Price (Ruby et al. 2013) and MITObim version 1.8 (Hahn et al. 2013) were applied and Bandage (Wick et al. 2015) was used to identify the circular topology. The complete sequence was primarily annotated by ORF prediction in Unipro UGENE (Okonechnikov et al. 2012) combined with manual correction. All tRNAs were confirmed using the tRNAscan-SE search server (Lowe and Eddy 1997). Other protein coding genes were verified by BLAST search on the NCBI website (http://blast.ncbi.nlm.nih.gov/), and manual correction for start and stop codons were conducted. The circular mitochondrial genome map was drawn using OrganellarGenomeDRAW (Lohse et al. 2007). This complete mitochondrial genome sequence together with gene annotations were submitted to GenBank under the accession numbers of MH059809.
The complete mitochondrial genome of P. heterotremus was 13,927 bp in length and has a base composition of A (16.6%), T (41.8%), C (13.%), G (28.4%), demonstrating an obvious bias of high AT content (58.4%). The mitochondrial genome contains a typically conserved structure, encoding 12 protein-coding genes (PCGs), 22 transfer RNA genes (tRNA), 2 ribosomal RNA genes (12S rRNA and 16S rRNA), and a control region (D-loop region). All PCGs were located on the Hstrand. ND4 gene and ND4L gene were overlapped by 39 bp.
Phylogenetic analysis was constructed by applying 12 mitochondrial protein coding genes with other 10 closely related taxa. The whole genome alignment was constructed by HomBlocks (Bi et al. 2018) and verified by MAFFT (Katoh and Standley 2013). Finally, conserved regions were picked out by Gblocks 0.91b (Castresana 2002) to construct concatenated nucleotide sequences. Phylogenetic tree constructed using RAxML version 8.1.12 (Staamtakis 2014) and Mrbayes (Ronquist et al. 2012) was shown in Figure 1. The relationships among the 11 taxa were fully resolved with 100% values. Paragonimus heterotremus was clustered into the group of genus Paragonimus and exhibited a relative close genetic distance with Paragonimus westermani (AF219379.2).

Disclosure statement
No potential conflict of interest was reported by the authors.