A novel human coronavirus OC43 genotype detected in mainland China

Dear Editor,Coronaviruses (CoVs) have a broad spectrum in humans and other animals, causing asymptomatic infections or respiratory tract infections, gastroenteritis, and neurological diseases of va...

From November 2014 to November 2016, a prospective study was conducted in hospitalized children with community acquired pneumonia (CAP) at 13 hospitals located in mainland China. A total of 2721 cases were enrolled into this study. The presence of HCoV-OC43 nucleic acid was screened by using an RVP Fast V2 kit (Luminex, USA) with a Luminex Magpix after RNA extraction from throat swabs or nasopharyngeal aspirates. Total RNA from specimens was converted into cDNA using oligo (dT) primers and the SuperScript IV Reverse Transcription System (Invitrogen, Carlsbad, CA). The full-length S, RdRp, N, and viral genomes (from the 5′-end of the ORF1a gene to the 3′-end of the poly-A tail) were amplified from HCoV-OC43-positive samples by a genome walking method involving a total of 44 overlapping fragments using a set of specific primers (Supplementary  Table S1) 8,12 . The genome sequence was determined as previously described 8 . Sequences were aligned using the ClustalW program implemented in MEGA 5.03 (version 5.0; Sudhir Kumar, Arizona State University, Tempe, AZ, USA). Maximum likelihood (ML) trees of whole-genome sequences and the full-length sequences of the S, RdRp, and N genes were constructed with the best-fit general time reversible model with gamma-distributed rate variation across sites and 1000 bootstrap replicates implemented in MEGA 5.03 14 . Neighbor-joining trees of 24 known genes and whole genomes were constructed with Kimura's two-parameter model and 1000 bootstrap pseudoreplicates implemented in MEGA 5.03 14 . To analyze potential recombination events, the complete  Table S2). The reference sequences were retrieved from GenBank on December 2017.
HCoV-OC43 was detected in 1.5% (42/2721) of enrolled cases. A total of 15 whole genomes of HCoV-OC43 were obtained from 42 respiratory specimens of OC43-positive cases. To identify the genotype of OC43-positive samples, the ML phylogenetic trees based on the full-length sequences of the S, RdRp, and N genes were constructed by using the representative strains of genotypes A-G ( Supplementary Fig. S1). Phylogenetic analysis of the S gene clustered all reference strains in genotypes A-G, which agreed with previous reports 8, [10][11][12][13] . The 15 OC43 strains identified in the present study were organized into two clusters ( Supplementary Fig. 1A). Eight  Fig. 1B). Importantly, the bootstrap values at several nodes, such as genotype D or C, were lower than 70% in the phylogenetic tree of the RdRp gene, which led to an unresolved tree. The possible reasons for this result maybe due to the highly conserved nucleotides compared to the other genes and less genetic information in GenBank. Analysis of N genes showed that eight strains belonged to genotype G strains in the tree of the S genes clustered together, while the other strains belonged to genotype B strains in the tree of S genes clustered together (Supplementary Fig. 1C). The incongruence of several phylogenetic analyses of different genes suggested the occurrence of recombination.
To further explore the evolutionary characteristics of the 15 OC43 strains, a ML tree was generated using the whole-genome sequences and was compared to other whole genomes of OC43 strains deposited in GenBank. These reference strains were divided into genotypes from A to G as reported by Oong et al. 13 . Eight OC43 strains clustered with genotype G strains circulating in Malaysia with high nucleotide similarity (99.2-99.6%). However, the other seven OC43 strains clustered into a dependent novel lineage (Fig. 1a). Based on the estimation of the intergenotype pairwise genetic distances, the distances of the novel lineage compared with genotypes B, C, D, F, and G were <0.7%, but the distances were >0.9% when compared with genotypes A and E (Fig. 1b). These results suggested that the novel lineage had a closer evolutionary relationship with genotypes B, C, D, F, and G. Genotype D was the descendant of the recombination events between genotypes B and C. Genotypes G and F were both D-like genotypes, which showed similar recombination patterns in most parts of the sequence with genotype D strains, except for parts of the nsp10 gene. The lowest wholegenome-sequence genetic distance between distinct genotypes (A-G) of HCoV-OC43 was 0.26 ± 0.02% (between genotypes F and D) in a previous study 13 . According to these criteria, the mean distances of a novel lineage compared with the other seven identified genotypes ranged from 0.45 ± 0.02% to 0.99 ± 0.01%, which suggested that a novel genotype of HCoV-OC43 emerged. The novel genotype was designated as genotype H. To further analyze the recombination structures of genotype H strains, neighbor-joining trees of 24 known genes and wholegenome sequences were constructed. Eighteen wholegenome sequences of HCoV-OC43 strains, which belonged to genotypes A-G, were used as reference strains (Supplementary Fig. S2). The seven genotype H strains showed a close relationship with the reference strains belonging to genotypes D, G (D-like), and F (D-like) in the phylogenetic trees of most nonstructural protein genes (nsp1-nsp16), the NS2a gene, and the HE gene. However, the NS4, E, M, N, I and whole-genome sequences were clustered with the genotype B strains, and the S gene showed a close relationship with genotype B and E strains.
Subsequently, we constructed a similarity plot and performed boot scanning analysis using full-length genome sequences. From the 5′-end of the genome to position 23,000 nt, genotype H strains showed a greater similarity to genotype F (D-like) strains. From positions 23,000 nt to 27,000 nt, genotype H strains were closely related to genotype E strains. From positions 27,000 nt to the 3′-end of the genome, genotype H was closely related to genotype B. These findings were consistent with the phylogenetic analysis of the 24 genes and suggested that the occurrence of natural recombination events resulted in the emergence of the novel genotype H of HCoV-OC43 (Supplementary Fig. S3A and S3B). The demographic and clinical profiles of children infected with HCoV-OC43 genotype G or Hin the present study are summarized in Table 1.
In summary, the present study reported a novel HCoV-OC43 recombinant genotype H, which was detected among children with CAP in mainland China. The novel genotype H might have been generated by recombination events among putative parental genotype D-like, genotype E, and genotype B strains. Our results emphasize the need for continuous surveillance of HCoV-OC43 in mainland China to better understand the mechanisms of the phylo dynamics of HCoV-OC43. Severe pneumonia case was indicated by symbol "a"