Virus strain of a mild COVID-19 patient in Hangzhou representing a new trend in SARS-CoV-2 evolution related to Furin cleavage site

We found, in our 788 confirmed COVID-19 patients, the decreased rate of severe/critical type, increased liver/kidney damage and prolonged period of nuclear acid positivity during virus dissemination, when compared with Wuhan. To investigate the underlining mechanism, we isolated one strain of SARS- CoV-2 (ZJ01) in mild COVID-19 patient and found the existence of 35 specific gene mutation by gene alignment. Further phylogenetic analysis and RSCU heat map results suggested that ZJ01 may be a potential evolutionary branch of SARS-CoV-2. We classified 54 strains of viruses worldwide (C/T type) based on the base (C or T) at positions 8824 and 28247. ZJ01 were both T at these two sites, becoming the only TT type currently identified in the world. The prediction of Furin cleavage site (FCS) and the sequence alignment of virus family indicated that FCS may be an important site of coronavirus evolution. ZJ01 had mutations near FCS (F1-2), which caused changes in the structure and the electrostatic distribution of the S protein surface, further affecting the binding capacity of Furin. Single cell sequencing and ACE2-Furin co-expression results confirmed that Furin level was higher in the whole body, especially in glands, liver, kidney and colon while FCS may help SARS-CoV-2 infect these organs. The evolutionary pattern of SARS-CoV-2 towards FCS formation may result in its clinical symptom becoming closer to HKU-1 and OC43 (the source of FCS sequence-PRRA) caused influenza, further showing potential in differentiating into mild COVID-19 subtypes.


Introduction
The outbreak of a novel coronavirus (SARS-CoV-2) and its infected Disease  in Wuhan since the end of 2019 and quickly spreading over the whole country, has put severe threat on public health and economic retardation to China 1, 2 .
Through quick response and drastic measures even including quarantining Wuhan city on Jan 23, 2020, the spreading of SARS-CoV-2 was effectively controlled.
Therefore, it is urgent to continuingly illustrate the clinical and virological characteristics of SARS-CoV-2 during its dissemination.
An important and common feature of virus is that its increased transmissibility usually accompanies with decreased virulence, which also holds true for SARS-CoV-2 and is reflected in the disease trajectory. On one hand, COVID-19 was more severe in Wuhan at its early stage, reaching approximately 32% of severe/critical types and 11% of fatality 8,9 . In contrast, data out of Wuhan showed more mild type of COVID-19, as presented in Zhejiang province 10 and nationwide 11 .
On the other hand, its transmissibility was increased from varied basic reproductive number (R0) of 2.2 12 and 2.68 6 based on Wuhan data to that of 3.77 13 in national level. Furthermore, the observation of similar viral load between COVID-19 patients with and without symptoms revealed its capacity of occult transmission 14 .
It is well acknowledged that change in the epidemiological and clinical features of COVID-19 roots from virological change of SARS-CoV-2, in which its spike (S) . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint surface envelope protein plays an important role 15 . Generally, its surface unit (S1) is responsible for host entry by binding to cell receptor while its transmembrane unit (S2) takes charge of the fusion of viral and cellular membranes 16 . Therefore, it is of great value to focus on the sequence mutation and conformation change in S protein for SARS-CoV-2 evolution in a finely established model, aiming to explain the related change of COVID-19.
In this study, we identified 9.9% severe/critical type in 788 confirmed COVID-19 patients of Zhejiang province and median 11 days of positive nuclear acid in 104 patients from our hospital, showing the tendency of COVID-19 progression towards mild but more infective direction. Based on these clinical findings, we performed in-depth bioinformatics analysis by comparing the virological features of previously reported 52 strains of SARS-CoV-2, including Bat coronavirus, SARS-CoV and SARS-CoV-2 in Wuhan and ZJ01 (SARS-CoV-2 we firstly reported in one patient with mild COVID-19 in Zhejiang province). We demonstarated that the specific mutation of ZJ01 may represent a de novo evolution trend of SARS-CoV-2, which might be related with Furin. Besides, the establishment of a novel SARS-CoV-2 categorization system may facilitate our understanding of virus evolution and its influence on disease severity and progression. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020 11 . The period of positive nuclear acid is defined as the date of confirmed nuclear acid positivity minus the date of confirmed nuclear acid negativity.

Procedure and virus strain collection
SARS-CoV-2 was confirmed from samples of throat-swab and sputum in our hospital and local center for disease control and prevention (CDC) of Zhejiang province, by real-time RT-PCR 8 . All patients received chest x-rays or CT scan on admission, with excluding other respiratory viruses such as influenza A (H1N1, H3N2, H7N9), influenza B, respiratory syncytial virus, SARS-CoV and MERS-CoV. The is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint epidemiological, anthropometrical, clinical and laboratory data were collected on admission, with specific attention paid to the period between symptom onset and outpatient visiting/PCR confirmation/hospital admission. One strain of SARS-CoV-2 was successfully isolated from the sputum of COVID-19 patient with mild type in our hospital, followed by whole genome sequencing with previously reported method 18 .

Phylogenetic and Relative synonymous codon usage analysis
Phylogenetic analysis was performed on a total of 80 coronavirus strains, covering 6 species (human, bat, mink, camel, rat and pig). The source of SARS-CoV-2 comes from 17 cities in 7 countries, with periods between virus outbreak and dissemination (2019.12.23-2020.2.5). The evolutionary history was constructed based on the S protein of coronavirus by Neighbor-Joining method. The bootstrap . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  20 .

Single cell transcriptome data analysis
The raw counts or processed data were download from Tissue Stability Cell Atlas (https://www.tissuestabilitycellatlas.org/), Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/). Lung, colon and liver data were respectively obtained from Tissue Stability Cell Atlas 21 , GSE116222 22 and HCA 23 , including samples of 5 lung, 3 colonic epithelium and 5 hepatic tissues from healthy volunteers and organ donors. Lung and liver data had been processed before downloading, and directly used for data analysis and visualization. For liver data, cells with less than 100 expressed genes and 1500 UMI counts and higher than 50% mitochondrial genome transcript were removed. Genes expressed in less than three cells were also removed. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint components resulted in harmony to perform cell clustering and nonlinear dimensionality reduction as same as liver data. Depending on the expression level of cell markers provided in the original article corresponding to the scRNA-seq datasets, we further estimate which cell types the cell clusters belong to.
Annotated clusters were then visualized using UMAP plots with "DimPlot" function in Seurat. Normalized gene expression levels were presented in the UMAP and violin plots by R package ggplot2 26 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint

Epidemiological and clinical characteristics of 788 enrolled COVID-19 patients and ZJ01 patient
As shown in Table 1, 51.65% of the 788 enrolled patients were male, with low rate of smoking (6.85%). The top three co-existing conditions were hypertension (15.99%), diabetes (7.23%) and chronic liver disease (3.93%). The median period from illness onset to outpatient visiting, PCR confirmation and hospital admission were 2, 4 and 3 days, respectively. The most common symptoms were fever (80.71%) and cough (64.21%) while the highest rate of CT/X-ray manifestation was bilateral pneumonia (37.56%). The rates of mild, sever and critical types of COVID-19 were 90.1%, 7.74% and 2.16%, respectively. The ZJ01 patient is male, 30y, and had neither smoking history nor any co-existing condition. He visited outpatient clinics 1 day after symptom onset and was admitted in hospital confirming with COVID-19 in the same day by PCR test. He had no exposure history to Wuhan and none of his family members was reported to be virus positive till now. Consistent with other COVID-19 patients, he had the symptoms of fever, cough and sputum production, with CT image showing bilateral pneumonia. This patient was categorized into mild type of COVPD-19, with normal blood routine test and examination of inflammation marker (CRP and PCT). The higher level of ALT and serum creatinine in this patient indicated potential liver and kidney injury.
Unusually, the period of continuing positive nuclear acid was 24 days, longer than most patients reported from Wuhan 27 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint differences between ZJ01 and other members of SARS-CoV-2 mainly resided in in the S1/S2 and S2 (Fig 2B). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint We proposed a novel categorization system for SARS-CoV-2, and defined type C for 8824C, type T for 8824T and type TT for ZJ01 as a special case, respectively. According to this system, we further categorized 54 strains of SARS-CoV-2-related virus (Fig 3C). We found 83.3% and 95.7% T type in China (n=30) and Wuhan (n=23); 60% and 100% C type in Japan (n=5) and Tokyo (n=3); 53.8% C type in the USA (n=13), 83.3% T type in California (n=6) and 100% C type in Washington  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint
For RaTG13, F2 locus was slightly changed (Furin score 0.333) and a novel FCS was formed in F1 locus (Furin score 0.279). Though the changes in these two sites were inherited in SARS-CoV-2, there still existed huge differences in F1 site between RaTG13 and SARS-CoV-2.
Strikingly, we found an additional PRRA sequence in the F1 site of SARS-CoV-2 when compared with RaTG13, forming a strong and reliable FCS (Furin score is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03.10.20033944 doi: medRxiv preprint 0.62). Though the source of insertion was unknown, the PRRA sequence was common in Avian Influenza virus 29 . We deduced that its source might inherit from HKU1 and OC43, which had effective FCS in F1 site (Furin score 0.878 and 0.744) and respective amino acid of SSRRKRR and TKRRSRR, with high similarity of NSPRRAR in SARS-CoV-2. HKU1 and CO43 could cause human upper respiratory tract infection but the symptom was milder than SARS and SARS-CoV-2. Therefore, we reckoned that the ancestor of SARS-CoV-2 may exchange gene with HKU1 or OC43 to obtain FCS in F1 site during evolution to human transmission (Fig 4C). ZJ01 had Glu702 to Lys702 substitution at the site 18 th amino acid behind F1 site, and deletion (Ala771 to -) at the site 37 th amino acid ahead of F2 site. These mutations may influence the tertiary and quaternary structure of S protein and finally change the Furin binding capacity. F1-3 sites were conservative in SARS-CoV-2 and SARS (Fig 4D), indicating the importance of mutation in these sites. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint 1Protein structure analyses implied the mutation in F1 and F2 related area of

ZJ01 may influence its binding capacity with Furin protein
Homology modeling clearly revealed the position of F1-3 sites in S protein of SARS-CoV-2 ( Fig 5A). Briefly, F1-3 all located on the surface of S protein and protruded outward, exhibiting high potential as substrate binding sites. F1 located on the transition area of S1 and S2 (S1/S2) with obvious outward protrusion, F2 located on the mid-lower position of S2, while F3 located on the top of S1-NTD.
Further homology modeling of the S protein of GZ02, RaTG13, Wuhan-Hu-1 and ZJ01 showed big differences in protein structure conformation of F1 locus. From SARS, RaTG13 to SARS-CoV-2, the F1 site showed the tendency of outward protrusion (Fig 5B). Though Wuhan-Hu-1 and ZJ01 shared same amino acid sequence in F1 site, the mutation (Glu702 to Lys702) nearby F1 site of ZJ01 may still change its protein structure conformation and result in further outward extension by 11.6 Å. Furthermore, RaTG13, Wuhan-Hu-1 and ZJ01 owned high degree of consistency in F2 site. The F2 site of GZ02 was deeply buried in the inner place of S protein, that was the biggest difference of SARS-CoV-2 whose F2 site was on the surface of S protein. Finally, RaTG13, Wuhan-Hu-1 and ZJ01 showed high similarity in F3 site that was missing in GZ02.
Adaptive Poisson-Boltzmann Solver (APBS)analysis revealed that furin was a protease with negative charge. Its substrate binding site (191-192, 253-258, . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint 292-295) was covered with a large number of negative charges (Fig 6). The F1 site from SARS-CoV-2 related virus (ZJ01, Wuhan-Hu-1 and RaTG13) were predominantly covered with positive charge while SARS was mixed with negative and positive charge. Compared with Wuhan-Hu-1, F1 site of ZJ01 had more positive charge in its protruding head and more negative charge in its basal part.
The F2 site of GZ02 was covered with negative charge while F2 site of Wuhan-Hu-1 and RaTG13 were covered with low level of positive charge. ZJ01 had more positive charge in F2 site than the other strains, probably caused by nearby gene deletion (Ala 771 to -). GZ02 had many negative charges in F3 site while few negative charges were identified in SARS-CoV-2 related virus. Therefore, we speculated that, though ZJ01 shared gene similarity with Wuhan-Hu-1, its mutation near FCS changed protein structure conformation and surface electrostatic potential, which further influenced its binding capacity with Furin.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020.   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint

Higher expression of Furin than ACE2 in different organs at single cell level
The protein and RNA expression level of ACE2 and Furin in human major tissues were explored in The Human Protein Atlas (https://www.proteinatlas.org/). We found that ACE2 was predominantly expressed in small intestine, duodenum, colon, kidney and testis, while was relatively lowly expressed in lung (Fig. 7A). Furin was expressed in most tissues and organs of the human body, and exhibited highest expression levels of RNA in salivary gland, placenta, liver, pancreas and bone marrow (Fig. 7B). Of note, Furin had extremely low protein expression level in lung compared with other tissues.
To further explore the correlation between ACE2 and FURIN expression, we reanalyzed single-cell RNA sequencing (scRNA-seq) data in lung, liver, colon as described in Methods ( Figure 7C). Since ACE2 and TMPRSS2 co-expression were reported recently 30 , we also examined TMPRSS2 expression levels in these tissues.
In the scRNA-seq datasets, ACE2, FURIN and TMPRSS2 showed higher expression levels in liver or colon than in lung (Fig. 7E). Consistently with previously report 31 , ACE2 was mainly expressed in alveolar type 2 cells in lung ( Fig. 7D-E). ACE2 was highly expressed in liver cholangiocytes, liver hepatocytes, colon colonocytes and colon CT colonocytes compared with other liver or colon cell types respectively. This expression pattern was the same as TMPRSS2, but TMPRSS2 had a higher expression level in each cell type. In contrast, FURIN was expressed in all cells types of three tissues, but had little co-expression with ACE2. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint

Discussion
The outbreak of COVID-19 has been rapidly spreading over the recent 2 months, causing enormous damage to China. With nationwide dissemination, its epidemiological and clinical features have changed. Accumulating evidences indicate the appearance of several unique characteristics distinct from Wuhan 8,9,32 , including higher rate of mild type, lower rate of severe/critical type and mortality and longer period of nuclear acid positivity 10,11,27,33 . Moreover, the increased transmission route of SARS-CoV-2 has been gradually unmasked, from previous recognition of respiratory transmission to through feces 34 and even tears and conjunctival secretions 35 All these data indicate the possibility of virus evolution during its spreading, towards the direction of decreased severity but increased transmissibility.
However, according to recently published virus sequencing results 18   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint Through analysis on 788 COVID-19 patients in Zhejiang province, we found that there existed mild and severe types of SARS-CoV-2. Though we didn't currently have evidence to prove whether patients with mild COVID-19 are directly caused by virus mutation or other factors, we indeed found the significant difference between ZJ01 and other members of SARS-CoV-2. ZJ01 had relative high number of 37 mutations and its RSCU was closer to humans than most members of SARS-CoV-2. More importantly, ZJ01 was the only TT type of total 54 strains in our C/T categorization system. Though the sequence of ZJ01 was still close to Wuhan-Hu-1 (the earliest identified SARS-CoV-2) and its mutations were not sufficient to reach the threshold of forming an independent subtype, our evidence indicated ZJ01 may represent a specific evolution direction of SARS-CoV-2.
In this study, we first developed the C/T categorization system of SARS-CoV-2, which revealed the occurrence of possibly inheritable mutation at the very early stage of its evolution and the potential of continuing C/T subtype formation. The TT type of ZJ01 was unique in our system. Though a similar categorization system has been recently proposed 36 , they did not report such TT type in their 120 strains of SARS-CoV-2. Besides, the C/T pattern could also be used to trace the route of virus infection and evolution. For instance, we found 8 strains of T type with is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  Figure 1).
Furin is well recognized as an important serine protease, which has a minimum enzyme restriction site of Arg(R)-X-X-Arg (R) and plays an essential role in influenza infection. The binding capacity change of Furin in Avian Influenza may influence its pathogenicity 37 . Though Furin is not the most common protease in coronavirus, previous studies indicated its pivotal roles in SARS and MERS 28,38 .
However, SRAS-CoV-2 had PRRA insertion in 690 amino acid site of S protein, with high conservation. This insertion may become critical point for the host of RaTG13 turning from animal to human. Through sequence alignment, we found . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint this inserted sequence may arise from the translocation between human coronavirus HKU1 and OC43 (Figure 4). SARS-CoV-2 had three FCS (F1-3), where F1 hydrolyzes S protein to S1 and S2 and promotes virus-cell fusion, F2 hydrolyzes S2 and participates in virus pathogenicity after cell entry, while F3 functions through NTD and promotes adhesion between virus and cell surface. However, whether F3 site really exists and what is the function need further investigation. Furthermore, the target cell binding site of HKU1 and OC43 was on their S A segment of S protein while its corresponding site in SARS-CoV-2 was NTD. Therefore, except for the potential interaction in F1 site, there also exists the possibility of interaction in NTD segment between SARS-CoV-2 and HKU1/OC43. Viruses frequently undergo mutation and adjust its RSCU under evolutionary selection pressure to adapt to the host, facilitating better replication and dissemination 40 . The choice of Furin cleavage site might be an outstanding marker for coronavirus evolution. Meanwhile, we found that the mutation in ZJ01 may cause binding force change of FCS. All these data indicate that Furin might play a pivotal role in the pathogenicity of SARS-CoV-2. The evolution trends of increasing FCS in SARS-CoV-2 we found in this study are more prone to influenza-like clinical manifestations such as human HKU1 and OC43 41 .
Single cell sequencing analysis showed that Furin had higher expression level and wider organ distribution than ACE2, especially in salivary gland, lachrymal gland, colon, liver and kidney. Therefore, SARS-CoV-2 might evolve to utilize this specific . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint feature by increasing FCS to become more infectious on multi-organ levels. Our hypothesis was in consistent with changed clinical characteristics of COVID-19 from published data and our observation, including detection of virus in feces 34 and conjunctival secretions 35 , decreased severity/fatality, increased liver/kidney damage and symptoms of gastrointestinal tract, increased transmissibility and prolonged period of nuclear acid positivity. Since ACE2 expression was quite low in the whole body including lung, we speculated that, on one hand, it might be the inflammatory reaction whereas not the viral road itself triggering severe respiratory damage; on the other hand, the utilization of Furin may help scatter virus attack from lung to other organs, contributing to the phenomenon of decreased severity but increased liver/kidney dysfunction. Nevertheless, all these speculations warrant further investigations.
To sum up, ZJ01 isolated form mild COVID-19 patient of Zhejiang province represents a potential branch in virus evolution. SARS-CoV-2 may adopt a similar mechanism by depending on Furin for invasion as HJU1 and OC43. Such potential evolution direction may promote the appearance of mild subtype of COVID-19. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted March 13, 2020. . https://doi.org/10.1101/2020.03. 10.20033944 doi: medRxiv preprint