Epigenetic rewiring of pathways related to odour perception in immune cells exposed to SARS-CoV-2 in vivo and in vitro

ABSTRACT A majority of SARS-CoV-2 recoverees develop only mild-to-moderate symptoms, while some remain completely asymptomatic. Although viruses, including SARS-CoV-2, may evade host immune responses by epigenetic mechanisms including DNA methylation, little is known about whether these modifications are important in defence against and healthy recovery from COVID-19 in the host. To this end, epigenome-wide DNA methylation patterns from COVID-19 convalescents were compared to uninfected controls from before and after the pandemic. Peripheral blood mononuclear cell (PBMC) DNA was extracted from uninfected controls, COVID-19 convalescents, and symptom-free individuals with SARS-CoV-2-specific T cell-responses, as well as from PBMCs stimulated in vitro with SARS-CoV-2. Subsequently, the Illumina MethylationEPIC 850K array was performed, and statistical/bioinformatic analyses comprised differential DNA methylation, pathway over-representation, and module identification analyses. Differential DNA methylation patterns distinguished COVID-19 convalescents from uninfected controls, with similar results in an experimental SARS-CoV-2 infection model. A SARS-CoV-2-induced module was identified in vivo, comprising 66 genes of which six (TP53, INS, HSPA4, SP1, ESR1, and FAS) were present in corresponding in vitro analyses. Over-representation analyses revealed involvement in Wnt, muscarinic acetylcholine receptor signalling, and gonadotropin-releasing hormone receptor pathways. Furthermore, numerous differentially methylated and network genes from both settings interacted with the SARS-CoV-2 interactome. Altered DNA methylation patterns of COVID-19 convalescents suggest recovery from mild-to-moderate SARS-CoV-2 infection leaves longstanding epigenetic traces. Both in vitro and in vivo exposure caused epigenetic modulation of pathways thataffect odour perception. Future studies should determine whether this reflects host-induced protective antiviral defense or targeted viral hijacking to evade host defence.


Introduction
Since the end of 2019, the Corona virus disease 19  pandemic has claimed lives of millions world-wide, highlighting the global challenges in detecting, monitoring, and treating novel viral infections. While efficacious vaccines are available at present, still a lot remains to be uncovered regarding the underlying mechanisms of the interaction between the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and its host. SARS-CoV-2 is a single stranded enveloped RNA virus belonging to the Coronaviridae family [1], which similarly to other viruses hijacks host functions for its own advantage [2][3][4][5]. Evidence suggest that epigenetic mechanisms, i.e., processes regulating transcriptional accessibility of genomes without altering the nucleic acid sequence, are involved in the hijacking process [6,7], also in SARS-CoV-2 infection. DNA methylation (DNAm) of CpG sites is considered to be the most stable epigenetic modification, as it ensures heritability throughout cell division, although it is at the same time highly dynamic in response to environmental stimuli [8]. The malleability and flexibility of the DNA methylome decreases with increasing age [9], and environmental factors such as smoking and nutrition may alter DNAm patterning in various cell types, including different immune cells [10,11]. This could have important implications in the course of COVID-19, as e.g., smoking status, BMI, sex, and age affect susceptibility to become severely ill if contracting SARS-CoV-2 [12][13][14]. Furthermore, DNAm patterns may also become altered upon microbial [15] or viral infection [16][17][18]. In line with this, we have previously observed that immune cells of asymptomatic, tuberculosis-exposed individuals carry a lasting DNAm signature that is linked to protection against mycobacterial infection [19][20][21].
A majority (40-80%) of individuals infected with SARS-CoV-2 show no or mild symptoms of COVID-19 and proceed into convalescence thereafter, while a smaller, but non-negligible, proportion of individuals show severe or life-threatening manifestations [22,23]. Tolerant immune responses have been observed in transcriptomic and immune profiling comparisons of asymptomatic and symptomatic COVID-19 patients [24]. Furthermore, as studies have shown the presence of SARS-CoV-2-specific T cell responses in mildly ill COVID-19 subjects [25], and epigenetic mechanisms regulate differentiation of e.g., T cells [8], it is conceivable that epigenetic mechanisms may be implicated in combating SARS-CoV-2 infection [26]. However, few studies have thus far addressed whether and how the epigenome is altered in subjects with a recent mild-to-moderate SARS-CoV-2 infection. In this study, we set out to examine epigenome-wide DNAm patterns in convalescent COVID-19 (CC19) subjects, after a mild-to-moderate disease course. Understanding how convalescent individuals have mounted an epigenetic response against new viruses such as SARS-CoV-2, for which no pre-existent immunity was present, may reveal how a functional defense strategy towards the virus is prepared, and guide the development of novel diagnostic and preventive measures. Indeed, we found that differential DNAm patterns separated those who have not been infected with SARS-CoV-2 from those who have recovered from mild COVID-19, suggesting that epigenetic mechanisms are at play during SARS-CoV-2 infection. The observations could be replicated in in vitro experiments, further underpinning our findings.

Study population
In this study, participants were enrolled between 29 May and 10 July 2020 during the first wave of the SARS-CoV-2 pandemic in Linköping, Sweden. Individuals who had recovered from and individuals who had not experienced COVID-19 were recruited after announcements with leaflets. Exclusion criteria were the existence of current active SARS-CoV-2 infection and/or other infectious disease symptoms, as well as being younger than 18 years. The study participants voluntarily entered the study in a consecutive manner. The study was conducted on blood and saliva samples from in total 38 individuals from three different groups ( Figure 1); non-infected controls (Con, n = 18), COVID-19 convalescents (CC19, n = 14), and symptom-free individuals with SARS-CoV-2-specific T cell responses (SFT, n = 6). Additionally, blood samples from anonymous healthy blood donors from the blood bank at Linköping University Hospital before 2020 were included as a separate group in the analyses (pre20, n = 5), collected between 2014 and 2019 prior to the outbreak of the pandemic. CC19 participants presented with either mild or asymptomatic initial infections, and none were admitted to hospital. Con participants were healthy individuals with no other known severe disease background who furthermore were defined as neither having any positive circulating IgG-antibody or T cell responses to SARS-CoV-2, while CC19s were defined by the presence of SARS-CoV-2-specific IgG antibodies in plasma using suspension multiplex immunoassay (SMIA), some of which were positive for IgG in saliva, rapid test, and in T cell responses as well. From the included individuals, the following information was retrieved using health questionnaires: self-reported COVID-19 symptoms (if applicable, one or several of the following: fever, headache, shortness of breath, loss of smell/taste, cough, fatigue, muscle pain, nausea, sinusitis/congestion), date of self-reported symptoms, weeks between symptoms and sampling, age, sex, smoking, weight, height, comorbidities, as well as medications. Blood and saliva from the study participants were processed in a Biosafety level-2 facility. The present study is an exploratory pilot study on the effects of mildto-moderate SARS-CoV-2 infection on DNA methylation patterns in PBMCs. The sample size could not be determined beforehand, as the SARS-CoV-2 infection rate in society was not known at the time of sample collection. Hence, all individuals fulfiling inclusion criteria, consenting to participation and providing both blood and saliva samples were included in the study. However, the sample size is similar to previous studies on the effect of BCG vaccination [19,27], where meaningful differences in DNA methylation upon tuberculosis infection were shown. Likewise, the belonging to the different groups described above was determined after the performance of the DNA methylation analyses, and hence handling, extraction, and experimental procedures performed on the samples were performed in a blinded fashion. For samples from the natural exposure cohort, all Outline of included participants, experimental procedures as well as statistical and bioinformatic approaches utilized in the present study. CC19 -convalescent COVID-19, Con -non-infected control, DMG -differentially methylated gene, Pre20 -Pre-2020 non-infected control, SFT -symptom-free individuals with SARS-CoV-2-specific T cell response, SMIA -suspension multiplex immunoassay.
participants provided written informed consent, and the present study was approved by the Regional Ethics Committee for Human Research in Linköping (Dnr. 2019-0618). Regarding the anonymous blood samples used for in vitro experiments, informed consent was given by the healthy donors at the time of blood donation, and the use of the donated blood for research purposes was guaranteed as per the guidelines of Regional Ethics Committee for Human Research in Linköping and the Helsinki Declaration.

PBMC and plasma isolation from whole blood
Peripheral blood was collected in three 10 ml EDTA tubes (BD Vacutainer, 10331254, Fisher Scientific, Sweden). Up to 20 ml of whole blood was used for PBMC isolation after Ficoll-Paque Plus gradient centrifugation (GE17-1440-03, GE Healthcare Life Sciences, Sigma-Aldrich, Sweden) with SepMateTM tubes (85450, StemCell Technologies, France) according to the manufacturer's protocol. Cells were frozen in 10% DMSO (10103483, Fischer Scientific, Sweden) in foetal bovine serum (FBS) (10270106, Gibco, Fisher Scientific, Sweden) and kept at −150°C until analysis. After thawing, the cells were washed twice in cell culture medium (RPMI medium 1640, 31870-025, 10% foetal bovine serum, 1% penicillin/streptomycin, 15140, 1% L-glutamine, 25030081, all from Gibco, Fisher Scientific, Sweden) further on termed as complete culture medium, prior to further processing. Up to 10 ml of whole blood was used for plasma separation by centrifugation (2000 g for 15 min, 4°C) and aliquots were stored at −80°C till further analysis.

Measurements of SARS-CoV-2-specific T cell responses using ELISpot
Peptides for the spike (S) protein of SARS-CoV-2 were obtained from Mabtech (3629-1, Sweden) and were reconstituted with dimethyl sulphoxide (DMSO) at a concentration of 200 µg/ml according to the manufacturer's instructions. The SARS-CoV-2 S1 scanning pool contains 166 peptides consisting of 15-mers, overlapping with 11 amino acids, covering the S1 domain of the spike S1 protein (amino acid 13-685). The peptides were combined into one pool. IFN-γ ELISpot Plus kit was purchased from Mabtech (3420-4HST-10, Sweden). Briefly, the pre-coated wells were plated with unfractionated PBMCs at counts of 300 000 cells/well, and the cells were cultured with peptides for the S protein of SARS-CoV-2 at a final concentration of 2 µg/ml (diluted in complete culture medium) for 20-22 hr in a 37°C, 5% CO 2 incubator. Cells cultured with medium alone were used as negative controls. Stimulation with anti-CD3 antibody at a concentration of 1 µg/ml was used as a positive control for each subject. Anti-CD28 antibody (3608-1-50, Mabtech, Sweden) was included at a final concentration of 0.1 μg/ml as a co-stimulator. All experiments were conducted in duplicates and the results represent the mean of the duplicates. The plates were then processed according to the manufacturer's protocol. Estimation of specific T cell numbers was expressed as spot-forming cells per 1 × 10 6 PBMCs (SFC). SFC were counted using an automated reading system (BioSys Bioreader 5000 Pro-F beta, Bio-Sys GmbH, Germany) and assessed with the Bioreader 5000 analyser. A stimulation index was calculated by dividing the SFC elicited by a SARS-CoV-2 stimulus by the SFC present in the negative control wells. An increment value was calculated by subtracting the SFC from the negative control wells from the SFC of the stimulated wells. A stimulus was considered to be positive when the stimulation index was >2, and the increment value was >10.

Saliva samples
Prior to saliva collection, participants were required to rinse their mouths with water and confirmed they did not show documented oral disease or injury, that they had fasted, refrained from smoking, chewing a gum, taking oral medication, tooth brushing for a minimum of 1 hour before sampling, and that no dental work had been performed within 24 hours prior to sample collection. Donors were asked to provide a 5 ml sample of saliva in a 50 ml sterile conical tube by passive drooling.
All saliva samples were stored/transported on ice upon receipt of the laboratory for processing to preserve sample integrity. Samples were centrifuged (2500 g for 20 minutes at 4°C) to pellet cells and insoluble matter. The supernatant was collected, and samples were complemented with cOmplete™ protease (#11836170001, Sigma) and PierceTM phosphatase inhibitor cocktails (#88667, Thermo Scientific), aliquoted, and frozen/stored at −80°C on the same day. On the day of the assay, samples were thawed and microcentrifuged (2500 g for 10 minutes at 4°C) prior to analysis.
For plasma samples, 50 µl of plasma diluted 1:1000, and for saliva samples 50 µl of sample diluted 1:2 in PBS-T containing and 1% (v/v) BSA (Sigma-Aldrich Sweden AB, Stockholm, Sweden, #Sigma-Aldrich-SRE0036) (PBS-T + 1% BSA) was added per well of a flat bottom, 96-well µClear non-binding microtiter plate (Greiner Bio-One GmbH, Frickenhausen, Germany, #Greiner-655,906). Fifty microlitres of a vortexed and sonicated antigen-coupled bead mixture suspended in PBS-T + 1% BSA (~50 beads/µl) was then added to each well. The plate was incubated in the dark at 600 rpm for 1 h at RT. The wells were then washed twice with 100 µl of PBS using a magnetic plate separator. The beads were resuspended in 100 µl of 1 µg/ml goat anti-human IgG-PE labelled antibody (Southern BioTech, Birmingham, AL, USA. Cat. #2040-09) in PBS-T + 1% BSA and incubated for 30 min at RT in the dark with rotation at 600 rpm. The beads were subsequently washed twice with PBS, resuspended in 100 µl of PBS, and analysed in a FlexMap 3D® instrument (Luminex Corporation, Austin, TX, USA) according to the manufacturer's instructions. A minimum of 100 events for each bead number was set to read and the median value was obtained for the analysis of the data. All sample analyses were repeated three times. A naked, non-antigen-coupled bead was included as a blank along with PBS-T + 1% BSA as a negative control.

In vitro stimulation with SARS-CoV-2
PBMC samples from four healthy blood donors, frozen in 2019 in −150°C in foetal bovine serum (FBS) with 10% DMSO, were thawed and added to 10 ml of Gibco Dulbecco's Modified Eagle Medium (DMEM) (Thermo Fisher Scientific, Waltham, US) containing 1% L-glutamine (Cat no: 25,030-024, Gibco, Waltham, Massachusetts, USA), 1% penicillin-streptomycin (Cat no: 15,140,148 Gibco) and 10% normal human serum (NHS) (pooled from 5 donors) filtered through a 40 µm strainer and pre-heated to 37°C. The cells were washed twice by centrifugation at 330 g for 10 min. The pellet was resuspended in 1.5 ml medium and 2 million per donor were seeded in six-well plates and incubated for 16-24 h. The cell culture media were collected and centrifuged at 330 g for 5 min to pellet the nonadherent cells.
For in vitro infection experiments, SARS-CoV-2 virus previously isolated in a Biosafety level 3 lab according to local safety regulations from the nasopharyngeal aspirate of a COVID-19 patient (early April 2020) was used [29]. The isolated virus was passaged five times in Vero E6 cells and for cell infection experiments, freeze-thawed medium supernatants of 4-5 days infected cells or mock supernatants were used. Virus titres were determined using immunoperoxidase assay. In brief, two-day-old confluent cells (in a 96-well plate) were first washed with DMEM (Gibco, Code: 13345364) containing 100 μg/ml gentamicin, and 100 μl of 10-fold serially diluted SARS-CoV-2 virus lysate was added in quadruplicate. SARS-CoV-2 or mock Vero cell supernatant was added to the PBMC cultures corresponding to a multiplicity of infection of 0.01. Two hours post infection, the cells were washed twice with DMEM and 100 μl of fresh DMEM (containing 2% FBS and 100 μg/ml gentamicin) was added, and the plate was incubated for 8 hours at 37°C in presence of 5% CO 2 . After incubation, the supernatant was discarded, and the cells were fixed for 2 hours with 4% formaldehyde. Next, Triton-X (1:500 in phosphate buffered saline, PBS) was added for 15 min, washed once with PBS and incubated for 2 hours at 37°C with PBS containing 3% BSA. Next, the cells were incubated with mouse-anti-dsRNA antibody (Scions, Code: J2 at 1:100 dilution) for 1.5 h followed by detection using horseradish peroxidase-conjugated goat anti-mouse IgG (heavy plus light chain) (Catalogue: 1,706,516, Bio-Rad Laboratories, Hercules, CA, USA) (1:1000) for 1 h. The plates were washed five times with PBS between every incubation, all incubations were done at room temperature, and the antibody dilutions were made in PBS containing 1% BSA. Finally, the SARS CoV-2-infected Vero E6 cells were identified using 3-aminoethylcarbazole (AEC) substrate. The spots representing virus-infected cells were counted under the light microscope, and the virus lysate was titrated to be 5 × 10 6 per ml.
Cells were monitored in the IncuCyte S3 live cell analysis system (Sartorius, Göttingen, Germany) to allow quantification of cell death in SARS-CoV-2 infected wells versus controls. After 48-h incubation, the cell culture media was collected from each well and centrifugated at 330 g for 5 min to collect the non-adherent cells. Lysis buffer (RLT from the AllPrep® DNA/RNA Mini Kit, Qiagen, Hilden, Germany) was added to the wells to lyse adherent cells, and the mixture was then added to the pelleted non-adherent cells in order to collect DNA (according to the manufacturer's instructions) from the entire PBMC fraction.

Descriptive analyses on demographic variables.
Initial descriptive analyses of demographic variables were performed on the available information about age, gender, smoking, and BMI (kg/m 2 ). Continuous variables were compared using an unpaired two-tailed t-test, and categorical variables were examined using the Pearson χ2 test or Fisher's exact test (if the number of observations was smaller than five), see Table S1.

DNA methylation analyses.
The resulting raw IDAT-files from the MethylationEPIC array analyses were processed in R programming environment (version 4.0.2). The analyses described below were identically performed for the clinical in vivo cohort and the in vitro experiment, unless stated otherwise.
Pre-processing and quality control. The resulting raw IDAT-files containing the raw DNA methylation profiles for each cell type were analysed in R (version 4.0.2) using the minfi package [30] (version 1.36.0), and the data were pre-processed in several steps. The following filters were applied: i) removal of probes with detection p-values above 0.01, ii) removal of non-CpG probes, iii) removal of multi-hit probes, and iv) removal of all probes in X and Y chromosomes. Pre-processing and quality control -In vivo. We removed the sex chromosomes from our data set, as female X-inactivation skews the distribution of beta values ( Figure S1). Of the initial 865918 probes, 841524 probes remained upon filtering. After filtering, quality control was performed, and normalization of the data was done with subset-quantile within array (SWAN) normalization method [31]. The β-values and M-values of the samples were calculated against each probe per sample. The quality of the data was assessed before and after the normalization ( Figure S2). Thereafter, we performed singular value decomposition (SVD) analyses using the ChAMP package [32] (version 2.19.3) to identify underlying components of variation within the filtered and normalized data set ( Figure S3). Significant components consisted of slide, batch, and sample groups that contributed to variation within the data set. Corrections were performed for the identified components using ComBat from the SVA package [33] (version 3.38.0). As PBMCs consist of multiple nucleated cell types in peripheral blood, we utilized the Houseman method to infer cell type proportions within the samples [34]. No differences could be determined in cell type proportions between any of the individuals or between sample groups (Table S2), motivating our choice of not correcting for these cell type proportions. We next performed Principle Component Analysis (PCA) using the normalized batch corrected β-matrix utilizing the R package pca3d (version 0.10.2). To determine which known sources of variation (i.e., disease phenotype, gender) explain the total variance of the data set, we applied regression of principal components (PCs) over the independent variables. Pre-processing and quality control -In vitro. In this dataset, we did not have any information on demographic variables, as the samples were derived from anonymous donors. However, we still removed the sex chromosomes from our data set, as female X-inactivation skews the distribution of beta values. Of the initial 861728 probes, 837694 probes remained upon filtering. After filtering, quality control was performed, and normalization of the data was done with SWAN normalization method [31]. The Houseman method was utilized to infer cell type proportions within the samples [34], yet again revealing no differences could be determined in cell type proportions between any of the individuals (Table S2), motivating our choice of not correcting for these cell type proportions. The β-values and M-values of the samples were calculated against each probe per sample. The quality of the data was assessed before and after the normalization ( Figure  S4). SVA package (version 3.40) was applied to correct the batch effect. Cell deconvolution was performed using FlowSorted.Blood.EPIC package (version 1.11).

Differential DNA methylation analysis
Differential DNA methylation -In vivo. As we were interested in studying CpGs that were differentially methylated between CC19s and noninfected controls from both before and after the start of the COVID-19 pandemic, we performed differential DNA methylation analyses, using the limma package (version 3.46.0). A linear model was fitted to the filtered, normalized and SVDcorrected DNA methylation data. Identifying sources of variation that were still present upon SVD correction provided the basis for the inclusion of these variables as co-variates in the models, in this case, gender and BMI ( Figure S3). For each investigated probe, moderated t-statistics, log2 Fold Change (logFC), and p-values were computed. The logFC values represent the average beta methylation difference (from hereon referred to as mean methylation difference, MMD) between the CC19s (n = 14) vs. non-infected controls (Cons + Pre20, n = 18 + 5). SFTs were excluded from these analyses as these individuals displayed SARS-CoV-2-specific T cell responses despite being reportedly healthy. Differentially methylated CpGs (DMCs) were defined as CpG sites having a nominal p-value of less than 0.01 along with an MMD of >0.2. As a means to ascertain the quality of the identified DMCs, genomic inflation and pertaining bias were estimated using the BACON package [35] (version 1.18.0). As the estimated genomic inflation for the comparison was close to 1 (genomic inflation: 1.20, bias: 0.01, Figure S5), this suggested that no major genomic inflation was present in the comparisons, and no correction for this was deemed necessary. The distribution of the DMCs among all investigated DNA methylation sites was illustrated by creating volcano plots (EnhancedVolcano, version 1.8.0).
A heatmap of the identified DMCs was generated using the normalized batch corrected β-matrix for all samples in each group (CC19, Con, Pre20, and SFT) using the ComplexHeatmap package in R (version 2.6.2). The clustering dendrograms in the heatmap were plotted using the Euclidean distance matrix. Thereafter, the DMCs were mapped to their corresponding DMGs. DMGs contained at least one DMC and were considered hyper-or hypomethylated if all DMCs within the gene were hyper-or hypomethylated, respectively. If both hyper-and hypomethylated genes were present in the same gene, the gene was considered having a mixed methylation pattern. Differential DNA methylation -In vitro. To evaluate the difference between the mock and infected sampeles, the fold change was calculated using the cut-off obtained from the density plot (M-value >| 2|; Figure S6) for each CpG site. Only those CpGs with higher values than the cut-off were selected for further analysis. Venn analysis was performed among the samples using the ggVennDiagram (version 1.1) package in R (version 4.0.3) and bioconductor (version 3.12).

Pathway over-representation analyses.
To make biological sense of the putatively SARS-CoV -2-induced DNA methylation differences, we performed PANTHER pathway over-representation test analyses using the PANTHER database (version 16.0). The Fisher's exact test was used for the generation of nominal p-values (significance level set to p-value of <0.05), in case the false discovery rate correction was too stringent. The significantly enriched pathways were displayed in dot plots generated in R using ggplot2 package (version 3.3.3).

Network analyses.
Network analyses were conducted to generate further and wider biological insight about the DMGs generated in the in vivo. An input object was constructed using the pre-2020 (Pre20, n = 5) and post-2020 (Con, n = 18) non-exposed controls and COVID-19 convalescents (CC19, n = 14), as a two-column data frame containing gene annotation and P-value of the significant DMGs (n = 54). The graph clustering algorithm MCODE [36] was used to identify molecular complexes and create a large disease module, which was then fitted to a protein-protein interaction network, and both were analysed and rendered in Cytoscape (version 3.8.0). High confidence interactions with a STRINGdb confidence value >0.7 were displayed in the network. Centrality measurements of degree, betweenness, and closeness were used to expose the most central nodes in the network. Finally, a functional enrichment of the genes present within the module was carried out using StringDB [37]. In addition, the inference of modules was performed with two other methods from the MODifieR package (DIAMOnD and WGCNA) [38] to study whether it was possible to condense the module genes to fewer genes of particular interest within the network, for both the in vivo and the in vitro setting.

Overlap to SARS-CoV-2 interactome.
A publicly available protein-protein interaction (PPI) network of SARS-COV-2 and human genes curated by BioGRID (version 4.4.197) was downloaded from the Network Data Exchange in Cytoscape (version 3.8.0). The DMGs from the in vivo and in vitro settings alongside the gene list from the module generated by MCODE were overlapped onto the PPI network to visualize their respective distributions.

COVID-19 convalescents display altered DNAm patterns compared to non-infected controls
We compared epigenome-wide DNAm patterning in peripheral blood mononuclear cells (PBMC) from non-infected controls (Con, n = 18), COVID-19 convalescents who had recovered from mild or moderate symptoms (CC19, n = 14), donor blood collected before the pandemic (Pre20, n = 5) and from asymptomatic individuals presenting with SARS-CoV-2-specific T cell responses (SFT, n = 6, Figure 1). Comparisons of demographic variables revealed no significant differences between any of the groups (Table S1). To examine any inherent differences in the DNA methylome between the different sample groups, principal component analyses (PCA) were performed. Three principal components (PCs) were identified as both contributing to the variation within the DNAm data and correlating with the sample groups ( Figure S7). A three-dimensional illustration of these three most contributing components revealed that the CC19 subjects are distinct from the Con, Pre20, and SFT subjects, whose centroids clustered more closely together (Figure 2(a), Figure S8). The observed methylome-wide differences prompted us to identify differentially methylated CpGs (DMCs), which we defined as CpG sites with a nominal p-value of <0.01 along with a mean methylation difference (MMD) of >0.2. We found 87 DMCs, when comparing the DNA methylomes of CC19s to the merged groups of Cons and Pre20s (Figure 2(b), Table 1, Table  S3a). This identified DMC signature could furthermore distinguish the CC19s from Cons, Pre20s, and SFTs (Figure 2(c)), suggesting that a past SARS-CoV-2 infection may have resulted in modulation of the epigenome that persists at least a couple of months after the virus is eliminated from the body. Interestingly, the majority of CC19s showed positive SARS-CoV-2-specific IgG responses both in the circulation and in saliva (Figure 2(c)). Individuals who showed T cell responses towards SARS-CoV-2 or presented with SARS-CoV-2-specific antibodies in saliva while being negative for antibodies in plasma, aligned with the uninfected controls in the PCA and unsupervised clustering analyses (Figure 2(a,c)).

Differential methylation of COVID-19 convalescents identifies a putatively SARS-CoV-2-induced module
To further explore the biological impact of SARS-CoV-2 exposure in the CC19 subjects, the identified DMCs were annotated to their respective differentially methylated genes (DMG), resulting in 54 unique genes ( Table 1, Table S3b). Subsequent pathway over-representation analyses revealed involvement in two significantly over-represented pathways (Wnt and integrin signalling pathways, Table S4).
As a means to elaborate on the wider interaction context in which the DMGs act with other proteins, the DMGs (n = 54) were used as seed genes in the identification of SARS-CoV -2-induced modules in network analyses. The resulting module consisted of 66 genes from the protein-protein interaction network, with 139 intra-network interactions, which is significantly more interactions than expected (34 interactions) for a network of that size (Figure 3, Table S5). Six of these genes were present in at least two module identification methods (INS, HSPA4, SP1, ESR1, TP53, and FAS), and they were all located in the centre of the module.
The four genes with the highest combined centrality scores were HSP90AA1, TP53, INS, and CFTR. Pathway over-representation analyses of the 66 module genes revealed involvement in pathways such as apoptosis signalling, muscarinic acetylcholine receptor 1 and 3 signalling, and gonadotropinreleasing hormone receptor pathway ( Figure S9).

SARS-CoV-2-stimulated PBMCs in vitro reveal overlaps with in vivo differential methylation, network analyses, and SARS-CoV-2 interactome
In the present study, we only had access to selfreported time-after-onset of COVID-19 symptoms (Table S6), thus making the immediate effects of SARS-CoV-2 exposure on the epigenome impossible to analyse. Moreover, as the virus-induced DNAm patterns in the CC19ʹs may fade over time, we set out to examine SARS-CoV-2-induced DNAm patterns in an in vitro setting. To this end, we exposed pre-2020 PBMCs collected from four blood donors in 2014-2019 to SARS-CoV-2 at a low multiplicity of infection (MOI = 0.01) for 48 h to mimic immediate in vivo exposure to the virus ( Figure S10), and compared genome-wide DNA methylation changes to non-infected mock samples from the same individuals. Exploring the intra-individual DNAm differences between stimulated and unstimulated cells, a set of DMCs (n = 3693), (Table 1), (Figure 4) were identified to be shared between all four individuals (Table S7a-b). These DMCs were mapped to in total 606 DMGs (542 unique genes, Table 1, Table  S7c), which were significantly over-represented in a number of pathways including several glutamate receptor pathways, muscarinic acetylcholine receptor 1 and 3 signalling pathway, as well as the Wnt and cadherin signalling pathways ( Figure S11).
As similar pathways were revealed in the findings from the in vivo study and the SARS-CoV-2 stimulations, we wanted to explore further similarities in DNAm between the in vivo and in vitro settings. Analyses of the overlap of shared DMGs identified Additionally, to understand the biological context of the genes identified in the in vitro comparison, we performed network analyses in the same manner as for the in vivo comparison. These analyses found a module consisting of six genes (Table 1) (TP53,  INS, HSPA4, SP1, ESR1, and FAS), which were among the previously identified module genes from the in vivo setting and were also identical to those that had been identified by more than two module identification methods ( Figure 3). Furthermore, explorations of the overlap between identified genes in the differential DNAm analyses and network module analyses of the genes from a publicly available SARS-CoV-2 interactome identified numerous interactions in the in vivo (n = 11/ Combined ranked scores of centrality quantification of degree, betweenness and closeness is visualized as a colour (light orange to dark red) continuum, with dark red nodes constituting the most central parts of the network. Nodes that were also found both when utilising two other module identifying methods (DIAMOnD and WGCNA) and when performing the same analyses on the in vitro data set using MCODE are enclosed with a black line.

Discussion
The epigenetic events triggered during a mild COVID-19 disease course are largely unexplored, despite the fact that these individuals make up a majority of all SARS-CoV-2-infected individuals.
The main finding of our study was an observed DNAm signature that was evident several months after recovery in CC19s compared to non-infected individuals. Although this has, to our knowledge, not previously been described, further investigations are needed to prove whether this particular signature is a remaining epigenetic mark from the time of active infection. Studies of DNA methylomes in circulating cells of COVID-19 patients have so far focused on the hospitalisation phase for moderate-to-severe disease or at discharge [39][40][41][42][43], and none of these studies report comparisons upon convalescence from a mild-to-moderate disease course. Not surprisingly, most of these studies mainly identify the engagement of several antiviral immune-related pathways as well as inflammatory responses in severely ill COVID-19 patients compared to controls [42,43]. In contrast, our pathway over-representation analyses revealed the involvement of distinct, previously unappreciated pathways such as the Wnt signalling and the muscarinic acetylcholine receptor 1 and 3 signalling pathways. The Wnt signalling pathway has been implicated in several aspects of COVID-19, including development of inflammation, cytokine storms, as well as pulmonary fibrosis [44]. Furthermore, potential viral hijacking of host Wnt targets has been suggested upon SARS-CoV-2 infection in multi-omics studies [45]. The muscarinic acetylcholine receptor 1 and 3 signalling pathway was present in the module identification analyses from both the natural in vivo exposure and the in vitro stimulations. In mice, it has been shown that blocking of the muscarinic acetylcholine receptor 1 and deletion of both muscarinic acetylcholine receptors 1 and 3 actually leads to deficits in olfactory perception [46,47]. Furthermore, in post-viral fatigue patients, including post-SARS-CoV and myalgic encephalomyelitis/chronic fatigue syndrome patients, this signalling pathway is dysfunctional, which has been tentatively attributed to the development of anti-muscarinic receptor autoantibodies [48,49]. This is interesting in terms of anosmia (loss of smell) in SARS-CoV-2 infected individuals, particularly in those who experience long-term symptoms, as it could indicate that epigenetic mechanisms are at play. Although we cannot draw any conclusions regarding expression from our data, the consequent immersion of these pathways suggests that they are indeed modulated. Future studies should elaborate on the role of Wnt and muscarinic acetylcholine signalling in the development of post-acute COVID-19 syndrome, as the effects we observe have persisted for months after the initial exposure to the virus.
In the present study, DNA methylome analysis of PBMCs identified a number of genes that were shared between the natural in vivo infection and following in vitro stimulation, which were further confirmed by several module identification methods. One of these genes was tumour protein 53 (TP53), an evolutionarily conserved protein that is one of the most well-studied hub genes in cell signalling due to its central role in cancer [50]. TP53 has in several other studies previously been identified as a hub gene, in whole blood from COVID-19 patients [51], and has been shown to interact with ACE2 in SARS-CoV-2-infected human induced pluripotent stem cell-derived cardiomyocytes [52]. Moreover, transcriptomic analyses of PBMCs from a small group of patients infected with SARS-CoV-2 revealed involvement of apoptosis and TP53 signalling pathways [53], a finding that was further supported by studies of the SARS-CoV-2 interactome, where TP53 was identified as a central player in apoptosismediated pathways [54]. Two additional genes, both members of the heat shock protein family, HSP90AA1 and HSPA4 stand out in the network derived from our in vivo and in vitro data. Interestingly, reports on differentially expressed genes overlapping between acute respiratory distress syndrome and venous thromboembolism datasets identified TP53 and HSP90AA1 as central genes, among the top ranked hub genes in their networks [55]. HSP90AA1 was previously shown to be upregulated in bronchial cells of patients with mild COVID-19 disease, as compared to those with a severe disease course [56], suggesting that this gene may be of particular importance in the mounting of a protective antiviral response. Although our study does not provide any evidence for a protective role of the observed epigenetic alterations, HSP70 family members have been discussed as both anti-viral defence components [57,58], and anti-viral drug targets, against SARS-CoV-2 [59]. Altogether, our findings on the network centrality of the hub genes that we derived from the in vivo and in vitro data suggest that they may be of particular importance in the interaction with epigenetically modulated genes upon SARS-CoV-2 infection. Nevertheless, further studies are needed to elucidate the mechanistic role of these genes during infection and recovery from COVID-19.
A limitation of this pilot study is the lack of validation of the DNAm findings on a transcriptional level. Since epigenetic alterations do not necessarily affect basal transcription levels, such studies need to address the transcriptome comparing epigenetically naïve and rewired samples with and without the exposure to a relevant stimulus [60,61]. Only then, when the need for an activation of defense systems seems apparent can differences be detected at the transcriptome level. Hence, whether the observed DNAm patterns are indeed associated or causally linked to host protective or host detrimental immune responses still needs to be addressed in future studies, with more well-designed, larger cohorts, and consecutive sample materials from the onset of SARS-CoV-2 infection. The investigation of epigenetic modifications in mild-to-moderately ill COVID-19 patients enabled us to discern DNAm differences that otherwise would have been masked by overriding inflammatory responses. Though these subtle changes may not primarily be relevant to immune response severity towards SARS-CoV-2, they may be insightful for the identification of both effective host protective mechanisms at play, or ensuing deliberating conditions such as long-COVID. The presentation of longstanding symptoms in long-COVID could be attributed to detrimental alterations in DNAm patterns, though originally triggered as a short-term anti-viral response.
In conclusion, we found epigenome-wide differences in DNAm patterns of individuals that had recovered from a mild-to-moderate disease course of COVID-19 compared to non-infected controls. This study suggests that DNAm is one of several epigenetic mechanisms that is altered upon SARS-CoV-2 infection. Presently, several clinical trials investigating how DNA methylation may impact and predict short-and long-term outcomes of COVID-19 are ongoing (ClinicalTrials.gov-ID: NCT04364828, NCT04411563, NCT04859894) and these studies will, along with our upcoming longitudinal studies of the epigenetic impact of SARS-CoV-2-infection (NCT04368013), further elaborate on whether our observed findings are induced by protective host responses or constitute virally induced hijacking processes. Pinpointing these matters will aid the development of efficacious diagnostic tools and treatments of COVID-19 in the future.