Identification of a promiscuous conserved CTL epitope within the SARS-CoV-2 spike protein

ABSTRACT
 The COVID-19 disease caused by infection with SARS-CoV-2 and its variants is devastating to the global public health and economy. To date, over a hundred COVID-19 vaccines are known to be under development, and the few that have been approved to fight the disease are using the spike protein as the primary target antigen. Although virus-neutralizing epitopes are mainly located within the RBD of the spike protein, the presence of T cell epitopes, particularly the CTL epitopes that are likely to be needed for killing infected cells, has received comparatively little attention. This study predicted several potential T cell epitopes with web-based analytic tools and narrowed them down from several potential MHC-I and MHC-II epitopes by ELIspot and cytolytic assays to a conserved MHC-I epitope. The epitope is highly conserved in current viral variants and compatible with a presentation by most HLA alleles worldwide. In conclusion, we identified a CTL epitope suitable for evaluating the CD8+ T cell-mediated cellular response and potentially for addition into future COVID-19 vaccine candidates to maximize CTL responses against SARS-CoV-2.


Introduction
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) was first identified in Wuhan at the end of 2019, spread at unprecedented speed, and became a disaster to human beings worldwide [1]. Effective vaccines, antiviral drugs, and treatments have high priorities to defend against such challenges. SARS-CoV-2 has four main structural proteins: the envelope, membrane, nucleocapsid, and spike protein considered for inclusion in vaccines. The spike protein has a receptor-binding domain (RBD) that specifically binds to human angiotensin-converting enzyme 2 (hACE2) as a receptor and mediates virus entry into the host cell [2,3]. Neutralizing antibodies recognizing the RBD can block the spike protein from binding to the hACE2 and inhibit virus entry [4,5]. Therefore, spike protein has been the primary choice as the immunogen in candidate vaccines.
Although protection against disease via vaccineinduced neutralizing antibodies has been demonstrated, the elimination of SARS-CoV-2 infection within the host is also essential. The numbers of mild and asymptomatic cases have been rising dramatically in recent years, and such cases remain infective, prolonging viral dissemination [6]. To eliminate the viral infection, induction of a potent antigen-specific CD8+ T cell response by vaccination is probably critical [7,8]. T cell immunity is indispensable for viral clearance, as demonstrated in animal models infected with viruses like JEV, DENV, and recently Zika, among others [7,9,10]. Few of the currently available methods can monitor virus-specific CD8+ T cells, and consequently, few studies have investigated whether virus-specific CTLs influence the pathology of COVID-19 or contribute to the elimination of the virus. Identification of peptides recognized by CTLs would help address these issues by enabling analysis of the distribution, function, and phenotype of specific CD8+ T cells in SARS-CoV-2-infected mice and facilitating studies of the T effect cell immune response on virus clearance in such models [11][12][13].
To activate a viral-specific CD8+ T cell response, the vaccine must contain highly active major histocompatibility complex class I (MHC-I) epitopes that MHC-I molecules can present to interact with CD8+ T cell receptors (TCR). The potentiation of viral-specific CD8+ T cell responses depends on the high affinity and avidity of MHC-I and TCR binding. There is a lack of information on the CD8+ T cell-recognized epitopes within the spike antigen; consequently, only overlapping peptide pools covering the whole region of spike antigen have been used routinely to evaluate cell-mediated immunity (CMI) of vaccine candidates [14][15][16]. A few reports have suggested that CTL epitopes are present within the spike protein, but only one epitope has been reported among the potential sequences discovered [17]. Identifying those CD8+ T cell epitopes would provide an important tool to evaluate the T cell immunity in vaccinated individuals or patients and was undertaken here.
This study utilized web-based tools to analyze the potentials for transportation associated with antigen processing (TAP) in the human MHC-I epitopes that were predicted by the Immune Epitope Database analysis (IEDB) resource [18] to be present in peptide pools covering the N-terminal domain (NTD) and receptorbinding domain (RBD) of the spike protein. We demonstrated that peptide 2 (YYVGYLQPRTFLLKY), although it did not give the highest score in the web-based analysis of immunogenicity, was the best epitope for inducing a robust antigen-specific IFN-γ producing CD8+ T response as defined by ELIspot assay. This epitope sequence is also highly conserved among currently discovered SARS-CoV-2 variants.

Mice
Female Balb/c mice (6-8 weeks of age) were purchased from Beijing Vital Laboratory Animal Technology Co., Ltd. (Beijing, China) and Shanghai Jiesjie Laboratory Animal Co., Ltd. (Shanghai, China), and were kept in SPF conditions. All animal experiments were approved by the Experimental Animals Committee of SHMC, and all methods were carried out in accordance with relevant guidelines and regulations. This study was carried out in compliance with the ARRIVE guidelines. After testing, all mice were sacrificed by euthanasia with isoflurane treatment.

Peptide pool derived from SARS-CoV-2 spike protein
The spike receptor-binding domain (RBD) peptide pool (SARS-CoV-2 spike protein aa258-518) published previously [15] was used for the study (Table 1), which was pool 2 in our previous study and renamed as pool 1 in this study. The peptide pool 5 covered the spike S2 region (SARS-CoV-2 spike protein aa1015-1275) in our previous study was renamed as pool 2 in this study. The peptides (Table 1 & sTable 2) were synthesized by Genescript (Nanjing, China).

Immunization
The mice were injected twice with a two-week interval via the intramuscular route (i.m.) with 25 μg of pVAX-S-WT, made from the wild-type sequence of the full-length spike protein of the SARS-CoV-2 (SARS-CoV-2/WH-09/human/2020/ CHN), or with pGX9501 expressing a synthetic, optimized sequence of the SARS-CoV-2 full-length spike glycoprotein [15]. Electroporation was applied with the Cellectro2000™ device. Serum samples and spleens were collected 14 days after the second immunization.

IEDB analysis for SARS-CoV-2 MHC-I epitope identification
An explorative panel of SARS-CoV-2-derived epitopes with the highest predicted affinity to MHC Class I molecules was defined by Immune Epitope Database analysis (www.IEDB.org). The selection was based on internal predictions using NetMHCpan Version EL4.1. All predicted epitopes with a percentile rank of < 2 were selected for a further MHC-I processing analysis using MHC-NP methods. Simultaneously, those epitopes were analyzed in MHC-I immunogenicity to check if the peptide sequence was consistent with this allele's site preference. After applying the above three analysis methods, peptides with a percentile rank of < 0.5, a TAP total score of > −1, and an immunogenicity score of > 0 were subjected to an ELIspot assay to evaluate their ability to elicit a T cell IFN-γ response.

Cytotoxic lymphocyte (CTL) killing ability
A single-cell suspension of splenocytes from naïve syngeneic mice was diluted to 1.5 × 10 8 /ml in RPMI1640 containing 10% FBS and 2% penicillin and streptomycin pulsed at 37°C with or without 5 μg/ml peptides as described previously [19]. After 4 h, eflour450 (eBioscience, 65-0842-85) at 5 mM (high concentration) was used to label peptide-pulsed cells at room temperature in the dark. Non-peptide-pulsed cells were labelled with a low concentration of eflour450 at 0.5 mM. After being rinsed three times with PBS, 4 × 10 6 labelled and peptide-pulsed cells and an equal number of labelled non-peptide-pulsed cells were adoptively transferred by tail vein injections into mice that had previously been immunized. Six hours later, the percentage of labelled cells in spleens was detected with LSRFortessa flow cytometry (BD) and analyzed by FlowJo (TreeStar). The following formula calculated the specific cell lysis: Specific cell lysis ability% = (1-(percentage of cells incubated with peptide/percentage of cells incubated without peptide)) x100%.

IFN-γ ELIspot
Splenocytes were collected from individual mice into RPMI1640 media supplemented with 10% FBS (R10, Gibco) and penicillin/streptomycin and processed into single-cell suspensions. ELIspot assays were performed using Mouse IFN-γ ELIspot plates (Dakewei Biotech Co., Ltd, 2210006, Shenzhen, China). The ELIspot plates were washed 5 times at RT with 100 μL of PBS per well then incubated with 200 μL of R10 for 10 min before the cells were plated. Two hundred fifty thousand mouse splenocytes, CD4+, or CD8+ T cells were plated into each well and stimulated for 16 h with 15-mer peptides from the SARS CoV-2 into immunized mice. After 5 h, splenocytes were harvested, and the intensity of eFlour450 peptide labelled target cells was compared with the non-peptide-labelled negative control cells by flow cytometry. pVAX1-s-WT was made from the wild type sequence of the full-length spike protein of the SARS-CoV-2(SARS-CoV-2/WH-09/human/2020/CHN) was subcloned into the pVAX1. The sequence of the same region was optimized via SynCon technology, synthesized, and cloned into pVAX1 as the pGX9501.
spike peptide pools that overlapped by nine amino acids as previously described [15]. Each peptide was at a final concentration of 1 μg in 100 μl R10 per well. The spots were developed based on the manufacturer's instructions. R10 and cell stimulation cocktails (Invitrogen) were used for negative and positive controls. Spots were scanned and quantified by AID ELIspot READER (AID, Germany). After subtracting the negative control wells, spot-forming units (SFU) per million cells were calculated.

Statistical analysis
The statistical analysis methods and sample sizes (n) are specified in the results section or figure Notes: 1. MHC-I binding score was between 0 and 2. < 0.5 strong binder, 0.5-2 weak binder, > 2 non-binder. 2. A high Immunogenicity score indicates that the degree of the peptide conformity to sequence preference was good.
3. The higher the TAP total score, the higher the likelihood that the peptide will be presented after being swallowed by DCs.
legends for all quantitative data. All values are reported as means ± sem with the indicated sample size. No samples were excluded from the analysis. All relevant statistical tests were two-sided and p values less than 0.05 were considered statistically significant. All animal studies were performed with randomized animal selection. Statistics were performed using GraphPad Prism 7 software. In all data, * p < 0.05, ** p < 0.01, *** p < 0.001, and **** p < 0.0001.

Results
Strong CD8+ CTL epitope activity is embedded in an overlapping peptide pool 1 that covers the NTD and RBD region of the spike protein When Balb/c mice were immunized twice with the pGX9501 DNA vaccine expressing the spike protein of SARS-CoV-2, a higher level of IFN-γ expression by splenocytes was more often seen by the ELIspot assay when the cells were stimulated in vitro with spike peptide pool 1 (Table 1) compared with pool 2 (sTable 2 & Figure 1A). In addition, when an in vivo CTL assay was done with identically immunized animals, the same peptide pool 1 gave a strong CTL response in vivo ( Figure 1B), suggesting that MHC-I epitope(s) were present within pool 1.
Screening and identification of an MHC-I epitope in peptide pool 1.
To seek T cell-relevant epitopes, we placed the entire 41 peptide sequences from peptide Pool 1 into the Immune Epitope Database analysis (IEDB, http:// www.iedb.org/). An evaluation method was established by integrating MHC-I binding prediction, MHC-I immunogenicity, and MHC natural processing (MHC-NP) prediction from three H-2d MHC-I alleles to improve prediction results (  (Figure 2A & B). Consequently, The Peptides (2, 11, 12, and 41) for which the MHC-I binding RANK was < 2 and showed the greatest TAP total score or Immunogenicity score in the various alleles were selected for the IFN-γ ELIspot assay. As shown in Figure 3A, Peptide 2 from Pool 1 presented the best stimulation to induce the IFN-γ secretion compared to the other two selected peptides. Thus, Peptide 2, consisting of 15 amino acids, stimulated CD8 + T cells via MHC-I or/and CD4 + T cells via MHC-II. To identify which T cell type was stimulated by Peptide 2, purified CD4 + T cells or CD8 + T cells were used (sFigure 1). Peptide 2 stimulated CD8 + T cells but not CD4 + cells, indicating that it can only be presented by MHC-I ( Figure 3B). To further investigate its sequence specificity, we mutated several predicted anchor amino acids of Peptide 2 according to the preferences of the H-2d MHC-I allele [20][21][22][23]. The mutated Peptide 2 had a low MHC-I binding score in the IEDB prediction (sTable 3) and showed a significantly reduced ability to stimulate IFN-γ secretion by CD8 + T cells ( Figure 3C). Furthermore, we compared this peptide with the MHC-I peptides reported in a previous study [17] (e.g. S526-533, GPKKSTNL) and found that Peptide 2 was significantly more potent in the induction of IFN-γ secreting T cells than the previously reported peptides (sFigure 2).

Analysis of Peptide 2 epitope conservation and HLA distribution
We compared the sequence of Peptide 2, YYV-GYLQPRTFLLKY (amino acid 264-278), with the sequences in the current SARS-CoV-2 variants-ofconcern (VOC) and variants-of-interest (VOI) posted by WHO, including the latest Omicron variant. We observed that this sequence is highly conserved among those variants ( Figure 4A) and located at the end of the NTD of the Spike protein and upstream of the RBD ( Figure 4B). Hence, this highly conserved epitopic sequence provides a valuable tool for evaluating the CD8 + T cell-mediated responses to vaccine evaluation both in animals and humans.
Since MHC-I-biased expression patterns in different populations are globally diversified and variable, a peptide sequence that can be recognized by one population may not be recognized by others. To investigate if this is the case with Peptide 2, we performed an HLA allele analysis for different regions to assess binding to one or more of the 27 prevalent  Figure  4C & Table 3. The frequency by HLAs was calculated with the online analysis tool at http://www. allelefrequencies.net/. We also evaluated the MHC-I binding ability, immunogenicity, and TAP potential of Peptide 2 on different HLA alleles by IEDB ( Figure 4D & Table 4). We set the MHC-I binding score when it was < 0.2, and then the immunogenicity was >0 as a basis for the determination. The results indicated that Peptide 2 could be recognizable by the HLA-A*02:01 allele (most in Europe and America), HLA-B*08:01 allele (in Europe and Australia), HLA-A*23:01 allele (in North Africa and Sub-Saharan Africa), HLA-A*02:03 allele (in Southeast Asia), HLA-A*24:02 allele (in Oceania), HLA-A*02:06 allele (in North America, North-East Asia, and Oceania), HLA-A*33:01 allele (in China and Pakistan), HLA-B*35:01 allele (in Oceania), and HLA-A*03:01 allele (in Europe). These findings suggest that Peptide 2 could be well recognized by the most frequent HLA alleles of the worldwide population and can therefore be considered to be promiscuous.

Discussion
In this study, we have defined and characterized a potential CTL epitope of the spike protein conserved among all the SARS-CoV-2 variants and validated its capacity to elicit IFN-γ and CTL responses of CD8 + T cells in the Balb/c mouse model. Furthermore, we found that the epitope, Peptide 2, maybe well recognized by HLA alleles in most populations worldwide.
In recent studies, CD8 + T cell immunity was found to make significant contributions to the protective efficacy of SARS-CoV-2 vaccines [24][25][26].
Additionally, lymphopenia was more accentuated in symptomatic COVID-19 patients with pneumonia than those without pneumonia, consistent with T cell immunity playing a protective role in pre-existing immunity against SARS-CoV-2 [26][27][28]. However, the role of T cell immunity in the pathology of COVID-19 has not been fully clarified and needs further investigations of T cell epitopes and their functions. Our work has provided a tool to monitor virus-specific CD8 + T cells and assess the contribution of CTLs to the control and the elimination of the virus. Based on Pools 2 and 5 of our previous study presented Notes: 1. MHC-I binding score was between 0 and 2. < 0.5 strong binder, 0.5-2 weak binder, > 2 non-binder. 2. The high Immunogenicity score means the degree of the peptide conformity to sequence preference was good.
3. The higher the TAP total score, the higher the likelihood that the peptide will be presented after being swallowed by DCs.
strong T cell stimulation properties [15], we explored those two peptide pools but renamed them as the Pool 1 (S258-512) and Pool 2 (S1015-1275) for further investigation to determine their differences to stimulation of IFN-γ expressions for this study. Other epitopes such as the MHC-I epitope S526-533 (GPKKSTNL) reported from another study [17] were not further investigated since we found that they were much weaker epitopes than Peptide 2 in inducing IFN-γ expressing T cells. A possible explanation for GPKKSTNL might be that the sequence showed a good MHC-I binding only in the H-2D d allele, whereas Peptide 2 showed an excellent binding score in all three alleles. The SARS-CoV-2 virus was found to mutate rapidly. Accordingly, the development of vaccines protecting people from different virus variants is urgently needed. The neutralizing antibodies induced by vaccines were found to have variable efficacies against the different SARS-CoV-2 variants, and efficacies declined over time, whereas the protection represented by CD8 + T cell immunity remained unchanged [24]. Peptide 2 is a highly conserved epitope among all variants and well-presented by MHC-1 of all HLA alleles across the globe. Thus, the conserved Peptide 2 should be suitable for evaluating COVID-19 vaccines for T cell response, particularly for the CD8 T cellmediated functions. It is also possible to include such an epitope in new COVID-19 vaccines to induce a robust cellular response against all variants of SARS-CoV-2. A recent study confirmed that Peptide 2 probably has a strong cell-mediated immunological function in man; a 9-mer (YLQPRTFLL) peptide overlapped by Peptide 2 could induce a high level of IFN-γ expression from PBMCs of patients who had recovered from COVID-19 and carried the HLA-A*02:01 allele [29]. The 9-mer peptide only showed a relatively good MHC-I binding ability in H-2L d allele (sTable 1), and it stimulated a weaker IFN-γ T cell response than Peptide 2 in mice (sFigure 2).
In conclusion, our study utilized web-based tools to predict human MHC-I epitopes and found several sequences falling into the category. Among these, Peptide 2 (YYVGYLQPRTFLLKY) was not given the most decisive total TAP score in the prediction, but overall it simulated a more robust antigen-specific IFN-γexpressing CD8+ T response compared to the other predicted epitopic sequences. This epitope sequence is located at the end of NTD of the spike protein and is highly conservative among the currently known SARS-CoV-2 variants and recognizable with the diverse HLA alleles prominent in most world populations. This critical MHC-I epitope can be used to assess CMI induced by COVID-19 vaccines and maybe strategically incorporated into vaccine designs to enhance the prospect of viral elimination by vaccination.