Design of a multi-epitope-based vaccine targeting M-protein of SARS-CoV2: an immunoinformatics approach

Abstract In the present study, one of the targets present on the envelopes of coronaviruses, membrane glycoprotein (M) was chosen for the design of a multi-epitope vaccine by Immunoinformatics approach. The B-cell and T-cell epitopes used for the construction of vaccine were antigenic, nonallergic and nontoxic. An adjuvant, β-defensin and PADRE sequence were included at the N-terminal end of the vaccine. All the epitopes were joined by linkers for decreasing the junctional immunogenicity. Various physicochemical parameters of the vaccine were evaluated. Secondary and tertiary structures were predicted for the vaccine construct. The tertiary structure was further refined, and various parameters related to the refinement of the protein structure were validated by using different tools. Humoral immunity induced by B-cells relies upon the identification of antigenic determinants on the surface of the vaccine construct. In this regard, the vaccine construct was found to consist of several B-cell epitopes in its three-dimensional conformation. Molecular docking of the vaccine was carried out with TLR-3 receptor to study their binding and its strength. Further, protein–protein interactions in the docked complex were visualized using LigPlot+. Population coverage analysis had shown that the multi-epitope vaccine covers 94.06% of the global population. The vaccine construct was successfully cloned in silico into pET-28a (+). Immune simulation studies showed the induction of primary, secondary and tertiary immune responses marked by the increased levels of antibodies, INF-γ, IL-2, TGF-β, B- cells, CD4+ and CD8+ cells. Finally, the vaccine construct was able to elicit immune response as desired. Communicated by Ramaswamy H. Sarma

SARS-CoV2 belongs to the genus -Betacoronavirus, Subfamily -Orthocoronavirinae, Family -Coronaviridae, Suborder -Cornidovirineae and Order -Nidovirales. MERS-CoV and SARS-CoV also belongs to the same genus Betacoronavirus (Gorbalenya et al., 2020). The genome of the SARS-CoV2 is a positive sense single stranded RNA that acts as mRNA immediately after its release into the host cell. The genome of coronavirus is approximately 20 kb-30 kb. Two-third of the genome of Coronavirus towards 5 0 -end is made up of two overlapping open reading frames, ORF1a and ORF1b that codes for the synthesis of several nonstructural proteins. ORF1a is translated and results in the synthesis of ORF1a polyprotein. However, due to ribosomal frame shifting, the ribosome continues to translate the ORF1b resulting in the synthesis of ORF1ab polyprotein. Cleavage of the mature peptide of ORF1ab gives rise to the following peptides -nsp2 to nsp4, nsp6 to nsp11, 3C-like proteinase, RNA-dependent RNA polymerase, helicase, 3 0 -to 5 0 -exonuclease, endoRNAse and 2 0 -O-ribose methyltransferase. Remaining one-third of the genome of Coronavirus towards 3 0 -end is made up of structural proteins consisting of Surface Glycoprotein (S), Envelope (E), Membrane Glycoprotein (M) and Nucleocapsid phosphoprotein (N). Interspersed between these four structural genes are ORF3a, ORF6, ORF7a-ORF7b, ORF8 and ORF10 genes (NCBI Accession MT434760) (Graham et al., 2008;Joshi et al., 2020;Kim et al., 2020;Prasad & Prasad, 2020).
Membrane glycoprotein is the most abundant protein in the Coronaviruses (Rottier, 1995). As the name indicates, the 'M' protein is embedded in the viral lipid bilayer. The N-terminal end is exposed outside and the C-terminal end is present inside the virus with three transmembrane helices embedded in the lipid bilayer (Bianchi et al., 2020;Kumar et al., 2020). It is an important glycoprotein that alone plays an important role in intracellular budding (Kumar et al., 2020;Rottier, 1995;Tang et al., 2009). It is different from the remaining glycoproteins of Coronavirus in terms of its structure and function (Rottier, 1995). It is flanked on its 5 0 -end by gene 'E' coding for Envelope protein and towards 3 0end by gene ORF6 (NCBI Accession MT434760).
Vaccines provide protection from infectious diseases (Thomas & Luxon, 2013). For the successful development of a vaccine candidate till its release into the market takes considerable amount of time as it needs to qualify in different phases in the clinical trials along with the huge investment involved. However, at times, the rate of success of the vaccine is also bleak. In such a scenario, Immunoinformatics approach is helpful in reducing the time taken for the development of vaccines compared with the conventional approaches and there are examples of the successful development of vaccine candidates by adopting in silico approaches (Ada et al., 2018;Bhatnager et al., 2020;He et al., 2010;Kim et al., 2019;Oli et al., 2020;Patronov & Doytchinova, 2013;Tordello et al., 2017 and the appropriate references cited therein). The whole genome sequence of the SARS-CoV2 is made available in the public domain by next generation sequencing. This enabled the design of several multi-epitope in silico vaccine candidates targeting different structural and nonstructural proteins of the SARS-CoV2 (Bhatnager et al., 2020;Bhattacharya et al., 2020;Devi & Chaitanya, 2020;Dong et al., 2020;Enayatkhani et al., 2020;Kar et al., 2020;Lizbeth et al., 2020;Peele et al., 2020;Samad et al., 2020). In the present study, we aimed to design a novel multi-epitope vaccine targeting membrane glycoprotein by immunoinformatics approach.

Methods
A flowchart summarizing the steps involved in the design of the multi-epitope vaccine in the present study is shown in Figure 1.

Retrieval of nucleotide sequences from NCBI GenBank
Amino acid sequences of the Membrane glycoprotein corresponding to different geographical regions with the following accession numbers viz. MT434760, MT259226, MT066156, MT292577 and MT339041 were retrieved from NCBI GenBank. Multiple sequence alignment of the retrieved sequences was carried out online using Clustal Omega available at https:// www.ebi.ac.uk/Tools/msa/clustalo/ to find out the extent of similarity among the sequences. It was observed that the extent of similarity among all the accessions is 100%. Subsequently, 222 amino acid sequences were used for the design of multi-epitope vaccine. Beta-Defensin 3 amino acid sequence was retrieved from UniProt database (Q5U7J2). were used in the present study for predicting the affinities of different peptides derived from the membrane glycoprotein of SARS-CoV2 for HLA -I & II. These alleles are considered to occur more frequently than others. (https://help.iedb.org/hc/en-us/ articles/114094151851 accessed on 05 July 2020 and the appropriate references cited therein).

Prediction of HLA-I restricted (CTL) epitopes
MHC class-I Epitopes were identified in the present study using NetCTLpan À 1.1 (https://services.healthtech.dtu.dk/ service.php?NetCTLpan-1.1) (Stranzl et al., 2010). It is considered to outperform any other CTL epitope predictor tools. Apart from this, it is more efficient in predicting new HLA-I/CTL epitopes compared to NetMHCpan and NetCTL tools (Stranzl et al., 2010). The output of the programme gives prediction scores for MHC-I binding affinity, TAP transport efficiency, Proteasomal C terminal cleavage, Ligand combined score and Percentage rank. Threshold for epitope identification was set at its default value of 1.0. The peptides whose %Rank was less than 1.0 were considered qualified for further analyses.

Prediction of HLA-II restricted epitopes
NetMHCIIpan-4.0 (https://services.healthtech.dtu.dk/service. php?NetMHCIIpan-4.0) was used in the present study to predict the peptides that bind to a HLA-II molecule using Artificial Neural Networks. NetMHCIIpan is superior in terms of prediction of ligand compared to any of the existing T H cell epitope prediction tools (Reynisson et al., 2020). The output of the programme gives information on the peptide sequence, core peptide region, eluted ligand prediction score, Percentile rank of eluted ligand prediction score and Bind level (binding affinity of the peptide). Threshold for strong binding peptides was set to default value. Only strong binding peptides, whose %Rank was less than the threshold value of 2.0, were included in the present study.

Population coverage analysis
HLA genotype frequencies vary across different populations in the world. The nature of HLA polymorphism has its influence on the binding of a given peptide derived from the multi-epitope vaccine to HLA-I/II molecules (Bui et al., 2006). Therefore, it is important to know the extent of coverage of world population by the multi-epitope vaccine constructed in the present study. To determine this, IEDB Analysis Resource available at http://tools.iedb.org/population/ (Bui et al., 2006) was used. A comprehensive account on the fundamental principles of the tool is available at http://tools.iedb.org/population/help/for further reading. In the present study, population (World) coverage analysis was carried for HLA class I and class II combined. A total of eight T-cell epitopes as well as their corresponding HLA alleles were given as input.

Prediction of B-cell epitopes
Linear B-cell epitopes were predicted using BepiPred-2.0 (https://services.healthtech.dtu.dk/service.php?BepiPred-2.0) at the default threshold of 0.5 (Jespersen et al., 2017). Only peptides which are of a minimum of 10 amino acids long are considered for further analyses.

Prediction of antigenicity, allergenicity and toxicity of HLA -I, II & B-cell epitopes
Initially each of the peptide was analyzed for important parameters viz. antigenicity, allergenicity and toxicity. Antigenicity of the peptides was determined using VaxiJen v2.0 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen. html). Allergenicity of the peptides was determined using AllerTOP v2.0 (https://www.ddg-pharmfac.net/AllerTOP/) (Dimitrov et al., 2014) and Toxicity of the peptides was determined using ToxinPred (https://webs.iiitd.edu.in/raghava/toxinpred/design.php) (Gupta et al., 2013). Determination of all the above three properties for the individual peptides as well as for the entire vaccine construct was carried out at default settings (unless stated otherwise). However, for B-cell epitopes, the threshold for VaxiJen was set to 0.4 to accommodate for a greater number of B-cell epitopes in the final vaccine construct.

Construction of multi-epitope vaccine
For the construction of a multi-epitope vaccine, b -defensin was used as an adjuvant. PADRE sequence was also included in the final vaccine construct. The following four linkers viz. EAAAK, GPGPG, AAY and KK were used for joining the epitopes. EAAAK was used for joining adjuvant and PADRE sequence. CTL, HTL and B-cell epitopes were joined by AAY, GPGPG and KK, respectively. A His6 tag was added to the c-terminal end of the vaccine construct to aid in purification.

Interferon-c inducing epitope prediction
Interferon-c (IFN-c) inducing ability of the vaccine was determined using IFNepitope (https://webs.iiitd.edu.in/raghava/ ifnepitope/scan.php) (Dhanda et al., 2013). The online server generates overlapping peptides (15 amino acids long) and predicts the IFN-c inducing ability of each of the peptide generated. 'Motif and SVM hybrid' method of prediction was selected in the 'Scan' module.

Evaluation of physicochemical properties of the vaccine
ProtParam (https://web.expasy.org/protparam/) Gasteiger et al. (2005) was used to predict the physicochemical properties of the vaccine viz. molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY).

Prediction of discontinuous B-cell epitopes in the final vaccine construct
Linear and discontinuous B-cell epitopes in the final vaccine construct were predicted using BepiPred À 2.0 (https://services.healthtech.dtu.dk/service.php?BepiPred-2.0) and DiscoTope v2.0 (http://www.cbs.dtu.dk/services/DiscoTope/ & http://tools. iedb.org/discotope/) (Kringelum et al., 2012), respectively. Default threshold value of 0.5 and À3.7 were used for the prediction of epitopes in BepiPred and DiscoTope, respectively. For the prediction of discontinuous B-cell epitopes using DiscoTope, three dimensional structure of the vaccine construct in '.PDB' format was given as an input. The output of the programme consists of chain id, residue number, amino acid, contact number, propensity score, DiscoTope score and identified B cell epitope (shown against the residue).

Prediction, refinement and validation of tertiary structure
I-TASSER (Iterative Threading ASSEmbly Refinement) available online at https://zhanglab.ccmb.med.umich.edu/I-TASSER/ was used for the prediction of tertiary structure of the protein (Roy et al., 2010). In the first step, it employs LOMETS, a meta-server threading programme for template-based protein structure prediction. It contains 11 threading programs. Each threading programme generates many templates from PDB that are aligned with the query sequence. However, only one template that has shown the highest z-score from each of the threading programs is selected. Subsequently, 10 threading programs are ranked. Thus, the output from LOMETS contains the top 10 threading programs along with their respective high score template. Using the templates generated from LOMETS, in the next series of steps, I-TASSER generates top five final models. Each model is given a Confidence score (C-score). The quality of each of the predicted models is estimated by means of C-score. A higher Cscore implies higher confidence in the particular model generated. Apart from C-score, I-TASSER computes TM-score and RMSD value for all the five models. I-TASSER chooses the top model out of the five models generated which has a strong correlation between C-score and TM-score. The tertiary structure generated by I-TASSER was submitted to 'GalaxyRefine2' (http://galaxy.seoklab.org/cgi-bin/submit.cgi?type=REFINE2) available at GalaxyWEB server (http:// galaxy.seoklab.org/index.html) for the refinement of protein structure obtained from the top model generated by I-TASSER. GalaxyRefine2 is an advanced version of 'GalaxyRefine', utilizes local operators and global operators as well as local error estimation and homology structure information for the improvement of the accuracy of the structure of the input protein . It generates 10 refined models and provides information related to RMSD, MolProbity, Clash score, Poor rotamers, Rama favored and 'GALAXY energy' for all the ten models in comparison with the user submitted model. The refined model obtained from GalaxyRefine2 was used as a final vaccine candidate for docking and in silico cloning.
The refined tertiary structure obtained from GalaxyRefine2 was further validated using ProSA-web, a protein structure analysis tool available at https://prosa.services.came.sbg.ac.at/ prosa.php (Sippl, 1993;Wiederstein & Sippl, 2007). The overall quality of the protein is indicated by the z-score. The output of the programme is represented in the form of a graph consisting of z-scores (y-axis) of the structures of native proteins deduced from X-ray and NMR spectroscopy against the number of residues (x-axis).
The Ramachandran plot of the protein was obtained using RAMPAGE (accessed at http://mordred.bioc.cam.ac.uk/ $rapper/rampage.php). The output of the programme gives information on the number of residues in favored region, allowed region and in the outlier region. Overall quality factor of the unrefined and refined protein was computed using ERRAT web server available at https://servicesn.mbi.ucla.edu/ ERRAT/ (Colovos & Yeates, 1993).
2.14. Docking of the vaccine with TLR-3 TLRs (Toll Like Receptors) are immune receptors play an important role in eliciting immune responses against infection by bacteria and viruses. Adjuvants have an ability to induce innate and adaptive immune responses upon their interaction with TLRs. In the present study, bdefensin was included in the vaccine construct as an adjuvant for enhancing the efficacy of the vaccine. It acts as an agonist for TLR3. As a result, docking study was carried out between TLR3 (Receptor) and Vaccine (Ligand) to study the binding affinity of the ligand with the receptor (Bhatnager et al., 2020;Pandey et al., 2018).
DIMPLOT under LigPlot þ v2.2 was used for studying the interactions between the residues of Chain-A (TLR-3) and Chain-B (Vaccine construct) in the docked complex.

Codon adaptation and in silico cloning of vaccine candidate
JCat (JAVA Codon Adaptation Tool), an online tool available at http://www.jcat.de/Start.jsp (Grote et al., 2005) was utilized for the Codon-adaptation of the vaccine construct in Escherichia coli (Strain K12). Standard genetic code was used for the conversion of input amino acid sequence of the vaccine construct to the DNA sequence. Apart from this, additional options such as (i) avoid rho-independent transcription terminators, (ii) avoid prokaryotic ribosome binding sites and (iii) avoid cleavage sites of restriction enzymes, were selected for the generation of the optimized DNA sequence corresponding to the input amino acid sequence of the vaccine construct. SnapGeneV R 5.1.4.1 was utilized for the in silico cloning of the DNA sequence (obtained from JCat) into pET-28a (þ) vector. Prior to cloning, the DNA sequence of the vaccine construct was verified for the absence of specific restriction enzyme recognition sites that were to be utilized for cloning into the vector. After confirmation, those restriction enzyme sequences were added on N-terminal and C-terminal regions of the vaccine construct for successful cloning.

Immune-simulation
In the present study, C-IMMSIM an agent-based model available at http://150.146.2.1/C-IMMSIM/index.php?page=1 (accessed on 25 July 2020) was used to simulate the immune  These HLA molecules were randomly chosen from among the HLA alleles, whose interacting peptides were utilized for the construction of multi-epitope vaccine. Three injections of the vaccine (without LPS) were given at four weeks interval with the following time steps (one time step corresponds to 8 h): 1, 84 and 168. Shey et al., 2019). Rest of the parameters viz. Random seed, simulation volume, number of antigens to inject were set to their default values except for 'Simulation Steps' that was set to 1050.

Results and discussion
3.1. Selection of epitopes for vaccine design

Prediction of HLA-I & II epitopes
A total of 37 and 24 peptides which have binding affinity for chosen HLA-I (HLA-A & HLA-B) ( Table 1) and HLA-II alleles (Table 2) were identified by NetCTLpan and NetMHCIIpan, respectively. These peptides were selected from among many peptides predicted by NetCTLpan and NetMHCIIpan on the basis of their percentage rank.

Prediction of linear and discontinuous Bcell epitopes
BepiPred predicted a total of five epitopes. Of these, three epitopes were more than 10 amino acids long. Among the three epitopes, two were found to be nonallergic, nontoxic and antigenic. Hence, these two epitopes, KLGASQRVAGDS and RYRIGNYKLNTDHSSSSDNIA were included in the vaccine construct (Table 3). Unlike BepiPred, which predicts B-cell epitopes that are continuous in a linear peptide sequence, DiscoTope predicts B-cell residues present on the surface of the three-dimensional structure of a protein. This implies that those residues what are distant in a linear peptide sequence, can come close when the protein assumes three-dimensional conformation to form a Bcell epitope (Kringelum et al., 2012). DiscoTope at a threshold of À3.7 identified 25 B-cell epitope residues (residue numbers 172, 182, 186, 188, 192, 195-196, 202-203, 207-211, 213-223) out of 223 total residues (Figure 2).
The final vaccine construct consists of both linear and discontinuous epitopes as determined from the results obtained using BepiPred and DiscoTope, respectively, which can induce humoral immune responses. This was also evident from the results obtained from Immune simulation.

Antigenicity and allergenicity
All the T and B-cell epitopes in the present study were screened for their antigenicity, allergenicity and toxicity. Only those peptides which were antigenic, nonallergic and nontoxic were used in the present study for the design of vaccine construct. The final vaccine construct (223 amino acids) was found to be antigenic in all the three models viz. Tumor (threshold: 0.49), Virus (threshold: 0.4) and Bacteria (threshold: 0.4). It was also found to be nonallergic.

Design of vaccine construct
The final vaccine construct is 223 amino acids long (Table 4). It consists of a total of six HLA-I, two HLA-II and two B-cell epitopes. Apart from this, at the N-terminal end, b-defensin was included in the vaccine construct for increasing the immunogenicity of the multi-epitope vaccine. b-defensin acts as an adjuvant and elicits antiviral responses Lei et al., 2019). b-defensins are anti-microbial peptides involved in eliciting innate immune responses. Apart from this, b-defensin interact with the immune receptors such as TLRs or CCR6 and induces innate and adaptive immune responses (Lei et al., 2019). It is used as an adjuvant in the construction of several vaccines by Immunoinformatics approach (Ali et al., 2017;Bhatnager et al., 2020;Dong et al., 2020;Narula et al., 2018). Apart from b-defensin, PADRE (pan DR epitope) sequence was also included in the vaccine construct to increase the immunogenicity of the vaccine construct as well as to enhance the T-cell help (Alexander et al., 2000;Sarkar et al., 2020;Wu et al., 2010). PADRE sequence acts as a TLR agonist adjuvant. It is a 13 amino acid long synthetic peptide that has an ability to effectively induce CD4þ T cell help. It has got several advantages, e.g. ability to bind to most of the HLA-DR molecules, clinically safe for humans and more potent than other universally available T H cell epitopes (Ghaffari-Nazari et al., 2015). In the present study, it is placed at the N-terminal end between b-defensin and HLA-I epitopes and was connected to these two by EAAAK and AAY linkers, respectively.
In the present study, EAAAK linker was attached at the Nterminal end of the vaccine construct (Mittal et al., 2020;Sarkar et al., 2020) and was also used to join the b-defensin sequence with PADRE sequence (Sarkar et al., 2020). EAAAK is a rigid a-helix forming peptide linker, with intramolecular hydrogen bonding having a closed packed backbone. Rigid linkers possess several advantages when compared to the flexible linkers. EAAAK linkers offer efficient separation of the functional domains by keeping a fixed distance with minimal interference between the epitopes thereby maintaining their individual functional properties (Chen et al., 2013). This helps in the effective separation of domains in a bifunctional fusion protein (Arai et al., 2001).
GPGPG linker was used to join the HLA-II epitopes (Livingston et al., 2002;Sarkar et al., 2020) in the present study. GPGPG linker was designed by Livingston et al. (2002) as a universal spacer . GPGPG linkers were demonstrated to be able to induce T H lymphocyte (HTL) responses which is crucial for a multi-epitope vaccine. Further, GPGPG linker is a valuable tool in breaking the junctional immunogenicity, which leads to the restoration of immunogenicity of the individual epitopes. This was experimentally demonstrated by Livingston et al. (2002) in mice models.
HLA-I epitopes were joined by AAY linker (Bhatnager et al., 2020) in the present study. AAY (Ala-Ala-Tyr) linker is the cleavage site for the proteasomes in mammalian cells. Therefore, epitopes joined using AAY linker gets separated effectively within the cells; thereby reducing the junctional immunogenicity. AAY linker also increases the immunogenicity of the multi-epitope vaccine .
In the present study, KK linker was used to join the B-cell epitopes (Gu et al., 2017;Nain et al., 2020;Sarkar et al., 2020). The Lysine linker is the target for the Cathepsin B, a lysosomal protease involved in processing of the antigenic peptides for their presentation on the cell surface in an MHC-II restricted antigen presentation. It also plays a crucial role in reducing the junctional immunogenicity by avoiding the induction of antibodies for the peptide sequence that individual epitopes can form when they are joined linearly (Yano et al., 2005). KK linkers also increase the immunogenicity . Linkers play an important role in minimizing the junctional immunogenicity and also in preserving the identity of each individual epitope during the processing of the vaccine within the cells thereby ensuring the immunogenicity of each epitope (Bhatnager et al., 2020;Meza et al., 2017;Shey et al., 2019). A multi-epitope vaccine without linkers may result in a new protein with unknown properties or may result in the formation of neoepitopes/junctional epitopes (Livingston et al., 2002;Meza et al., 2017).

Population coverage analysis
Population coverage analysis using the T-cell epitopes used for the construction of the vaccine covers 94.06% of world's population (HLA-I & II combined).

IFN-c inducing epitope prediction
IFN-c is a cytokine that modulates various immune responses. It is mainly secreted by activated T-cells and Natural Killer cells. It plays an important role in promoting macrophage activation, enhancing the antigen presentation, activation of the innate immune system, promoting antiviral and antibacterial immunity, etc. (Tau & Rothman, 1999). IFNepitope generated a total of 215 epitopes. Out of these, 88 epitopes had positive score and 127 had negative score. The IFN-c inducing epitopes constitute around 41% of the total number of peptides. This is consistent with the results obtained from the Immune-simulation studies ( Figure 3) which had showed an increase in the levels of IFN-c at all the three stages of the immune response towards the vaccine.

Evaluation of physicochemical properties of the vaccine
Molecular weight and theoretical pI of the protein was 24.9 kDa and 10.16, respectively. Based on the theoretical pI of the vaccine, it is basic in nature. Extinction coefficient of the protein was found to be 45,840 M À1 cm À1 at 280 nm when all Cys residues are reduced. Total number of negatively charged residues (Asp þ Glu) is nine and the total number of positively charged residues (Arg þ Lys) is 37. Estimated half-life was 1 h in mammalian reticulocytes (in vitro), 30 min in yeast, in vivo and more than 10 h in Escherichia coli, in vivo. The instability index of the protein vaccine was computed to be 36.21; this classifies the vaccine as stable. A protein whose instability index is greater than 40 is considered unstable and vice-versa (https://web.expasy.org/protparam/ protparam-doc.html; accessed on 28 July 2020). Aliphatic index (defined as the relative volume occupied by aliphatic side chains) of the protein is 73.77. This indicates that the vaccine is thermo stable. The GRAVY value for the protein is À0.400. This indicates that the protein is hydrophilic

Prediction of the secondary structure
The secondary structure of the final vaccine construct contained 16% beta strand, 39% alpha helix and 44% coil (Figure 4).

Prediction
, refinement and validation of tertiary structure I-TASSER: PDB ID codes of the top threading templates generated in I-TASSER by LOMETS were as follows: 1kj6, 2z36A, 3r9cA, 1gh6B, 6gk5A, 5id6A (z score <1). I-TASSER reported five models which correspond to five large structure clusters. Model 1 was chosen as the best template out of the five models generated by I-TASSER. It has the C-score of À4.67, Exp.TM-Score: 0.23 ± 0.07 & Exp. RMSA value of 17.2. This model has shown comparatively strong correlation between C-score and TM-score. For the remaining models, a weak correlation was observed between C-score and TM-score.
ProSA-web: The z-value of the protein was À5.77, which fell within the range characteristic for native proteins of similar sizes as can be observed from Figure 6. This is an indication that the predicted tertiary structure probably does not  contain errors. Visual interpretation of z-value obtained for the tertiary structure predicted by I-TASSER (image not shown) as well as for the refined models generated using GalaxyRefine (image not shown) and GalaxyRefine2 ( Figure 6) indicated that Z-values fell within the region normally observed for native proteins of similar size.
RAMPAGE: Analysis of the Ramachandran plot of the refined protein obtained from GalaxyRefine2 showed that 88.7% of residues in favored region, 8.1% of residues in allowed region and 3.2% of residues in outlier region (Figure 7).   When the parameters of Ramachandran plot obtained for the model generated and refined using I-TASSER (60.6% of residues in favored region, 20.4% residues in allowed region and 19% of residues in outlier region) and GalaxyRefine (81% of the residues are in favored region, 13.1% of the residues in the allowed region and 5.9% of the residues are in outlier region), respectively, were evaluated, the model obtained from GalaxyRefine2 is better when compared with the models obtained from GalaxyRefine and I-TASSER. The ERRAT score for the refined protein obtained using GalaxyRefine2 was found to be 70% (data not shown); whereas, for the unrefined (I-TASSER) and refined (GalaxyRefine) models, the ERRAT score was found to be 54% and 46%, respectively (data not shown).

Docking of the vaccine with TLR-3
All the top 10 models obtained as a result of docking the vaccine with TLR-3 in PatchDock were refined further and ranked by FireDock. Solution number 6 was the refined docked complex ranked as number 1 by FireDock (Figure 8), with the following parameters: Global energy, 8.71; Attractive VdW, À47.04; Repulsive VdW, 27.04; ACE, 23.02; and HB, À6.25.
The docked complex showed interactions between the ligand and the receptor. The interacting residues between the  receptor and ligand in the docked complex was visualized using DIMPLOT module in LigPlot þ as shown in Figure 9.

Codon adaptation
The multi-epitope vaccine construct consisted of 223 amino acids and the total number of nucleotides present in the DNA sequence generated by JCat after codon-adaptation consisted of 669. CAI value and GC-content of the improved sequence were 0.96 and 52.31, respectively. The CAI value >0.8 is considered as a good value for high-level expression and the range for optimal GC content is 30-70% (Nezafat et al., 2016). Thus, the CAI value and GC-content obtained are highly satisfactory for the expression of the multi-epitope vaccine in the E coli (strain K12) (Nezafat et al., 2016;Pandey et al., 2018). 3.3.8. In silico cloning of vaccine candidate DNA sequence was cloned into pET-28a (þ) vector in between Pae71-PspXI-XhoI and MluI restriction sites. These restriction sites were not present at both the ends as well as within the DNA sequence of the vaccine construct. As a result, the XhoI and MluI restriction sites were added at the  N-terminal and C-terminal ends, respectively. After this, the insert was treated with these two restriction enzymes in silico in SnapGene and the insert DNA was successfully cloned into the vector (Figure 10). The vector along with the insert was 5079 bp in length.

Immune simulation
Immune simulation results obtained from C-ImmSim showed increased levels of IgM þ IgG, IgM, IgG1 and IgG1 þ IgG2 antibodies in the secondary and tertiary immune responses when compared to primary immune response. IgM and IgM þ IgG antibodies were seen in the primary immune response. However, the peaks depicting the titers of IgM and IgM þ IgG are distinctly separated in the secondary and tertiary responses than in primary response. There is a decrease in the levels of antigen at each stage of immune response with the increase in the levels of antibodies (Figure 11(a)). The decrease in antigen count is also attributed to the increase in the total count of B (Figure 11(b)) and T-lymphocytes. Apart from B-lymphocytes, active (Figure 11(c)) and total T H (helper) (Figure 11(d)) cell populations along with memory T H cell (Figure 11(d)) populations were increased during the secondary and tertiary immune responses. The T H cell count (active and total) in the secondary and tertiary immune responses is more when compared to primary immune response. There is also an increase in the active T C (Cytotoxic) cell population per state (Figure 11(e)).
Immune simulation studies showed an increase in the levels of antibodies, B-lymphocytes, T H (including memory cells) and T C cells owing to the administration of the vaccine designed in the present study. This is evident with the development of primary, secondary and tertiary immune responses' and subsequent clearance of antigen from the system.

Conclusion
In the present study, a vaccine was designed against membrane glycoprotein of SARS-CoV2 by different in silico tools. T-cell and B-cell epitopes were designed using different online servers that are efficient in predicting the epitopes. Only those epitopes which are antigenic, nonallergic and nontoxic were used for the construction of vaccine. Epitopes were joined by different linkers to reduce junctional immunogenicity and also for efficient separation and presentation to T-cell and B-cell receptors. Population coverage analysis (using the epitopes that were used for the construction of the vaccine) had shown that the vaccine can cover 94.06% of the world-wide population. Secondary and tertiary structures were predicted for the protein. Subsequently, the tertiary model generated was further refined and validated. Molecular docking was performed between the vaccine construct and TLR-3 receptor. Various molecular interactions, i.e. hydrogen bonding, hydrophobic interactions, etc. were visualized between the protein and TLR-3 receptors. Condon optimization was carried for cloning and expression of the vaccine in E. coli. CAI value of 0.9 was obtained for codonoptimization; indicating the high-level of protein expression in the E coli host. The vaccine was successfully cloned into pET-28a (þ) vector. The multi-epitope vaccine was efficient in eliciting primary, secondary and tertiary immune responses as evident from Immune simulation studies. The proposed vaccine needs to be further validated for its efficacy by in vivo studies.

Disclosure statement
No potential conflict of interest is reported by the authors.