Complete genome sequencing and assessment of mutation-associated protein dynamics of the first Indian bovine ephemeral fever virus (BEFV) isolate

Abstract Background Bovine ephemeral fever (BEF) is a re-emerging disease caused by bovine ephemeral fever virus (BEFV). Although it poses a huge economic threat to the livestock sector, complete viral genome information from any South Asian country, including India, lacks. Aim Genome characterization of the first Indian BEFV isolate and to evaluate its genetic diversity by characterizing genomic mutations and their associated protein dynamics. Materials and Methods Of the nineteen positive blood samples collected from BEF symptomatic animals during the 2018-19 outbreaks in India, one random sample was used to amplify the entire viral genome by RT-PCR. Utilizing Sanger sequencing and NGS technology, a complete genome was determined. Genome characterization, genetic diversity and phylogenetic analyses were explored by comparing the results with available global isolates. Additionally, unique genomic mutations within the Indian isolate were investigated, followed by in-silico assessment of non-synonymous (NS) mutations impacts on corresponding proteins’ secondary structure, solvent accessibility and dynamics. Results The complete genome of Indian BEFV has 14,903 nucleotides with 33% GC with considerable genetic diversity. Its sequence comparison and phylogenetic analysis revealed a close relatedness to the Middle Eastern lineage. Genome-wide scanning elucidated 30 unique mutations, including 10 NS mutations in the P, L and GNS proteins. The mutational impact evaluation confirmed alterations in protein structure and dynamics, with minimal effect on solvent accessibility. Additionally, alteration in the interatomic interactions was compared against the wild type. Conclusion These findings extend our understanding of the BEFV epidemiological and pathogenic potential, aiding in developing better therapeutic and preventive interventions.


Introduction
Bovine ephemeral fever virus (BEFV) is a widespread and economically significant bovine pathogen associated with bovine ephemeral fever (BEF) disease. The first report on the emergence of BEFV outbreaks dates back to the mid-nineteenth century in Africa, which became endemic to Australia, Africa, the Middle East, and major parts of the Asiatic continent, including East Asia and South-East Asia (Niwa et al. 2015). The outbreaks persist during monsoon onset, coinciding with vectors propagation (culicoid midges and mosquitoes) that spreads the disease (Venter et al. 2003). The clinical manifestations of BEF include polyphasic fever, synovitis, somnolence, muscle stiffness, shifting lameness, reluctance to move, recumbency, and inappetence (St. George b1986). Presently enzootic, the disease is characterized by high morbidity and occasional high case fatality (2-20%). The disease causes enormous economic loss arising from a reduction in milk production, impaired reproduction, traction decline, loss of body weight, and livestock-associated trade restrictions (Hsieh et al. 2005;Tonbak et al. 2013).
The BEFV belongs to the genus Ephemerovirus of the Rhabdoviridae family. It is an enveloped, bulletshaped virion that enwraps an $14.9 kb, non-segmented (-) RNA genome encoding ten transcription units in the order 3 0 -l-N-P-M-G-[G NS -a1-a2-b-c]-L-t-5 0 (McWilliam et al. 1997). The virion contains five structural proteins, namely, nucleoprotein (N, 52kD), phosphoprotein (P, 43 kDa), matrix protein (M, 29 kDa), glycoprotein (G, 81 kDa), large polymerase protein (L, 180 kDa). Additionally, genome codes for a non-structural glycoprotein (G NS , 90 kDa) and few accessory proteins a, b and c (each of <15kDa) (Walker and Klement 2015). Each gene junctions inhabit intergenic regions (IGRs), flanked by regulatory motifs termed transcription initiation (TI), transcription termination/polyadenylation (TTP) signals (Rose 1980; Barr et al. 1997). This arrangement likely regulates the sequential transcription by a "stopstart" process resulting in a 3 0 !5 0 polar gradient of mRNA synthesis (Dietzgen et al. 2017). The phylogenetic analyses based on the glycoprotein (G) gene delineates circulating BEFV into geographically distinct Australian, Middle Eastern, Southeast Asian, and newly emerging South African lineages (Walker and Klement 2015;Omar et al. 2020). Although BEFV is serologically monotypic (Walker and Klement 2015), its diverse distribution and differential pathogenicity indicate its genetic variability. The mutation accumulation probability among all RNA viruses is driven by multiple factors, such as poor proofreading by polymerase protein, replicative repairing, selective pressure, etc. (Sardar et al. 2020;Wang et al. 2020). Notably, mutations could lead to protein structural alterations that affect the viral immunological properties, transmission, and pathogenicity (Wo zniakowski et al. 2009;Sardar et al. 2020). Previous studies have suggested BEFV to possess high substitution rates (Trinidad et al. 2014). However, few recent reports on the generation of novel genogroups with higher viral virulence (Yanase et al. 2020) and limited vaccine effectiveness suggest BEFV evolution is enhancing variant's fitness landscape to help overcome the host immune system (Aziz-Boaron et al. 2014).
The new geographical expansions with enhanced case fatality rate and high production losses alarm BEFV as an evolving arbovirus that warrants closer scrutiny in surveillance, epidemiology, and ecological impact (Lee 2019;Yanase et al. 2020). Despite the multiple decades of episodic reporting on seasonal incursions from many South Asian countries, including India (Nandi and Negi 1999;Walker and Klement 2015), no complete sequence information was available until recently (Pyasi et al. 2020), which hinders progress for its detection and prevention. Field detection solely relies on the observed clinical signs accompanied by haematobiochemical analysis in the lack of any serological diagnostics kits. However, molecular diagnosis affirms the viral suspicion. Thus, the current study aimed at genome sequencing and characterizing the first complete Indian BEFV genome. It further analysed its genetic diversity by investigating genomic mutations and their associated protein dynamics that could provide better insight into the viral pathobiology.

Field sample collection
Blood samples (n ¼ 25) were collected from symptomatic animals during the 2018-19 outbreaks in Madhya Pradesh, India. Bovine aged between 8 months to 8 years of age showed prominent clinical signs in the study. Veterinary professionals collected the blood sample. Haematological and biochemical analysis was performed with/without anticoagulants as required. Subsequently, serum and peripheral blood mononuclear cells (PBMCs) were separated by centrifugation at 3,000 rpm for 15 minutes and kept at À80 C until future use.

RNA extraction and genome sequencing
Viral RNA was extracted from serum utilising TRIzol method (Invitrogen, Waltham, MA, USA) and converted to cDNA using Prime Script cDNA synthesis kit (Takara Bio Inc, Kusatsu, Japan). The viral presence was confirmed by reverse transcriptase PCR (RT-PCR) using published primers (Zheng and Qiu 2012) with previously described PCR conditions (Hsieh et al. 2005). Subsequently, overlapping RT-PCR using eleven primer pairs, designed as per the conserved regions of reference strain (accession no: NC_002526), was performed with optimized annealing between 55 C and 62 C as per the primer set (Table S1). Purified amplicons were directly sequenced using the ABI3100 platform (Applied Biosystems, USA). All sequences were subjected to Blastn search and assembled using BioEdit (Tom et al. 2011).
Alternatively, sequence accuracy was determined by performing Illumina sequencing, a commonly used next-generation sequencing (NGS) technology. Libraries preparation was done using total viral RNA with NEBNext mRNA Library Prep kit (NEB, Ipswich, MA, USA) and sequenced in a MiSeq instrument (150 cycles, paired-end sequencing). The sequenced raw data followed trimming and adapter removal using Trim Galore (Krueger 2015). Further, it reads classification by Kraken-2 utilizing a standard database (Wood et al. 2019). The only confirmed viral reads were snipped against other non-viral reads by BBMap (Bushnell 2014) and assembled into contigs using MEGAHIT v.1.1.3 (Li et al. 2015) to generate a de novo sequence.

Phylogenetic and genetic analysis
Phylogenetic analyses were inferred by the maximum likelihood method employing the Tamura-Nei model of nt substitution model using the MEGA X (1000 bootstrap replicates) (Kumar et al. 2018).
Furthermore, the new genome sequence was aligned with all available BEFV complete genomes archived from GenBank using the ClustalW (Larkin et al. 2007) module of BioEdit to determine terminal sequence conservancy, molecular characterization, and relative similarity evaluation. Eventually, genetic variations among the consensus sequences in addition to the novel nucleotide (nt) and amino acid (aa) substitutions were inspected as synonymous (SN) and nonsynonymous (NS). Lastly, the effect of NS mutations on structural alteration was evaluated.

Mutational assessment
The detected genomic mutations, emphasizing unique NS mutations, were examined to better understand the mutational consequences on protein structure and dynamics. P, L and G NS proteins of Indian BEFV isolate that possessed such mutations were extensively investigated utilizing several computational approach-based tools. We considered the first globally reported Australian isolate as wild type sequence (AF234533) for all comparisons.

Secondary structures prediction
The secondary structure of investigated proteins was predicted by CFSSP: Chou and Fasman secondary structure prediction server (Kumar 2013). The server predicts the probability of occurrence of regions such as a-helix, b-sheet and turns in secondary structure prediction. Eventually, NS mutation-driven relative alterations in structures and solvent accessibility were compared against the reference isolate.

Protein dynamics analysis
Analysing protein structure-based alterations due to mutation determines the energy change value, significant to drive functional alterations. To achieve this, structure elucidation was done relying on the Modeller 9v8 (http://salilab.org/modeller/download_ installation.html) (Sali and Blundell 1993), as no crystal structure of the studied (P, L and G NS ) proteins were available. The FASTA sequences of proteins of Indian and Australian isolates were retrieved from the NCBI database and modelled. Further validation to predict various stereochemical properties was done by PROCHECK (Laskowski et al. 1993) and verify 3 D (Colovos and Yeates 1993). All models were visualized using Chimera tool (Pettersen et al. 2004).
For investigating the mutational consequences on the protein structural conformations, DynaMut server was employed (Rodrigues et al. 2018). The server is based on graph-based signatures in a consensus predictor algorithm and outperforms predicting the effect of mutations on protein flexibility and stability (p-value < 0.001). It works by uploading the modelled proteins to evaluate the impact of mutation in native protein structures based on dynamicity and protein stability resulting from vibrational entropy changes (DDS Vib ENCoM) and differences in the free energy (DDG). It uses a well-established normal mode analysis (NMA) to interpret and visualize the practicality of protein structural dynamics.
Additionally, analysis on altered conformations, evaluating their interatomic interaction due to investigated mutations was done. By utilizing the DynaMut server, results of the interacting molecules demonstrate how often mutation disrupts the interacting bonds, using a pairwise alignment algorithm (Bauer et al. 2019) by comparing against the wild type.

Virus isolation
Virus isolation has also been attempted in Vero and Baby hamster kidney (BHK-21), cell lines, respectively. The cell monolayer has been inoculated by the extracted PBMCs and serum of BEFV positive samples as previously performed (Chen et al. 2010;Zaher and Ahmed 2011). These were eventually passaged every eighth day and observed routinely for visible cytopathic effect (CPE). The supernatant was further investigated by RT-PCR amplification for virus presence.

Clinical and haemato-biochemical evaluations
The outbreaks were episodically epizootic from July-October during 2019. The studied group comprised animals suspected of BEF based on the typical clinical signs and hemato-biochemical data. Most of the affected animals showed significant neutrophilia and lymphocytopenia in the hematological analyses. Similarly, blood biochemical analysis revealed decreased calcium concentrations and elevated creatinine kinase activities compared to normal, attributing to recumbency and lameness in these animals. Details of the hemato-biochemical analyses are depicted in Table S2. Although the infected animals almost exhibited full recovery within 3-10 days, they suffered significant milk losses as evidenced by lactation cessation in 22 out of 25 diseased animals extending up to 8-10 months.

Virus confirmation and sequencing
The PCR analysis showed expected-sized amplicons, confirming BEFV suspicion in 19 samples. Of these, a randomly picked sample was further utilized for complete genome sequencing. The PCR purified amplicons were sequenced and assembled, generating a near-complete genome of IND/IDR/BEFV/2019. Analogously, Illumina sequencing yielded similar outcomes with a de novo assembled 14,875 nt long contigs. Blastn search returned $95% identity to other published BEFV sequences. However, the sequence lacked a stretch of nt (15 at 5 0 and 13 at 3 0 ) as determined by consensus sequences of aligned genomes. Importantly, a perfect conservancy was displayed at 5'end, while 3 0 revealed an average of 96%, with a maximum 99% to Israeli isolate (with the insertion of two nt in the missing region) and hence adopted to the Indian isolate (Identity matrix as shown in Table  S3). Altogether, the first complete genome of IND/ IDR/BEFV/2019 revealed a length of 14,903 nt with 33.37% GC content and deposited in GenBank under accession number MN905763.

Sequence analysis
The aligned consensus sequences revealed a conserved genome pattern with each open reading frame (ORF) flanked by TI and TTP sequences and were separated by intergenic regions (Figure 1). A partial complementarity at 3 0 leader (l) of 52 nt (1-52) and 5 0 trailer (t) of 72 nt (14832-14903 nt) ends were observed, a characteristic property of all Rhabdoviruses (Dhillon et al. 2000). While few variations in the intergenic regions with nt insertions at G NS /a1 and b/c junctions in south-eastern isolates were observed, their significant difference in genome regulations requires experimental exploration. The details of each region are summarised in Table 1 and their sequences are depicted in Table S3.
The Indian sequence revealed the highest ($96.53%) identity to isolates of the Middle East, followed by East Asia ($91%) and lastly, Australia (89.84%). Further comparison of each terminus and the concatenated non-coding regions also followed a similar pattern. However, aa similarity ranged from 85.65% to 92.93%. All comparisons are represented in Table 2.

Phylogenetic analysis and genetic diversity
Phylogenetic analysis clustered IND/IDR/BEFV/2019 with the Israeli and Turkish isolates of Middle Eastern lineage (Figure 2). The pairwise genetic distance of all corresponding sequences, estimated to be 0.03% to 0.10%, revealed high-level identity suggesting the constant evolutionary rate of BEFV. Additionally, the genetic distance had no clear correlation with geographical distance.
Next, we comprehensively analysed the genomic mutational accumulation specific to the Indian isolate, in context to unique nt and aa variations, revealing 76 substitutions (34 in non-coding and 42 in coding regions) and 30 substitutions (20 in structural proteins and 10 in non-structural proteins), respectively (Table 3). Except for N, M, b, and c proteins with 100% identity, SN mutations were prominent in P, G, G NS , a1, a2, and L proteins, with few NS mutations observed in P, L and G NS proteins as well (Table 3). Interestingly, the third nt substitution of codon seemed most frequent for identified NS mutations. The results indicated the L, P, and G NS proteins to show the highest variation with 10, 9, and 8 mutations, respectively, that specifically included 4, 3, and 3 as NS mutations.

Assessment of genomic NS mutations on protein dynamics
The NS mutations stand unique to the particular isolate that profoundly impacts the overall structural properties and biological function (Lewin et al. 2018). Our study utilized the in-silico approach to analyse their impact on various parameters affecting protein secondary and tertiary structures. Additionally, the overall impact due to these NS mutations were evaluated to seek their biological impact.

Secondary structure and its mutational impacts
The impact of these mutations on the secondary structure of mutant proteins was investigated. Our findings demarcated the changes observed in the mutant protein. It shows that except for two mutations, including S152L and K173N in P protein, all the remaining eight sites, E45G, S115P, in P; K1635E, K1937N, E2072R in L and G490E, E515K, K520I in G NS proteins induced changes in secondary structure. The detailed analysis revealed that mutants attained considerable alterations specifically at mutation sites  Pairwise comparison of the complete genome of Indian BEFV isolate (Acc.no. MN905763) that was compared against all other complete sequences of BEFV is denoted. Light grey shade denotes for the coding regions whereas, darker shade for non-coding regions. Ã The percent nucleotide identity is represented by the number without the bracket and the amino acid identity with the number within it in all genes except for non-coding region. ÃÃ It is the assembled concatenated non coding region of the BEFV complete sequence. Above numbers represents the values obtained after performing pairwise sequence alignment that calculated the distances of aligned sequences as done by Bioedit. (E45G, S115P in P and E2072R in L) or in the neighbouring residues (remaining all other mutants), indicating alterations in the local protein conformations in the secondary structure as compared to the reference isolate (Figure 3). The significant alterations in P protein were depicted as few charged (E) and polar residues (S) were substituted by uncharged, non-polar, and hydrophobic residues (G NS , P, and L), though with little altered solvent accessibility. Likewise, in L protein, mutant replaced few positively charged residue with a negative or uncharged residue (K!E/N), with nonsignificant changes observed. In contrast, the substitution of negative to positively charged residue (E!R) may lead to the breakage of hydrogen bonds between the bases, thus altering their structures. All residues showed their solvent accessibility conversion from exposed to slightly buried (K!E/N) and converse (E!R). Similarly, the G NS protein displayed transitions only in the neighbouring residues resulting in altering the secondary structure. For solvent accessibility, a transformation from strongly exposed charged lysine to most buried hydrophobic isoleucine and negatively charged slightly exposed (K!I/E) residues was observed.
3.5.2. Stability dynamics comparison of tertiary structure By analyzing the 3 D protein models in DynaMut server, we correlated the interpretation of secondary structure alterations to tertiary structure protein dynamics. All the built models were refined and validated, assuring their high quality for dynamics analysis. Our data inferred, most of the identified NS mutations resulted in destabilization with negative DDG value and enhanced molecular flexibility with a positive DDS Vib ENCoM value. Except for K520I and E514K mutants of G NS protein with significantly higher positive DDG (each as, À0.282 and À0.504 kcal/mol) and negative DDS Vib ENCoM values (each as, 1.003 and 0.856 kcal.mol-1. K-1) demarcating enhanced stability and rigidity, respectively (Table 4) (Figure 4). The DDS Vib ENCoM value represents an average of the protein configurational entropies within a single minimum energy landscape (Goethe et al. 2015). Whereas DDG is the difference in DG of native and mutant type (Eriksson et al. 1992). Generally, DDG positive value induces stabilization, and DDS Vib ENCoM with negative value results in a decrease in flexibility. Additionally, other structure-based prediction tools such as mCSM, SDM, and DUET included as a module of the DynaMut server, further supported similar stabilization behavior, thus confirming our findings (Rodrigues et al. 2018).
Furthermore, the significant relative variations in the mutants' stability and flexibility can investigate changes in protein confirmations. The comparison revealed significant alteration in the interatomic interaction among the variant and the wild-type. Major changes were observed in E490G and K514I mutants in G NS protein and S115P in P protein, while the rest showed slight alterations. The findings may Table 3. Represents the number and type of unique variations/substitutions in the nucleotide and amino acid of all the genes with respect to their position. Additionally, Synonymous mutation (SN) and Non-Synonymous mutation (NS) that were observed within the Indian BEFV isolate in comparison to globally available sequences are also summarized.  K!Q Ã SN depicts synonymous mutation and NS shows non-synonymous mutation. All amino acid is denoted by its standard symbol used universally.
contribute to explore the mutational influence on interatomic interactions of the proteins with their surrounding environment. All the mutated residues influenced the side chains altering the bond types, including H-bonds, halogen bonds, hydrophobic bonds, etc. That determines the interactions among residues existing in the surrounding area. The difference in structures representing variations is depicted in Figure 5.
Subsequently, utilising the PROVEAN tool, we predicted their corresponding biological impacts as determined by the pairwise sequence alignment scores. Two mutations in P protein (S152L, K173N) showed deleterious effect, while the rest had a neutral effect (Table 5).

Virus isolation
The cell monolayer evidenced profound CPE from the fourth to fifth passage after 48 hours post-inoculation. However, PCR confirmation produced unclear results. Thus, an assumption of an unknown virus that co-exists and co-circulates predominates its adaptability in cell culture compared to BEFV. This inference requires future experimental research. Hence, virus isolation proved unsuccessful.

Discussion
The enhanced endemicity, combined with high morbidity and increasing case-fatality associated with Figure 3. Representation of the effects of non-synonymous (NS) mutations on protein secondary structure. Depicts the prediction of the secondary structure due to change in mutation comparing Indian BEFV isolate to the Australian isolate (reference sequence) in P, L and G NS proteins. The rectangular box shows the mutant residue against the wild type. The position of the dashed box in the corresponding panels highlights the difference in secondary structure between Indian and Australian isolates.
BEFV infection, seriously attributes to enormous economic losses to the global dairy sector. As the world's leading dairy producer, India is at immense risk to BEFV. This necessitates a thorough understanding of BEFV isolates and genome to assess the pathogenicity. This study performed RT-PCR and NGS analysis of prevalent BEFV isolate, accompanied by a mutation-linked assessment of protein structures, crucial in understanding the viral pathogenesis, transmission, evolution, and antiviral therapy development.
The nearly complete genome obtained through molecular approach was completed by utilizing the Table 4. Representing the analysis based on the differences in DDS Vib ENCoM and DDG value due to mutation affecting the stability and flexibility of the mutated protein of Indian BEFFV isolate (MN905763) as compared to wild type reference sequence (AF234533).  terminal sequence conservancy pattern of global BEFV sequences, which determined the IND/IDR/ BEFV/2019 isolate, the first Indian BEFV genome sequence. The 14,903 nt long full-genome was comparable to the Australian and Middle eastern isolates but recorded $ 40 nt shorter than East Asian isolates. All eight available complete genomes (including ours) maintain a conserved genome architecture pattern, length similarities, and deduced proteins suggesting essentiality for the propagation and dispersion of BEFV within the host populations. This isolate has an expected conserved TI and TTP sequences pattern, vital to recruiting RNA-dependent RNA polymerase (RdRp) for independent transcription (Zeng et al. 1998). Moreover, the IGRs separating each gene are crucial determinants in transcription initiation and termination events (Das and Pattnaik 2004;Leyrat et al. 2011). The current isolate showed a certain variation in nucleotide length in IGRs at G NS /a1 and b/c junctions compared to the South-Eastern isolates, which might influence viral gene expression and replication, which needs further exploration.
Intriguingly, sequence and phylogenetic analysis revealed a close relatedness to Israeli and Turkish isolates, implying a common origin. The branching patterns coincide with the previous P and G genes based phylogenetic analysis, projecting its relative high pathogenicity (Yeruham et al. 2010;Abayli et al. 2017;Pyasi et al. 2020). Interestingly, no significant correlation between genotypes and geographical distance was observed. Despite the geographical closeness to Eastern and South-Eastern countries, the Indian isolate revealed the highest similarity to Middle Eastern isolates. As this region shares a significant portion of the livestock trade with India, it indicates a probable reason for the close relatedness of this cattle pathogen.
As geographical traversion of viruses may prompt certain adaptive responses in the form of mutations generating new variants that are more potent and fit (Pfeiffer and Kirkegaard 2005). Henceforth, we comprehensively scanned the IND/IDR/BEFV/2019 genome for unique mutations. We observed a prevalence of unique nt substitutions, with high codon degeneration, majorly contributing to silent or SN mutations (n ¼ 20) and a few NS mutations (n ¼ 10). The sequence conservation pattern followed M > G > N > L > P for structural proteins and c > b>a2 > a1 > G NS for the non-structural proteins. Our results correlate well to other arboviruses (Domingo and Holland 1997;Pomeroy et al. 2008) that sustain constant viral evolution despite high substitution rates, thus similarly BEFV retained a single serotype characteristic worldwide (Dietzgen et al. 2017).
The highest 10 mutations were observed in L, followed by 9 in P, and 8 in G NS proteins that included  1.090 Neutral few NS mutations. The L protein is the sole enzymatically active protein and is essential for genome transcription and replication. Similarly, P protein showed the highest divergence degree (relative to its size), supporting its significant role in genome evolution (Tordo et al. 1988;Leyrat et al. 2011). Since the rhabdovirus N-P-L complex regulates viral biological processes in collaboration (Das and Pattnaik 2004), experiencing mutations in any of these may presumably interfere with overall virus biology. Likewise, mutations in the G NS protein with understudied functions indicate an uncertain outcome. We also examined unique NS mutations in the context of alteration to protein confirmations, contributing to the emergence of new genotypes. For instance, China, Taiwan and Turkey isolates (KY012742) demonstrated an elevated case fatality rate due to such mutations (Zheng and Qiu 2012;Abayli et al. 2017;Yanase et al. 2020). In total, 10 unique NS mutations comprising 4 in P, 3 in L, and 3 in G NS regions were structurally explored to infer any functional consequences (Cao et al. 2020) (Das and Pattnaik 2004). Studies suggest L protein to be highly conserved (Poch et al. 1989), while P protein shows the highest divergence degree (Leyrat et al. 2011). The computational analysis predicted two mutations in P (S152L and K173N) to be considerably deleterious and impact L protein's polymerase activities by interfering with L-P, protein-protein interaction functions (Cao et al. 2020) (Table 4). However, further in vitro analysis is needed to corroborate this observation. Similarly, mutations in the methyltransferase region of L could alter mRNA capping function.
Interestingly, our data on NS mutations in the corresponding protein revealed alteration to the structural stability. Most mutants' secondary structure analysis demonstrated the transition from charged to uncharged residues and hydrophilic to hydrophobic residue substitutions. Also, solvent accessibility decreased as substituted residues were buried deep in the protein structures compared to the earlier exposed ones. Eventually, we were able to link the aforesaid data to extrapolating the transition in the extremely vibrant protein tertiary structures by comparing it to the native type. The data revealed that most of the investigated NS mutations presumably resulted in the loss of rigidity and stability as is seen with positive DDS Vib ENCoM and negative DDG values, respectively. In contrast, G NS protein with E514K and K520I mutations are of concern as they stabilize conformations due to local helix backbone distortions (Cornish et al. 1994), leading to gain in rigidity. Further, most mutants showed significant alteration in the interatomic interaction compared to the wild-type. However, our finding has limitations due to the unavailability of any crystal structure and limited knowledge of the functions of three proteins. Nonetheless, the Indian BEFV genome has acquired mutations that could potentially influence the protein structure. However, their influence on protein functions and viral virulency or transmission remains to be elucidated by performing wet-lab experiments.

Conclusion
The study presents the first complete genome sequence of Indian BEFV isolate and confirms its relatedness to Middle Eastern countries. Genome characterization revealed considerable genetic diversity with several unique mutations. Most of these mutations in P, L and G NS proteins were predicted to potentially impact the protein conformation and dynamics. Except for E514K and K520I mutants of G NS protein that achieved stabilized conformations and increased rigidity, all other mutants exhibited increased flexibility and destabilized effect on the structure. Mutations in P protein (S152L and K173N) are predicted to be deleterious for structural stability. As our study focused primarily on in-silico analysis, the inference from this could pave the way for prospective studies centered on experimental validation. Integrated knowledge of all these aspects with a larger sampling size would enhance diagnostics and vaccine development efforts of this understudied pathogen.

Disclosure statement
No potential conflict of interest was reported by the authors.