Whole genome analysis unveils genetic diversity and potential virulence determinants in Vibrio parahaemolyticus associated with disease outbreak among cultured Litopenaeus vannamei (Pacific white shrimp) in India

ABSTRACT Vibrio parahaemolyticus has caused widespread mortality in Indian shrimp aquaculture in recent years. However, there are insufficient genome data for the isolates from Indian shrimp vibriosis to analyze genetic diversity and track the acquisition of genetic features that could be involved in virulence and fitness. In this study, we have performed genome analysis of V. parahaemolyticus isolated from moribund shrimps collected from shrimp farms along coastal Karnataka, India, for better understanding of their diversity and virulence. Five newly sequenced genomes of V. parahaemolyticus along with 40 genomes retrieved from NCBI were subjected to comparative genome analysis. The sequenced genomes had an overall genome size of 5.2 Mb. MLST analysis and core genome phylogenomic analysis revealed considerable genetic diversity among the isolates obtained from the moribund shrimps. Interestingly, none of the V. parahaemolyticus isolates possessed the classical features (PirAB) of the strains associated with Acute Hepatopancreatic Necrosis Disease (AHPND). This study also revealed the presence of multiple virulence attributes, including ZOT, ACE and RTX toxins, secretion systems, and mobile genetic elements. The findings of this study provide insights into the possible transition of an environmental V. parahaemolyticus to emerge as pathogens of aquaculture species by increasing its virulence and host adaptation. Future studies focusing on continuous genomic surveillance of V. parahaemolyticus are required to study the evolution and transmission of new variants in shrimp aquaculture, as well as to design and implement biosecurity programs to prevent disease outbreaks.


Introduction
Vibrio parahaemolyticus is a member of the harveyi clade that occupies various niches of the marine ecosystem [1]. This organism is associated with water, sediment, and various aquatic flora and fauna, such as plankters, invertebrates, fish, and marine mammals [1,2]. It is a well-established foodborne pathogen responsible for inflammatory gastroenteritis in humans following ingestion of contaminated seafood. Human infection is typically correlated with the production of heat-stable toxins such as thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH) coded by tdh and trh genes, respectively [3]. Outbreaks associated with seafood due to certain genogroups designated as pandemic clones of V. parahaemolyticus have been reported from Asia, North America, South America, and Europe [4,5]. Continued outbreaks from different parts of the world led the FAO/WHO to assess the risk associated with seafood contamination by V. parahaemolyticus, recommend monitoring of distribution of this organism in seafood globally, and prepare guidelines for seafood trade [2].
The shrimp aquaculture industry has expanded from 15% to 18% in the international seafood trade over a decade with an expected compound annual growth rate of 5.7% during 2017 to 2020 ( [6,7] Highlights no. [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]. However, with the intensification of aquaculture and environmental changes, diseases have become a major problem in the Asia Pacific region and Latin American countries. While most of the shrimp diseases have been due to viral agents, the disease caused by V. parahaemolyticus has received considerable attention from the scientific community. Some strains of this bacteria have been identified as a shrimp pathogen causing Acute Hepatopancreatic Necrosis Disease (AHPND) resulting in mass mortality in farmed shrimps in several parts of the world [4,8]. Unusual disease outbreaks due to AHPND began in China during 2009, rapidly spreading to Vietnam, Malaysia, Thailand, and Mexico [9][10][11]. AHPND affects shrimps in the post-larval or juvenile stage within 30-35 days post stocking and can cause near 100% mortality [12,13]. Studies have revealed that AHPNDcausing V. parahaemolyticus possesses a unique 70kb plasmid (pVA1) encoding a binary toxin, photorhabdus insect-related (Pir) coded by pirA and pirB genes [8]. PirAB toxins induce pore formation in the hepatopancreatic cells, leading to tissue degradation and digestive dysfunction that finally results in cell necrosis [8,13]. The animals affected by AHPND are characterized by severe atrophy of the hepatopancreas with massive sloughing of epithelial cells. However, in recent years, reports of PirAB lacking V. parahaemolyticus infections have been reported [13,14]. AHPND has not been detected in India, but reports of V. parahaemolyticus strains lacking pirA and pirB genes have been associated with mortalities in farmed shrimp during 2013 [15].
Whole-genome sequence analysis of V. parahaemolyticus isolated worldwide has shown high genomic diversity with respect to their virulence and targeted hosts [16]. Many of the virulence attributes are acquired by horizontal gene transfer (HGT) through mobile genetic elements [17]. The genome plasticity and genetic diversity of mobile elements associated with V. parahaemolyticus may affect the survival of the pathogen and its infection capabilities. Also, understanding the relationship between epidemiological links among different pathogenic strains isolated worldwide and the genome complexity and diversity of V. parahaemolyticus [18] would be important. Recent studies have shown that environmental isolates of V. parahaemolyticus acquire the virulence attributes that would make them potentially pathogenic for humans and aquatic animals [12,19].
We performed whole-genome sequencing of V. parahaemolyticus isolated from moribund shrimps collected from shrimp farms along coastal Karnataka, India, and compared the sequences with global strains to better understand the virulence attributes and diversity of V. parahaemolyticus. The data would be useful to explore the pathogenesis mechanism of this opportunistic pathogen to cause pathological changes in the affected animals and the molecular factors that allow V. parahaemolyticus to adapt and grow in various niches.

Results and Discussion
The intensification of the shrimp aquaculture industry, and the consequent occurrence of diseases, is a constraint to the progress of the sector. V. parahaemolyticusassociated disease of cultured shrimp is an emerging disease in aquaculture that has seriously impacted food productivity, animal welfare, human health, and, consequently, the country's overall economic development. In late 2013, the first documented example of largescale mortalities among cultured Litopenaeus vannamei (Pacific white shrimp) owing to V. parahaemolyticus were recorded in grow-out farms within 40-50 days of stocking in India, and the disease continues to be a scourge for aquaculture [15,20,21]. Surprisingly, none of the V. parahaemolyticus isolates obtained from the diseased shrimps exhibited the classic characteristics of the strains associated with AHPND. To investigate the genomic characteristics of the disease-causing strains, we performed whole-genome sequencing and comparative genomic analysis on V. parahaemolyticus associated with an outbreak among cultured L. vannamei in coastal Karnataka grow-out ponds.
Severe mortalities among cultured L. vannamei were reported in coastal Karnataka grow-out ponds in early 2017. Infected animals exhibited erratic swimming, decreased feeding, cloudy hepatopancreas, brown to black spots on their exoskeleton, and strings of white feces floating in the water column ( Figure 1). Bacteriological analysis of hemolymph, aseptically drawn using a syringe from 100 moribund shrimps from 10 ponds and plated on thiosulfate citrate bile salt (TCBS) agar yielded greenish-blue colonies. Their biochemical reactions matched the phenotypic characteristics of standard V. parahaemolyticus culture. Eighty isolates of V. parahaemolyticus obtained were genotypically confirmed by the presence of toxR and tlh genes [22,23]. As all analyzed samples were negative for the shrimp viruses tested, they are not discussed further [24]. Four representative samples originating from different ponds were selected based on the extent of shrimp mortality in the ponds, for their ability to cause infection in laboratory trials on shrimps (results not shown) and for the presence of virulence gene markers. All four isolates were negative for the tdh, trh, and pirAB genes, but their T6SS profiles differed with two isolates possessing both T6SSs and the other two negative for T6SS1 and positive for T6SS2. Additionally, an oyster isolate positive for the tdh and trh gene was chosen to study the genetic composition and differences, if any, from V. parahaemolyticus recovered from the cases of shrimp mortality, through comparative analysis.

General genomic features and annotation of the sequenced isolates
The raw reads generated from the five V. parahaemolyticus isolates were evaluated for their quality and were found to be appropriate for downstream genome analysis. de-novo assembly of raw reads using Unicycler v0.4.8 generated high-quality draft genomes with an average coverage of 800X and >89% completeness estimated by QUAST. The V. parahaemolyticus isolates sequenced in this study had an overall genome size of approximately 5.2 Mb (5.09-5.35 Mbp) with an average GC content of 45%, which was similar to the reference strain RIMD2210633 (Table 1) and other V. parahaemolyticus genomes available in the NCBI GenBank [25][26][27][28]. The pairwise ANI percent and dDDH values revealed a high degree of sequence similarity to the reference strain RIMD2210633. The detailed information on pairwise ANI percent and dDDH values of each isolate is given in Table S1. The RAST annotation of the draft genomes revealed the presence of approximately 4,895 CDS in which an average of 3,798 proteins was assigned a functional category and 1,096 identified as hypothetical proteins (Table S2).

MLST-based phylogenetic analysis
The in silico MLST analysis of the five V. parahaemolyticus isolates sequenced from this study demonstrated five different sequence types (STs) and among them, three STs were identified with previously assigned STs in the pubMLST database (https:// pubmlst.org/), HP1 -ST428; SHP/2 -ST363 and 81TDH2 -ST675. The remaining two isolates, NUK/7  and VP32 showed unique MLST allelic profiles, hence was submitted to the PubMLST database to assign allele identifier number and sequence type. These two isolates were assigned with new sequence types, ST2429 for NUK/7 and ST2428 for VP32, by the pubMLST database curator ( Table 2). The previous studies have identified several clonal complexes in the V. parahaemolyticus, including CC3, comprising pandemic strains and six clonal complexes of clinical isolates [29,30]. It is important to note that none of the STs identified in this study belongs to any of the clonal complexes described for the highly pathogenic clones of V. parahaemolyticus [29].
A phylogenetic tree was constructed based on the concatenated sequences of seven housekeeping loci obtained for the five different V. parahaemolyticus strains sequenced in this study along with 40 global strains (retrieved from diverse sources, Table 1), whose MLST loci are available at PubMLST database ( Figure 2). The resultant phylogenetic tree consisted of two major lineages, lineage A and lineage B. Surprisingly, all of the V parahaemolyticus isolates from vibriosis-affected L. vannamei in India were found scattered throughout the phylogenetic tree without restricting to a particular cluster. The results obtained in this study are in agreement with the study of Chonsin et al. [31], wherein they demonstrated that V. parahaemolyticus affecting L. vannamei were genetically diverse and not derived from a single genetic lineage. However, most of the clinical isolates included in this study showed a unique branching pattern (Clade B.C2). The overall observation from the phylogenetic analysis shows that the genome of V. parahaemolyticus has remarkable genetic diversity which could be due to a relatively high frequency of recombination events [29,32,33]. In conclusion, the V. parahaemolyticus population shows a high level of genetic diversity, which has evolved as a result of mutations and recombination events. Exposure of V. parahaemolyticus to changing environmental conditions could further result in the selection of new sequence types.

Mining of virulence-associated genes and characterization of the genomic and pathogenicity islands
BLASTn and VFanalyzer were used to search for virulence genes in the genomes of the five V. parahaemolyticus isolates sequenced in this study. Except for 81TDH2, none of the sequenced isolates of this study possessed the classical virulence markers (tdh and trh) commonly associated with human pathogenic V. parahaemolyticus. Interestingly, none of the sequenced isolates possessed Pir toxin-like genes, despite the fact that four of them were isolated from Vibriosis affected shrimp from aquaculture ponds. Tran et al. [12] demonstrated that V. parahaemolyticus carrying Pir toxin genes were responsible for pathological changes in the hepatopancreas of AHPND -affected animals and they were negative for the tdh and trh genes. T3SS1 and T6SS2 genes were present in all five sequenced isolates, as these systems are conserved and present in both environmental and clinical isolates of V. parahaemolyticus [16]. It is noteworthy that T6SS1 was discovered in the genomes of V. parahaemolyticus NUK/7 and SHP/2, which had previously only been found in human clinical and AHPND-positive isolates [34,35]. The two loci of T6SS present in each chromosome play a role in both inter-bacterial and host-pathogen interactions [36]. The presence of T6SS among the isolates associated with disease outbreak in this study implies that acquisition of T6SS might offer a fitness advantage for this opportunistic pathogen in out-competing other microbes in the same ecological niche and thereby influencing the colonization outcomes [37]. Similarly, T3SS's are involved in the secretion of toxins that cause cytotoxicity and enterotoxicity [25]. In contrast to T6SS, the T3SS1, known to secrete cytotoxic proteins, is present in almost all isolates, while T3SS2 is identified mainly in clinical isolates encoding tdh and/or trh genes [38,39]. Both T3SS's play a prominent role in conferring virulence to V. parahaemolyticus [5,40] VFanalyzer was used to predict 147 virulence factors related to secretion systems, toxin production, quorum sensing, iron metabolism, cellular motility, and host immune evasion genes such as adhesion and antiphagocytic factors from five sequenced genomes. The pattern of virulence profile from this study showed variation in the number of genes identified in each category, with an exception of quorum sensing and iron uptake with reference to RIMD2210633. The details of the number of genes present in each category among the five sequenced isolate and reference strain RIMD2210633 is illustrated in supplementary Figure   S1. The quorum sensing genes luxS and cqsA were present in all isolates similar to reference strain. Similarly, the genes encoding for iron uptake and chemotaxis-motility showed minimum differences compared to reference strain. A capsular polysaccharide gene cluster cpsA-Z, which is responsible for capsular biosynthesis, was found on chromosome 2 in all of the isolates included in the study. The genes encoding capsular polysaccharides function as surface antigens, which play an important role in bacterial defense against the host immune system [41]. To establish itself within the host, a pathogen must overcome the host's phagocytosis mechanism. All environmental isolates had a higher number of anti-phagocytosis-related genes than the reference strain. This is an intriguing finding, and more research is needed to understand the distribution and dynamics of capsular polysaccharide genes, as well as their relationship to pathogenicity and fitness in V. parahaemolyticus isolates. Other than NUK/7, the four sequenced isolates had ORFs encoding Type IV pilus among the adhesion factor genes. The NUK/7 genome identified two ORFs encoding Type IV pili in Yersinia sp. and an ORF encoding Pseudomonas aeruginosa LPS O antigen, identifying more genes in adhesion factors than the reference strain RIMD2210633. Adhesion factors are essential for pathogens to attach to the host, allowing the bacterium to persist in a new environment and proliferate as a symbiont or a pathogen [42]. Overall, all of the sequenced isolates possessed virulence characteristics that could allow them to colonize the host and cause disease.
Using the BLAST atlas tool, we compared the chromosomes of the five V. parahaemolyticus isolates sequenced with the reference genome of V. parahaemolyticus RIMD2210633 (Figure 3). The chromosome comparison revealed that the genomes had congruent gene content, variability was found in the pathogenicity island (VPaI) region and other mobile genetic elements. These pathogenicity islands on the chromosome are horizontally acquired regions flanked by repeated regions that improve the bacterium's virulence and fitness in a specific environment [43]. Seven VPaIs have been identified in the genome of V. parahaemolyticus RIMD2210633 [43,44]; VPaI 5 was absent in all sequenced isolates; only SHP/2 and 81TDH2 were positive for VPaI 2; 81TDH2 harbored the complete VPaI 7 while all the other isolates had a partial sequence; partial sequences of VpaI 6 was present in all isolates; a partial island sequence of VPaI 1 in NUK/7 and 81TDH2; and a partial VPaI 4 sequence in SHP/2 and HP1 ( Figure 4). BLASTn results showing a sequence with >80% sequence coverage and >95% of sequence identity to reference sequence was considered to harbor V. parahaemolyticus pathogenicity island (VPaI). Among the seven islands, VPaI 1, 4, and 5 are associated with the pandemic clones of V. parahaemolyticus while VPaI 2, 3, and 6 are completely or partially present in all isolates regardless of their pandemic potential or pathogenicity, which was evident in this study as well [4,43,45,46]. VPaI 7 harboring tdh and/or trh genes and T3SS2 cluster within its genomic region [16] is usually associated with the highly pathogenic strains of V. parahaemolyticus, having the ability to induce cytotoxicity and enterotoxicity and cause inflammatory gastroenteritis in humans [20,29]. This indicates that the emergence of environmental isolates with the acquisition of pathogenicity and fitness genes is a potential threat to public health [19].

Toxins and antimicrobial resistance genes
We also identified the gene coding for ACE toxin in the chromosomes of V. parahaemolyticus isolates sequenced in this study (81TDH2, NUK/7), which may give them an advantage in their environment. Accessory cholera enterotoxin (ACE) is a known Ca2 + dependent symporter that aids bacteria in ion transport across the membrane encoded in the "virulence cassette" of Vibrio cholerae [47]. Another promising toxin encoded by the zot gene is zonula occludens toxin (ZOT), a secreted toxin that increases intestinal permeability and is found in many Vibrio spp. Among the sequenced isolates, zot was found in NUK/7, VP32, and 81TDH2, which may improve their pathogenicity to shrimps and humans. They are known to cause actin cytoskeleton disruption in non-toxigenic clinical V. parahaemolyticus, increasing the permeability and enterotoxigenic effect in host cells [47,48]. The RTX toxin was present in the genomes of all five sequenced V. parahaemolyticus isolates. These toxins associated with the Type I secretion system in Gram-negative bacteria have been shown to have a variety of effects, including pore formation, hemolysin, and cytotoxic activity. They are known to contribute to the virulence of several Vibrio species, including V. cholerae and Vibrio vulnificus. RTX toxins are reported in V. vulnificus and known to cause acute cytotoxicity in host cells upon contact [49]. Thadtapong et al. [50] have made similar observation wherein non-AHPND  V. parahaemolyticus carrying ACE and ZOT along with elements of secretion system showed virulence to shrimp with the characteristic features identical to AHPND pathology. Based on these observations, we surmise that the presence of ACE, ZOT, and RTX toxin genes along with the identified secretion systems and genomic islands might have enhanced the pathogenic potential of these V. parahaemolyticus to cause mass mortalities in L. vannamei.
Bacterial resistance mechanisms are critical for their survival in the environment. Antibiotic use in the aquaculture industry has also resulted in shrimp pathogens developing resistance to the majority of commonly used antibiotics [51]. PATRIC analysis of sequenced genomes revealed an average of 74 genes responsible for antibiotic and heavy metal resistance ( Figure S2). Multidrug resistance genes predominated in all the isolates. The genes responsible for resistance to acriflavine, fluoroquinolones, tetracycline, and β-lactamase were identified. In the presence of bactericidal heavy metals, environmental bacteria in the estuarine and marine environment tend to adopt mechanisms to overcome heavy metals [52]. All environmental strains tested in this study possessed resistance genes against heavy metals, such as zinc, cadmium, copper, cobalt, arsenic, and chromium, which are common heavy metals in the natural habitat where V. parahaemolyticus is present.

Presence of mobile genetic elements
MOB Suite found four plasmids in NUK/7, two each in SHP/2 and VP32, and no plasmids in HP1 or 81TDH2. Plasmids in NUK/7 contained toxin genes coding for a putative insecticidal toxin from the RhsA superfamily, ACE, and ZOT toxins; SHP/2 contained genes for ACE, and ZOT, and VP32 had plasmid-encoded ZOT toxin genes. None of the plasmids contained the PirA and PirB binary toxin genes, which are found in V. parahaemolyticus strains that cause AHPND. Plasmids, as mobile genetic elements encoding toxin genes in sequenced isolates of V. parahaemolyticus, may contribute to improved survival in their ecological niches and to genetic diversity. Other than 81TDH2, four of the five sequenced isolates showed the presence of phage sequences in their genome (Table S3). ORFs from f237 were found among the identified prophage sequences. The phage f237 has been linked to pandemic clones since 1996, and is thought to play a key role in increasing bacterial virulence [53]. Furthermore, VP32 had prophage sequences from the Myoviridae family, while the HP1 and SHP/2 genomes contained prophage sequences from the Myoviridae and Siphoviridae families. These phages are members of the order Caudovirales, which includes marine bacteriophages known to infect Vibrio sp [53,54]. Apart from being acquired horizontally, some of these prophages could have integrated into the sequenced bacterial genome and may have been transferred vertically through generations [55]. Prophage sequences may be responsible for some of the genetic diversity observed in the isolates studied. CRISPR plays a crucial role in the interaction of bacteria and mobile genetic elements. Six CRISPR sequences have been identified in V. parahaemolyticus to date. Clinical isolates have these CRISPR sequences in their genome [56], but very few CRISPR sequences have been identified in V. parahaemolyticus environmental isolates. CRISPR finder identified a few potential/dubious CRISPR sequences from the sequenced genomes; however, no confirmed CRISPR sequences were found in any of our sequenced isolates (Table  S3). CRISPR/Cas systems are immune genes that prevent the acquisition of mobile genetic elements in bacteria. Fu et al. [57] have shown that the absence of CRISPR/Cas elements contributes to genome plasticity due to the horizontal acquisition of new virulence and fitness genes in bacteria. In contrast to clinical isolates known to have CRISPR sequences, environmental isolates lacking CRISPR elements have a better chance of acquiring conjugative genomic elements encoding resistance or virulence genes.

Pan and core genome phylogeny analysis
The genome of five V. parahaemolyticus sequenced that was subjected to pan-genome analysis with 40 global isolates retrieved from NCBI GenBank using Roary revealed 19, 531 protein-coding genes. From the analysis, it is evident that as the V. parahaemolyticus genomes accumulated, the number of core genes reduced while the number of pan genes increased. This demonstrates the versatility of V. parahaemolyticus open pan-genome, which is a characteristic feature for bacteria with the capability to acquire new genes via HGT [58,59]. The open pan-genome also illustrates the diversity in the V. parahaemolyticus genome within the species. To determine the phylogenetic relationship of V. parahaemolyticus core genomes, a concatenated core genome-based phylogenetic tree was constructed using 2,321 identified core genes ( Figure 5). From the core genome phylogeny, it is evident that all the newly sequenced V. parahaemolyticus isolates were polyphyletic; whereas, the genome of the majority of clinical isolates exhibited monophyletic branching. A pan-genome phylogenetic tree constructed, along with a gene presence-absence matrix (Figure 6), also revealed polyphyletic branching of the isolates sequenced in this study. Although the analysis is preliminary in nature, the genetic variation observed in the V. parahaemolyticus genome could be ecologically significant as a strategy to expand and adapt to various ecological niches as well as the driving force in the evolution of this organism [60,61] In summary, this study unveils the genetic diversity and presence of potential virulence determinants in V. parahaemolyticus isolated from the disease outbreak among cultured L. vannamei in India. Despite the fact that these isolates were recovered from moribund shrimp and tested negative for the pVA plasmidencoded PirAB toxin, which is associated with AHPND and shrimp virulence, their pathogenicity to the shrimp could be due to other identified toxins such as ZOT, ACE, RTX, and secretory apparatus such as T6SSs. The acquired genomic islands and mobile genetic elements detected in these isolates might be responsible for their transition from an environmental niche to an aquaculture pathogen by increasing its host adaptation fitness. The results presented in this study is limited to few isolates recovered from the disease outbreak; hence, future genomic studies involving a large number of isolates recovered from vibriosis affected shrimp and their rearing environment is necessary to understand the evolution of V. parahaemolyticus pathotypes and novel genetic features responsible for causing mass mortalities in culture systems. Such studies could also provide valuable information on the design and implementation of biosecurity programs to prevent disease outbreaks and develop appropriate response strategies for shrimp aquaculture management.

Isolation and Identification of V. parahaemolyticus from moribund shrimp
A total of 100 moribund shrimps were randomly sampled from 10 ponds where mass shrimp mortalities were observed, and bacteriological analysis was performed to identify the causative agent associated with the disease. To isolate Vibrios, hemolymph and hepatopancreas from each animal was plated on thiosulfate citrate bile salt sucrose (TCBS) agar (HiMedia Laboratories Pvt. Ltd, India). Bluish-green colonies obtained after 24 h of incubation at 30°C were subjected to a battery of biochemical tests to identify the Vibrio sp., as described previously [15]. Polymerase chain reaction (PCR) was used to confirm the species of the vibrio isolates by targeting the toxR and tlh genes [22,23]. All samples were also tested for the important DNA and RNA viruses affecting shrimp aquaculture in Asia by PCR as described in the OIE Aquatic Manual [24].

Whole-genome sequencing and assembly
Whole-genome sequencing and comparative genomic analysis were performed on four confirmed cultures of V. parahaemolyticus isolated from moribund shrimp and one isolate from seafood (oyster) harvested from India's southwest coast. All isolates were grown overnight at 30 °C in Luria Bertani (LB) broth supplemented with 3% NaCl and the genomic DNA was extracted by the cetyl trimethyl ammonium bromide (CTAB)proteinase K method. Whole-genome sequencing library of genomic DNA was prepared using the Nextera XT kit (Illumina). The Illumina MiSeq platform was used to prepare the sequencing libraries. The quality of the raw reads was assessed using FastQC v0.11.8 and MultiQC v1.6 [62], and contamination was detected using Kraken v2 [63]. Trim Galore v0.6.2 [64] was used to perform low-quality base calling and adapter sequence trimming. The trimmed paired reads were subjected to de-novo assembly using Unicycler v0.4.8 [65], with output files generated in FASTA format. QUAST [66] was used to evaluate the quality of the assembled genomes. MOB-Suite v1.4.9 [67] was used to separate genomic and plasmid DNA. The genomic DNA contigs were reduced to single scaffolds as chromosomes 1 and 2 using MeDuSa v2/ 1.6 [68]. In addition, genome sequences of 40 V. parahaemolyticus isolates available in the NCBI database (Table 3) based on parameters, such as their global presence, source, and isolate characteristics (AHPND

In-silico taxonomic validation and genome annotation
The species identity was confirmed using JSpecies v1.2.1 [69] to calculate the pairwise average nucleotide identity (ANI) of the assembled genomes with V. parahaemolyticus RIMD2210633. In addition, the GGDC 2.0 server [70] was used to calculate digital DNA-DNA hybridization (dDDH). The Rapid Annotations using Subsystems Technology (RAST) server [71] was used to perform structural gene prediction and functional annotations. Using the BLAST Ring Image Generator (BRIG) [72], the annotated chromosomes were visualized in the BLAST atlas. Phigaro v2.2.0 [73] helped detect the presence of putative prophage sequences in assembled genomes. VFanalyzer [74] and NCBI BLASTn were used to compare the presence of putative virulence factors with V. parahaemolyticus RIMD2210633. To predict the presence of antibiotic resistance genes, PATRIC and the CARD web portal were used and CRISPRfinder [75] served to annotate the CRISPR elements in the V. parahaemolyticus sequenced genomes.

MLST and phylogenetic analysis
The stringMLST tool [76] helped perform in-silico MLST analysis on the sequenced strains. Furthermore, This study a neighbor-joining phylogenetic tree calculated with default parameters from concatenated sequence alignments of seven housekeeping genes from five sequenced isolates and 40 global sequences was constructed using iTOL v3 [77] for the MLST-based phylogenetic analysis.

Pan genome analysis
Roary v3.13.0 [78] was employed for the pan-genome analysis. To generate gff3 files, 40 NCBI GenBank reference strains and 5 newly sequenced strains were annotated using Prokka v1.13.4 [79]. Thereafter, the annotated files were subjected to pan genome analysis with a minimum BLASTP identity cutoff of 95%. The genes present in 99% of the total isolates were identified as core genes. The resulting core genome alignment was then used to build a core genome-based phylogenetic tree with IQ-TREE v 1.5.5.3 [80] with 1000 bootstrap replicates and visualized with iTOL v3 [77]. Similarly, a pan-genome-based phylogenetic tree was constructed and visualized with the gene presenceabsence profile of the 45 isolates analyzed.