Identification of Streptococcus suis putative zoonotic virulence factors: A systematic review and genomic meta-analysis

ABSTRACT Streptococcus suis is an emerging zoonotic pathogen. Over 100 putative virulence factors have been described, but it is unclear to what extent these virulence factors could contribute to zoonotic potential of S. suis. We identified all S. suis virulence factors studied in experimental models of human origin in a systematic review and assessed their contribution to zoonotic potential in a subsequent genomic meta-analysis. PubMed and Scopus were searched for English-language articles that studied S. suis virulence published until 31 March 2021. Articles that analyzed a virulence factor by knockout mutation, purified protein, and/or recombinant protein in a model of human origin, were included. Data on virulence factor, strain characteristics, used human models and experimental outcomes were extracted. All publicly available S. suis genomes with available metadata on host, disease status and country of origin, were included in a genomic meta-analysis. We calculated the ratio of the prevalence of each virulence factor in human and pig isolates. We included 130 articles and 1703 S. suis genomes in the analysis. We identified 53 putative virulence factors that were encoded by genes which are part of the S. suis core genome and 26 factors that were at least twice as prevalent in human isolates as in pig isolates. Hhly3 and NisK/R were particularly enriched in human isolates, after stratification by genetic lineage and country of isolation. This systematic review and genomic meta-analysis have identified virulence factors that are likely to contribute to the zoonotic potential of S. suis.


Introduction
Streptococcus suis is an opportunistic pathogen in pigs and can cause zoonotic infections that often result in meningitis [1,2]. S. suis zoonotic infections occur worldwide with the highest reported incidence in Thailand, Vietnam and The Netherlands [1]. Close contact with pigs and consumption of undercooked pork have been identified as important risk factors for zoonotic S. suis infections [1]. The emergence of zoonotic clones has been demonstrated and led to new insights in the evolution of S. suis' population structure [3], but the virulence factors involved in zoonotic potential of S. suis are not well understood.
S. suis of multiple serotypes from different phylogenetic groups (clonal complexes) are found in healthy and diseased pigs, but human infections are predominantly caused by strains from clonal complex 1 and serotypes 2 or 14 [1,4]. Distinct stages in the pathogenesis of S. suis infections in humans include the adhesion and translocation across mucosal surface particularly in case of foodborne infection, survival in blood, and translocation across the blood brain barrier in case of meningitis [5]. Over 100 putative S. suis virulence factors have been described that may contribute to the pathogenesis of infection in pigs [4,6]. Although many of these virulence factors were identified in in vitro models of human origin, their contribution to S. suis zoonotic potential has not been studied.
We performed a systematic review of S. suis virulence factors studied in in vitro models of human origin. In a subsequent genomic meta-analysis we determined if these putative virulence factors are encoded by the S. suis core or accessory genome and identified those virulence factors that may contribute to the pathogenesis of zoonotic infection, designated putative zoonotic virulence factors (PZVFs). CONTACT

Definitions
Virulence factors can be defined as "molecules produced by pathogens that contribute to the pathogenicity of the organism by allowing its establishment, replication, dissemination and persistence in the host" [4]. Here, we define a PZVF as a virulence factor of a bacterial pathogen from an animal reservoir that contributes to pathogenicity in the human host specifically. We define human models as in vitro models of human origin, including cell lines in continuous culture of human origin, human primary cells, human blood, human blood components, human extracellular matrix proteins, and the zebrafish human streptococcal infection model [7].
Search strategy and selection criteria The systematic review was performed according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [8]. TR searched PubMed and Scopus for primary research articles published until 31 March 2021 describing S. suis and virulence in the title and/or abstract using Pubmed PubReMiner to generate the search query (appendix S1 p1) [9]. References were downloaded and duplicates were removed using Endnote (9.3.3), Mendeley (1.19.8) and a manual search. TR and KA independently screened all titles and abstracts and selected articles that mentioned a host (e.g. human or pig) and S. suis and both agreed on the final selection for full text screening, which was done by TR. Studies were included when a virulence factor was evaluated in a human model and the virulence factor was studied in an isogenic knockout (KO) mutant, as recombinant protein, and/or as purified protein. Articles were excluded when the full text was unavailable in English. Experimental outcomes included bacterial binding of host proteins, adhesion, invasion, translocation, survival and immune cell responses.

Data extraction
TR and ST extracted data from the included articles in a pre-specified table in Microsoft Excel 2016 (appendix S1 p2, appendix S2), followed by an overall curation of extracted data by KA. In short, we extracted information on virulence factor analysis approach (KO, recombinant protein and/or purified protein), S. suis strain characteristics, applied in vitro models, experimental outcomes and NCBI protein ID. If NCBI protein ID was not stated, the NCBI protein ID was searched manually using available data such as primer sequences, gene names or protein sequences. Experimental outcomes for single virulence factors studied in at least 5 articles were summarized and compared. As part of a critical appraisal, data on growth rates of wildtype, isogenic KO and complementation mutants were extracted from the articles or articles' references. In addition, the number of S. suis strains analyzed in each study was recorded.

Bacterial genome meta-analysis
We downloaded all BioSample records from NCBI mentioning "Streptococcus suis" (final date 31-01-2020). Missing metadata were searched in the corresponding publications and pubMLST [10], and added. Genomes were included if at least metadata on host, host health status, and country of origin were available (see appendix 1 p3). The curated set of assembled genomes with corresponding metadata was deposited on Zenodo (10.5281/zenodo.4686597).
The presence of a virulence factor in S. suis isolates was determined by mapping its protein sequence with a minimal protein identity of 95% and query coverage of 60% on the translated S. suis genome assemblies (see appendix 1 p3 [11][12][13][14]). We defined the core genome as all genes present in ≥95% of the isolates whilst the remaining genes constitute the accessory genome.
We calculated the ratio of the prevalence of each virulence factor in S. suis populations isolated from human, and healthy and diseased pigs respectively. A virulence factor was considered a PZVF if the prevalence ratio > 2. A stratified analysis was performed for the main zoonotic S. suis lineage (clonal complex 1) and the countries contributing most human isolates (China, Vietnam).

Results
Title and abstract of 713 unique records were screened and 411 articles were selected for full text screening. Of these 411, 268 articles did not meet inclusion criteria and 13 were excluded due to unavailability of full text in English. The 130 included articles described 124 different putative virulence factors ( Figure 1).
Putative virulence factors were studied as purified protein (3), as recombinant protein (51), as (partial) isogenic KO (152), by blocking protein function with antibodies (3) or as a combination of these. For 56/152 (37%) of the isogenic KO mutants, changes in growth rate compared to parental wildtype were not assessed. For 72/152 (47%) growth rate of KO mutants was reported as unaffected and for 24/152 (16%) impaired growth was observed for the KO mutant. In only 43 (28%) studies the KO mutant was genetically complemented and three articles (2%) described complementation with a recombinant protein.
Models used to evaluate putative virulence factors were grouped based on the human body sites from which the model originated ( Figure 2). The human epithelial HEp2 cell line was used in 63 out of 72 articles that studied adhesion, invasion or cell lysis induced by S. suis in a human epithelial model. Adhesion to extracellular matrix was studied in 13 different articles that used laminin (3), collagen (1), fibronectin (10) and/or fibrinogen (7). Survival in blood was studied in 70 articles using a diverse set of models, of which human whole blood (19), human (polymorphonuclear) neutrophils (15) and zebrafish (15) were most frequent. Human brain microvascular endothelial cells (BMECs) were used in 14 out of 25 articles that studied the role of a virulence factor in crossing the blood-brain barrier (BBB).

Experimental outcomes
Five out of 124 (4%) putative virulence factors (appendix S1 p2, appendix S2) were studied in at least 5 articles and the experimental outcomes were summarized and compared for each factor to evaluate their contribution to zoonotic potential (appendix S1 p4-6).

Factor H binding protein (Fhb)
Fhb also named Streptococcal adhesin protein (SadP) is anchored in the cell wall and secreted [53]. Fhb can bind proteins from the host complement system as well as glycans [53][54][55]. Fhb can bind to Gb3 on human erythrocytes and a specific allele of Fhb (SadP n ) can also bind to Gb4 [56]. Fhb contributes to S. suis adhesion to and translocation across a Caco-2 monolayer by binding to Gb3 [57]. Fhb can bind human factor H, which increases S. suis adherence to airway epithelial A549 cells [29]. A Fhb KO showed decreased binding to vascular endothelial EA.hy926 cells [56]. Fhb contributes to S. suis survival in whole blood [53] and intracellular survival in PMN [53,55]. Fhb can bind factor H [52,53,55] and C3 simultaneously [53] and a Fhb KO showed decreased factor H binding and increased C3b/iC3b deposition [53,55]. Secreted Fhb lowers C3b/C3b deposition on a Fhb KO mutant and restores PMN intracellular survival of S. suis [53]. However, a Fhb KO in a different strain still bound factor H and degraded C3b, whilst THP-1 phagocytosis of the KO mutant was unaffected [29]. Translocation across and adhesion to hCMEC/D3 cells is decreased in a Fhb KO mutant [58] and factor H binding by Fhb increases adherence to BMEC cells [29]. Fhb was shown to bind fibrinogen [48].

Enolase
Enolase is a multifunctional protein with glycolytic functions and plasminogen binding abilities, and is found in many organisms [59]. In S. suis, enolase was found within the cytoplasm and on the cell surface of S. suis, although lacking a LPXTG-motif [60]. Blocking enolase functioning with recombinant protein or polyclonal antibodies was shown to decrease adherence to HEp2 cells [61][62][63]. Enolase was shown to bind fibronectin, [61] laminin [61] and factor H [52]. 40S ribosomal protein SA (RPSA), a protein involved in BBB integrity, was shown to increase at the cell surface of hCMEC/D3 cells when treated with enolase [64]. In transfected HEK-293 T cells it was demonstrated that enolase can interact with RPSA [64,65]. Enolase can induce apoptosis in HEK-293 T cells [65] and in hCMEC/D3 cells by interacting with RSPA [64,65]. The apoptosis induced by enolase is inhibited by caveolae, a type of lipid raft [64].
Out of 111 unique protein sequences, including multiple alleles for four proteins, 53 proteins were encoded by genes which are part of the S. suis core genome (appendix S1 p8). The remaining 58 proteins were encoded by genes which are part of the accessory genome. The presence of these 58 accessory proteins together with isolate metadata was plotted against a clustered core genome alignment [11][12][13][14] ( Figure 3a) and the human-pig prevalence ratio was calculated (Figure 3b, appendix 1 p9). Six proteins (Atl1, Atl2, AtlAss, CPS9E, KAR and PK) had a prevalence ratio below 1. For 26 proteins, including MRP, Sly and CPS2B/E/F/G/J/L which form a single operon [15], the prevalence ratio was above 2 and three of these proteins, Fhb_1, NisK and NisR were at least ten times more prevalent in human isolates than in pig isolates.
Ninety percent of human S. suis isolates had the same genetic background (CC1) while the pig isolates were genetically more diverse (Figure 3a). To adjust for potential lineage effects, we repeated our analysis for the 52 proteins with prevalence ratio above 1 in the first analysis, but restricted to CC1 isolates. Of these 52 proteins, 35 were encoded by genes which are part of the CC1 core genome, including Sly and CPS2B/L. Four proteins (nisin dependent two-component signal transduction system [NisK/R], putative hemolysin-IIIrelated protein [Hhly3] and Fhb_1) had a prevalence ratio of at least 2 (appendix S1 p10-11). NisK/R and Hhly3 were initially discovered on the 89 K pathogenicity island found in Chinese human S. suis outbreak isolates belonging to ST7 [66,67]. Outside ST7 but within CC1, both PZVFs were also present in 110 Vietnamese human isolates from ST 1 (105), ST144 (3), ST869 (1) and ST951 (1), and in 4 Chinese human isolates from ST1 (2) Figure 3. Presence of virulence factors in S. suis isolates and the corresponding virulence factor prevalence ratio in human isolates compared to pig isolates.
(A) The 1703 assemblies were clustered using IQ-TREE [11] based on a Roary [12] core gene alignment of a Prokka [13] annotated assembly. Presence of proteins in S. Suis isolates and isolate metadata were visualized in Phandango [14]. (B) Prevalence ratio of virulence factors in human isolates over pig isolates was based on virulence factor presence in S. suis genomes.
isolates from certain countries. Therefore, we determined the prevalence ratio of these proteins per country of origin. The prevalence ratio within Chinese isolates was 6·4 for NisK/R and 5·5 for Hhly3. In addition, inclusion of multiple strains belonging to a single outbreak may cause confounding. When isolates from the Chinese outbreak in 2005 [68], which all except one harbored NisK/R and Hhly3, were excluded, the prevalence ratio within Chinese isolates was 3·7 for NisK/R and 3·1 for Hhly3. In Vietnamese isolates the prevalence ratio for NisK/R was 1·5 and for Hhly3 1·4. NisK/ R and Hhly3 were not detected in human isolates from other countries than China and Vietnam.

Discussion
We identified 124 S. suis putative virulence factors studied in a human model in our systematic review. In our subsequent genomic meta-analysis, we identified 26 putative virulence factors with prevalence at least two times higher in human isolates than in pig isolates, which were therefore considered as PZVFs.
The five virulence factors most studied in in vitro models of human origin were CPS, Sly, MRP, Fhb and enolase. The contribution of these five virulence factors to S. suis virulence has also been studied in vivo in pig and mouse infection models. In a review of studies of Sly, MRP and Fhb, these putative virulence factors were found not to be critical for virulence in all models [4]. CPS was shown to contribute to S. suis virulence in vivo in pig and mice [15,21,69,70,71]. Both Sly and MRP contributed to virulence in mice [50,51,28,72,73], but a Sly or MRP KO did not show decreased virulence in pigs [38,43,74]. Fhb was shown to contribute to virulence in pigs [55] and to be essential to cross the BBB via Gb3 in mice [58]. Enolase was only tested in mice and increased the BBB permeability [75]. Pig and mouse in vivo infection models appear to yield different outcomes for certain virulence factors. A similar observation was made for the difference in virulence of different S. suis serotype 2 strains, observed after experimental infections in pig and mouse [2]. These data indicate that, although we can learn much from these in vivo models, the translation of mice or pig infection studies to the human S. suis pathogenesis can be challenging.
In our genomic meta-analysis, proteins involved in the serotype 2 capsular polysaccharide biosynthesis were more prevalent in zoonotic isolates, confirming epidemiological observations [1]. Sly, MRP, and Fhb were identified as PZVF, while enolase was found to be part of the S. suis core genome and therefore not identified as PZVF. Only Fhb_1 remained more prevalent in human than in pig isolates within the CC1 lineage. In a previous genomic analysis, a comparison between human and pig isolates from Vietnam and pig isolates from the UK did not find a substantial enrichment of specific accessory genes in human isolates [76]. The prevalence of virulence factors was higher in clinical pig isolates than in non-clinical isolates from the UK [76]. Putative virulence factors were more abundant in Dutch zoonotic isolates than in non-zoonotic isolates [3]. Zoonotic and nonzoonotic strains could only be separated based on their accessory genome and not based on their core genome [3]. Moreover, zoonotic isolates with dissimilar core genomes showed similarity in their accessory genome [3], implying that PZVFs are most likely part of the accessory genome. Here, 53 of the putative virulence factors were encoded by genes which are part of the S. suis core genome. Given their function (appendix S2), many of these putative virulence factors are likely to be involved in S. suis metabolism although a role in pathogenesis cannot be ruled out. As was noted before and was also observed in this study, many S. suis putative virulence factors have not yet been thoroughly characterized [4]. Most virulence factors were studied in a single isolate instead of multiple isolates, introducing potential bias [77]. An additional concern is that isogenic KO mutants used to study the virulence factors were not always properly characterized. In 37% of the studies that used an isogenic KO mutant, the impact of the mutation on growth rate was not verified and therefore a direct effect of the KO on the experimental outcome due to changes in growth rate, instead of or in addition to a potential functional effect, cannot be ruled out.
Independent parallel genomic acquisition events can introduce different PZVFs that could drive the emergence of a zoonotic S. suis lineage, as observed in the Dutch zoonotic CC20 lineage [3]. Such acquisition event could explain why NisK/R or Hhly3 are not present in all human S. suis isolates. These findings suggest that these specific PZVFs are not essential for zoonotic potential per se, as the acquisition of other genes could confer zoonotic potential as well. However, within the zoonotic CC1 lineage or after stratification by country of origin, NisK/R and Hhly3, as well as Fhb_1 are still more prevalent in human isolates than in pig isolates suggesting that these PZVFs contribute to zoonotic potential.
Hhly3 is a cholesterol-independent hemolysin first discovered in the foodborne pathogen Bacillus cereus [78] and later also identified in the foodborne pathogen Vibrio vulnificus [79]. Hhly3 monomers bind in a temperature-dependent fashion to host cell membranes and form 3-3·5 mm pores after multimerization [80]. The cholesterol independency of Hhly3 could give S. suis the ability to induce pores in membranes with low cholesterol or unavailable cholesterol, such as endosomes [81]. The contribution of Hhly3 to S. suis virulence has not been studied in in vivo pig or mouse infection models yet.
Nisin is al antibiotic produced by several Lactococcus and Streptococcus species with antimicrobial properties against Gram-positive and Gram-negative bacteria [82]. Operons conferring nisin resistance in strains that cannot produce nisin themselves have mainly been found in human pathogenic strains, including Streptococcus mutants and Streptococcus agalactiae [82]. In S. suis, three independent acquisitions of nisin resistance genes have been reported. A complete nisin production and resistance locus including NisK/R was found on two different pathogenicity islands in two unrelated strains [83,84]. NisK/R was also present on the 89 K pathogenicity island in a CC1/ST7 strain from China [66]. Here we also detected NisK/R in CC1/ST1 strains from Vietnam. Besides conferring nisin resistance, NisK/R could potentially contribute to zoonotic potential by regulating gene expression [85]. NisK/R was demonstrated to contribute to S. suis virulence in mice [66]. A NisK/R KO mutant was shown to have decreased hemolytic activity and decreased adhesion to and invasion of HeLa cells [66].
Our study has several limitations. To determine the presence of the putative virulence factors in S. suis genomes, we mapped the proteins to the assembled genomes using a minimal identity of 95%. Although this cutoff can distinguish between virulent and avirulent MRP [50], it cannot distinguish small differences at the amino acid level. However, a single amino acid change can affect the function of a putative virulence factor, as for example recently shown for SadP [86]. Additionally, we included two articles in the systematic review that studied sRNAs [87,88], but our protein mapping approach did not permit meta-analysis of regulatory RNA molecules or regulatory non-coding DNA sequences that could contribute to virulence. Moreover, we determined the presence of single virulence factors and did not study a potential combined effect of virulence factors. For the proteins encoded by genes of the accessory genome we attempted to compare their prevalence in human and pig isolates per study, which would allow for a combined statistical analysis comparable to an individual patient data metaanalysis. However, only a single study systematically sampled both pig and human isolates [3], precluding such meta-analysis. Whilst we included all S. suis genomes with accompanying metadata present in NCBI BioSample, for 31% of BioSample records metadata were lacking, likely introducing further bias. We tried to overcome this limitation partly by performing our analysis within genomic lineage CC1 and for individual countries.
Genomic determinants associated with particular bacterial traits are increasingly identified using genome-wide association studies. Such studies require confirmation of biological relevance of genes with significant association. Here, we used a different approach by starting with a systematic approach toward identification of functional proteins and subsequent estimation of their relative frequency in genomes of strains representing different S. suis populations. The collected metadata with corresponding assembled genomes and the list of PZVF are valuable tools for further research into zoonotic potential of S. suis, the pathogenesis of zoonotic S. suis infections, and for early detection of emerging zoonotic lineages.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This research has received funding from the European Union's Horizon 2020 research and innovationprogramme under grant agreement No 727966 (https://cordis.europa.eu/ project/id/727966).

Contributors
CS, TR and KA conceived the study. TR and KA did the abstract screening. Full text was read by TR and data was extracted by ST and TR. KA curated included articles and data extraction. TR collected metadata from BioSample records and divided records over groups. Genomic analysis was performed by BP. TR made visualizations and drafted the manuscript. All authors contributed to the final version of the manuscript.

Data Availability statement
"The data that support the findings of this study are available in Zenodo at https://doi.org/10.5281/zenodo.4686597. These data were derived from the following resources available in the public domain: NCBI (https://www.ncbi.nlm.nih.gov/)."