Molecularly defined extraintestinal pathogenic Escherichia coli status predicts virulence in a murine sepsis model better than does virotype, individual virulence genes, or clonal subset among E. coli ST131 isolates

ABSTRACT Background: Escherichia coli ST131, mainly its H30 clade, is the leading cause of extraintestinal E. coli infections but its correlates of virulence are undefined. Materials and methods: We tested in a murine sepsis model 84 ST131 isolates that differed by country of origin (Spain vs. USA), clonal subset, resistance markers, and virulence genes (VGs). Virulence outcomes, including illness severity score (ISS) and “killer” status (>80% mouse lethality), were compared statistically with clonal subset, individual and combined VGs, molecularly defined extraintestinal and uropathogenic E. coli (ExPEC, UPEC) status, and country of origin. Results: Virulence varied widely by strain. Univariable correlates of median ISS and percent “killer” (outcomes if variable present vs. absent) included pap (ISS, 4.4 vs. 3.8; “killer”, 71% vs. 46%), kpsMII (4.1 vs. 2.3; 59% vs. 25%), K2/K100 (4.4 vs. 3.2; 77% vs. 41%), ExPEC (4.2 vs. 2.2; 62% vs. 17%), Spanish origin (4.3 vs. 3.1; 65% vs. 36%), and H30R1 subset (2.5 vs. 4.1; 35% vs. 59%). With multivariable adjustment, ExPEC status was the only consistently significantly predictive variable. Conclusion: Within ST131 the strongest predictor of experimental virulence was molecularly defined ExPEC status. Clonal subsets seemed to behave differently in the murine sepsis model by country of origin.


Background
The pandemic extraintestinal pathogenic Escherichia coli (ExPEC) clone ST131 is a major contributor to the increasing incidence of extraintestinal E. coli infections, mainly bloodstream and urinary tract infections, especially those caused by fluoroquinolone-resistant or extended-spectrum beta-lactamase (ESBL)-producing strains. ST131 also occurs in the gut microbiota of healthy and institutionalized persons [1].
Like other E. coli clones from virulence-associated phylogroup B2, ST131 exhibits a broad range of genes that encode known or suspected virulence factors, hence are called virulence genes (VGs). Such VGs, which contribute to adherence, colonization, invasion, and/or persistence in the host, include siderophores (iutA, fyuA, iroN), adhesins (fimH, pap, afa/dra, iha, yfcV), toxins (sat, vat), protectins (traT, iss, capsule variants), and miscellaneous elements (cvaC, ompT, usp, malX). ST131's VGs have been proposed as a possible reason for its dramatic dissemination and clinical emergence. Moreover, some lineages within ST131 have been associated with sepsis, worse clinical outcomes, and errors in empirical treatment [5,6]. Whether for ST131 particular VGs or combinations thereof are required for, or associated with, successful colonization, establishment of infection, or progression to severe disease is unclear.
Studies to date of the experimental virulence of ST131 in diverse animal hosts (mice, zebrafish, Caenorhabditis elegans, and Galleria mellonella) have yielded conflicting results [7][8][9][10]. Moreover, several authors have identified specific combinations of VGs, or "virotypes", within ST131 [8,11,12] that in some studies predicted experimental virulence [8]. However, interpretation of these studies is impeded by their small sample size, inconsistent selection of VGs, limited attention to ST131 clonal subsets, and diversity of animal models, including some of the uncertain relevance to human infections.
Because of the importance of identifying potentially virulent ST131 strains, we sought here to identify among E. coli ST131 isolates associations of experimental virulence with diverse bacterial traits that could act as markers for said virulence, whether or not they directly determine it. For this, we used an established murine sepsis model and a comparatively large collection of well-characterized ST131 isolates of diverse ecological and geographical origins. We then compared experimental virulence results with the strains' country of origin, virulence genotype, ST131 clonal subset, ESBL genotype, and fluoroquinolone resistance status.

Study collection and subtyping
A convenience sample of 84 diverse ST131 E. coli isolates from various previously published collections from our group was analyzed [9,[13][14][15][16][17]. Geographical origin, year of testing, and ecological source are shown in Table 1. Whereas all isolates from Spain were tested in 2014, 81% of the isolates from USA were tested pre-2014. Because the control strains yielded consistent results across experiments and years (not shown), we assumed that any variation associated with the year of testing was due to geographical factors. Accordingly, to avoid possible bias, we also analyzed the variable "country", despite its close correlation to "year of testing".

Experimental virulence
In vivo virulence was assessed previously using a wellestablished murine subcutaneous sepsis model [13,26] at the Minneapolis Veteran Affairs Medical Center (MN, USA) according to animal use protocol 120,603, as approved by the local Institutional Animal Care and Use Committee. The sepsis model results for the present isolates were reported elsewhere [13,14].
For this model, female pathogen-free outbred Swiss-Webster mice were inoculated subcutaneously with approximately 10 9 CFU/mL log-phase bacteria in 0.2 mL saline, as described previously [13,26]. Mouse health was assessed twice daily for 3 days post-challenge by experienced researchers unaware of strain identity, following a strict protocol and using positive control strain CFT073 (high lethality) and negative control strain MG1655 (zero illness or lethality). Mice were classified daily as to maximal illness severity, which ranged from 1 (healthy) to 5 (dead), with intermediate scores 2 (barely ill), 3 (moderately ill), and 4 (severely ill). Results for the controls were consistent during all the experiments, regardless of year (data not shown). The variables used as metrics of the study isolates' virulence potential included overall mean illness severity score (ISS), a continuous variable obtained by averaging the daily illness severity scores for the mice challenged with a given isolate, and "killer" status, defined based on death of ≥80% of the challenged mice [15]. Each test strain was assessed initially in five mice, followed by another five mice if the initial testing did not yield a consistent result (i.e., lethality or survival for four or five of the initially challenged mice). To minimize the risk of a possible cohort bias, mice from a given shipment were assigned to different test strains using a formal randomization scheme.
Statistically significant values are in bold. frequencies and percentages and were compared using a chi-square test or Fisher's exact test, as appropriate. The criterion for statistical significance was p < 0.05, with Bonferroni correction for multiple comparisons. To avoid possible bias involving the variables "year of testing" and "country of origin", analyses were repeated after stratification by year and country. Univariable and multivariable regression analysis (simple regression, for ISS; logistic regression, for "killer" status) were used to assess the predictive power of independent variables with and without adjustment for collinearity between them. For multivariable analysis, only those bacterial traits were included that in univariable analyses predicted one or both of the experimental virulence outcomes, whether overall or after stratification by year or country. For use in multivariable modeling, the qualifying univariable predictors were divided into a core set (ExPEC status, belonging to the H30R1 clonal subset, and year of testing) and a supplementary set (VG score, genes pap, kpsMII, K2/K100).
Two methods were used for variable entry into the multivariable models, i.e., forced and stepwise. The forced-entry method was applied first to only the core set of candidate predictor variables, then to the core plus supplementary variable sets combined. The stepwise method was applied to both variable sets combined. Data were analyzed using STATA (Stata Statistical Software: Release 11. College Station, TX: StataCorp LP).
Of the 49 studied VGs and variants, 35 were detected in at least one isolate each. VGs were distributed significantly by clonal subset (Table 2). However, with stratification by year of testing and country of origin, the only statistically significant differences involved kpsMII and ibeA (Supplementary material tables S1-S4). The mean VG score was 11.8 (SD 1.9) overall but was lower among H30R1 isolates (mean 10.2, SD 1.2) than among H30Rx (mean 12.6, SD 1.7) and non-H30 isolates (mean 12.1, SD 1.9) (p < 0.01, H30R1 vs. H30Rx or non-H30). Even with stratification by country of origin and year of testing, these differences remained statistically significant. By contrast, within a given clonal subset, VG scores did not differ significantly by year of testing or country of origin (Supplementary material Table S5).
The 35 detected VGs occurred in 38 distinct combinations (38 VG profiles; Figure 1). Whereas most profiles (66%, 25/38) occurred in a single isolate each, the two most common profiles were repeated 9 and 10 times each. The heatmap based on VG content showed four main clusters (Cluster1-4) of related VG profiles, which corresponded roughly with virotypes and clonal subsets. Cluster 1 and 2 grouped mainly H30Rx isolates and corresponded mostly with virotype E and A, respectively. Cluster 3 split into two main subclusters; one grouped all non-H30 isolates (all virotype D3), whereas the other grouped a mix of isolates from different clonal backgrounds (virotype C). Finally, Cluster 4 grouped mainly H30R1 isolates and corresponded mostly with virotype C. These four clusters differed mostly for presence/ absence of pap, kpsMII, specific group 2 capsular variants, hra, afa/dra, ibeA, and traT ( Figure 1).

Experimental virulence outcomes vs. bacterial characteristics
In the murine sepsis model, ISS was fairly high overall (median 3.9, on a 1-5 scale), but varied greatly by isolate (IQR 2.2), with approximately half (54%, 44/84) of isolates qualifying as "killers". ISS and "killer" status were significantly associated (median ISS: "killers", 4. By contrast, H30Rx isolates tested in 2014 or from Spain showed significantly higher ISS and were more often "killer" than H30Rx isolates tested pre-2014 or from USA (Supplementary material Table S8). Neither ESBL production nor FQ resistance was associated significantly with ISS or "killer" status.

Virulence vs. individual VGs (split by year/country)
Overall, of the 49 studied individual VGs, pap, kpsMII, and K2/K100 were associated significantly with ISS and "killer" status; the median ISS and percent "killer" were significantly higher for isolates with vs. without the particular gene (Table 5). However, with stratification by year or country, many of these comparisons lost statistical significance or differed inconsistently by subset (not shown).  Table  S9). By contrast, UPEC status was unsuitable for statistical analysis due to its 98% overall prevalence. Overall, VG score was correlated only weakly with ISS (rho = 0.29, p = 0.008), and was slightly higher among "killer" isolates (median score: 12 ["killers"], vs. 11 [others], p = 0.03). With stratification by year of testing, VG score was not associated with either virulence endpoint in either subgroup. With stratification by country, the correlation of VG score with ISS was only marginally statistically significant among isolates from the USA (rho = 0.37, p = 0.03) and was not significant among isolates from Spain (rho = 0.18, p = 0.23). By contrast, VG score was not associated with "killer" status in either subgroup (data not shown).
Virotype was not associated with experimental virulence (Table 4). Due to already-small group sizes, these  Year of testing correlated roughly with country of origin (rho = 0.84, p < 0.001), so can be considered a surrogate for that trait.
analyses were nor stratified by year and country. Finally, aggregate VG profiles (n = 38) grouped isolates with very different experimental virulence (Figure 1), without statistically significant virulence differences between profiles (p = 0.16). Additionally, the exploration of diverse combinations of VGs, as selected based on inspection of the heatmap, identified none other than ExPEC status that significantly predicted experimental virulence (not shown).

Multivariable analysis
Given the multiple significant univariable predictors of virulence, and these variables' associations with one another, multivariable analysis (with both forced entry and forward stepwise entry) was used to clarify primary associations and to allow adjustment for year of testing, which served as a proxy for the country of origin. Candidate predictor variables included a core set (H30R1, ExPEC status, year 2014) and a supplemental set based on VGs (pap, kpsMII, K2, VG score). For predicting ISS, the initial forced-entry multiple regression model, with candidate predictors ExPEC, H30R1, and year 2014, identified as significant predictors both ExPEC (beta 1.4, p < 0.001) and year 2014 (beta 0.61, p = 0.02); H30R1 lost significance (beta 0.22, p = 0.50). The extended forced-entry model, which included the four supplemental variables as additional candidate predictors, identified ExPEC as the only significant multivariable predictor (beta 1.72, p = 0.001). The stepwise model, in which all seven variables were included as candidate predictors, yielded substantially similar results: the only significant predictors retained in the final model were ExPEC status (beta 1.2, p < 0.001) and, with lower predictive power and marginal statistical significance, year 2014 (beta 0.57, p = 0.02).

Discussion
In this study we analyzed a large collection of wellcharacterized E. coli ST131 isolates for associations between experimental virulence, as assessed in a murine sepsis model, and diverse bacterial traits, including clonal subsets, resistance markers, and virulence genotype. For this, we analyzed virulence genotype in multiple ways, including as both individual VGs and various combinations of VGs, i.e., molecular ExPEC and UPEC status, virotype, aggregate VG profile, and VG score.
Despite all strains being ST131, their experimental virulence in the murine sepsis model varied widely, both overall and within most subsets of the population, as defined based on diverse bacterial characteristics (e.g., clonal subsets, virotypes or VG score). This provided an opportunity to search for bacterial traits that correspond with experimental virulence.
According to the univariable analyses, four types of bacterial traits (i.e., specific clonal backgrounds, individual VGs, and VG combinations, and country of origin/tested in 2014) were significantly associated with ISS and "killer" status. To summarize: first, of the studied clonal subsets, H30R1 was associated negatively with both virulence endpoints, whereas a subset of H30Rx isolates (those from Spain or tested in 2014) was associated positively with these endpoints. Second, among the 49 studied VGs, pap, kpsMII and K2/K100 were associated positively with one or both virulence endpoints, although these associations varied by year and/or country. Third, of the studied VG combinations, only molecularly defined ExPEC status (robustly) and VG score (marginally) were associated (both, positively) with the virulence endpoints. Lastly, experimental virulence also varied overall in relation to year of testing (a surrogate for country of origin), probably due in part to country-specific differences in clonal subset distribution. Indeed, with stratification by clonal subset, only H30Rx isolates exhibited this by-year (i.e., bycountry) difference in experimental virulence.
Previously reported results for a subset of the present isolates showed highly variable experimental virulence, with a trend toward lower virulence for H30R1 isolates [13,14]. Other studies of experimental virulence and ST131 likewise have yielded inconsistent results, not only across different animal models (mice, zebrafish, C. elegans, and G. mellonella) but even within a given model. For example, although initial results using the murine sepsis model suggested that ST131 was highly virulent [27], subsequent studies showed marked virulence variability [13,14,28]. Our results confirm this overall variability, notwithstanding a relatively high average virulence level. Additionally, with our comparatively large strain set, we were able to document in an univariable analysis significantly lower virulence for H30R1 isolates and, among isolates from Spain, higher virulence for H30Rx isolates.
Notably, certain individual VGs (pap, kpsMII, and K2/ K100) were associated with both virulence and the H30Rx subset (vs. the H30R1 subset), especially among isolates from Spain. This is consistent withand may partially explainthe greater observed virulence of Spanish H30Rx isolates, especially because virulence associations were stronger for VGs than for clonal subsets. Indeed, previous studies support a possible direct virulence contribution from the K2 capsule in non-ST131 clonal backgrounds [12,29,30]. These findings suggest that VG content and/or combinations of VGs (i.e., ExPEC molecular definition and VG score) could predict, and may determine, experimental virulence.
By contrast with the univariable analysis, the results of the multivariable analysis showed molecular ExPEC status as the strongest and only consistently significant predictor of experimental virulence, followed distantly by year of testing (which is a surrogate for the country of origin) and K2/K100. This finding may explain the initial observation of differences in experimental virulence across clonal subsets, which also differ for their molecularly defined ExPEC fraction (i.e., higher in H30Rx and lower in H30R1).
With the multivariable adjustment, whereas the individual VGs that were univariable predictors of experimental virulence lost statistical significance, ExPEC status remained a significant predictor. This may be explained by ExPEC status accounting for the influence of the individual VGs on virulence outcomes because the molecular definition of ExPEC includes those genes [14]. The very wide confidence interval around the OR for ExPEC suggests a need for studies involving more isolates, ideally of different countries of origin to test geographical impact.
By contrast with molecularly defined ExPEC status, neither virotype nor aggregate VG profiles significantly predicted virulence, which conflicts with previous findings [8]. Such inconsistencies across studies indicate that the described ST131 virotypes or even more extensive VGs profiles (as shown here), although associated with clonal background, [12,31,32] are insufficient to reliably predict experimental virulence. Although some of our negative findings may reflect in part the small numbers per subgroup after stratification by virotype, extended VG profiles, year of testing, and/or country of origin, even our overall analysis failed to replicate certain associations noted in previous studies, despite our greater total number of isolates.
Our study has some limitations. First, the murine sepsis model mimics only partially the pathogenesis of sepsis in humans, despite being standard in the field; outbred mice may vary in their response to infection, possibly contributing to experimental variability, although also presumably improving generalizability; and the use of only female mice conceivably could bias the results.
Second, our isolates were tested at different times, although temporal effects were addressed analytically by stratification by year of testing, and seem unlikely given the temporal stability of results for the control strains and for clonal subsets other than H30Rx. Third, our virulence genotyping relied on DNA detection for a limited set of VGs. Conceivably, expression/regulation of these or other (unaddressed) VGs linked to these may underlie the observed associations between VG presence and experimental virulence, and/or account for the residual unexplained virulence variation.
Fourth, the molecular definitions used for ExPEC and UPEC refer to the presence/absence of specific VGs, so do not track reliably with the source of isolation. However, they predict biological ExPEC status, defined as a strain's intrinsic ability to cause extraintestinal infection, more accurately than does clinical source or presentation. Fifth, due to VG genotype variability within E. coli, strains that qualify as molecular UPEC do not necessarily qualify as ExPEC and vice versa, regardless of source; here, UPEC status was too prevalent for valid statistical analysis. Sixth, for some comparisons, statistical power was reduced by stratification by country and year of testing, and by the low prevalence of certain variables.
The study has also notable strengths. These include the large number of strains tested (making it, to our knowledge, the largest study to date of experimental virulence of ST131 strains in any animal model); the use of an established sepsis model that among those available most closely mimics human disease; attention to an extensive range of bacterial traits, including single and combined VGs and different ST131 subsets; and use of multiple statistical approaches, including multivariable modeling.
In conclusion, we found considerable variability in experimental virulence between and within the different ST131 clonal subsets, which differed significantly for VG content, ExPEC status, and virotype. With the multivariable adjustment, ExPEC status was the only consistently significant outcome predictor. Thus, composite markers such as ExPEC status are useful for identifying potentially virulent ST131 strains. These findings may help in devising screening tests and identifying targets for therapeutic or preventive interventions against infections caused by ST131 strains. They also indicate a need to study further the virulence determinants of ST131 (including possibly with in vitro assays such as serum resistance or survival in phagocytes), and to identify an explanation other than sepsis-causing ability for ST131's impressive epidemiologic success.