Genetic diversity and evolution of the virulence plasmids encoding aerobactin and salmochelin in Klebsiella pneumoniae

ABSTRACT Virulence plasmids of hypervirulent Klebsiella pneumoniae (hvKp) have the potential to transfer to drug-resistant strains or integrate with other plasmids, facilitating the genome evolution of threatening pathogens. We conducted an in-depth analysis of the publicly available 156 complete genome sequences of hvKp together with a multi-region clinical cohort of 171 hvKp strains from China to provide evidence for the virulence plasmid evolution. Virulence plasmids were frequently detected in the ST23 and ST11 K. pneumoniae strains. Multidrug-resistant hvKp (MDR-hvKp) occupied a large proportion of hvKp, and the coexistence of virulence and resistance plasmids may be the major cause. Virulence plasmids commonly possessed multiple replicons, of which IncFIBK was the most prevalent (84.6%). We identified 49 IncFIBK alleles among 583 IncFIBK plasmids, and they could be divided into Clades I, II, and III. We further observed that conjugative and non-conjugative virulence plasmids could be distinguished by IncFIBK genetic diversity, and IncFIBK subtyping could also indirectly indicate a chimeric preference of conjugative virulence plasmids. On this basis, we developed an open-access web tool called KpVR for IncFIBK subtyping. In conclusion, the genetic diversity of IncFIBK virulence plasmids could be used for tracking the evolution of virulence plasmids, and further preventing the emergence of MDR-hvKp strains.


Introduction
Two major pathotypes of Klebsiella pneumoniae pose serious hazards to public health: hypervirulent K. pneumoniae (hvKp) causes severe invasive infections while classic K. pneumoniae (cKp) are mostly hospital-acquired and multidrug-resistant (MDR) [1]. The mobilizable plasmids are the widely acknowledged cause of the dissemination of the resistance genes among cKp [2,3]. Multiple factors are required for hypervirulence of hvKp, such as iron acquisition and capsular polysaccharide [4]. The critical virulence factors with experimental support for conferring the hypervirulent phenotype are encoded by genes present on the virulence plasmids, which include iuc (coding for aerobactin), peg344 (a metabolic transporter), and rmpA and rmpA2 (regulators of capsule production) [5]. Virulence plasmid acquisition may be an important mechanism for the increased virulence of hvKp. Plasmids carrying resistance or virulence genes have the potential to be transferred between bacterial strains, conferring resistance and virulence.
However, the patient survival of infections with strains displaying both hypervirulence and multi-drug resistance is unexpectedly poor. Indeed, Gu et al. has reported that carbapenem-resistant ST11 K. pneumoniae strains showed increased virulence after acquiring a pK2044-like plasmid, resulting in extremely difficult treatment and high mortality of such infections [6]. Nevertheless, current thinking considers pK2044-like virulence plasmids without conjugation and mobilization genetic modules to be nontransmissible [7]. Determining how classical pK2044like virulence plasmids show mobility remains to be further studied. The escalating threat to global health posed by hypervirulent K. pneumoniae has warranted urgent investigation and recent research has uncovered some important findings. Clonal Group 23 (CG23) has been described as having a strong association with hvKp strains, causing primary liver abscess [8,9]. Efficient iron acquisition is required for the survival of hvKp in human ascites [10]. Thus, the ability of hvKp to produce more iron acquisition factors could enhance its virulence. Several molecular epidemiologic studies showed that aerobactin and salmochelin siderophore genes iuc and iro encoded on plasmids are hvKp specific, and could be useful as clinical biomarkers for hypervirulent K. pneumoniae strains [4,11]. Further, Russo et al believed that iuc and/or either rmpA or rmpA2 would be predicted to be the best biomarkers. In addition, based on genetic diversity, Lam and Wyres et al. developed a framework for identifying and tracking key virulence loci encoding aerobactin and salmochelin [12]. The authors further noted that the most common pK2044-like classical virulence plasmid (KPVP-1) almost always harbored an IncFIB K replicon. In fact, IncFIB K plasmid is widespread in Klebsiella strains, and most of the IncFIB K plasmids are found to carry resistance genes [13,14]. The genetic diversity of IncFIB K replicons could help us understand the differences between the IncFIB K virulence plasmids and the IncFIB K resistance plasmids. Other types of virulence plasmids were also reported, such as IncFIB Mar [15]. However, comprehensive analysis of virulence plasmids is still lacking.
The virulence plasmid acquired by ST11 carbapenemresistant K. pneumoniae was a classical non-conjugative pK2044-like virulence plasmid. Besides, self-transmissible virulence plasmids have also appeared. For example, Yang et al. identified that a conjugative virulence plasmid from K. variicola strain could be transferred to carbapenem-resistant K. pneumoniae [16], indicating the rapid evolution of virulence plasmids. Therefore, it has become increasingly urgent to understand the genetic diversity of conjugative and non-conjugative virulence plasmids and further to quickly identify conjugative virulence plasmids in clinical isolates.
In this study, we observe that virulence plasmids are found in a wide range of host bacteria, not limited to CG23 strains, and MDR-hvKp has occupied a considerable proportion of hvKp. IncFIB K replicon is dominant in virulence plasmids, and the genetic diversity of their replicons could distinguish conjugative from non-conjugative virulence plasmids. We also described a new open-access online tool KpVR for IncFIB K subtyping. Our results might significantly deepen our understanding of the evolution of virulence plasmids of K. pneumoniae and provide insights for further research into hypervirulent K. pneumoniae.

GenBank data and clinical isolates
Completely sequenced virulence plasmids with lengths more than 50 kb were extracted from GenBank by searching iuc, iro, rmpA, and rmpA2 in July 2020 (https://www.ncbi.nlm.nih.gov/nuccore). Complete chromosome sequences were obtained for some of them (Datasets S1 and S2). We focused on the two most common specific virulence determinants for iron acquisition, iuc and iro, encoded on the plasmids of hvKp. Other virulence genes were not analyzed in this study.
According to previous studies, we used iuc and/or iro, rmpA, and rmpA2 to screen for clinical hvKp strains [4]. K. pneumoniae clinical isolates were collected in nine hospitals from seven Chinese provinces from January 2017 to February 2018. A total of 530 isolates were collected from Huashan Hospital

Core single nucleotide polymorphisms (SNPs) extraction and phylogenetic analysis
Chromosomal MLST was performed in the Institute Pasteur MLST database online (https://bigsdb.pasteur. fr/klebsiella/). To obtain a further evolutionary relationship among the closely related ST11 strains, genomes of them were used to construct phylogenetic tree based on the genome-wide SNPs. SNPs situated in the core genome region were automatically called and aligned by submitting ST11 genomes to the 'CSI Phylogeny 1.4ʹ web-based program with disabled prune selection, using the NJST258_1 complete genome as a reference [17]. The alignments were released from CSI Phylogeny and then used to construct maximumlikelihood trees with 1000 bootstrap values using MEGA v7.0.26 software [18].

Phylogenetic analysis of virulence plasmids
Twenty-five plasmids were selected according to the replicons and MLSTs of the host bacteria. The phylogenetic tree was constructed according to the replicon sequences (IncFIB and IncFIA) of the reference plasmids using the maximum-likelihood method with 1000 bootstrap replicates in MEGA7. Visualization, annotation, and management of the phylogenetic tree were performed by Evolview v3 (https://www.evolgenius. info/evolview).

Subtyping IncFIB K plasmids in Klebsiella bacteria
To obtain a comprehensive overview of IncFIB K plasmids, the complete sequences of IncFIB K plasmids of Klebsiella from GenBank were selected by BLASTn using the reference replicon "IncFIB K _1_JN233704" in the PlasmidFinder database (≥90% identities and ≥90% coverage) (Dataset S4). IncFIB K alleles were determined by the nucleotide sequence identity using "IncFIB K _1_JN233704" as a reference, and any replicon sequence with single nucleotide variations or deletions after multiple-sequence alignment by ClustalW was defined as a new allele ( Figure S1). We developed a user-friendly web-based tool, named KpVR (https://db-mml.sjtu.edu.cn/KpVR/), as a public resource for detecting replicons, T4SS gene clusters, virulence, and antibiotic resistance genes in K. pneumoniae plasmids. First, we developed a backend data set repDB based on our IncFIB K subtypes. Next, the online tool KpVR performs rapid similarity searches of a query plasmid sequence against repDB based on nucleotide sequence identity and coverage. Then, iuc and iro lineages based on the AbST and SmST typing schemes were also integrated by KpVR [12]. Finally, KpVR outputs a simple list and generates a graphic overview of the prediction of replicon types together with the extended putative virulence modules or acquired antibiotic resistance genes. All IncFIB K virulence plasmids in this study were submitted into KpVR to acquire corresponding alleles.
IncFIB K subtyping primers for clinical isolates were designed according to the conserved regions at least 50bp away from both sides of the "IncFIB K _1_JN233704" sequences. The sequences of preselected primer regions of IncFIB K plasmids were extracted and then visualized using WebLogo to determine the primer sequences ( Figure S2). Forward primer: GCCTTRATGACTTCGTCATA; reverse primer: CRGACGTTAAGATCACCGG. The positive PCR products were sequenced and then submitted to KpVR to determine the IncFIB K alleles.

Comparative analysis of conjugative virulence plasmid genomes
Four conjugal modules in the plasmid sequences were detected by using oriTfinder [22], including the origin of transfer site (oriT), relaxase gene, type IV coupling protein (T4CP) gene, and the type IV secretion system gene cluster (T4SS). BLAST Ring Image Generator (BRIG) was used to compare conjugative virulence plasmids with other similar plasmids to further generate circular plasmid maps.

General characteristics of hypervirulent strains and virulence plasmids of K. pneumoniae
We identified 156 fully sequenced virulence plasmids with various lengths (70 to 479 kb; median, 213 kb) and GC contents (46.4% to 53.1%; median, 49.9%)( Figure  S3), and 117 complete chromosomes of them were then extracted for analysis (Datasets S1 and S2, Figure S1). The strains were isolated from f14countries and regions between 2006 and 2019 ( Figure S4A). More than 30 ST types were identified from 117 K. pneumoniae chromosomes, among which ST11 and ST23 accounted for 29.9% and 16.2%, respectively ( Figure 1A). Given the high proportion and concentrated distribution of ST11 strains, we conducted geographical distribution and core SNPs phylogenetic tree analyses of these strains to exclude the possibility of clonal dissemination ( Fig  S4B and Figure S4C). Besides the GenBank data, we also identified 171 hvKp strains from 573 non-duplicated K. pneumoniae clinical isolates to further investigate ST distribution trends. The overall distribution of ST types detected in clinical hvKp isolates was also similar to the global data ( Figure 1B). ST11 still accounted for a large proportion of clinical hvKp isolates (12.3%; 21/171) being second only to ST23 (26.3%; 45/171).

The emergence of MDR-hvKp strains
The dual phenotype of virulence and resistance may occur when the virulence plasmids carry resistance genes or exist in the same strain with the MDR plasmids. We observed that 90 of 117 hvKp strains were identified as MDR-hvKp, and 23 of them had resistance genes in their virulence plasmids ( Figure S5, Dataset S3). The coexistence of virulence and resistance plasmids in the same strain may be the major cause of MDR-hvKp.
We further observed that the same plasmid type could be found in various ST K. pneumoniae strains. For example, pK2044-like IncFIB K /IncHI1B virulence plasmids were not restricted to ST23 but also occurred in ST11, ST48, and other strains, indicating that the virulence plasmids may have been transferred (Figure  1 F). Another important finding was that the IncFIB replicon (95.5%; 149/156), including IncFIB K , IncFIB Mar , and IncFIB pQil , was nearly always present in each virulence plasmid (Figure 1 F), highlighting that the IncFIB replicon likely plays a crucial role in the formation and replication of the virulence plasmid.

The replicons of IncFIB K virulence plasmids
Considering the widespread distribution of IncFIB K replicon in Klebsiella, we collected 583 fully sequenced IncFIB K plasmids restricted to Klebsiella from the GenBank database to explore the genetic divergence between virulence and non-virulence IncFIB K plasmids (Dataset S4). We identified 49 IncFIB K alleles by nucleotide identity, and single nucleotide variations accounted for most of the variations in these alleles ( Figure S1).
Phylogenetic tree analysis showed that the 49 IncFIB K replicon alleles were grouped into three clades (Clade I, Clade II, and Clade III) (Figure 2A). Of these, IncFIB K 1, IncFIB K 10, and IncFIB K 37 were the three most common. A total of 12 IncFIB K alleles were identified in 132 IncFIB K virulence plasmids by KpVR. Unexpectedly, nearly all virulence plasmids were found with IncFIB K 37, a branch in Clade III, while the general non-virulence IncFIB K plasmids belonged to Clade I (Figure 2A, Figure 2B). Plasmids in Clade I often carried IncFII replicons simultaneously (Figure 2A).
To verify the observations in the GenBank data that IncFIB K replicons in Clade III, especially IncFIB K 37, were highly associated with virulence plasmids, we identified IncFIB K alleles in 171 hvKp clinical isolates using the web tool KpVR. Indeed, 91.8% (157/171) of clinical hvKp isolates were found to harbor IncFIB K replicons, and 11 IncFIB K alleles were identified in IncFIB K -positive clinical hvKp isolates. Of these, the IncFIB K 37 allele accounted for the highest proportion (84.1%; 132/157) (Figure 2 C). It is important to note that a small number of IncFIB K virulence plasmids belonging to Clade I were found in both GenBank data and clinical isolates ( Figure 2B Figure 2C).

Genetic diversity of conjugative and non-conjugative IncFIB K virulence plasmids
Virulence plasmids in Clade I generally carried IncFII replicons while Clade III carried IncHI1B replicons, and all IncFIB K /IncFII virulence plasmids in Clade I had the transfer region ( Figure 3A). Therefore, almost all Clade I virulence plasmids were conjugative plasmids while virulence plasmids in Clade III were non-conjugative. In addition, we identified 27 conjugative virulence plasmids, all but one of which were concentrated in Clade I, providing indirect evidence for the significant genetic diversity between conjugative and non-conjugative virulence plasmids (Table  S1). Collectively, these findings indicate that classical non-conjugative virulence plasmids and conjugative virulence plasmids belonged to Clade III and Clade I, respectively. The IncFIB K appears to a large extent to discriminate between conjugative and nonconjugative virulence plasmids. Some conjugative virulence plasmids seemed to carry some resistance genes ( Figure 3A).
The loci and distribution of iuc and iro of IncFIB K virulence plasmids in Clade I and Clade III also showed divergences. Plasmids in Clade I almost harbored iuc genes, but few of them carried iro genes. In contrast, plasmids in Clade III tended to carry both iuc and iro genes. Besides, iuc lineages of plasmids in Clade I consisted largely of iuc3 while those in Clade III were mainly iuc1 ( Figure 3B).

Chimeric preference of conjugative IncFIB K virulence plasmids
We observed above that except for plasmid pAP855, all conjugative virulence plasmids were concentrated within Clade I. We compared the conjugative plasmids p205880-1 (Clade I) and pAP855 (Clade III) to other related plasmids to explore the differences between them.
Plasmid pAP855 (Clade III) was a very large 357.8 Kb co-integrated plasmid which could be possibly disassembled into a backbone of a pK2044-like plasmid (Clade III) and an IncFII tra-trb conjugative transfer region ( Figure 4B). In contrast, the Clade I conjugative plasmid p205880-1 may have evolved through the acquisition of virulence genes by a Clade I IncFIB K /IncFII plasmid ( Figure 4A). Further, we found that other conjugative virulence plasmids in Clade I exhibited high similarity with p205880-1. All shared an almost identical aerobactin-related virulence module integrating into a classical non-virulence IncFIB K /IncFII plasmid with only one exception involving plasmid pINF078-VP which carried 13 copies of iro genes ( Figure S6). Together these data indicate that the Clade I conjugative plasmids likely evolve into conjugative virulence plasmids by acquiring an aerobactin-related virulence cluster while non-conjugative virulence plasmid pAP855 acquire a transfer region.

Discussion
HvKp infections are not only restricted to Asia but occur worldwide [8,23]. Since the report of a fatal outbreak of the carbapenem-resistant hypervirulent K. pneumoniae (CR-hvKp) ST11 strain in a Chinese hospital in 2016, MDR-hvKp infections, especially those caused by CR-hvKp, have been identified in many other countries including Singapore [24], Iran [25], France [26], and Russia [27], providing a global wake-up call for the widespread dissemination of MDR-hvKp strains. The infections caused by MDR-hvKp would exacerbate poor clinical outcomes, highlighting the need to understand the role of virulence plasmids in these "dual risk" hypervirulent superbugs.
Total siderophore production has been shown to strongly correlate with in vivo virulence. Aerobactin and salmochelin are high-affinity siderophores that are always encoded on virulence plasmids [9]. Although many factors may contribute to the virulence of hvKp, virulence plasmids encoding aerobactin appear to be less dispensable. Our study focuses on virulence plasmids encoding aerobactin and/or salmochelin, incorporating analysis of multi-center Chinese clinical isolates with global GenBank strains. Tracing the host genomes of these plasmids identified a large proportion of ST11. However, ST11 was reported as the most common type of carbapenem-resistant K. pneumoniae. Further phylogenetic analysis and clinical data confirmed that ST11 was indeed increased in hvKp strains, second only to ST23 (Fig S4). One previous study showed that ST11 strains possibly acquired a virulence plasmid and exhibited increased virulence [6]. A striking observation is that the same virulence plasmid could be found in various ST types, indicating the likelihood of plasmid transfer between strains. However, virulence plasmids were previously considered to persist for long periods and the lack of conjugation machinery supports their clonal expansion [28]. The common presence of virulence plasmids in drug-resistant K. pneumoniae, therefore, highlights the urgency for countering this important global public health threat. Besides, the combination of MDR and virulence could also be caused by the "mosaic plasmid" carrying both antibiotic resistance and hypervirulenceassociated features [29]. On the basis of these findings, we comprehensively dissected the composition of different virulence plasmids to look for possible evolutionary clues that could be used to prevent their dissemination.
In this study, IncFIB K virulence plasmids were found to be predominant. According to previous studies, IncFIB K replicons are mostly associated with MDR plasmids in K. pneumoniae [30,31]. A recent study found that plasmids encoding iuc or iro genes usually harbor an IncFIB K replicon [12]. Our study now provides valuable evidence for the important role of the IncFIB K replicon in the formation and replication of virulence plasmids. We used the online tool KpVR to determine the genetic diversity of virulence and non- virulence IncFIB K plasmids together with additional information such as acquired resistance genes and Type IV secretion system (T4SS) gene clusters. The Center for Genomic Epidemiology (CGE) databases could also analyze replicons, virulence and antibiotic resistance genes by three separated tools. Besides this information, the tool KpVR could also predict T4SS gene clusters and iuc and iro subtyping. More importantly, KpVR integrates these results together with IncFIB K subtyping to explore the close connection between conjugative virulence plasmids and IncFIB K genetic diversity. It may further help predict the mobility of virulence plasmids in clinical hvKp strains. IncFIB K replicon sequences were divided amongst three clades according to the phylogenetic tree but virulence plasmids were mostly located to IncFIB K 37 (Clade III), a different evolutionary branch from those common IncFIB K plasmids, indicating that IncFIB K virulence plasmids were evolved independently. The different distribution of iuc and iro supported the different evolution of virulence plasmids in Clade I and Clade III. Though IncFIB K replicon always coexists with IncHI1B, IncFIB K possibly fulfills the crucial role in supporting the formation and replication of virulence plasmids rather than IncHI1B. A previous study provided supporting evidence that repHI1B lacks a partitioning system and other compatible replicons are required for replication [32]. Overall, these data provide important clues that help understand the dissemination mechanisms of the hypervirulent and drug-resistant K. pneumoniae as well as demonstrating the utility of the KpVR tool.
Our results confirm that the emergence of MDR-hvKp is largely caused by the coexistence of the virulence plasmid and resistance plamid in the same strain, and the presence of "mosaic plasmids" occupies a small proportion [33]. However, we could not ignore the terrible consequences of virulence plasmids or "mosaic plasmids" acquiring the ability to transfer. IncFII plasmids are highly likely to harbor transfer regions gain the ability to undertake conjugative transfer [34]. We found all IncFIB K /IncFII virulence plasmids were conjugative and therefore had the potential to promote the emergence of MDR-hvKp. Conjugative virulence plasmids have also been reported in other species such as Escherichia coli and Salmonella enterica [35,36]. Indeed, virulence factors encoding aerobactin, salmochelin in K. pneumoniae shared high homology with those that occur in chromosomes or plasmids of other species [15,35,36]. It seems reasonable to assume that these virulence factors could be transferred via mobile genetic elements, such as conjugative plasmids, to evolve into novel hypervirulent strains. One recent study demonstrated that a virulence plasmid with a type IV secretion system in K. variicola could be conjugated to a carbapenem-resistant K. pneumoniae and E. coli C600 [16]. Therefore, it is crucial to distinguish between conjugative and non-conjugative virulence plasmids. Toward this, we established that IncFIB K subtyping could serve as an effective approach to recognize most conjugative virulence plasmids. This approach will greatly reduce the cost of de novo sequencing of whole genomes and enable quick identification of emerging threats to public health.
Conjugative virulence plasmids are thought to possess the two enabling characteristics of conjugative machinery and virulence-related genes. Conjugative plasmids and virulence plasmids, respectively, each possess one of these properties. We found that almost all conjugative virulence plasmids were formed by the insertion of an aerobactin cluster into an IncFIB K /IncFII conjugative plasmid (Clade I). The pAP855 (Clade III) was possibly formed by a tra-trb conjugative element integrating into a classical pK2044-like virulence plasmid. Thus, IncFIB K could indirectly reveal the chimeric preference of conjugative virulence plasmids. Yang et al. have speculated that homologous recombination events possibly occurred when integrating two plasmid regions into a mosaic plasmid [37]. We did find some homologous regions in several conjugative virulence plasmids, but the exact mosaic mechanism is still waiting to be confirmed. This study has some other limitations. Rather for this study, we focused on the dominant IncFIB K virulence plasmids to track the evolution of conjugative and non-conjugative virulence plasmids by IncFIB K subtyping, while other virulence plasmids were not analyzed due to the small number. The potential bias of GenBank data was unavoidable though we supplemented multi-center isolates to support. More genome sequences are needed to validate our findings.
In conclusion, we observe that ST11 hvKp occupies a high proportion of hvKp strains, second to ST23, and the co-existence of virulence and resistance plasmids is the major cause of the emergence of MDR-hvKp. IncFIB K virulence plasmids are most detected and exhibit a peculiar evolution from other IncFIB K plasmids. We further present the development of a Web tool KpVR for IncFIB K subtyping to distinguish conjugative virulence plasmids from classical non-conjugative pK2044-like virulence plasmids. This work might provide novel insights into the evolution of virulence plasmids and will bolster efforts to understand and limit the emergence of infections caused by MDR-hvKp strains.