Mutation accumulation and horizontal gene transfer in Escherichia coli colonizing the gut of old mice

ABSTRACT The ecology and environment of the microbes that inhabit the mammalian intestine undergoes several changes as the host ages. Here, we ask if the selection pressure experienced by a new strain colonizing the aging gut differs from that in the gut of young adults. Using experimental evolution in mice after a short antibiotic treatment, as a model for a common clinical situation, we show that a new colonizing E. coli strain rapidly adapts to the aging gut via both mutation accumulation and bacteriophage-mediated horizontal gene transfer (HGT). The pattern of evolution of E. coli in aging mice is characterized by a larger number of transposable element insertions and intergenic mutations compared to that in young mice, which is consistent with the gut of aging hosts harboring a stressful and iron limiting environment.


Introduction
The passage of time in the life of organisms is accompanied by a myriad of changes that ultimately culminate in death. In his extraordinary comparative physiology study "The prolongation of life: Optimistic studies" [1], Metchnikoff postulated that the intestine is the organ to blame for short lifespans; with time it is increasingly colonized by harmful bacteria instead of beneficial ones. Today's sequencing technologies, which permit the cataloging of many bacterial species in the intestines of hosts, indeed reveal significant compositional changes of gut bacterial species as the host clock ticks and its intestine ages [2]. However, another clock accompanies the ecological changes in the gut: the evolutionary time taken by each bacterial species to produce new strains [3]. Recently, we observed that under continuous antibiotic treatment, the common commensal Escherichia coli evolves differently in the gut of old mice relative to genetically similar but younger mice: shifts from metabolic adaptations to mutations targeting stress-related functions [4] were observed. As continuous antibiotic treatment can strongly affect the gut microbial ecosystem as well as host physiology [5], we now focus on the differences in evolutionary responses of this important human commensal after treatment is stopped in old and young mice.

Results and discussion
We colonized a cohort of old mice (18.5 months, n = 9) after a short treatment with streptomycin [6], with an invader E. coli strain, labeled with two neutral markers coding for either YFP or CFP fluorescence. These markers allow rapid detection of evolutionary adaptation to the gut through the emergence and spread of new mutations within the focal newly colonizing strains. The short treatment with streptomycin is able to break colonization resistance, allowing the invader E. coli to colonize the mouse gut under a complex gut microbiota, which may harbor a resident E. coli strain [6] and can affect its evolutionary trajectories [7].
To characterize differences in the colonization of the invader E. coli in the gut microbiota of young and old mice we estimated its total loads. The invader E. coli can stably colonize the majority of old mice with loads similar to those in young mice [6] and is also able to colonize in the presence of a resident E. coli that is native to the mouse microbiota (mice C13 and C15) (Figure 1a) leading to strain co-existence. No differences in the mean abundances for the invader strain of E. coli were found between old and young mice (One-way ANOVA with Bonferroni's multiple comparison test for Log 10 (CFU/gram of feces), p > 0.05), both at d 8 and 27 post-colonization ( Figure 1a). We were not able to detect E. coli in the old mouse A2 after 8 d of colonization and in the old mouse A2 and C11 after 27 d of colonization (Figure 1a).
To query the speed of evolution during the 27 d of colonization, we measured the changes in the frequency of the neutral markers within the invader strain. We observed that YFP frequency changes rapidly, with increases followed by declines over time (Figure 1b), a pattern that is expected under rapid evolution and intense clonal interference [4].
To determine the pattern of evolution and identify the mutational targets of adaptation to the gut of old mice, we performed whole-genome sequencing of the invader E. coli populations after 27 d of colonization. Old mice A2 and C11 were excluded from this analysis due to the low loads at d 27 as well as C13 due to the emergence of streptomycin-resistant bacteria. We find that evolution in the gut of old mice involves accumulation of mutations in frlR, in srlR, in the intergenic region psuK/fruA, and in the intergenic region of fimE/fimA, all of which were previously observed when this strain colonized young mice [3,6,8,9] (Figure 2a, Tables 1 and 2). Mutations in these targets are likely adaptive in both age groups as they emerge in multiple mice independently and/or with multiple alleles within the same mouse (e.g. srlR mutation is detected only in mouse B6 of the old age cohort but with mutations leading to two distinct amino-acid changes). Both, frlR (putative transcriptional regulator of the fructoselysine operon [10]) and psuK/fruA (psuK is involved in pyrimidine metabolism [11] and fruA in fructose transport [12]), as well as srlR (repressor of sorbitol transport and utilization [13]), may be related to the mouse diet, as they are also observed when E. coli colonizes germ-free mice [7]. FimE is known to regulate fimA, which is the major subunit of type 1 fimbriae. Importantly we find mutations in the evolving invader E. coli that are specific to old mice. Some of these reached considerable frequencies, such as mutations in lrhA, lpoA, ydiM, and ydiP ( Figure 2a). Mutations in lrhA were previously observed in old mice undergoing continuous antibiotic treatment and shown to increase E. coli motility [4], suggesting that their selection is independent of the antibiotic. Interestingly, LrhA is also known to regulate fimE [14]. LpoA plays a critical role in the in vivo function of penicillin-binding protein A [15] and ydiM is involved in isoprenol tolerance [16]. A remarkable signal of convergent evolution in the cohort of old mice was detected in the intergenic regions ymgF/ymgD and tdcA/tdcR ( Figure  2a). The latter genes are involved in the anaerobic degradation of L-threonine to propionate [17], associated with stress response in Klebsiella pneumoniae [18], and iron limitation in a pathogenic strain of E. coli [19]. We observed IS5 insertions in the intergenic region tdcA/ tdcR reaching frequencies as high as 65% in one mouse and above 5% in 4 out of 6 mice ( Figure 2a, Table 1), suggesting that this region is under strong selection in this environment.
HGT occurs when this invader E. coli co-exists with the mouse resident E. coli [6]. In addition, environmental stress (e.g. inflammation) is known to influence the rate of HGT events [20,21]. Thus, we tested for this evolutionary mechanism by enquiring if clones sampled from mice where coexistence was seen, had acquired two known active prophages. Probing for the Nef and KingRac prophages over time in mouse C13 and C15 revealed that similar to the pattern in young mice [6], HGT occurs in the old mice as well ( Figure 2b). Interestingly, the acquisition of the Nef prophage by the invader E. coli is detected earlier in the young mice relative to old mice (Figure 2b), suggesting that Nef phage integration occurs at a higher rate in young mice. Nef is known to provide a metabolic advantage for E. coli when colonizing young mice [6]. Since young and old mice differ in the concentration of a variety of gut metabolites [22], a smaller advantage of acquiring Nef could be a possible explanation for what we observe in the old mice. Another possibility is that in a more stressful environment, with high inflammation levels [20], as in the gut of old mice, the Nef prophage could be more frequently induced, which leads to bacterial death and thus less bacteria carrying Nef prophage.
As the gut of old mice shows higher levels of inflammation [4] and iron deficiency [23], E. coli evolving in the aging gut could exhibit a higher mutation rate and/ or an altered spectrum of emerging mutations [24]. The mean number of mutations detected across the aging mice was not significantly higher than that of the young mice (5.8 ± 0.9 for old and 5.0 ± 1.1 for young mice, two-sided Mann-Whitney U = 17.5, p = 0.6; Figure 2c), suggesting that the rate of mutation accumulation is not significantly increased in the old, at least during the first month of colonization. However, the number of IS insertions that increased in frequency in the old mice was significantly higher than that observed in young mice (1.2 ± 0.2 for old and 0.4 ± 0.4 for young mice, two-sided Mann-Whitney U = 6.0, p = 0.02; Figure 2d), indicating that either the rate of spontaneous IS insertions or its adaptive value is higher in the aging intestine. The number of mutations in intergenic regions was also significantly higher in the aging gut (3.5 ± 0.4 for old and 1.7 ± 0.4 for young mice, two-sided Mann-Whitney U = 5.0, p = 0.02; Figure 2e), while the number of synonymous mutations (0.5 ± 0.5 for old and 0.6 ± 0.3 for young mice, two-sided Mann-Whitney U = 17.0, p = 0.6; figure 2f), coding mutations (1.8 ± 0.3 for old and 2.7 ± 0.8 for young mice, two-sided Mann-Whitney U = 16.5, p = 0.6; Figure 2g) and insertion/ deletion mutations (1.3 ± 0.3 for old and 1.7 ± 0.5 for young mice, two-sided Mann-Whitney U = 17.0, p = 0.6; Figure 2h) was not significantly different. Another signature in the pattern of molecular evolution being driven by positive selection is the high rate of coding to synonymous mutations of 3.67 and 4.75 across the cohort of old and young mice, respectively (assuming 15 generations per day).
The targets of evolution in old mice found here are similar to the pattern previously observed when old mice underwent continuous antibiotic treatment and are compatible with environmental stress and iron scarcity [4]. Since iron is essential for both, bacterial proliferation and correct functioning of the host immune system, strict regulation of iron availability in the gut is key to prevent and control the proliferation of potential bacterial pathogens [25]. With the imbalance of iron content that emerges during aging, in conjunction with less effective immune surveillance [26], the host selective pressure against the emergence of potential pathogens should in principle be reduced. On the other hand, environmental stress (e.g. iron limitation and inflammation) may lead to an increased selective pressure for bacteria functions with pathogenic potential [4]. Interestingly, it is known that E. coli grown under iron limitation shows a mutational profile characterized by an eightfold increase in the rate of IS transpositions [24]. In addition, intergenic mutations are of major importance for Pseudomonas aeruginosa adaptation within host in cystic fibrosis patients, characterized by increased inflammation in the lungs [27]. Therefore, the significant increase of IS transpositions and intergenic mutations in the aging mouse gut may be caused by differences in the iron availability and inflammation in the elderly.

Conclusion
We have described the pattern of evolution of an invader commensal E. coli colonizing the gut of old mice in comparison with that observed in young adult mice. We showed that rapid adaptation and bacteriophage-mediated HGT occur in both old and young mice, but a higher number of IS transpositions and intergenic mutations is observed in old mice, consistent with higher stress and iron limitation in the aging gut. The symbol Δ means a deletion event and a + symbol represents an insertion of the nucleotide that follows the symbol. The initials IS denote the abbreviation of insertion sequence element at the indicated position. Highlighted in bold are the target genes that are common to both E. coli evolving in young and old mice.

Whole-genome sequencing for mutation detection
Frozen fecal pellets were serially diluted and plated on LB agar plates supplemented with streptomycin (100 μg mL −1 ). After overnight incubation at 37°C, a mixture of >1000 clones was scraped from the plates Highlighted in bold are the target genes that are common to both E. coli evolving in young and old mice. See Table 1 legend for further details. and resuspended in 1X PBS. DNA was extracted with phenol-chloroform [28]. The library construction (Pico Nextera) and sequencing were performed at the IGC Genomics facility using an Illumina MiSeq Benchtop Sequencer and an Illumina NextSeq 500. Each sample was pair-end sequenced and standard procedures produced datasets of paired-end 250 bp read pairs. Sequencing adapters were autodetected and removed using fastp [29]. Raw reads were trimmed from both sides, using window sizes of 4 base pairs across which the average base quality had to reach a minimum value of 20 to be retained. Trimmed reads were retained if they reached a minimum length of 100 bps and consisted of at least 50% base pairs which had phred scores of at least 20. The filtered fastq files were passed through the bbsplit utility of bbmap [30] with default parameters in order to resolve potential contamination issues between the invader and resident strains of E. coli coexisting in the gut, as well as other species from within the microbiome. Reads matching neither of the two E. coli strains were considered contaminating reads and stored in separate files. The fully assembled invader and resident (Accession Numbers SAMN15163769 and SAMN15163749, respectively) E. coli strains (previously discussed in Frazão et al. [6]) were used for this step. Plasmid sequences from the resident E. coli strain were used as additional potential sources of reads. A total of four file-pairs were generated from the original fastq input files in this step: one for reads best matching the invader reference, one for reads matching the resident, one for plasmid reads, and one pair containing all read pairs not properly matching either of the above categories. The mean coverage per invader E. coli population after the filtering amounted to 261x for populations evolved in young mice, and 413x for populations evolved in old mice. The reads sorting into invader and resident reference genomes were separately aligned to either the E. coli strain K-12 (substrain MG1655; Accession Number: NC_000913.2) reference or the newly assembled reference genome of the resident E. coli, respectively. Three separate alignment algorithms were used: BWA-sampe with default parameters [31], MOSAIK with default parameters [32], and Breseq/Bowtie [33,34]. Breseq provides both, alignment and variant analysis, and was run in polymorphism mode requiring a minimum coverage of five reads per position, a variant frequency of at least 0.05, and the absence of significant strand bias. Additional variant calling approaches were employed on top of all three alignment methodologies to identify potential additional variation in the evolved genomes, and to verify the variants called by Breseq. For SNP and indel identification, a naïve pipeline using the mpileup utility within samtools [35] and a custom filter script written in python was employed. This script filtered base calls to ensure a minimum read mapping quality of 20 and a base call quality of at least 30 for variant calling. Among these high-quality positions, initially, at least 3% of reads and a minimum of five quality reads had to support a putative SNP or indel on both strands with a strand bias (pos. strand/neg. strand) above 0.2 and below 5.0 for this mutation to be considered further.
Mutations were only retained if they did not fall within repeat regions or regions of sequence breakpoints (in which case clustering of false-positive SNPs was observed). The final list of mutations consists of those variants that were identified in more than one of the alignment approaches and reached an average frequency of at least 5% between the three alignments. Putative novel insertion sites of known mobile elements, 54 known bacterial insertion sequences (https://ecoliwiki.org/colipedia/images/a/a1/All_IS_ seqs_fasta.txt), 9 known representative transposons (AY485150.   [37] and panISa [38], and compared to previous predictions from Breseq. Any insertion elements movement predicted by at least two of these approaches was considered well supported. All variants identified were manually verified in IGV [39,40].

Detection of bacteriophage-mediated HGT events
Prophage (Nef or KingRac)-specific genes were PCRamplified from clones or populations isolated at different time points as described in Frazão et al. [6].

Statistical analysis
All statistical analyses were conducted in GraphPad Prism (version 7.04). Detailed statistics for all the experiments can be found in the figure legends and/or in the manuscript together with the n and definitions of center and dispersion. In all figures, n represents the number of animals that were used. Statistical significance was defined for p < 0.05 in all comparisons and calculated as described in the manuscript and/or figure legends.

Data deposition
Genome sequencing data have been deposited with links to BioProject accession number PRJNA633524 and BioProject accession number PRJNA549246 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/).