A microbial signature following bariatric surgery is robustly consistent across multiple cohorts

ABSTRACT Bariatric surgery induces significant shifts in the gut microbiota which could potentially contribute to weight loss and metabolic benefits. The aim of this study was to characterize a microbial signature following Roux-en-Y Gastric bypass (RYGB) surgery using novel and existing gut microbiota sequence data. We generated 16S rRNA gene and metagenomic sequences from fecal samples from patients undergoing RYGB surgery (n = 61 for 16S rRNA gene and n = 135 for metagenomics) at pre-surgical baseline and one, six, and twelve-month post-surgery. We compared these data with three smaller publicly available 16S rRNA gene and one metagenomic datasets from patients who also underwent RYGB surgery. Linear mixed models and machine learning approaches were used to examine the presence of a common microbial signature across studies. Comparison of our new sequences with previous longitudinal studies revealed strikingly similar profiles in both fecal microbiota composition (r = 0.41 ± 0.10; p < .05) and metabolic pathways (r = 0.70 ± 0.05; p < .001) early after surgery across multiple datasets. Notably, Veillonella, Streptococcus, Gemella, Fusobacterium, Escherichia/Shigella, and Akkermansia increased after surgery, while Blautia decreased. Machine learning approaches revealed that the replicable gut microbiota signature associated with RYGB surgery could be used to discriminate pre- and post-surgical samples. Opportunistic pathogen abundance also increased post-surgery in a consistent manner across cohorts. Our study reveals a robust microbial signature involving many commensal and pathogenic taxa and metabolic pathways early after RYGB surgery across different studies and sites. Characterization of the effects of this robust microbial signature on outcomes of bariatric surgery could provide insights into the development of microbiome-based interventions for predicting or improving outcomes following surgery.


Introduction
Bariatric surgery, the most effective treatment for severe obesity, induces rapid and durable weight loss and improves or resolves many obesityassociated comorbidities including Type II diabetes and cardiovascular diseases. 1 Mechanisms that have been proposed to contribute to weight loss and metabolic improvements following bariatric surgery include changes in diet, hormones, bile acids, energy metabolism, and the gut microbiome. 2 Recent literature has shown that the composition and function of the gut microbiome undergo significant changes following bariatric surgery and this may partly contribute to its beneficial effects; however, the underlying mechanisms for such effects remain unknown. [3][4][5] To clarify potential mechanisms by which the gut microbiome exerts its beneficial effects, several studies have evaluated how bariatric surgery alters the gut microbiome. While these studies identify taxa that change in response to surgery, the specific microbiome alterations reported across studies have been inconsistent. For example, both increases and decreases in the relative abundance of the phylum Bacteroidetes as well as Faecalibacterium and Bifidobacterium species following bariatric surgery have been reported. 3,[6][7][8][9][10][11][12][13] These differences can possibly be explained by variations in study design between research groups (e.g., demographics, geography, exclusion and inclusion criteria, sample size, and dietary intake) or differences in laboratory techniques (e.g., DNA extraction and sequencing platforms). 14 In areas such as obesity and colorectal cancer, prior meta-analyses have integrated microbiome data across multiple studies to resolve robust associations between the gut microbiome and health covariates. [15][16][17][18][19] Such studies apply consistent bioinformatic tools and statistical models to all datasets to eliminate analytical variability between investigations and, where possible, model the impact of study design and methodological factors on the results. However, no such meta-analysis has yet clarified whether consistent associations exist between changes in the gut microbiome and RYGB surgery outcomes across studies.
In this study, we generated 16S rRNA gene sequences and metagenomics data from fecal samples obtained from patients who underwent Rouxen-Y Gastric Bypass (RYGB) surgery and compared the microbial signature observed from our study with three publicly available 16S rRNA gene datasets and one metagenomic dataset obtained from similar RYGB-microbiota studies while controlling for sequencing and bioinformatic tools. 12,[20][21][22] Our study reveals that many of the microbial community of patients in different cohorts responds to surgery in very similar ways. Establishing this consistent microbial signature is a necessary prerequisite for the development of robust clinical tools that could use microbial changes to predict personalized outcomes in bariatric surgery.

Inference using linear mixed models reveals a consistent fecal microbiota signature associated with RYGB surgery across multiple 16S rRNA gene datasets
We generated 16S rRNA gene sequence data from fecal samples obtained from patients pre-(n = 30) and one month (n = 21) and six months (n = 10) post-RYGB surgery. These data were generated as part of our ongoing prospective study (referred to as the BS study in this manuscript), which aims to characterize the effect of the gut microbiome and behavioral variables on weight profile following bariatric surgery. 23 We combined these data with three smaller publicly available 16S rRNA gene sequence datasets from patients who also underwent RYGB surgery. [20][21][22] Demographics and number of samples at each timepoint are presented in (Table 1). Consistent with many previous studies showing large differences between cohorts analyzed by different groups, 24 Principal Coordinates Analysis (PCoA) using Bray-Curtis distance showed pronounced clustering by study ( Figure  1a). Using the PERMANOVA test with independent variables of study, time (pre-versus all timepoints post-surgery) and the interaction between time and study showed highly significant differences between the gut microbiota compositions across studies (R 2 = 0.33, p = .001, Figure 1a). Although the effect size was much smaller, we also observed a significant difference between pre-and post-surgical samples including all timepoints (R 2 = 0.03, p = .001, Figure 1b), while the interaction between study and time was not significant (R 2 = 0.01, p = .22). No significant difference in Shannon Diversity Index was observed between different timepoints (Supplementary Figure S1).
In order to provide a more detailed view of the taxa that were changed with surgery, for each study we compared the log 10 -normalized count of each genus at each timepoint after surgery to the same genus's normalized baseline abundance using linear mixed models. To enable the comparison of results across studies, we created "log 10 p-value vs. log 10 p-value" plots in which the results from the "pre-vspost" coefficients of our models at each timepoint were compared across pairs of studies (supplementary Figure S2). For example, (Figure 1c) compares the changes in genera from the baseline to postsurgery between our BS study dataset at one month and the Assal study at three months post-surgery, which had the greatest similarity across all pairs of studies. Genera in the upper right quadrant are enhanced post-surgery in both studies, while genera in the lower left-quadrant are reduced postsurgery in both studies and genera with unadjusted p < .05 in both studies are annotated. Using this comparison, there is a remarkable degree of similarity (adjusted p < .001 spearman r = 0.671) across the two studies. We made similar plots comparing   Figure S2) in which we compared changes in the log 10 -normalized count of genera at each timepoint compared to the baseline and then showed the consistency between those comparisons using "log 10 p-value vs. log 10 p-value" plots. All five pairwise comparisons within studies and the 23 pairwise comparisons across studies were significantly associated at a 5% FDR although as we would expect from our PCoA analysis, there was greater consistency comparing timepoints within studies than across studies (r = 0.63 ± 0.07 versus r = 0.41 ± 0.10) (Figure 1e). Among the four 16S rRNA gene datasets, the Ilhan study was somewhat of an outlier exhibiting weaker associations with the Afshar and our novel BS study (Figure 1f). The Ilhan study had the smallest sample size among the datasets (Table 1) and this may explain why its associations with other studies were less robust. Since the DADA2 pipeline allowed us to increase the resolution of classification of 16S rRNA gene sequences to sequence variants, we explored whether the high degree of similarity between studies is present at the high-resolution sequence variant level. For this, we compared changes in the log 10 normalized count of sequence variants between the BS and Assal studies as these studies included amplicons from the same region of the 16S rRNA gene with the same length and primers. We observed that changes in sequence variants post-surgery followed a similar pattern in both studies (Supplementary Figure S3) and that in some cases, sequence variants that were increased or decreased in both studies belonged to different genera, such as Veillonella, Atopobium, a c d e f b Figure 1. A Consistent Signature of Fecal Microbiota at the Genus Level is Associated with Roux-en-Y Gastric Bypass Surgery Across 16S rRNA Gene Studies. (a) Principal Coordinates Analysis using Bray-Curtis distance was performed on the fecal microbiota composition at the genus level. Samples are clustered by study, suggesting significant differences present across studies. (b) A similar ordination plot to A is shown with pre-surgical samples (0) and post-surgical samples (1) specified for each dataset. (c & d) log 10 p-value versus log 10 p-value plots are generated using the unadjusted p-values from linear mixed models comparing log 10 normalized count of taxa at each timepoint compared to baseline with patient IDs as random effects. Upper right-quadrant and lower left-quadrant show taxa that were increased and decreased, respectively, in two different studies (c) or at two different timepoints within a study (d). Spearman rankorder correlation was used to test the consistency of changes in taxa after RYGB between studies or within studies. Taxa with unadjusted p < .05 in both studies or at both timepoints within a study are annotated. All pairwise comparisons between and within studies are shown in Supplementary Figure S2. (e) Boxplots show the correlation coefficients from Spearman rank-order correlations between and within studies. Wilcoxon Rank-Sum test was used to compare the correlation coefficients (* adjusted p < .05). Postsurgery time for Afshar study was assumed to be 6 months based on Afshar et al 2018. (f) Boxplots show correlation coefficients between different studies or between different timepoints in a study.

Lachnoclostridium,
and Streptococcus (Supplementary Figure S4). Overall, we conclude that the genus and sequence variant view gives highly concordant results in the comparison of these two datasets.
Following our observation of significant correlations between studies, we next determined which genera were consistently increased or decreased across studies compared to the baseline. We observed that changes in the relative abundance of Veillonella, Streptococcus, Gemella, Fusobacterium, Escherichia/Shigella, and Akkermansia showed a similar trend and increased at least at one timepoint after surgery across the four 16S rRNA gene datasets. Rothia, Actinomyces, Atopobium, and Granulicatella showed a similar trend in changing after surgery across three datasets, but not in the Ilhan study. On the other hand, Blautia decreased at all the timepoints after surgery in all four 16S rRNA gene datasets ( Figure 2). Most significant changes relative to baseline were observed at onemonth post-surgery in the BS study and threemonth post-surgery in the Assal study ( Figure 2). These results may reflect greater shifts in the microbiota during the first few months after surgery due to surgery-associated factors, such as medication Figure 2. Several Taxa at the Genus Level were Increased or Decreased Consistently Across Datasets. A hierarchical clustering heatmap was performed on the Euclidian distances of log 10 p-values from the linear mixed models with timepoint as a fixed effect and patient ID as a random effect. Red indicates taxa were increased after surgery, while blue indicates taxa were decreased after surgery compared to the baseline. use, liquid diet, and hospitalization, or may be simply explained by the larger sample size at these timepoints for both the BS and Assal studies.

Metagenomic studies reveal a consistent and robust signature in microbial pathways associated with RYGB surgery across studies
The above results relied on the SILVA database contextualized with taxonomic annotations that stemmed from analysis of DADA2 clustered 16S rRNA gene sequences. In order to show that these results are independent of the database used for taxonomic classification as well as the 16S rRNA gene sequencing platform, we performed metagenomics on 135 samples from the BS study, taxonomically classified sequences using the Kraken2 classifier, and compared changes in the log 10 normalized count of species at each timepoint relative to the baseline with a smaller publicly available metagenomic dataset (Palleja, Table 1). 12 Similar to the results from pairwise comparisons of 16S rRNA gene datasets, a significant correlation was observed when we compared the changes in species from the baseline to post-surgery between our BS study and the Palleja study (r = 0.31 ± 0.04, p < .001; Figure 3a and Supplementary Figure S5), suggesting some taxa have similar patterns of changing post-surgery in both studies. For example, Streptococcus spp., Klebsiella spp., Veillonella parvula, Citrobacter freundii, and Enterobacter hormaechei were increased in both studies. As expected, a greater consistency in changes in species was observed when we compared different timepoints within the Palleja or BS studies (r = 0.76 ± 0.06 versus r = 0.31 ± 0.04; Figures 3b and 3c and Supplementary Figure S5). While significant, the correlation based on taxa from these metagenomic studies was smaller than what we observed for 16S rRNA gene studies.
In addition, changes in abundances of metabolic pathways profiled by the HUMAnN2 pipeline were a b c compared between these two metagenomic datasets. Interestingly, compared to the taxonomic composition, changes in abundances of metabolic pathways post-surgery relative to the baseline showed a greater degree of similarity between the Palleja and BS datasets (r = 0.70 ± 0.05; p < .001 across different timepoints between studies; r = 0.89 ± 0.05; p < .001 across different timepoints within studies; Figure 3d-f and Supplementary Figure S6). Notably, pathways involving gluconeogenesis, L-alanine biosynthesis, biotin biosynthesis, hexitol degradation, mycolate biosynthesis, N-acetylneuraminate degradation, and fatty acid elongation were enriched post-surgery in both studies, while dTDP-L-rhamnose biosynthesis I and starch degradation were decreased post-surgery in both studies (Supplementary Table S2). These results suggest that functional microbial genes respond similarly to RYGB surgery across cohorts.

Machine learning approaches confirm that the fecal microbiota can be used to discriminate pre-and post-surgery samples
Machine learning approaches have often been used in metagenomic analyses to determine whether a microbial signature in one study can be used to predict the phenotype of samples from another study. 25 In order to determine if such an approach could be extended to our analyses, we next used supervised classifiers to assess whether the signature of the gut microbiota after surgery generalizes across 16S rRNA gene studies. For this purpose, we used the Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression ( Figure 4) and Random Forest (Supplementary Figure S7) classifiers implemented in a statistical workflow developed by Wirbel et al. 15 Within study crossvalidation performance, quantified by the area under the receiver operating characteristic curve (AUROC) was highest for the BS and Assal studies (above 0.90 for both LASSO and Random Forest classifiers) and was the lowest for the Ilhan study (Lasso: 0.67, Random Forest: 0.59) and Afshar study (LASSO: 0.67, Random Forest: 0.74) ( Figure  4a and Supplementary Figure S7A), which likely reflects the smaller sample size of these latter studies. The average study-to-study accuracy predictions ranged between 0.66 and 0.84 for the LASSO classifier, with the highest performance being associated with using Afshar or BS studies as training sets (0.84 and 0.83, respectively) and the lowest performance associated with training on the Ilhan study (0.66) (Figure 4a). Finally, we examined when a classifier is trained using all datasets but one, how well it generalizes in evaluation on the remaining hold-out study (leave-one-study-out: LOSO validation). The LOSO performance for all datasets was between 0.77 and 0.95 using the LASSO classifier ( Figure 4b) and was above 0.85 using the Random a b Forest classifier (Supplementary Figure S7B). All these results are consistent with the RYGB procedure having a replicable microbiota signature across studies.

Abundances of opportunistic pathogens increase post-surgery in a consistent manner across cohorts
Since we observed similar changes in the microbial community following surgery across studies, we next asked whether changes in the gastrointestinal environment post-surgery provide an opportunity for opportunistic pathogens that could be acquired from the hospital or could be associated with antibiotic treatment to flourish in the gastrointestinal tract. For this purpose, from our analysis from the Kraken2 pipeline, for both 16S rRNA gene and metagenomic sequences, we compared the log 10 normalized count of typical opportunistic pathogens Klebsiella pneumoniae, Klebsiella oxytoca, Enterococcus faecalis, Enterobacter cloacae complex sp, Enterobacter cloacae, Enterococcus faecium, and Clostridium perfringens between different timepoints using a linear mixed model for each study. Interestingly, our BS metagenomic and Palleja metagenomic datasets showed a largely consistent set of opportunistic pathogens that increased at an adjusted p < .05 one month or three-month postsurgery and remained high in abundance at later timepoints following surgery (Figure 5a,b). One exception to this agreement between studies was Enterococcus faecium which was decreased in the BS study following surgery but was unchanged in the Palleja study. In addition, an increase in Clostridium perfringens reached statistical significance (adjusted p = .001) at six-month postsurgery in our BS metagenomic study but not in the Palleja metagenomic study.
A similar trend was observed in the 16S rRNA gene datasets; however, only the increase in Klebsiella pneumoniae (BS study adjusted p = .007 at one-month post-surgery; Assal study adjusted p = .04 at three-month post-surgery) and Clostridium perfringens (BS study adjusted p < .001 at one and six-month post-surgery and Assal study adjusted p = .002 at three months and adjusted p = .04 at two years post-surgery) reached statistical significance (Figure 5c,d). This could be simply because Kraken2 may not have enough information to resolve many 16S rRNA gene sequences to this level of taxonomic assignment in many instances. Nevertheless, the increased trend a b c d e f in opportunistic pathogens post-surgery observed in both 16S rRNA gene and metagenomic datasets indicates that RYGB is associated with colonization of some of the opportunistic pathogens in the gut, which may remain abundant months after surgery.

Discussion
Previous studies have shown that bariatric surgery induces significant shifts in the gut microbiota. 2 A recent systematic review of longitudinal bariatric surgery studies reported that similar patterns in microbial profiles were observed after RYGB and Sleeve Gastrectomy (SG) across studies. 14 For example, at the phylum level, Proteobacteria were increased after RYGB in numerous studies. 3,4,10,26 In addition, at the genus level, Prevotella and Viellonella were increased following RYGB in multiple studies. 3,8,[26][27][28] Despite these similarities, substantial inconsistencies in microbial profiles following surgery have also been observed across different cohorts. For example, there are conflicting results regarding changes in the relative abundance of the phylum Bacteroidetes as well as Faecalibacterium and Bifidobacterium species following bariatric surgery. 3,[6][7][8][9][10][11][12][13] In addition, some studies have reported an increase in microbial diversity post-surgery, 5,11,12 while Paganelli et al. observed a short-term decrease in microbial diversity. 29 These conflicting results can be attributed to large variations in demographics, study designs, inclusion and exclusion criteria, data collection methods, and data analyses across studies. 14 Factors such as diet, medications, physical activity, and disease states are not always well controlled across clinical studies and cannot necessarily be corrected in a post-hoc analysis. However, if sequence data for these studies are publicly available, it is possible to eliminate analytical differences between studies. The results we report here remove variations in data analysis between studies by applying identical bioinformatic tools and statistical models.
Our results revealed a significant and surprisingly robust microbial signature in the weeks and months following RYGB. By integrating multiple studies, our results confirm and expand on some of the original observations made in previous publications. For example, when we compared changes in taxa across two of the 16S rRNA gene datasets (Assal and BS datasets), we found over 20 taxa that were significantly changed in both datasets with an unadjusted p <.05 with a remarkable degree of similarity in the magnitude of the changes across the two cohorts (Figure 1c). By contrast, in the original paper, Assal et. al reported only 10 taxa by Mann-Whitney paired test that were significantly changed. Therefore, our integrated analysis allowed us to achieve a better resolution of postsurgical changes in the microbial profile in a way that an individual study with a small sample size would be underpowered to detect.
In our analysis, we observed an increase in Veillonella and Streptococcus and a decrease in the Blautia (all from the Firmicutes phylum) across the four studies. These changes could have important clinical implications post-surgery. For example, Veillonella and Streptococcus metabolize lactate, 30,31 which consequently impacts butyrate metabolism and the integrity of the epithelial barrier. [32][33][34] Enhanced integrity of intestinal epithelium could decrease low-grade systemic inflammation and improve metabolic disorders. 35 Our study also revealed an increase in Akkermensia within the Verrucomicrobia phylum. Akkermensia contains mucin degrading microbes and has been shown to increase after bariatric surgery in several studies. 3,12,28,36 Previous animal studies have shown that Akkermensia muciniphila protects against obesity and diabetes by enhancing the intestinal epithelium barrier and potentially decreasing endotoxemia and low-grade inflammation. 37,38 Akkermensia muciphilia was also associated with improvements in insulin sensitivity markers in humans. 39 We also observed that Escherichia (found in the Proteobacteria phylum) was increased, which is in agreement with previous studies, 8,11 and a negative correlation between E. coli and serum leptin after RYGB was previously reported. 8 In addition to the four 16S rRNA gene datasets, we compared our novel metagenomic sequences with the previously published Palleja metagenomic study. This allowed us to examine the effect of RYGB on the function of the gut microbiome in addition to the taxonomic composition. Our results showed a more robust and replicable signature in metabolic pathways following surgery across the two cohorts compared to the taxonomic profile. Changes in carbohydrate, amino acid, and lipid metabolism by the gut microbiome post-surgery could be due to dietary changes, the adaptation of the gut microbiome to the new gastrointestinal environment, or the increased potential of the gut microbiome in energy harvest as a compensatory response to the reduced food intake after RYGB. 12 Changes in the metabolic pathways of the gut microbiome could have clinical implications on outcomes after surgery since metabolites produced by the gut microbiota, such as short-chain fatty acids, have beneficial health effects, and have been linked to weight loss, glycemic improvement, and regulation of food intake. 28,40 However, future studies that evaluate how changes to the metagenome predict post-surgery outcomes are needed to clarify the clinical implications of these observations. Moreover, our study revealed that some opportunistic pathogens associated with the hospital environment, such as Klebsiella pneumonia and Clostridium perfringens increased after RYGB. Changes in the gastrointestinal environment and routinely administered operative prophylactic antibiotics prior to surgery might be the reasons for the increase in opportunistic pathogens after surgery. Future metagenomic studies are warranted to confirm these results and to determine the extent to which these opportunistic pathogens persist longterm after surgery and whether they have any effect on weight outcomes and metabolic response to bariatric surgery. Nevertheless, it is worth mentioning that our observation of increased opportunistic pathogens following RYGB does not necessarily translate into an increased risk of infection. In fact, the increase in opportunistic pathogens may be transient and may revert to the baseline following overgrowth of normal flora at later timepoints following surgery. Furthermore, an increase in the observed relative abundance of opportunistic pathogens does not necessarily mean that the taxa in question will elicit pathogenesis.
This study has some limitations that should be noted. First, due to lack of availability of weight and metabolic data in some studies, we were unable to perform any association studies between the baseline microbiome and the post-surgical microbial signature with weight and metabolic outcomes after RYGB. Therefore, future studies are needed to characterize the effects of the taxa that were found to be commonly increased or decreased across studies on the outcomes of surgery. Second, again due to lack of data availability, we were not able to control for BMI, age, medications, diet, metabolic disorders, and other covariates that could potentially impact the gut microbiota composition. However, this limitation is less likely to have impacted our main findings since our outcomes show consistency across multiple studies. Finally, the microbial signature observed in this study might only reflect the immediate impact of surgery-related procedures, which could include a liquid diet or exposure to antibiotics. Future research is needed to determine if this consistent microbial signature persists over a longer term.
In conclusion, our study highlights a robust signature in both microbial composition and gene function following RYGB across different cohorts. Ongoing assessment of our cohort will allow us to determine in future studies if this microbial signature is predictive of weight outcomes after surgery. These studies may allow us to use the microbial community to guide decisions on which subset of patients will successfully respond to surgery. Our characterization of the robust effect of the microbial signature across cohorts is a necessary prerequisite for the development of novel microbiomebased interventions for personalized treatment of obesity, improving outcomes, and preventing weight regain following surgery. Such novel treatments will potentially have greater patient compliance, will be less invasive compared to surgery, and will decrease the need to revision of bariatric surgery for weight regain.

Publicly available 16S rRNA gene and metagenomic datasets
In this study, we generated novel 16S rRNA gene and metagenomic datasets from fecal samples from patients undergoing RYGB and performed a comparative analysis of changes in the fecal microbiota from pre-to post-RYGB using our novel dataset and previously published datasets. For this, we searched NCBI for Bioprojects that included clinical studies involving patients undergoing RYGB surgery as well as their associated 16S rRNA gene sequences, metagenomics and metadata (Supplementary Table S1). Studies that included longitudinal samples and were sequenced on the Illumina platform were included in our comparative analysis, while cross-sectional studies were excluded. Three 16S rRNA gene datasets and one metagenomic dataset met these requirements. 12,[20][21][22] The raw sequences for these studies were obtained from the Sequence Read Archive (SRA) with project numbers specified in ( Table 1). The dataset from the Afshar et al. study was part of Biomarkers Of Colorectal cancer After Bariatric Surgery study (www.isrctn.com/ ISRCTN95459522) and no publication for the microbiome dataset is available. Also, for this dataset, the time for collection of post-RYGB samples is not determined in the metadata, however, based on Afshar et al. 22 patients were followed up at 6 months after surgery. The Ilhan et al 2020 study included mucosal samples from post-RYGB patients and fecal samples from a cross-sectional cohort; however, these samples were removed from our analysis. Sample collection, DNA extraction, library preparation, and inclusion and exclusion criteria can be found in original publications and have been summarized in (Supplementary Table S1).

Clinical study (BS dataset)
We used data (BS dataset, Table 1) from our ongoing prospective study, which aims to examine the impact of biological and behavioral variables on weight outcomes following bariatric surgery. 23 This study is registered at www.clinicaltrials.gov with trial ID NCT03065426. Details regarding specific aims and study design for this study have been previously published. 23 Briefly, patients are recruited at two sites: Sanford Center for Biobehavioral Research (CBR), in collaboration with North Dakota State University, in Fargo, ND, and Cleveland Clinic Bariatric and Metabolic Institute, Cleveland, OH. This study was approved by the Institutional Review Boards (Cleveland Clinic: #16-1460, Sanford Health: #00001409 and North Dakota State University #PH17112) at both data collection sites and all patients provided written informed consent prior to enrollment. Female or male patients, aged 18-65 years undergoing the first bariatric surgery procedure, either a RYGB or Sleeve Gastrectomy (SG), were recruited and evaluated at baseline (pre-surgery) and multiple timepoints (one, six, 12, 18, and 24 months) following surgery. Exclusion criteria can be found in Heinberg et al. 23 For this study, samples at baseline, one, six, and 12 months post-RYGB were included as samples at 18 and 24 months are not yet available at the time of preparing this manuscript. We did not include SG samples since we have smaller number of SG samples compared to RYGB samples and also most publicly available datasets include patients who underwent RYGB (Supplementary  Table S1).

Microbiome analysis (BS dataset): 16S rRNA gene and metagenomics
Fecal samples were collected at baseline and each timepoint after surgery and stored at −80°C until analysis. DNA extraction was performed as previously described. 41 Briefly, a phenol/chloroform extraction method combined with physical disruption of bacterial cells and a DNA clean-up kit (Qiagen DNeasy Blood and Tissue extraction kit, Valencia, CA) was used to extract DNA from fecal samples. For 16S rRNA gene sequencing, the V4 variable region was amplified by polymerase chain reaction (PCR) using 16S rRNA gene primers (forward: 5ʹ-CAACGCGARGAACCTTACC-3ʹ; reverse: 5ʹ-CAACACGAGCTGACGAC-3ʹ) and sequenced on the Illumina MiSeq platform (Illumina, San Diego, CA) at the High-Throughput Sequencing Facility in the Carolina Center for Genome Sciences at the UNC School of Medicine as previously described. 42 For metagenomics, extracted DNA was subjected to 2 × 150 bp paired-end sequencing on the Illumina HiSeq 4000 platform at the UNC-Chapel Hill high throughput sequencing facility. The microbial profile of BS study was primarily characterized through metagenomic sequencing (n =135 metagenomes) and 16S rRNA gene sequencing was additionally performed for a subset of samples (n = 61) to validate the results from metagenomic sequencing.

Comparative, integrated analysis across multiple datasets
Forward reads from the four 16S rRNA gene datasets were individually run through the DADA2 pipeline to generate amplicon sequence variants (ASVs). 43 Sequences were filtered using the "filterAndTrim" function in DADA2 with default parameters and maxEE = 2 (reads with expected errors more than 2 were discarded). For the BS, Assal, and Afshar datasets forward reads were truncated at 200 base pairs and for Ilhan study reads were truncated at 150 base pairs. The SILVA132 database was used for the taxonomic classification of ASVs using the DADA2 "assignTaxonomy" function. The metagenomic datasets (Palleja and BS) sequences were analyzed with Kraken2 44 using the BioLockJ automated pipeline (https:// github.com/BioLockJ-Dev-Team/BioLockJ).
In order to compare changes in the abundance of opportunistic pathogens at the species level after surgery across cohorts, 16S rRNA gene datasets were taxonomically classified using the Kraken2 pipeline in addition to the metagenomic datasets. Recent work by Lu et al. 45 showed that classification of 16S rRNA gene microbial communities using Kraken2 is fast and accurate.
Each 16S rRNA gene and metagenomic dataset (from DADA2 or Kraken2) was normalized using the following formula to correct for different sequence depth across samples: 46 log 10 Raw count in sample i ð Þ # of sequences in sample i ð Þ � � �Average # of sequences per sample� þ 1Þ Metabolic pathways for the metagenomic datasets were profiled using the HUMAnN2 pipeline and the MetaCyc database. 47,48 Pathway abundances (reads per kilobase) were normalized to copies per million using the "humann2_renorm_table" function from HUMAnN2.
For each dataset, longitudinal fecal samples from patients who underwent RYGB were selected. Samples from other procedures such as SG or cross-sectional samples were removed from the downstream analysis. Taxa that were present in less than 10% of the samples were removed. Linear mixed models with patients as random effects and timepoints as fixed effects (taxa ~ timepoint, random = ~1 | patient ID) were constructed to compare the log 10 normalized count of taxa between pre-surgery and different timepoints postsurgery using the "lme" function from the nlme package in R. P-values were generated from the "summary" function in R for each post-surgery timepoint with a baseline reference. Linear mixed models were used rather than paired t-tests because not every patient had samples for all timepoints (Table 1). To compare results across studies,log 10 unadjusted p-values from linear mixed models were multiplied by the sign of the regression slope and "log 10 p-values vs. log 10 p-values" plots were generated. Therefore, positive and negative log 10 p-values indicate an increase and decrease, respectively, in log 10 normalized count of taxa after surgery compared to baseline. Spearman's rank-order correlation was used to examine the correlations of the log 10 p-values between studies or between different timepoints within a study. P-values from the Spearman's rank-order correlations were corrected for multiple hypothesis testing using the Benjamini-Hochberg procedure and were considered to be significant when the False Discovery Rate (FDR) was <5%. In addition, a hierarchical clustering heatmap based on the Euclidian distances of the log 10 p-values from linear mixed models were generated using the function "Heatmap" from the package ComplexHeatmap in R. Similar statistical analyses were performed to compare changes in metabolic pathways postsurgery between the BS and Palleja cohorts.
Principal Coordinates Analysis with Bray-Curtis distance was performed on log 10 normalized counts of taxa from the joined four 16S rRNA gene sequencing datasets using the "capscale" function from the Vegan package in R. The PERMANOVA test on the Bray-Curtis distances was further used to compare the microbiota composition between different datasets and timepoints (pre-versus post-surgery) after surgery (Microbiota ~ study * timepoint). Shannon diversity index, a measurement of richness and evenness, was performed using the "diversity" function from the Vegan package in R. Finally, supervised classification models, including Random Forest and Least Absolute Shrinkage and Selection Operator (LASSO) were used to examine the microbial signature following RYGB surgery. For this purpose, we used a statistical workflow developed by Wirbel et al. 15 In this workflow, taxa at the genus level that have no variance across samples are removed, relative abundances are log-transformed after adding a pseudocount of 1 × 10 −5 and standardized as z-scores. Each dataset was split into a train set and a test set with 8-fold cross validation and 10 repeats. Within the study, prediction of surgical status (pre-surgery versus post-surgery) in a test set was calculated as average area under the receiver operating characteristic curve (AUROC) from 10 trained models (one model from the 8-fold cross-validation * 10 repeats). Additionally, the trained models from all cross validations and repeats (180 models: 8 models from cross validation * 10 repeats) were then used to predict the surgical status in another dataset. In addition, all joined datasets except for one dataset were trained to predict the hold-out dataset (leave-one-study-out (LOSO) validation). Predictions were averaged across all models. All analyses and visualizations were performed using R studio (Version 1.3.1056).

Disclosure of potential conflicts of interest
The authors report no conflict of interest.

Funding
This project was supported by the National Institutes of Health 5R01 DK112585-05 (PI's: Kristine Steffen and Leslie Heinberg).

Ethics approval and consent to participate
This study was approved by the Institutional Review Boards (Cleveland Clinic: #16-1460, Sanford Health: #00001409 and North Dakota State University #PH17112) at both data collection sites and all patients provided written informed consent prior to enrollment. This study is registered at www.clinical trials.gov with trial ID NCT03065426.