Probiotic and commensal gut microbial therapies in multiple sclerosis and its animal models: a comprehensive review

ABSTRACT The need for alternative treatments for multiple sclerosis (MS) has triggered copious amounts of research into microbial therapies focused on manipulating the microbiota–gut–brain axis. This comprehensive review was intended to present and systematically evaluate the current clinical and preclinical evidence for various probiotic and commensal gut microbial therapies as treatments for MS, using the Bradford Hill criteria (BHC) as a multi-parameter assessment rubric. Literature searches were performed to identify a total of 37 relevant studies (6 human, 31 animal), including 28 probiotic therapy and 9 commensal therapy studies. In addition to presenting qualitative summaries of these findings, therapeutic evidence for each bacterial formulation was assessed using the BHC to generate summative scores. These scores, which encompassed study quality, replication, and other considerations, were used to rank the most promising therapies and highlight deficiencies. Several therapeutic formulations, including VSL#3, Lactobacillus paracasei, Bifidobacterium animalis, E. coli Nissle 1917, and Prevotella histicola, emerged as the most promising. In contrast, a number of other therapies were hindered by limited evidence of replicable findings and other criteria, which need to be addressed by future studies in order to harness gut microbial therapies to ultimately provide cheaper, safer, and more durable treatments for MS.


Multiple sclerosis
Multiple sclerosis (MS) is a chronic autoimmune disease of the central nervous system (CNS) characterized by neuroinflammation, myelin sheath degeneration, axonal loss, and blood-brain barrier (BBB) deterioration. 1,2 Globally, about 2.8 million people are estimated to live with MS. 3 The disease typically follows four different clinical courses: clinically isolated syndrome (CIS), relapsingremitting MS (RRMS), primary progressive MS (PPMS), and secondary progressive MS (SPMS), with progressive forms being most severe and refractory to treatment. 1 As MS progresses, most patients are challenged with chronic pain and fatigue, gradual sensorimotor impairments, bowel and bladder dysfunction, cognitive changes, and overall diminished quality of life. 1,2 There are currently 15 FDA-approved diseasemodifying therapies (DMTs) used to decrease the severity and frequency of MS relapses. [4][5][6] These DMTs are generally effective at mitigating MS pathology by suppressing various aspects of the immune system, but they are also expensive, 2 often accompanied by an array of side effects, 5 and demonstrate decreased efficacy over time. [6][7][8][9] As such, improved, alternative MS treatments are warranted.
Like most chronic diseases, susceptibility to MS is driven by both genetic and environmental components, with the latter including well-documented risk factors like Epstein-Barr virus infection, vitamin D-insufficiency, and smoking. 10,11 An emerging putative pseudo-environmental risk factor for MS and other chronic diseases is an imbalance (dysbiosis) in the gut microbiome, a complex ecosystem of trillions of microorganisms inhabiting our intestinal tracts. A generalized mechanism proposed is the bidirectional communication between the CNS and gut by way of the so-called microbiota-gut-brain axis (MGBA). 1,[12][13][14] Dysbiosis within the gut can promote effector T cell phenotypes toward proinflammatory pathways that subsequently increase intestinal barrier permeability. [15][16][17][18][19][20] This enables the release of microbial antigens and intestinal immune cells into circulation, further promoting systemic, low-level inflammation which may contribute to the weakened BBB tight junctions and enhanced T cell autoreactivity observed in MS. 15,16,[18][19][20] The directionality of whether MS contributes to, as opposed to results from, this dysbiosis, however, is still unclear. Nevertheless, multiple studies 16,[18][19][20] have characterized MS gut microbiomes as distinct from their healthy control-counterparts, generally possessing an elevated relative abundance of microrganisms associated with inflammation that when transplanted into mice have been shown to exacerbate an experimental animal model of MS, experimental autoimmune encephalomyelitis (EAE). 19

Commensal and probiotic gut bacterial therapies
Given the putative role of gut dysbiosis in promoting MS susceptibility, an attractive therapeutic approach would be to restore balance of the microbiome, and/or to take advantage of the intimate cross-talk between the immune system and gut microorganisms to inhibit or skew autoimmune responses. [21][22][23][24] Hence, gut targeted microbial therapies have been gaining traction as alternative or supplemental treatment options for a variety of conditions, including MS.
Gut bacterial commensals are generally beneficial organisms naturally comprising the gut microbiome to maintain a healthy host environment. 22 Whereas probiotics are defined by the International Scientific Association for Probiotics and Prebiotics (ISAPP) as "live microorganisms that, when administered in adequate amounts, confer a health benefit on the host." 25 For the purposes of this review, these are defined as live bacteria that can be supplemented into the microbiome to elicit beneficial changes in the commensal microbial community structure, and/ or exert direct beneficial effects on the host. 22,26,27 It should be noted, however, that the exact distinction between gut bacterial commensals and bacterial probiotics remains arbitrary and far from uniform across studies, organizations, and regulatory guidelines. Nevertheless, both commensals and probiotics have important roles in digestive and immune health, including nutrient and vitamin synthesis, metabolism of host dietary products, intestinal barrier reinforcement, prevention of pathogenic microbe colonization, and antiinflammatory immunoregulation.- 21,22,26,27 Consequently, both have the therapeutic potential to mitigate MS pathology through modulation of the MGBA. This, in combination with accessibility and relatively low costs, makes probiotic and commensal therapies attractive alternative MS treatment candidates.

Alternative MS therapies: where do we stand?
A number of recent reviews have attempted to compile the growing body of evidence for gut microbiome-targeted therapies, though most of these are far from comprehensive. 12,[26][27][28][29][30][31][32][33][34] To date, there have been two published reviews that evaluate the clinical utility of probiotics and explore possible underlying mechanisms, including one systematic review of two clinical and five preclinical studies, 31 and one recent study with a meta-analysis of three clinical studies and systematic review of 22 preclinical studies. 33 Though valuable for highlighting probiotic therapeutic efficacy, both of these reviews are exclusively focused on probiotics without consideration of commensal therapy, and neither ranked the current evidence of each specific gut bacterial formulations using a quantitative objective rubric. Addressing the latter is particularly important, given that it helps to identify stronger and weaker areas within the field, highlight discrepancies, and ultimately provide direction for future research.
Accordingly, in this comprehensive review, we attempt to (1) compile the current clinical and preclinical evidence of MS mitigation by probiotic and commensal therapies; and (2) systematically rank the evidence of each gut-bacterial formulation using the Bradford Hill criteria (see below). In doing so, we aim to identify the most promising emerging therapies, as well as to highlight existing shortcomings in the field and emphasize specific foci for future studies.

Methods
This comprehensive review was originally intended as a systematic review, and therefore registered in the International Prospective Register of Systematic Reviews (PROSPERO; ID# CRD42020206819) following the initial search, but prior to screening articles.

Search strategies
Searches were conducted by two authors (LB & TM) on August 27, 2020 and January 4, 2021 using four databases: OvidMEDLINE, CINAHL, PubMed, and Web of Science. One paper was identified separately by one author (DK) outside of the search strategy. 35 Search strategies were tailored to each database using keywords, MeSH and MH headings, truncation, and an English Language filter (Supp. File 1A).

Selection criteria
Studies from all years were included in this review if they (1) were written or available in English, (2) investigated the effects of probiotic and/or commensal therapy on MS or an MS animal model severity and progression, and (3) utilized an experimental/intervention-based study design. Studies were excluded if they did not meet the inclusion criteria and/or used a non-intervention/experimental study design, including cohorts, cross-sectional studies, case-control studies, case series, and case reports.

Operational definitions
As mentioned above, the distinction between bacterial probiotics and commensals is not well defined, particularly as it applies to MS. For the purposes of this review, a bacterial therapeutic was considered "probiotic" when meeting evidence level 1-2 based on World Gastroenterology Organization guidelines from the Oxford Center for Evidence-Based Medicine, and "commensal" if falling at evidence level 3 and below where RCTs are lacking. 36 Study interventions were therefore classified as "probiotic therapy" if researchers supplemented with the following putative probiotics: Lactobacillus spp., Bifidobacterium spp., Escherichia coli Nissle 1917 (E. coli Nissle 1917), Enterococcus faecium (E. faecium), or Streptococcus thermophilus (S. thermopohilis); or "commensal therapy" if researchers supplemented with any other species of commensal bacteria, including Prevotella spp., Akkermansia spp., Pediococcus acidilactici (P. acidilactici), Clostridium butyricum (C. butyricum), and Bacteroides fragilis (B. fragilis).
There were two animal studies that were exceptions to these classifications, for the following reasons. Both studies introduced putative probiotic Lactobacillus spp. via stable colonization by a single inoculation rather than continuous treatment, which is more representative of commensal therapy than a probiotic therapy. 37,38 Additionally, the bacterial strains used in these two studies are not strains recognized as probiotics, but are instead isolates from commensal murine gut microbiota. These two studies were hence classified as commensal therapy.

Data extraction
Following screening studies for relevance against the selection criteria (LB, TM, & DK), data were extracted from the included studies by two authors (LB & TM) (Supp. File 1). The study metrics extracted included first author, year of publication, DOI, location of study, study design, sample, intervention, duration of study, MS model (for animal studies), measurements/outcomes, statistical methods, and power (for human studies). The study measurement/outcomes extracted included clinical parameters of MS/EAE severity and progression, immune and metabolic indices, microbiome and metabolome parameters, and mechanistic or correlative findings.

Evaluating quality and evidence of included studies
Included studies were subject to quality and risk of bias (ROB) assessments using the Cochrane ROB tool 39 for human studies and SYRCLE tool 40 for animal studies. High quality was assigned to studies with a low ROB, including randomized controlled trials (RCTs) and animal studies that explicitly stated using randomization and blinding measures. Medium quality was assigned to studies with an uncertain ROB, including non-RCT human studies and animal studies that did not explicitly state using randomization and/or blinding measures. Low quality was assigned to studies with a high ROB, including studies with considerable confounding, in addition to not explicitly stating the use of randomization or blinding. These quality assessments were factored into the summative evaluation of each bacterial therapy; therefore, no studies were excluded from analysis on the basis of ROB.
The overall quality and strength of therapeutic evidence provided by each bacterial formulation was assessed using the Bradford Hill criteria (BHC), which includes the following: temporal relationship, strength of relationship, doseresponse relationship, replication of findings, biological plausibility, cessation of exposure, specificity of association, and coherence between multiple approaches. 41,42 The descriptions and numerical designations of each BHC can be found in Table  1. Sufficient evidence (Yes or No) was determined for each criterion (except for replication, see below) and assigned a score of 1 for yes, followed by summation across all criteria to yield a final "BH score" for each therapy. Replication of findings was the most heavily weighted criterion, and was scored as follows: 3 = replicated in human and animal studies, 2 = replicated by different groups, 1 = replicated by the same group, 0 = not replicated, −2 = conflicting findings (not considering lack of effect as conflicting with positive). The calculations are detailed in Supplemental File 2 and summarized in Table 5.

A. Study characteristics
The study characteristics and major findings of the included studies are summarized in Table 1-3. A total of 770 de-duplicated articles were found by the initial search, 55 additional articles were found by the second search, and one article 76 was found by an author (DK) outside of the search strategy. A total of 37 studies 35,37,3843-76 (6 human, 31 animal) were included for analysis in this review based on the stated selection criteria (see Methods) ( Figure 1; Supp. File 1B). Of these 37 studies, 28 (6 human, 22 animal) investigated the effects of probiotic therapy and 9 (0 human, 9 animal) utilized commensal therapy. Studies were conducted between 1998 and 2020 in the following countries: USA, Iran, Japan, England, Netherlands, Spain, Russia, Italy, Sweden, France, China, and Republic of Korea.

Risk of bias
Overall, 10 studies (4 human, 6 animal) were deemed "high quality" based on the low risk of bias determined using the Cochrane ROB and SYRCLE tools (Supp. File 1D and 1 F). Most (n = 24) of the studies (0 human, 24 animal) were classified as "medium quality" due to study design limitations or failure to disclose randomization and/or blinding efforts. The remaining studies (n = 3) were classified as "low quality," including two human studies that had important baseline characteristic differences between groups, substantial risk of confounding due to concurrent use of a DMT (glatiramer acetate), and compared the results to healthy controls rather than untreated MS patient controls 46,47, ; and one animal study that was not powered to perform statistical analysis lacked clarity regarding the control groups and did not explicitly state the use of randomization or blinding measures. 72

Subjects
Human subjects were studied exclusively in the RRMS stage. Expanded disability status scores        All findings are reported with respect to control group(s) unless otherwise indicated.

Measurements and outcomes
The most common measurements for clinical parameters for human studies included EDSS, mental health and quality of life assessments (Beck Depression Inventory (BDI), General Health Questionnaire-28 (GHQ-28), Depression Anxiety Stress Scale (DASS), Fatigue Severity Scale (FSS), McGill Pain Questionnaire (MPQ)). Notably, none of the human studies assessed MRI lesions. For animal studies, clinical signs of motor disability and associated quantitative variables (EAE incidence, onset, duration, and clinical scores; motor function, coordination, and activity for other MS models), histopathology (demyelination, CNS infiltration), BBB and intestinal permeability, and weight loss. Cytokine analysis, oxidative stress/antioxidant markers, and immunophenotyping were the most commonly measured immune/metabolic indices. Microbiome and metabolome assessments were primarily measured using fecal microbiome analysis, and fecal/serum short-chain fatty acid (SCFA) production. Gene expression and adoptive transfer experiments comprised additional mechanistic and correlative findings.

Major trends
The major findings of each probiotic therapy study included can be found in Tables 2 and 3 for human and animal studies, respectively. A qualitative summary of these studies is provided below, followed by a semi-quantiative ranked evaluation using BH criteria.

Clinical studies
Four of the human probiotic therapy studies included were double-blind, placebo-controlled RCTs, and thus all were classified as "high quality" studies. 60,61,63,64 Probiotic therapy produced modest decreases in EDSS that, while sometimes statistically significant, were not found to be clinically significant based on the authors' designation of an EDSS change of ≥1.0 point for levels less than 5.5 or ≥0.5 point for levels greater than 5.5. 60,64 The impact on EDSS seemed more pronounced in the shorter, 12 week study, suggesting that the observed benefits may only be transient. 60 Probiotic therapy did, however, lead to marked improvements in quality of life as measured through the BDI, GHQ-28, DASS, FSS, and MPQ assessments. 60,63,64 The proinflammatory cytokines that were measured (IL-6, IL-8, and TNFα) were consistently reduced in the probiotic treatment groups, as were several oxidative stress markers (hs-CRP, MDA). 60,61,63,64 Anti-inflammatory cytokines and antioxidants were measured, showing elevated IL-10 and plasma nitric oxide. 60,64 Two additional prospective cohort studies used therapy with the probiotic mixture VSL#3. 46,47 While neither study focused on clinical outcome, both found that VSL#3 elicited changes in the peripheral immune response consistent with an immune regulatory state, including phenotypic changes in monocytes and dendritic cells and decreased expression of the MS risk allele HLA-DQA1. Additionally, these studies found an increased relative abundance of Lactobacillus, Bifidobacterium, Streptococcus spp. in stool, which is consistent with the species that were administered in probiotic form in the VSL#3 formulation. Further, one study found an increased relative abundance of Collinsela and Veillonellaceae family members that are typically depleted in MS gut microbiomes, as well as a decreased relative abundance of Akkermansia, Blautia, and Dorea genera, which are typically enriched. 46 Notably, these differences display an inverse relationship in MS patient cohort gut microbiomes, suggesting that VSL#3 may restore the MS-dysbiotic state.

Preclinical studies
The majority (26 out of 30) of animal studies used the EAE model. This is an important consideration, since it is a model driven by an autoimmune response, thus immunomodulation is the most likely mode of action for any effects on clinical disease. More studies investigated the effects of Lactobacillus spp. 48 49 and the combinations of L. crispatus and L. rhamnosus 53 and B. animalis subsp. lactis strains 53 were able to elicit clinical benefits whether administered live or heat-killed , while Lacto-mix 48 was only effective when live probiotic organisms were used. Five studies demonstrated a doseresponse relationship, 44,48,50,53,54 and three studies provided evidence for the combinatorial effects of probiotics. 43,45,69 Only ~32% of the animal studies (n = 22) reported primarily no effect or exacerbation of clinical disease. 51,59,68,[72][73][74][75] It should also be noted that only four of the studies were classified as high quality, all of which reported positive results with probiotics. 44,45,50,53 The majority of studies reported favorable secondary immunological findings, with elevated levels of anti-inflammatory cytokines (IL-10, IL-4, TGFβ) 43 49 and T H 1 and T H 17 cells. 43,54,55,67,69,74,75 Multiple Lactobacillus spp. and probiotic combinations also demonstrated decreased antigen-specific T cell proliferation. 44,48,50,53,69 Putatively beneficial microbiome changes included an increased relative abundance of Firmicutes, Bacteriodetes, Proteobacteria phyla and Sutterella, Bifidobacterium, Streptococcus, Lactobacillus, and Prevotella spp. 43,44,75 Two studies measured SCFA production and found increased levels in both serum and feces. 43,45 Four studies reported increased expression of T H 2 and Treg regulators (GATA3, Foxp3), 50 miR-25, 71 antimicrobial peptides (Reg3g, Reg3b), 55 and tight junction proteins (Claudin-8, ZO-I); 55 and decreased expression of T H 1 and T H 17 regulators (Tbet, RORγt), 44,50 miR-155, 71 and the IDO gene, 71 a potential marker of MS/EAE relapses. Furthermore, Lavasani et al. performed an adoptive transfer experiment of CD4+ CD25 + T cells from mesenteric lymph nodes of the probiotic Lacto-mix group and found that the recipient mice had suppressed EAE symptoms and elevated IL-10 levels. 48 These effects, however, were eliminated when tested in IL-10-deficient mice. 48 Notably, Sanchez et al. also included an adoptive transfer experiment of splenic and mesenteric lymph node leukocytes of heat-killed L. paracasei-treated donor mice into recipient mice and found no such effects. 49 Additional mechanistic findings included decreased intestinal barrier permeability 55 and oligodendrocyte differentiation enhancement. 45

BHC scores and rankings
The BH score calculations and findings for probiotic therapy are detailed in Supplemental File 2, and summarized in Table 5. Given the large number of studies and treatments, we do not discuss each individually, but instead highlight and contrast some of the key findings below.
One probiotic treatment approach emerged as the most strongly supported (BH score = 9), namely the VSL#3 multi-species formulation, which was assessed in two human and three animal studies. Two out of three animal studies reported significant clinical improvement in the EAE model. The third study used the cuprizone demyelination model and reported a lack of clinical improvement, but some favorable histologic changes. The human studies did not measure clinical parameters, but reported immunological and microbiological changes that would be consistent with favorable immune modulation. Hence, this particular approach satisfied seven out of eight BH criteria (BHC #1, 3-8), with high evidence for replication (Table 5). In contrast, another combination treatment (L. acidophilus, L. casei, L. fermentum, and B. bifidum) was used in two high-quality human RCTs, but lacked supporting mechanistic and/or animal model studies (BH score = 5; Table 5). Additional promising probiotic treatments with high BH scores included B. animalis, L. paracasei, and E. coli Nissle 1917, each receiving a BH score of 7; and L. plantarum, Lacto-mix, L. crispatus & L. rhamnosus, and the B. animalis combination therapy, all of which received a BH score of 6.
The majority of the remaining microbial treatments were characterized by low BH scores, resulting from a paucity of studies, lack of mechanistic evidence, and/or presence of conflicting evidence. The latter is exemplified by L. casei (BH score = 1), which was examined in six animal studies, but showed evidence of disease exacerbation or lack of effect in four of those studies, resulting in a deduction of 2 points for BHC #4. We note that this interpretation is confounded by the fact that different strains/isolates were used across these different studies, highlighting the need for careful standardization and interpretation.

BHC deficiencies
Using the Bradford Hill criteria, several therapies had considerable evidence for strength of relationship, dose-response relationship, biological plausibility, and coherence. Future studies should focus on strengthening these areas further by investigating dosing effects, establishing more direct evidence of MGBA involvement with more probiotic organisms and combinations, addressing alternative explanations, and repeating interventions in various contexts. Additionally, more evidence is needed to fulfill the remaining Bradford Hill criteria categories (BHC #s 1, 4, 6, and 7), including more before-and-after analyses, using a standardized protocol to facilitate comparisons across studies and research groups, and investigating cessation effects (Table 6). Strengthening the specificity of association by comparing the effects of live versus heat-killed organisms and their soluble products, and the inclusion of more mechanistic experiments is also recommended. Future studies and reviews should also consider the taxonomic reclassification of Lactobacillus spp. when referring to those probiotic organisms. 77 Lastly, to move toward translational application, wherein probiotics are stringently defined as conferring a known benefit to human health, diseasespecific usage should be assessed in well-powered RCTs to provide clinically relevant guidance. Notably, a defined benefit to MS patient health should not be limited to clinical outcome, but also include secondary parameters such as quality of life, since it is plausible that probiotic therapy may improve the well-known GI-associated MS symptomatology (e.g. constipation) rather than affecting overall disease progression directly; and mental health, since depression has been identified as a risk factor for RRMS disability and relapses. 78,79,80

Major trends
The major findings for each of the commensal therapy animal studies included in this review can be found in Table 4. A qualitative summary of these studies is provided below, followed by a semiquantiative ranked evaluation using BH criteria.
No human studies were identified for commensal therapy in this review, so the below trends are limited to preclinical findings. All but two (L. reuteri 37,38 and Allobaculum 38 ) of the commensal organisms studied among the nine studies were shown to delay EAE onset and decrease clinical scores, incidence, inflammatory CNS infiltration, and demyelination. P. histicola was one of two commensals represented in more than one study and exhibited positive outcomes in each. [56][57][58] One study also reported reduced astrocytosis and microglial activation in the brain and spinal cord of P. histicola-treated mice, 58 while a sister study found that P. histicola helped to strengthen the MGBA by decreasing BBB permeability and restoring gut permeability. 56 Similar to probiotic therapy, reduced proinflammatory and increased antiinflammatory immune responses were observed in each of the commensal studies. Specifically, studies found decreased IL-17-and IFN-γ-producing CD4 + T cells, 56,58,65 T H 17 cells, 65 and IL-17, 56,62,70 IFN-γ, 56,70 IL-23, 56 and IL-12 70 cytokines; and increased IL-10, 56,62,70 TGF-β, 56 and Tregs. [56][57][58]65 These results were consistent across all three P. histicola treatments. 56,57,58 L. reuteri was also represented in multiple studies and was shown to exacerbate EAE in both, either when administered alone (in the context of a normal microbiome) 37 or in combination with Allobaculum (in a dualcolonization gnotobiotic model). 38 For studies that analyzed the microbiome, commensal therapy groups had a general microbiome shift toward pre-EAE states following treatment, including an increased relative abundance of Bacteriodetes, Firmicutes, Prevotella spp., and Lactobacillus spp. 56,57,65 Mechanistically, the adoptive transfer of splenocytes from P. histicola-treated mice led to decreased EAE incidence in recipient mice. 56 Similarly, the adoptive transfer of FoxP3 + cells from wild-type B. fragilis-treated mice, resulted in decreased EAE clinical scores and increased levels of IL-10 in recipient mice. 62 These findings were not observed in the recipient mice receiving cells from polysaccharide A (PSA)deficient B. fragilis-treated mice, suggesting that PSA is requisite for EAE protection. Separately, treatment with C. butyricum was reported to suppress phosphorylation of p38 MAPK and JNK signaling pathways -which are typically elevated in EAE -in the spinal cords of mice. 65 One commensal was also found to be at least as effective as two different DMTs (glatiramer acetate 57 and IFNβ 58 ).

BHC scores and rankings
The BH score calculations and findings for commensal therapy are detailed in Supplemental File 2, and summarized in Table 5.
Treatment with P. histicola had fairly strong evidence (BH score = 7), but fell short across several BH categories (BHC #s 4 and 6), as it lacked human studies and replication by independent groups (Table 5). Another promising treatment with a high BH score was B. fragilis (BH score = 5), which scored points in BHC # 1, 2, 5, 7, and 8 owing to an adoptive transfer experiment, but lacked replication and evidence of dose-response and cessation effects. The remaining treatments were characterized by low BH scores (ranging 1-4) comprised of points in BHC # 1, 2, 3, and/or 5, once again resulting from a paucity of studies, lack of mechanistic evidence, and/or presence of conflicting evidence, as observed with L. reuteri (BH score = 1), which was found to exacerbate the disease in three of the five studies, leading to a 2-point deduction for BHC #4. As noted with L. casei for probiotic therapy, this interpretation is confounded by the use of different strains/isolates and modes of treatment (stable commensal colonization vs. daily gavage) across studies and would benefit from careful standardization (Table 5).

BHC deficiencies
Using the Bradford Hill criteria, commensal therapy had strong evidence for temporal relationship, specificity of association, and biological plausibility, but was lacking in the remaining categories (BHC #s 2-4, 6, and 8; Table 6). Additional studies replicating current findings and testing more commensal    organisms and combinations should be the main focus of future studies, since there were few commensal therapy studies overall and only P. histicola and L. reuteri were used in more than one study. Future studies should also focus on confirming colonization of the commensal organisms in the gut to reduce confounding and strengthen the specificity of association (BHC #7). Testing the effects of live versus heat-killed organisms and their products should also be prioritized, as these findings may contribute to the elucidating the underlying therapeutic mechanisms. For instance, subsequent studies of B. fragilis by the same research group utilized only the B. fragilis PSA symbiosis factor rather than administering live, wild-type B. fragilis and found similar reductions in EAE severity, as well as protection against EAE demyelination and inflammatory responses, providing a key molecular mechanism in support of the action of the live bacterium. [81][82][83][84] Other recommendations reflect those of probiotic therapy, namely controlling for alternative explanations, supporting immunological and microbiological findings with mechanistic experiments, and adding standardization to promote study design consistency and ease of comparison across studies.

Discussion
The purpose of this comprehensive review was to compile, summarize, and systematically rank the current evidence for probiotic and commensal therapeutic efficacy in MS and its preclinical models in an effort to identify weaker areas that should be addressed in future studies. A total of 37 studies were evaluated, including 28 for probiotic therapy and 9 for commensal therapy. The probiotic formulations VSL#3 (BH score = 9), B. animalis, L. paracasei, and E. coli Nissle 1917 (BH scores = 7) ranked highest due to their fulfillment of at least six of the eight Bradford Hill criteria. For commensal therapywhich suffered from a complete absence of clinical studies -the highest rankings went to P. histicola (BH score = 7) and B. fragilis (BH score = 5).
Animal studies demonstrated generally higher efficacy for reducing disease severity and progression with probiotic therapy than did the human studies, which is not unexpected, given the known shortcomings of the animal models, and the expected difficulties in translating basic science findings into therapy. The disconnect between human and animal studies could also be due to the difference and extent of the clinical markers measured, as clinical studies only measured MS  severity through EDSS and questionnaires, while the pre-clinical studies were able to investigate EAE and the other MS models more comprehensively. MRI evaluation is a powerful, unbiased, and quantitative surrogate for MS severity and progression, and this was conspicuously lacking in the human studies. Additionally, human studies were in all likelihood underpowered to detect potentially subtle effects of probiotic treatments, and confounded by multiple environmental variables (e.g. diet, host baseline differences) that are impossible to control in this setting. Replication of findings (BHC #4) was one of the most deficient Bradford Hill criteria across studies, with only 25% (n = 28) of the formulations receiving points and two of them (L. reuteri and L. casei) losing points. Another almost uniformly unfulfilled criterion was cessation of exposure, only addressed by six formulations. Both probiotic and commensal therapies would benefit from additional replication, testing more organism combinations, improved mechanistic evidence and comparison of live versus heat-killed organisms and their soluble products, and protocol standardization to enable improved contextual comparison across studies.

Study limitations
There were several limitations to accurately assessing the efficacies of both therapies. First, there was widespread study design variability in the species and/or strain, dosage, duration of intervention, timeline, and sample characteristics. These variations would have been beneficial to external validation if the same strains were used across studies, but instead posed a challenge for assessing therapeutic utility. Standardized protocols outlining the optimal dosage, timeline, and duration for different organisms would be helpful for mitigating this issue, as was the focus of a review on probiotic therapy that concluded 10 9 CFU for 8-12 weeks duration produced the most favorable results. 31 Another study design issue was exclusion criteria and control of confounding variables, since some of the human studies did not account for diet or stress, which can alter gut microbial composition and subsequently influence MGBA interactions and concurrent DMT use, which could overshadow the true therapeutic efficacy if synergism exists between the two. 34 Additionally, genetic variability was also mostly unaccounted for in both human and animal studies (since the latter for the most part used a single strain of mouse). Furthermore, none of the human studies were conducted long enough to span the average remission period of 12-18 months, so the true impact of each therapy on reducing the severity of MS cannot be revealed with certainty. 1,85 As for animal studies, none of these can accurately capture the complexity of spontaneous MS and its various forms in humans. [86][87][88] Only five studies tested the effects of live versus heat-killed organisms or their products. 48,49,53,56,62 This distinction is important, since equivalent efficacy with heat-killed organisms would help to reduce any associated risks of therapy posed by live microbiota and likely improve treatment uptake and adherence in patients. Furthermore, this effect likely differs across organisms. For instance, L. paracasei, 49 a combination of L. crispatus and L. rhamnosus, 53 and a combination of two B. animalis subsp. lactis strains 53 did not need to be viable for EAE suppression, while P. histicola 56 and Lacto-mix 48 did. Additionally, protection from EAE elicited by B. fragilis required the expression PSA by this bacterium, indicating that this bacterial product alone can play an important role in EAE protection. 62 Indeed, follow-up studies confirmed that live B. fragilis is not required, while PSA is sufficient to elicit a therapeutic effects. [81][82][83][84] Future studies should prioritize these distinctions to help optimize efficacy and therapeutic success.
Another limitation of this review was the quality and risk of bias for the studies included. Most of the animal studies included were classified as "medium quality" with "uncertain" bias due to the lack of explicitly stated randomization and blinding measures used. A lack of randomization can subject the results to inadvertent confounding and chance findings. 39,40 Animals that live together in the same cage or area of a room may have more similar characteristics to each other than compared to a different cage or area importantly including the basal composition of their gut microbiomes. Additionally, a lack of blinding can introduce both performance and detection bias, as a researcher or caretaker's knowledge of the treatment group can cause them to subconsciously act differently toward one group, such as providing extra care to sicker animals. 39,40 It is entirely possible that these studies did in fact incorporate these measures into their protocol, but since it was not stated it was considered to be absent for this review.
Other issues regarding quality include the premise that the positive findings observed across studies may simply reflect consistency of confounding variables and/or publication bias, rather than the therapy itself.

Review limitations
As for the risk of bias for this review, there were several methodological flaws in the assessment of therapeutic efficacy. Personal judgment was required for assessing each therapy's fulfillment of the Bradford Hill criteria and there were no pre-defined guidelines as to what should be considered "sufficient evidence." The scoring system was implemented as an arbitrary method to facilitate comparison of the efficacies in the context of the recommendation criterion and not intended to be a comprehensive assessment of the therapies. Regardless, the aim of this review was to highlight areas within each therapy that should be strengthened in future studies, so the risk of bias in this sense seems low. Separately, a few of the Bradford Hill criteria may be less important for establishing efficacy, causing the therapeutic to be penalized for lacking evidence in a non-applicable category. For example, a threshold effect rather than a doseresponse relationship might be necessary for observing beneficial effects. Cessation of exposure may also not be necessary to demonstrate, since these therapies would theoretically be lifelong as is the case for DMTs. Accordingly, another avenue for future research could be establishing a minimum set of probiotic specific criteria for comprehensively evaluating therapeutic strategies in both animal and human studies.
Other limitations of this review were related to the search and screening process for identifying relevant studies. First, the use of an English language filter in our search strategy imposed obvious restrictions on the number of studies included and extent of available evidence. Second, the operational definitions we established limited the scope of probiotic and commensal therapies to only live or heat-killed bacteria, excluding any mechanistic evidence that may have been generated in studies that used only probiotic/commensal strain-soluble products. The BH score for B fragilis, for example, could have been improved had the follow-up studies that focused on the B. fragilis PSA symbiosis factor been eligible for inclusion. [81][82][83][84] Lastly, we did not contact the authors of studies classified as medium or low quality for clarification of missing methodological data (i.e. randomization, blinding). Attaining such information could have altered our quality assessments and evaluations for BHC #2. Regardless of these limitations, this review was intended as a resource to guide and optimize future probiotic and commensal therapy studies by highlighting both emerging therapies and study shortcomings, rather than to firmly conclude the therapeutic utility of specific formulations.

Conclusion
In this comprehensive review, we used a Bradford Hill criteria scoring approach to provide a multiparameter assessment and ranking of evidence for specific gut microbial therapies, with the overall goal of identifying and highlighting areas of need for future research (see Tables 5 and 6). Several formulations emerged as having the most promise, including VSL#3, B. animalis, L. paracasei, and E. coli Nissle 1917 for probiotics; and P. histicola and B. fragilis for commensals. However, many other therapies fell short across a number of criteria, notably replication of findings. Other Bradford Hill criteria lacking evidence were temporal relationship and specificity of association for probiotic therapy, and strength of relationship, dose-response relationship, and coherence for commensal therapy. Future studies should prioritize addressing these shortcomings through better control of confounding, supporting immunological and microbiological findings with mechanistic experiments, improved standardization of protocols and therapeutic formutions, and the other suggestions discussed in this review. Focusing on these areas is necessary to make progress toward clinical implementation, since cheaper, safer, and more durable treatments for MS are in demand.