Guidelines for reporting on animal fecal transplantation (GRAFT) studies: recommendations from a systematic review of murine transplantation protocols

ABSTRACT Fecal microbiota transplant (FMT) is a powerful tool used to connect changes in gut microbial composition with a variety of disease states and pathologies. While FMT enables potential causal relationships to be identified, the experimental details reported in preclinical FMT protocols are highly inconsistent and/or incomplete. This limitation reflects a current lack of authoritative guidance on reporting standards that would facilitate replication efforts and ultimately reproducible science. We therefore systematically reviewed all FMT protocols used in mouse models with the goal of formulating recommendations on the reporting of preclinical FMT protocols. Search strategies were applied across three databases (PubMed, EMBASE, and Ovid Medline) until June 30, 2020. Data related to donor attributes, stool collection, processing/storage, recipient preparation, administration, and quality control were extracted. A total of 1753 papers were identified, with 241 identified for data extraction and analysis. Of the papers included, 92.5% reported a positive outcome with FMT intervention. However, the vast majority of studies failed to address core methodological aspects including the use of anaerobic conditions (91.7% of papers lacked information), storage (49.4%), homogenization (33.6%), concentration (31.5%), volume (19.9%) and administration route (5.3%). To address these reporting limitations, we developed theGuidelines for Reporting Animal Fecal Transplant (GRAFT) that guide reporting standards for preclinical FMT. The GRAFT recommendations will enable robust reporting of preclinical FMT design, and facilitate high-quality peer review, improving the rigor and translation of knowledge gained through preclinical FMT studies.


Introduction
The collection of microorganisms in the gastrointestinal tract, termed the gut microbiota, is growing in appreciation for its dynamic regulation of host function and disease. While large-scale sequencing studies have provided unprecedented insight into the range of conditions associated with the microbiome, they are unable to provide conclusive evidence for how the microbiota causally contributes to disease and how it can be exploited to modify disease risk or progression. 1 Fecal microbiota transplantation (FMT) is a powerful technique in which the microbial community is transferred from a donor to a recipient host, transferring a unique microbial enterotype to prevent, treat or (preclinically) induce disease, or modulate host physiology. Clinically, FMT is now second-line therapy for antibiotic-resistant Clostridioides difficile (C. difficile) and its scope is expanding. 2 Indeed, there is a growing list of emerging indications under investigation in a variety of preclinical models and pilot cohorts. 3,4 In addition to its therapeutic application, preclinical FMT is increasingly used to dissect causal microbiotadependent mechanisms and understand how unique microbial profiles dictate disease risk.
Although a powerful technique, the regulatory landscape for clinical use of FMT is challenging, largely due to the ambiguities regarding its classification (i.e. biological product equivalent to blood or organ versus drug). 5 Despite this, there are clear recommendations and guidelines for FMT preparation, administration, and quality control when used in human recipients. 6 In contrast, preclinical FMT protocols vary enormously, as recently highlighted, 7 with little to no recommendations on best practice and reporting standards. This profoundly hinders the ability to interpret and replicate preclinical FMT studies and the inconsistent application of experimental approaches compromises clinical translation.
The need for better guidance of preclinical FMT protocols is underscored by the additional layers of complexity that are introduced in a preclinical setting. For example, experimental design, preparation and administration are complicated by the coprophagic nature of rodents. While some studies have exploited this behavior (co-housing to induce microbial transfer), 8 there is significant variability in how this technique is applied and the omission of key methodological detail hinders experimental replication, thus undermining subsequent translation. 9 Similarly, while bowel preparation is recommended for colonoscopically administered FMT in humans, the necessity for an appropriate equivalent in recipient rodents remains unclear.
Germ-free (GF, i.e. those without any resident microorganisms) mice have often been used as recipients in FMT models, as their lack of existing gut microbiota represents a highly effective 'takeup' of the donor FMT. However, as previously highlighted, 10 barriers related to cost and logistics have prevented widespread use of this model, and concerns regarding how closely they mimic normal immune development have plagued interpretation of results generated. 11 Therefore, antibioticinduced depletion of the microbiota has become common practice to ablate the microbial community of the gut. However, there are vast differences in the antibiotic treatment specifications used in animal models, including FMT. These include type, dose and duration, which can introduce significant variability in ablative capacity, with persisting pathogens confounding results (for a comprehensive review of this topic, please see Gheorghe et al. 12 ).
While antibiotic treatment represents a particularly common area of variability, in reality, each step of preclinical FMT protocols can introduce bias. This was recently highlighted by Walter et al. (2020) who identified that 95% of preclinical FMT studies reported successful transfer of the clinical phenotype to the recipient rodenta figure deemed implausible by the authors. 13 While these findings also reflect publication biases, they underscore the need to advocate for standardization of approaches for preclinical FMT when inferring causality to prevent unrealistic expectations that may undermine the credibility of microbiome science and delay its translation.
A key element of this enhanced rigor must be clarity in the methodological standards and reporting to improve consistency and transparency within the field, both of which will strengthen the reproducibility of findings. As such, we systematically reviewed published literature on preclinical FMT use in mice to provide a snapshot of current reporting patterns and, in collaboration with key microbiome research sites and networks, developed a set of minimum reporting guidelines for future preclinical FMT studies.

Study Selection
Of 1753 screened studies, a total of 241 were included. One thousand one hundred and ninetysix were screened via title and abstract, with 728 excluded as not relevant. Four hundred and sixtyeight full-text articles were assessed. Two hundred and twenty-seven were excluded at the full-text stage ( Figure 1).

Study Characteristics
We studied papers evaluating FMT across a range of indications ( Figure 2a). The most common area of investigation was metabolism/metabolic disease, accounting for 23.7% of all papers reviewed. Other areas of investigation included infectious diseases (15.4%), gastroenterology/ inflammatory bowel disease (14.9%) and cognition/behavior/affect disorders (6.6%).
Studies ranged in the sample size used per experimental group (median [range]:8 ), reflecting varying power requirements for specific models. Disappointingly, 21.4% of the eligible studies did not clearly state the sample size of the recipient group. While the sample size for the donor FMT group was not extracted in our analysis due to low levels of reporting, it is also important to acknowledge that this should also be clearly reported alongside information on whether FMT contents are pooled across multiple donor samples. Donor sample size is particularly pertinent in the use of human donors, as recently outlined by Gheorghe et al. with the use of a single donor considered N = 1 12 .

Collection, processing and storage
There are several aspects of FMT preparation that must be acknowledged and highly protocolized for rigorous results: collection of donor stool, processing and storage. The vast majority of papers used fecal pellets to prepare the FMT product (73.0%; Figure 2b), with the remaining using cecal (12.9%) contents or other gastrointestinal products (e.g. duodenal aspirates and feces, mucosal scrapings, small and large intestinal contents collected from a culled mouse). A range of preparation techniques were used to produce the FMT, including filtrates, supernatants, and slurries. In the papers reviewed, a fecal slurry was most commonly used (50.6%; Figure 2c), with supernatants and filtered products used in 33.6% and 9.9% of papers, respectively.
In any FMT preparation, the vehicle/diluent must be carefully considered. In the studies included for analysis, phosphate buffered saline (PBS) was the most common solution (80.5%; Figure 2d), with a small number of studies including additives to the PBS (glycerol, cysteine hydrochloride) to improve microbial viability. The concentration of cysteine-hydrochloride was consistent across all studies (0.05%), whilst the concentration of glycerol ranged from 5% to 50%. While only 7.5% of studies failed to report the vehicle solution used for FMT preparation, 91.7% of studies failed to report whether this solution was reduced (i.e. de-oxygenated) or if the FMT was prepared under anaerobic conditions.
The high number of studies that failed to explicitly state whether FMT was prepared under anaerobic conditions is concerning as it has been reported that FMT prepared under aerobic conditions profoundly decreases microbial viability, altering microbial metabolite synthesis and abundance of many anaerobic commensals. [14][15][16] Similarly, the way in which the fecal/cecal contents were processed was poorly reported, with 33.6% of studies failing to report on homogenization. This methodological step was generally reported with limited detail, using broad terminology such as "dissolved", "mixed" or "suspended" (Figure 2e), with only 12% of studies providing sufficient detail for replication of the homogenization step. A similar observation was made for filtration methods used when preparing supernatants or bacterial preparations, with 31.5% of studies failing to report on any filtration or "clean up" steps. For clarity and replication, manual filtration should be defined by the size of the strainer used and centrifugation defined using standard metrics (x g, min, o C).
Once processed, the final FMT product can and should be quantified in terms of its concentration. Strikingly, 31.5% of studies failed to report a concentration, with the remaining studies using a wide range of units, including mg/ml (63.0%), pellets/ml (13.9%) and CFU/ml (9.6%). While we do not intend on recommending a specific unit to define concentration, it is critical that the final FMT product is defined in a standard unit of  measurement that can be replicated by others. Studies reporting pellets (but no volume) or milliliters (but no weight) were deemed irreproducible.
The final FMT product can then be used immediately (fresh) or stored and used at a later date. As such, clarity on this methodological detail must be clearly provided particularly in light of the evidence that shows storage conditions impact microbial preservation and viability. 17,18 Of the papers included in our analysis, 31.9% administered freshly prepared (i.e. not stored) FMT. A number of papers (18.7%) noted that the FMT product was frozen (−20°C and −80°C) prior to administration; however, close to half of the papers (49.4%) did not report any methodological detail on storage conditions (figure 2f).

Recipient preparation and FMT administration
Once the FMT has been prepared, there are many considerations in its administration related to both the product itself and the recipient, including typical reporting standards related to husbandry. Of the studies included, 30.2% failed to provide any detail on animal housing conditions (i.e. single vs co-housed). Given the coprophagic nature of rodents, it is critical that this be clearly reported in all studies in which FMT is used to acknowledge/exclude potential confounding impacts.
We also investigated how, if at all, recipient mice were prepared for FMT. As suggested previously, 19 there is some evidence that bowel lavage or cleansing could improve FMT efficacy; however, these remain speculative and not widely recommended. Accordingly, very few studies (N = 3) included in our analysis reported bowel preparation procedures in recipient mice. One study fasted mice the night prior to FMT administration and two studies provided PEG3350 as a laxative beforehand. Antibiotic-induced depletion was used in 60.5% of studies, most commonly administered in drinking water (61.4%) for a median of 14 days [1-91 range] (Figure 3a). The most common combination was a cocktail of ampicillin, neomycin, vancomycin, and metronidazole (ANVM ; Table S1).
While antibiotic-induced depletion has, to date, been an area of critical methodological consideration in optimal FMT administration, it remains an area of contentious debate. In fact, increasing evidence suggests that antibiotic depletion may not be necessary for FMT uptake; albeit the evidence is conflicting. While Ji et al. (2017) reported great  FMT administration can be achieved via oral or rectal routes. It has been previously speculated that as oral gavage inoculum needs to pass through the acidic stomach environment, rectal administration may be more efficient. 21 However, a previous study of FMT in mice showed that specific pathogen-free mice treated with antibiotics and then orally or rectally inoculated with donor mice gut microbiota had no differences in microbial community after inoculation. 22 As such, while rectal administration is preferable in efficacy and safety outcomes for clinical FMT use, oral gavage is often selected in mouse studies, presumably due to convenience. 23 In line with these findings, the overwhelming majority of studies included in our analysis opted for oral administration (90.4%; Figure 2g) and only a small number used rectal administration (2.9%). Three studies reported alternative methods of administration, including directly pipetting into the oropharynx, 24 which can be used when oral gavage is not permitted (note that other methods including co-housing and vertical transfer (generational transfer between mothers and pups) were excluded). The route of administration was the most consistently reported aspect of FMT methodology; only 5.3% of papers either did not report or did not clearly state how their FMT product was administered.
Administration volume is also critical to FMT replication, with our analyses identifying a large range of volumes administered to recipient mice (median[range]: 200 µl ; Figure 3b).
Volume was not reported in 19.9% of studies included in our analysis. Similarly, the frequency of FMT (or absolute number of treatments) was not reported in 13.2% of studies. Of the studies that did report this metric, there was again a significant range (1-120 treatments) with a median of 5 FMT treatments (Figure 3c).
In administering the FMT, adequate control procedures must be implemented to account for the impact of the procedure. This can be achieved by administering an autologous FMT or one prepared using sham/control animals. Alternatively, the vehicle solution can be administered. Forty percent of papers included in our analysis failed to use a control arm or report on what their control animals received. Of the studies that did report this detail, 56% used FMT prepared from sham/control animals and 34% used the vehicle solution.

Quality control and uptake confirmation/durability
The success of FMT relies on a number of complex and interacting factors, but central to its general efficacy is its viability (after collection and processing) and uptake ("durability").
We defined quality control (QC) as analysis of the FMT product before administration to the recipient, i.e. to identify the presence of potential pathogens and confirm viability of the product. No information regarding quality control was reported in 88.4% of studies. Of the few studies that did include QC, 16S rRNA sequencing was the most commonly used technique (53.6%) followed by standard culture (32.1%). Given the inability of 16S rRNA sequencing to determine the viability of the microbial community, these findings underscore the need to implement standardized preclinical FMT guidelines to ensure appropriate QC is incorporated at project inception.
Confirming uptake of the FMT is also critical to its efficacy. Le Roy et al. (2018) defined the durability of the FMT procedure as: 1. Establishment of high levels of bacterial taxa from the inoculum in recipients, 2. Relative abundance of bacterial taxa as similar as possible in the inoculum and recipients, and 3. The removal of a high amount of endogenous bacterial taxa in non-GF recipients. 7 This can be determined by microbial analysis of the FMT inoculum, and gut microbial contents of the recipient both before and after FMT occurs.
Overall, explicit reference to durability assessment was lacking with microbial analyses often reported in the study but rarely compared between the FMT donor, product, and recipient. In fact, 22.1% of papers did not report or did not confirm uptake of the FMT in any way. Of the papers that did report, 16S rRNA sequencing was the primary method (86.9%) with other studies reporting culture-(6.0%) or PCR-based approaches (4.3%).

Reproducibility and rigor
A recent systematic review searched scientific literature for studies suggesting a causal relationship between an altered human microbiome and disease or physiological condition. 13 Of the papers meeting the inclusion criteria, all but two (95%), suggested that fecal transfer from diseased donors resulted in a disease phenotype. Due to the wide range and complexity of pathologies studied in these papers, the authors suggested that the causal claims seem unlikely across this wide range. Similarly, in our study, we found that 92.5% of papers showed that FMT had an effect. This may reflect publication bias -a tendency to favor positive findings for publication, or it could also suggest that the introduction of any new complex combination of microbes via FMT may cause a protective immune response in the gut, that manifests as a change in symptoms or disease severity. Thus demonstrating the wide variety of potential uses for FMT. Regardless, as suggested by Walter et al. (2020), microbiome science would benefit from increased rigor and critique. 13 A key part of this scientific rigor is transparent and reproducible methodology. 13,25 Throughout our analysis, we found that many methods described in published manuscripts did not have sufficient detail to be completely replicated. Therefore, we developed a reproducibility index containing 10 key aspects of FMT methodology and assigned a score from 0 to 1 for each variable, where 0 = not reported, 0.5 = reported with insufficient detail and 1 = reported with sufficient detail for replication ( Figure 4). The median total value of this index was 6.5, with 23.6% of papers gaining a total value of 5 or more. Two papers had a reproducibility index of 9.5/10 in our review, the highest observed. The first paper (Krisko et al., 2020) paid particular detail to the preparation of donor mice, and the use of anaerobic conditions in FMT preparation. 26 The second paper (Zhou et al., 2019) provided good detail about the amount, and concentration of the administered FMT product. 27 In addition, Foligne et al. clearly outlined group numbers of both donors and recipient mice. 28 While this measure provides an objective assessment of the level of detail in reporting, it is important to recognize that this should be interpreted with caution as the index is not validated. Thorough review of the literature yielded no appropriate method for assessing methodological reporting in this way, and as such, the reproducibility index was developed specifically for this study.

The GRAFT recommendations and future steps
Our systematic review revealed an overall lack of clarity in the reporting of FMT methods. In almost all variables we investigated, there was not only a lack of consistency in FMT protocols, but also a lack of clarity and detail in methodological reporting. For example, for FMT concentration, as well as the actual concentration ranging widely, 7 different units were used to report this key step in FMT preparation. These findings point to a lack of authoritative guidance on preclinical FMT studies for both authors and reviewers.
Due to the low level of detail found in many papers and the low mean score from our reproducibility index scoring, we suggest that a minimum set of reporting standards for preclinical FMT studies would be useful. As such, we present here the GRAFT (Guidelines for Reporting on Animal Fecal Transplantation) recommendations (Table 1, Figure 5) along with a simple checklist (File S1) that can be used at project inception/design, manuscript preparation, and review. By providing these recommendations, we hope to increase the transparency and reproducibility of preclinical FMT procedures, thus elevating their translational strength. While our systematic review intentionally restricted our search to studies with mice, we argue that, given the similarity in FMT procedures across species, the GRAFT recommendations can also be used to guide FMT use in other species, and may offer guidance in human-animal transplantation if followed in combination with the recommendations presented by Walter et al. (2020). 13 While these guidelines provide the much-needed structure for preclinical FMT protocols, it is critical to emphasize that we do not aim to recommend what methods should be used, as different experimental endpoints and research questions will clearly need a different methodological design (as previously discussed by Gheorghe et al. 12 ). However, by consistently reporting the following set of guidelines, future studies will be more reproducible and thus be more likely to generate clinically relevant outcomes. Similarly, these guidelines will facilitate and structure the peer review process for preclinical FMT studies, which based on our analyses is poorly guided. We envisage that the GRAFT reporting recommendations will facilitate interpretation and experimental replication in    a. Define how engraftment/uptake of the FMT procedure was confirmed (e.g. 16S rRNA/shotgun sequencing, fecal culture) -It is recommended that the same analysis be applied to the final FMT product administered to compare composition of donor and recipient b. Timing of sample collection for engraftment assessment relative to FMT administration and outcome assessments c. Details on sample collection methods, as for donor: -Time of day of collection -Handling during collection -Method: Placing animal into clean cage until defecation or direct postmortem collection from colon, cecum or other site Durability/stability of donor profile a. Particularly for lengthy experimental designs, it may be informative to analyze the recipient microbiota at multiple timepoints after FMT administration to determine the long-term stability of the donor profile within the recipient future preclinical FMT studies, improving reproducibility, allowing better systematic review and metaanalysis, and future guidance on the most optimal methods to answer specific scientific questions.

Conclusions
This systematic review aimed to determine the most common protocols for FMT experiments in mice. Our key overarching finding was that many of the details required to reproduce these protocols were missing from the majority of papers, leading to the development of our minimum set of reporting guidelines. In the future, we urge researchers to clearly outline their protocols in order to provide transparency, increase reproducibility, and ultimately enhance the chances of producing clinically relevant and translatable knowledge.

Key findings:
• 92.5% of studies included reported a positive outcome of FMT intervention • Fecal slurries, containing both the microbiota and their metabolome, were the most common form of FMT product • 91.7% of studies did not report on the use of anaerobic conditions during FMT product preparation • Method of homogenization was not referred to in 33.6% of studies • 49.4% did not report storage conditions for FMT product • 21.4% of studies did not report on the sample size of the recipient group • Antibiotic-induced depletion was the most common form of recipient preparation • 5.3% of studies did not describe how the FMT was administered • 19.9% did not report the volume of FMT administered • FMT durability/uptake was not confirmed in 22.1% of studies • 88.4% of studies did not perform any quality control • 40% of studies did not report on control FMT conditions • The vast majority of studies were deemed irreproducible

Focus question
This systematic review aimed to answer the question: "what FMT protocols are being used in preclinical mouse models of health and disease?" FMT protocols were then used to define core aspects of preclinical FMT methodology and develop a set of minimum reporting standards.

Study design
The protocol for this systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. 29

Search strategy
We completed a comprehensive search using the electronic databases PubMed, Ovid Medline and Embase on June 30, 2020 (no date restrictions). The search parameters were tailored to each database, and the full search string for each database can be found in the Supplementary Information. We searched for papers including fecal or cecal material transplant. In total, 1753 papers were identified from our database search.

Selection criteria
Two reviewers (KRS and HRW) conducted the initial literature search and removed duplicate articles. Following this, entries from prior to 2010 were removed to ensure only modern FMT protocols were included. Initially, abstracts and titles were screened using Covidence systematic review software web program to assess eligibility (Veritas Health Innovation, Melbourne, Australia). Available at www.covidence.org). After this abstract screen, full-text articles were again assessed by the same reviewers. Two hundred and forty-one articles were selected for data extraction.
We aimed to retrieve only full-text, peer-reviewed, original experimental studies performed in mice. Studies must have been published in English. To be included in the review, studies must have completed a fecal or cecal microbiota transplant where mice were both the donor and the recipient.
Studies were excluded if they: used human or other non-mice microbial material for transplant, utilized GF mice as recipients or utilized a cohousing only approach to FMT. Secondary studies such as review papers, methodological protocols and conference abstracts were also excluded.

Data extraction and analysis
Seven reviewers (KRS, HRW, GHA, CBS, JS, MS, CC) independently extracted relevant information from the selected papers using standard data collection templates. We extracted all available methodological data on FMT from the main paper or Supplementary Information. Key information included: donor and recipient characteristics (age, strain, antibiotic use), FMT preparation and storage methods, FMT administration (dosage, number of treatments, administration route) and use of quality control methods. To quantify the reproducibility of preclinical FMT protocols included in our analysis, we developed a reproducibility index based on 10 variables of preclinical FMT, irrespective of model or study goals. The criteria were as follows: buffer/vehicle, method of homogenization, filtration steps, storage (if applicable), concentration of final FMT product, pre-conditioning of the recipient, route of administration, volume administered, frequency of administration and the inclusion of anaerobic conditions. Reviewers marked each criterion as 0 = not reported, 0.5 = mentioned, 1 = mentioned with appropriate detail (to be able to effectively replicate the study). Importantly, studies were assessed based on whether these parameters were reported, not for how they were performed. This index was not developed to provide a statistically comprehensive measure of reproducibility, and as such there was not necessarily a linear relationship between the score and overall reproducibility of the study.