HT-SuMD: making molecular dynamics simulations suitable for fragment-based screening. A comparative study with NMR

Abstract Fragment-based lead discovery (FBLD) is one of the most efficient methods to develop new drugs. We present here a new computational protocol called High-Throughput Supervised Molecular Dynamics (HT-SuMD), which makes it possible to automatically screen up to thousands of fragments, representing therefore a new valuable resource to prioritise fragments in FBLD campaigns. The protocol was applied to Bcl-XL, an oncological protein target involved in the regulation of apoptosis through protein–protein interactions. Initially, HT-SuMD performances were validated against a robust NMR-based screening, using the same set of 100 fragments. These independent results showed a remarkable agreement between the two methods. Then, a virtual screening on a larger library of additional 300 fragments was carried out and the best hits were validated by NMR. Remarkably, all the in silico selected fragments were confirmed as Bcl-XL binders. This represents, to date, the largest computational fragments screening entirely based on MD.


Content
Description Figure-S1 Structure of outliar fragments Figure-S2 Overview of the ligand-based NMR experiments Dataset-1 Smiles strings of the 100 fragments screened in the first round Dataset-2 Smiles strings of the 300 fragments screened in the second round Table S1 Mixture Classification according the ΔδNH Table S2 NMR ligand-based deconvolution of the mixtures Video-S1 Caption of Video-S1. Superposition of 300 recognition trajectories generated by HT-SuMD in the first Virtual Screening (100 fragments).

Video-S2
Caption of Video-S2. Recognition pathway obtained by HT-SuMD of Fragment 2 Video S3 Caption of Video-S3. Superposition of 900 recognition trajectories generated by HT-SuMD in the second Virtual Screening (300 fragments).

Figure S1 caption
The table summarizes the chemical structure, the computational and experimental NMR descriptors for all the fragment molecules considered as outlier. In detail, four fragments (i.e. 167, 172, 200, 261) were not included by HT-SuMD in the list of putative hits and were independently confirmed by NMR as Bcl-XL binders. It worth noting how fragment 200 was correctly identified by the consensus scoring approach as a potential binder belonging to MMGBSAclust ꓵ SIZEclust intersection ( Figure 2 and Figure 4). However, being this specific convergence domain rich in false positive, with only one molecule among twelve correctly anticipated as a true binder (i.e. fragment 200), it was decided to exclude this intersection from the hit fragment set. Fragments 167, 172 and 261 instead, all characterized by a bicyclic scaffold, despite the good geometric and energetic indicators, are excluded from the top 10% of the best cluster. The choice of such a stringent cutoff can, therefore, be responsible for the missing identification of the three fragments; however, the use of wider cutoffs could result in increased noise in the selection of true binders molecules. Compound 163 and 164 were instead incorrectly predicted by the HT-SuMD selection protocol as Bcl-XL binders. However, it worth noting how these compounds, belonging to third class mixtures (ΔδNH of the mixtures < 0.025) were therefore discarded from the subsequent screening phase and have never undergone an individual validation phase. The values reported in bolt are the value that are above the theashold (top 10%).
* These values do not refer to the single fragment but refer to the original mixture in which it was contained.

Figure S2
S4 Figure S2 caption Overview of the ligand-based NMR experiments for all the 12 mixtures selected after the initial screening using 1 H-15 N SOFAST-HMQC experiments. All these mixtures were subjected to Saturation Transfer Difference (STD) and Water-LOGSY experiments in the presence and in the absence (control experiment) of Bcl-XL and the results are summarized in this Figure Dataset-2. Smiles of the 300 fragments screened in the second round Oc1c ( Table S1 legend: Average NH of the 20 mixtures of the 100 fragments NMR screening. The firstclass mixtures are reported in green, the second-class ones in yellow and the third-class ones in white.  Table S2 legend: Deconvolution of the first-and second-class mixtures with the ligand-based STD (column STD) and WaterLOGSY (Column WL) experiments. The sign "-" indicates the absence of binding, the sign "+" stands for the presence of binding.
Caption of Video-S1. Superposition of 300 recognition trajectories generated by HT-SuMD in the first Virtual Screening. All the recognition trajectories of the100 fragments of the first screening (3 replicas each) were superposed based on the Ca protein atoms and aligned to the first trajectory frame. Only one protein trajectory is shown for clarity (gray molecular surface).
Caption of Video-S2. Recognition pathway obtained for fragment 2 by HT-SuMD. The best replica according to the analysis protocol was chosen. The video in composed by four synchronized and animated panels that depict the molecular trajectory considering different aspects of the simulation. The time evolution is reported in nanoseconds. In the first panel (upper-left), the molecular representation of the macromolecular system is shown. The Bcl-XL molecular surface is reported in light grey while Fragment 2 is rendered using light-green stick representation and by a transparent molecular surface.
In the second panel (upper-right), the distance between the centers of mass of the fragment and Bcl-XL cleft is reported along the trajectory.
In the third panel (lower-left), the MMGBSA energy profile is reported. The animated red circle highlights the value for the corresponding frame. The trend is depicted by a continuous black line obtained by smoothing the raw data (gray circles) using a Bezier curve procedure.
In the fourth panel (lower-right), the cumulative electrostatic interactions are reported for the 10 Bcl-XL residues most contacted by the fragment during the whole simulation.
Caption of Video-S3. Superposition of 900 recognition trajectories generated by HT-SuMD in the second Virtual Screening. All the recognition trajectories of the 300 fragments of the second screening (3 replicas each) were superposed based on the Ca protein atoms and aligned to the first trajectory frame. Only one protein trajectory is shown for clarity (gray molecular surface).