Computational studies with flavonoids and terpenoids as BRPF1 inhibitors: in silico biological activity prediction, molecular docking, molecular dynamics simulations, MM/PBSA calculations

ABSTRACT The BRPF1 protein is encoded by the BRPF1 gene. In addition, the BRPF1 gene is known to be upregulated in leukaemia. Recent studies have shown that it is also overexpressed in hepatocellular carcinoma (HCC) as well. Therefore, BRPF1 is a significant target for anti-cancer drug development studies, especially on HCC. 40 terpenoids and flavonoids were chosen because of their anticancer properties given in the literature. In this study, the biological activity of molecules was also investigated with in silico structure-activity relationship analysis. In addition, interactions between a series of terpenoids and flavonoids and the BRPF1 protein were investigated by molecular docking and molecular dynamics simulations. The energy change caused by the interactions of BRPF1 with different compounds was also evaluated by MM/PBSA calculations. It has been revealed that compound 5 (−9.2 kcal/mol), a kind of secoclerodane type diterpenoid, has a higher affinity both compared to other flavonoids and terpenoids, and 9F9 (−7.9 kcal/mol), a selective BRPF1 inhibitor. The study presented in this article demonstrates that compound 5, as a natural product, could form a chemical scaffold for the development of selective BRPF1 bromodomain inhibitors.


Introduction
Cancer, which occurs due to poor living conditions and genetic factors, increases every day and remains among the leading causes of death, despite the new anticancer therapies developed. Among the many types of cancers, the most common type of cancer in children is leukaemia (28%) [1]. There are many mutations determined to cause this disease. Bromodomain and plant homeodomain finger containing 1 (BRPF1) protein has been experimentally proven to be effective in leukaemia in recent studies [2].
Bromodomains are protein interaction sites that provide acetylation to be read. The plant homeodomains (PHD fingers) also recognize and bind acetylation sites [3,4]. BRPF1 provides increased acetylation activity because it contains both regions (Bromodomain and PHD finger). Therefore, a significant phenotypic response has been obtained in leukaemia cells with BRPF1 inhibition [5]. In addition, it has been found that the BRPF1 gene, which encodes the BRPF1 protein, is up-regulated not only in leukaemia but also in hepatocellular carcinoma, and it has even been shown that this upregulation may be associated with the high mortality rate of liver cancer patients [6].
It has been clinically proven that the use of high doses of a single chemotherapy agent in cancer treatment can cause many side effects. For this reason, a treatment protocol in the form of a combination of different chemotherapeutics at low doses is generally applied today. The use of natural compounds with chemotherapy agents has been investigated, and many compounds have been reported to increase the chemotherapeutic effect [7][8][9], help to overcome cancer multidrug resistance [10] and protect cells against the side effects of chemotherapeutics [11][12][13].
Terpenoids are composed of isoprene units and are natural products of the mevalonate pathway [14]. Bicyclic diterpenoids of the clerodane type are called secoditerpenoids if they have opened rings [15]. Flavonoids, on the other hand, are a very large family of secondary metabolites containing hydroxylated forms of polyphenol structures [16]. Although there are many in silico studies about drug-receptor interactions with bromodomains in the literature [17][18][19][20][21], there are few studies on the interactions between flavonoids, terpenoids, and bromodomain proteins [22][23][24][25].
In this study, the ability of a group of natural compounds to inhibit BRPF1 protein was investigated computationally, and it was aimed to determine natural products that could be drug precursors for leukaemia and HCC. For this purpose, living cell conditions were mimicked biophysically, and molecular docking and molecular dynamics (MD) simulations were supported by energetic calculations.

Materials and methods
In silico biological activity prediction PASS (Prediction of Activity Spectra for Substances) online web tool [26] was used to determine in silico the biological activity properties of molecules downloaded from PubChem [27] in 'sdf' format and recorded in '.mol' format with the DSV program [28]. PASS online web tool makes a prediction based on the analysis of structure-activity relationships (SAR) of molecules. In this way, the drug-like nature of ligands is examined with the PASS online database. P a denotes the probability of molecules being a member of the active compound's family for that properties (antimetastatic, antineoplastic and anticarcinogenic), while P i denotes the probability of being a member of the inactive compound's family. The web tool suggests filtering methods such as P a >0.7, P a > Pi, and P a > 0.3 after estimation. If the compound with the highest P a values is selected in the estimation, the possibility of the chosen compound matching with a known pharmacological agent arises. Suppose the P a value is chosen between 0.5 and 0.7. In that case, we have a lower chance of accurately determining the experimental activity, but our probability of detecting a molecule analogous to a known drug molecule also decreases. If the P a value is chosen below 0.5, the chance of detecting experimental activity will be considerably reduced. In this study, since natural flavonoids and terpenoids that are not clinically active substances or analogues were selected, compounds with high P a values were chosen as the target, and P a > 0.8 values were emphasized.

Molecular docking
Rigid docking was performed employing the AutoDock Vina program [29]. Autodock Vina optimization algorithm is an algorithm that includes local optimization procedures, and stochastic global optimization approaches such as genetic algorithms, particle swarm optimization, and simulated annealing. Using this algorithm, the global minimum of the receptor-ligand complex and the low-score conformation of the ligands are found [29,30].

Receptor preparation
A comprehensive screening was made in the RCSB protein databank [31] for the BRPF1 protein, and 75 structures were found. These structures were especially examined according to their resolution and apo/halo conformation properties and 3 different X-ray crystallographic 3D structures (PDB ID: 4QYL [32], 4UYE [5], 5O55 [2]) were selected considering the modelling studies carried out with BRPF1 in the literature. Since it is more accurate to work on active conformation in inhibitor drug development studies, 3 halo structures have been emphasized. The properties of all BRPF1 structures in the RCSB protein databank, as well as the 3 selected 3D structures, are given in Table S1.
First, the structures have been simplified to include a single chain. All water and heteroatoms were then removed except the binding package. These two steps were carried out through the DSV program. Finally, structures recorded in PDB format in DSV were opened with the ADT program [33], hydrogens and Gasteiger partial charges were added, and they were recorded in PDBQT format with ADT's Grid/Macromolecules module.

Ligand preparation
19 terpenoids and 21 flavonoids (Table S2) extracted from plants growing in Turkey, New Zealand, Taiwan, Southeast Asia, and China were identified from the literature according to experimental data for molecular docking [34][35][36][37][38][39][40]. Molecules were downloaded from the PubChem database, and those that could not be found in PubChem were drawn with ChemSketch [41] and their hydrogens were added to the DSV program. All ligands were recorded in PDB format with the DSV program. Finally, 40 ligands were prepared by ADT's 'Ligand' module automatically, Gasteiger charges were added, non-polar hydrogens were merged, the number of rotatable bonds was determined, and TORSDOF values were adjusted accordingly, and ligands were recorded in PDBQT format.

Docking protocol
A grid box was created via the 'Grid/Grid Box' module of the ADT to determine the area to be scanned for docking. This box should contain the binding package of protein. In our study, box properties were determined by taking the ligands crystallized together with the protein in a 3D structure to the centre (Table 1).

Molecular dynamics simulations
Energy values and binding profiles of molecular docking were examined. In addition, molecules with higher docking scores than the reference molecule (9F9) were also considered in the ligand selection. For this reason, MD simulations were applied to 10 terpenoids, 9 flavonoids, and reference molecule (9F9) ( Table 2).
For MD simulations, the 3D structure of BRPF1 with PDB ID: 4UYE code was chosen. AMBER v20 [42] molecular dynamics package was used for MD simulations. The ANTECHAMBER module in the AMBER v20 package was used to create the topology files required for molecules. In this way, atomic equivalence, and missing force fields were examined, relevant regulations were completed, and the AM1-BCC charge model was used to calculate atomic point charges. The AM1-BCC model is a model that produces high-quality atomic charges that mimic the HF/6-31 G* electrostatic potential for organic molecules. Parameterization using the GAFF and AMBER14SB force fields [43], ionization, and solvation of the protein-ligand complexes were performed with the LEAP module of AMBER v.20. Two sodium ions (Na + ) were added to all complexes to neutralize the system. Around the complex, 10 Å TIP3P water buffer was placed in all directions and the proteinligand system was covered with explicit solvent in a rectangular box.

Minimization and MD protocol
SANDER modules of AMBER v.20 were used for the minimization phase. Minimization was carried out in two steps, in the first step, the positions of TIP3P molecules and ions were minimized by keeping the protein-ligand complex constant. 500 steps of steepest descent minimization were applied, followed by 500 steps of conjugate gradient minimization. In the second step, 25,000 steps of minimization were implemented to the whole system without any restraints.
PMEMD. CUDA module of AMBER v.20 was used for the heating and MD equilibration stages. In the heating stage, the system was heated to 300 K by applying weak restraint to the protein-ligand complex in a constant volume periodic boundary. In this process, the temperature was controlled by using Langevin dynamics at a collision frequency of 1.0 ps −1 . Completed at 50 ps using 50,000 molecular dynamics steps with a time step of 2 fs per step. In the MD equilibration stage, the SHAKE algorithm was used, which inhibits the effect of hydrogen on large-scale dynamics by restricting all hydrogen-containing bonds. In the simulation, isotropic position scaling with a relaxation time of 2 ps and constant pressure periodic boundary conditions were employed to keep the system pressure at 1 atm. This calculation was carried out for 150 ns with 150,000,000 steps of 1 fs for each system and results were recorded every 10 ps.

Post-trajectory analysis
MD simulation results were saved in trajectory files and using these files throughout the whole simulation, RMSD, Hydrogen bond analyses were performed with the CPTRAJ [44] module of AMBER v.20. Binding free energy calculations were performed by MM/PBSA, MM/GBSA modules [45] of AMBER v.20. While the salt concentration was determined as 0.150 M in the calculations, the ionic strength, the fill ratio, and the interior dielectric constant values were determined as 0.15, 4.0 and 1.0, respectively. The energy components of 10 ns trajectories between 140-150 ns for 19 different complexes were investigated. The equations that form the basis of MM/PB(GB)SA calculations were given in our previous study [46].

In silico biological activity prediction
After the literature review, the bioactivity scores of 40 molecules with known anticancer properties were calculated for their antimetastatic, antineoplastic and anticarcinogenic properties. Table 3 shows that the P a value of 34 compounds was higher than the P i value for anticancer properties, and compounds 30, 33 and 40 give a very high probability of activity in terms of anticancer, antineoplastic and antimetastatic properties.

Comparison of 3D structures of BRPF1 protein
Docking was implemented for 3 different structures of the BRPF1 protein (PDB ID: 4QYL, 4UYE, 5O55) and all ligands given in Table S2. The results are given in Table S3 for comparison. Examining the energy values and the binding profiles obtained with the 3 structures, it was determined that the highest scores were obtained with the 4QYL structure.
The resolution values of the first (4QYL), the second (4UYE), and the third (5O55) structures are 1.8 Å, 1.64 Å, and 1.45 Å, respectively. The highest goodness of fit with experimental data was reached in the second structure (4UYE) with a rate of 95%, the lowest was reached in the third structure (5O55) with a rate of 75%. Considering these features of the structures and docking results, it was decided to continue the MD simulations with the first structure (4QYL).

Energetic results of docking
Molecular docking results with the binding affinities of the 19 selected molecules are shown in Table 4. The lowest binding affinity value of −9.2 kcal/mol was obtained in molecule 5, a seco-clerodane-type diterpenoid. In addition, the second-lowest binding affinity value of −9.1 kcal/mol was obtained in molecule 8, a newly discovered clerodane diterpenoid, and molecule 13, a known clerodane structure. Molecular docking results for the 3 highest-scoring binding modes of 40 molecules are also given in Table S4.  The binding profiles of the reference molecule (9F9) and other molecules were also examined, and it was observed that there were significant interactions within the binding pocket of molecule 5. It is recognized that the reference molecule only makes 2 carbonhydrogen bonds with ILE 24, while compound 5 makes 3 conventional hydrogen bonds with ASN 23, SER 26, and GLU 27. It has been revealed that the reference molecule is tightly packed by π-interactions. In addition, it is given in Figure 1 that molecule 5 is surrounded by van der Waals interactions and unfavourable bumps.

MD simulations
MD simulations were conducted for 20 molecules given in Table 2 and RMSD values were calculated during the 150 ns simulation ( Figure S2). The average RMSD value for all molecules is given in Table 5. When both RMSD plots and average RMSD values are compared with the simulation carried out with the reference molecule, it was observed that the complex systems were largely stable (Average RMSD values ~ 2). Small fluctuations observed in RMSD graphs were mostly detected before 100 ns ( Figure S2). The protein was found to be stabilized after 100 ns.
To show the residues interacting with the ligands, the RMSF values of the Cα atoms were analysed from the time-averaged positions for the BRPF1 protein structure. Fluctuations in residues occur due to interactions with ligands. Changes in protein structure due to the binding of 20 compounds were expressed with RMSF graphs. In Figure 2, the RMSF curves of molecule 5 and the reference compound are given on the same graph. It has been shown that molecule 5 causes the highest atomic fluctuations between residues 20-25 (orange), 30-35 (pink), and 79-85 (blue), and the positions of the relevant residues on the protein are given in Figure 2.   Trajectory analysis was performed for the 150 ns MD to understand the interaction of the compounds with BRPF1 in detail. Although all frames within 150 ns were scanned, more emphasis was placed on simulation beyond 100 ns due to high stability. As a result, although all compounds remained in the binding pocket throughout the simulation, they had little interaction with the protein ( Figure S1). However, molecule 5 comes to the forefront by trajectory analysis, and it is revealed that it is packed in the active site with hydrogen bonds (Figure 3). While the binding profile of molecule 5 with BRPF1 was examined, both the docking result and the MD result showed a great interaction between molecule 5 and the acyl group. While 2 hydrogen bonds were detected between the -O atom of the acyl group and the residues of GLU 27, and SER 26 in the docking results (Figure 1b), it was determined that the acyl group made 2 bonds with ASN 23 and GLU 27 in the MD results. In addition, the 'linker -O atom' that connects the acyl group to the molecular skeleton makes 1 hydrogen bond with PHE 47 residue. 3 hydrogen bonds were also detected between the -O atom in the furan ring and the residue GLU 33, and between the -O atom in the γ-lactone ring and ASN 80, ILE 24, and ILE 85 ( Figure 3).
According to the trajectory analysis, the molecule that gives the best result in flavonoids is molecule 30. The binding profile between molecule 30 and BRPF1 is given in Figure 4. The compound 30 contains 2 methyloxane rings and 5 hydrogen bonds were detected between the hydroxyl groups attached to these rings and residues ASN 23, GLU 27, and PRO 30. Despite the existent 5 hydrogen bonds, 2 new hydrogen bonds were detected between the hydroxyl groups attached to the dihydrochromene regions and residues GLU 33 and ASN 80. Furthermore, there are 2 hydrogen bonds between the -O atom in the dihydrochromene group and the TYR 37, and TYR 79 residues. Finally, 3 hydrogen bonds were detected between the ASN 80 residue and the hydroxyl, -OCH 3 groups bound to the benzene ring (Figure 4).
Hydrogen bond analysis was applied for 15,000 frames during the 150 ns simulation performed with all ligands. In the literature, bonds with an average bond distance of less than 2.5 Å and an average bond angle between 90-180° have been reported as acceptable [47]. In the hydrogen bond analysis, the acceptable bonds were observed in 20 frames, and more were given in the filtered Table  S5. All the hydrogen bonds in Table S5 were examined and it was determined that the majority of the compounds bonded with GLU 27, GLU 33, and ASN 80 at different frame numbers. The properties of hydrogen bonds developed during the simulation between molecule 5 and BRPF1 are given in Table 6. According to the results presented in Table 6, it is observed that molecule 5 forms hydrogen bonds with ASN 23 in the 9144 frames and with ASN 80 in the 4580 frames. In addition, hydrogen bonds with an average bond distance of 2.16 Å with the GLY 27 residue were observed only in 542 frames, although they had the lowest bond length.  Determining the effectiveness of different functional groups in the formation of noncovalent interactions is a very important issue for drug development studies. In this study carried out with natural products, when the functional groups in the compounds are examined according to their hydrogen bond acceptor and donor properties, it is revealed that phenol, furan, methyloxane rings which have -O atoms in their structures are of key importance (Figures 3, 4, and S1). In addition, the functional groups that trigger interactions the most are the hydroxyl groups attached to these cyclic structures as -dihydroxy and -trihydroxy ( Figure S1).
As in the molecule 18 (Euphoroylean B), molecule 6 (12 β-hydroxydolabella-(3E,7E)diene) and molecule 1 (schistochilic acid D) it has been observed that the large crown structures of the compounds are not very effective in the ligand-BRPF1 interaction but lead to many unfavourable bumps (Figures 1 and S1). Unfavourable bumps are also known to lower affinity as they indicate thrusts that destabilize the complex [48]. Therefore, compounds are expected to bind in vitro with a higher affinity than is obtained in both docking and MM/PBSA results.
Another important functional group in diterpenoid structures is γ-lactone. Compound 5, which gives the highest scores in energetic, and trajectory analyses in our study, is a very important functional group for ligand-BRPF1 interaction ( Figure 3).
The most striking functional group in flavonoids is the benzopyrane (chromene) structure in the flavone skeleton. Another notable compound in this study is the molecule 30. When the hydrogen bonds of molecule 30 are examined, it is observed that methyloxane groups play a major role in the ligand-protein interaction as well as the dihydrochromene group (Figure 3).

MM/PB(GB)SA calculations
The interactions of the compounds with BRPF1 cause changes in the energetic values. To determine this change and compare it with the change created by the reference molecule, MM/PB(GB)SA calculations were performed and given in Figure 5. For the Poisson Boltzmann and Generalized Born solvation models, polar and non-polar energies were calculated separately for 20 different complexes. In addition, the van der Waals, electrostatic, and dispersion energies along with their standard errors of them are given in Table S6.
The effect of different energy components on total energy was also analysed. The energy composition in the MM/PB(GB)SA results is given in Table S6. Also, Table 7 demonstrates the energy composition of compounds 5, 30, and reference compounds. Accordingly, it has been determined that the highest effect on energy is provided by van der Waals interactions, while the lowest effect is achieved by the electrostatic effect. On the other hand, Generalized Born solvation has been observed to make the energy more negative, but this has led to quite different energetic results from the docking results. Tables S6 and S7 show that the MM/PBSA results are closer to the docking results.  Although docking and MM/PBSA results were found to be relatively close to each other in our study, MM/GBSA results were found to be quite different. Although it has been reported in the literature that MM/PBSA analyses give results very close to the experimental results [47], different opinions have also been encountered [45,49,50].

Conclusions
Diterpenoids and flavonoids are natural products whose anticancer properties have been extensively demonstrated in the literature [8,12,15,16,37,[51][52][53]. In this study, the biological activities of 40 natural compounds were investigated by in silico SAR analysis. It was observed that the P a value of 34 compounds for anticancer properties was higher than the P i value. In the second step, 19 diterpenoids and flavonoids showing inhibitory properties specific to BRPF1 protein were examined by in silico methods. Both docking results and MD simulation results with energetic analyzes showed that molecule 5 gave higher scores than the reference compound (9F9). In the RMSF analysis, fluctuations triggered by hydrogen bonds and hydrophobic interactions in the residues were investigated. All compounds were shown in Figure S3 to interact with the active site identified in the literature for the BRPF1 protein [2,54,55].
In conclusion, it is computationally shown that compounds 5 and 30 are potential novel BRPF1 inhibitors due to their natural anticancer agent properties against leukaemia and HCC. According to the MD simulation performed with the explicit solvent model revealed that the system was stabilized by the binding of compounds 5 and 30. When all diterpenoids and flavonoids used in the study are examined, it is revealed that different functional groups with the -O atom they contain trigger non-covalent interactions. Based on these results, although in vitro and in vivo analyzes are recommended to understand the true inhibitory potential of compounds 5 and 30, the molecules show promise as anticancer agents.