Virtual design of novel Plasmodium falciparum cysteine protease falcipain-2 hybrid lactone–chalcone and isatin–chalcone inhibitors probing the S2 active site pocket

Abstract We report computer-aided design of new lactone–chalcone and isatin–chalcone (HLCIC) inhibitors of the falcipain-2 (PfFP-2). 3D models of 15 FP-2:HLCIC1-15 complexes with known observed activity (IC 50 exp) were prepared to establish a quantitative structure–activity (QSAR) model and linear correlation between relative Gibbs free energy of enzyme:inhibitor complex formation (ΔΔG com) and IC 50 exp: pIC 50 exp = −0.0236 × ΔΔG com+5.082(#); R 2 = 0.93. A 3D pharmacophore model (PH4) derived from the QSAR directed our effort to design novel HLCIC analogues. During the design, an initial virtual library of 2621440 HLCIC was focused down to 18288 drug-like compounds and finally, PH4 screened to identify 81 promising compounds. Thirty-three others were added from an intuitive substitution approach intended to fill better the enzyme S2 pocket. One hundred and fourteen theoretical IC50 (IC 50 pre) values were predicted by means of (#) and their pharmacokinetics (ADME) profiles. More than 30 putative HLCICs display IC 50 pre 100 times superior to that of the published most active training set inhibitor HLCIC1.


Introduction
Malaria remains the main cause of death all over the world and we are unable to reverse this sad morbidity statistic. Indeed, during the year 2015, 212 million cases of malaria and 429 thousand death cases were reported by the WHO 1 . Plasmodium falciparum (Pf), the most virulent causative agent of malaria, is developing resistance rapidly so that the future of the Artemisinin Combined Therapy (ACT, available since 2006) is compromised 2 . For almost a decade the development of novel antimalarials concerned mainly ACT, as recently exemplified by introduction of the combination of artemether and lumefantrine referred to as novamether 3 . Luckily, this alarming dark picture has been illuminated by the reported new antimalarial candidates [4][5][6] . Relatively low level of structural knowledge of novel and validated pharmacological targets of the Pf on one side and the lack of inhibition pharmacophores on the other, remain the drawbacks of antimalarial drug design (ADD) and development. One of the recent ADD strategies was to design hybrids molecules containing artemisinin (ART) or other antimalarial combined with another structural fragment [6][7][8] . Falcipains have drawn the attention of ADD due to their important role in the haemoglobin degradation (along with plasmepsins) while their inhibition is lethal to the Pf 9-11 . Lactone-chalcone and isatin-chalcone (HLCIC) hybrid molecules have been mentioned in this context, their experimental FP-2 inhibitory activities were reported 12 . They essentially target the bipartite motifs of falcipain-2 (FP-2) and falcipain-3 (FP-3), the papain-family cysteine proteases of Pf, which are involved in the degradation of host's haemoglobin within the food vacuole during the blood-stage of the parasite 13,14 . Due to their specific features, falcipains represent promising targets for the development of next-generation antimalarials 10,14 .
In this work, we design new analogues of HLCIC starting from a series of 15 known HLCIC hybrids with determined experimental inhibition potencies (IC 50 exp ), which were used as a training set 12 . We build on our design and new analogue activity prediction relies on a reliable descriptor, namely standard Gibbs free energy (GFE) of enzyme:inhibitor (E:I) complex formation (DG com ), quantitative structure-activity relationships (QSAR), 3D structural model of the FP-2 and analysis of the inhibitor-enzyme interactions. Relative changes in the GFE of E:I complex formation were computed in order to build a linear regression QSAR model utilising the published IC 50 exp 12 . Each complex was carefully built by in situ modification of the reference crystal structure of FP-2 (3BPF) 15 in complex with epoxysuccinate E64 (Figure 1). The 3D models of inhibitors bound to FP-2, QSAR and pharmacophore (PH4) models derived for the training set compounds provided the necessary structural information needed to improve inhibitor interactions at pockets S1, S2, and S3 of the FP-2 active site. Screening of designed virtual library (VL) of analogues by the PH4 led to the identification of potent HLCIC, which are predicted to be hundreds of times more potent than the best training set inhibitor HLCIC1 (IC 50 exp ¼ 6.8 mM).

Training sets
The chemical structures of the training set compounds comprising 15 HLCIC hybrids and their experimental biological activities (IC 50 exp ¼ 6.8-90 mM) are taken from the literature 12 . The IC 50 exp values of the HLCIC compounds cover a relatively wide concentration range, which is needed to build a reliable QSAR model.

Model building and calculation of binding affinity
Molecular modelling was carried out for the E:I (FP-2:HLCIC) complexes, the free enzyme FP-2, and the free HLCIC inhibitors starting from the high-resolution crystal structure of FP-2 co-crystallised with epoxysuccinate E64 inhibitor (PDB code 3BPF, resolution 2.9 Å) using Insight II molecular modelling program 27 . Initially, all crystallographic waters were removed, then hydrogens were added to the residues of the FP-2 and FP-2:HLCIC complex with the protonisation/ionisation state corresponding to the pH of 7 keeping the N-and C-terminal groups neutral. Inhibitors were modelled from the 3BPF reference crystal structure by in situ modification of functional groups in the molecular scaffold of the endogenous E64 inhibitor. All rotatable bonds of the replacing fragments were subjected to an exhaustive conformational search coupled with a careful gradual energy-minimisation of the modified inhibitor and active-site residues of FP-2 located in the immediate vicinity (5 Å radius) in order to identify low-energy bound conformations of the modified inhibitors. The resulting low-energy structures of the E:I complexes were then carefully refined by energy-minimisation procedure of the entire complex to obtain stable structures of the binary FP-2:HLCIC complexes. The complete description of the computation of relative ligand binding affinity (DDG com ) is described in Ref. [25]: The DDH MM describes the relative enthalpic contribution to the GFE change corresponding to the intermolecular interactions in the E:I complex estimated by molecular mechanics (MM), DDG sol , and DDTS vib represent the relative solvation and vibrational entropy contributions to the GFE of the E:I complex formation, respectively.

Molecular mechanics
Modelling of the inhibitors and their complexes were carried out in all-atom representation using atomic parameters and charges of the class II consistent force field CFF91 22 . A dielectric constant of 4 was used for all MM calculations in order to take into account the dielectric shielding effect in proteins. Minimisations of the E:I complexes, free E and I were carried out by relaxing the structures gradually, starting with added hydrogen atoms, continued with residue side chain heavy atoms and followed by the protein backbone relaxation. Geometry optimisations were performed using the sufficient number of steepest descent and conjugate gradient iterative cycles and average gradient convergence criterion of 0.01 kcalÁmol À1 ÁÅ À1 .

Solvation GFE
The electrostatic component of the solvation GFE, which includes also the effect of ionic strength of the solvent by solving the nonlinear Poisson-Boltzmann equation 28,29 was computed by the DelPhi module of the Discovery Studio (DS 2.5) 30 . The program represents the solvent by a continuous medium of high dielectric constant (e ro ¼ 80) and the solute as a charge distribution filling a low dielectric cavity (e ri ¼ 4) with boundaries linked to the solute's molecular surface. The program numerically solves for the molecular electrostatic potential and reaction field around the solute using finite difference method. DelPhi calculations were done on a (235 Â 235 Â 235) cubic lattice grid for the E:I complexes and free E and on a (65 Â 65 Â 65) grid for the free I. Full coulombic boundary conditions were employed. Two subsequent focusing   Figure 2. Workflow describing the multistep approach to virtually designed novel HLCIC analogues with higher predicted potencies against the FP-2 of Pf. steps led to a similar final resolution of about 0.3 Å per grid unit at 70% filling of the grid by the solute. Physiological ionic strength of 0.145 molÁdm À3 , atomic partial charges and radii defined in the CFF force field parameter set 30 and a probe sphere radius of 1.4 Å were used. The electrostatic component of the Poisson-Boltzmann solvation GFE was calculated as the reaction field energy 21,24,29,31,32 .

Interaction energy
The molecular mechanic interaction energy (E int ) calculation protocol available in DS 2.5 30 was used to compute the non-bonded interactions (van der Walls and electrostatic interatomic potential terms) between two sets of atoms belonging either to the E or I in the E:I complexes. All pairs of interactions of the total enzyme-inhibitor interaction energy were evaluated using CFF force field parameters with a relative permittivity of 4 30 . In particular, the breakdown of E int into contributions from individual active site residues allows a quantitative analysis, which permits identification of residues with the highest contribution to the ligand binding. It also helps with the identification of favourable structural modifications and suggests molecular moieties in the inhibitor structure which are primarily responsible for the biological activity of the compound 22 .

3D-QSAR pharmacophore generation
Pharmacophore (PH4) modelling assumes that a set of key structural features responsible for biological activity of the compound is recognised by the active site during receptor binding. In this work, the pharmacophore was prepared by the 3D-QSAR pharmacophore protocol of Catalyst HypoGen algorithm 33 implemented in DS 2.5 30 . Bound conformations of HLCIC inhibitors taken from the refined E:I complexes were considered for construction of the PH4. The top scoring pharmacophore hypothesis was prepared in three stages: constructive, subtractive, and optimisation step, from a set of most active HLCIC inhibitors. The inactive compounds served for the definition of excluded volume. During the PH4 generation, five features available in the HypoGen algorithm were used: hydrophobic aromatic (HYdAr), hydrophobic aliphatic (HYd), hydrogen-bond donor (HBD), acceptor (HBA), and ring aromatic (Ar) feature. Default values of the adjustable parameters were kept during the PH4 generation, except the uncertainty on the biological activity, which was reduced to 1.25 instead of 3. This adjustment modified the uncertainty interval of experimental activity from a wide span [IC 50 exp /3, 3ÂIC 50 exp ] to a relatively narrow one] [4ÂIC 50 exp /5, 5ÂIC 50 exp /4], due to accuracy and homogeneity of the measured activities originating from the same laboratory 12 . The top ten pharmacophores were generated with the number of missing features set to zero. Finally, the best PH4 model was selected. Generally, a PH4 model, as the one described here, can be used to estimate the pIC 50 pre of new analogues on the basis of their mapping to its features. In this study, priority was given to PH4 based screening of ADME focused VL of HLCIC analogues.

ADME-related properties
Properties that determine the pharmacokinetics profile of a compound, besides octanol/water partitioning coefficient, aqueous solubility, blood/brain partition coefficient, Caco-2 cell permeability, serum protein binding, number of likely metabolic reactions and other 18 descriptors related to adsorption, distribution, metabolism and excretion (ADME properties) of the inhibitors were computed by the QikProp program 34 based on the methods of Jorgensen [35][36][37] . According to these methods, experimental results of more than 710 compounds including about 500 drugs and related heterocycles, were correlated with computed physicochemical descriptors, resulting in an accurate prediction of molecule's pharmacokinetic profile. Drug likeness (#stars) is represented by the number of descriptors that exceed the range of values determined for 95% of known drugs out of 24 selected descriptors computed by the QikProp 34 . Drug-likeness was used as the global compound selection criterion related to ADME properties. The selected ADME descriptors were calculated from 3D structures of compounds considered. They were used to assess the pharmacokinetics profile of designed compounds and served also for the VL focusing.

Virtual combinatory library generation
The analogue model building was performed with Molecular Operating Environment (MOE) program 38 . The library of analogues was enumerated by attaching R-groups (fragments, building blocks) onto HLCIC scaffold using the Quasar CombiDesign module of MOE 38 . Reagents and chemicals considered in this paper were selected from the directories of chemicals available from the commercial sources. Each analogue was built as a neutral molecule in the MOE program 38 , its geometry was refined by MM optimisation through smart minimiser of DS 2.5 30 meeting high convergence criteria (threshold on energy difference of 10 À4 kcalÁmol À1 and root mean square deviation (RMSD) of 10 À5 Å), dielectric constant of 4, using class II consistent force field CFF 39 .

ADME-based library focusing
Twenty-four pharmacokinetics-related molecular descriptors available in QikProp 34 , which characterise a wide spectrum of molecular properties as described in Section 2.7, were used. Optimum ranges of these 24 descriptors were defined in terms of upper and lower bounds according to QikProp 34 . Among them predicted drug-likeness (#stars, Section 2.7) was used to retain drug-like HLCIC analogues in the focused VL.

Pharmacophore-based library focusing
The pharmacophore model (PH4) described in Section 2.6 was derived from the bound conformations of HLCIC at the active site of FP-2. The enumerated and ADME-focused VL was further evaluated by using the pharmacophore mapping protocol available of DS 2.5 30 . Within this protocol, each generated conformer of the analogues was geometry optimised by means of the CFF force field for a maximum of 500 energy minimisation steps and subsequently aligned and mapped to the PH4 model in order to select the top ranking overlaps. Twenty best-fitting inhibitor conformers were saved and clustered into 10 conformational families according to their mutual RMSD by Jarvis-Patrick complete linkage clustering method 40 . The best representative of each cluster was considered for the virtual screening of analogues. Only those analogues mapping to all PH4 features were retained for the in silico screening.

In silico screening
The conformer with the best match to the PH4 pharmacophore in each cluster of the focused library subset was selected for in silico screening by the complexation QSAR model. The relative GFE of E:I complex formation in water DDG com was computed for each selected new analogue and then used for prediction of FP-2 inhibitory potencies (IC 50 pre ) of the focused VL of HLCIC analogues by inserting this parameter into the target-specific scoring function. The scoring function, which is specific for the FP-2 receptor of Pf inhibition, is given in Equation (2), was parameterised using the QSAR model of training set of HLCIC inhibitors 12 .

Results and discussion
A series of 18 [15 training set HLCIC and 3 validation set (VS) HLCIV] of HLCIC inhibitors and their experimental activities (IC 50 exp ) from the same laboratory 12 were selected (Table 1). These activities cover a relatively wide range 6.8 mM IC 50 exp 90 mM and allowed building of a valid QSAR model.

Results
The relative GFE of E:I complex formation DDG com was calculated for the FP-2: HLCIC complexes as described in Section 2. Table 2 shows GFE and their components (Equation (1)). The DDG com reflects the mutual affinity between the enzyme and the inhibitor. Since it is calculated via an approximate approach, the relevance of the binding model is evaluated by a linear regression with experimentally observed activity data (IC 50 exp ) 12 (Equation (2)), which led to a linear correlation and QSAR model for the training set of HLCIC inhibitors. Two correlation equations obtained for the GFE of E:I complex formation DDG com (Equation (B)) and its enthalpic component DDH MM (Equation (A)), respectively are presented in Table 3 with the relevant statistical data plotted in Figure 3. The relatively high values of the regression coefficient R 2 and the Fischer F-test of the correlation involving DDG com indicate that there is a strong relationship between the binding model and the experimental inhibitory potencies of the HLCIC.
The bound conformation of the most potent FP-2 inhibitor HLCIC1 12 in this QSAR model is displayed in Figure 4 ( Table 1). The enzyme-inhibitor overall intermolecular interaction energy E int Table 1. Training set (HLCIC) and validation set (HLCIV) of FP-2 inhibitors 12 used in the preparation of QSAR model of inhibitor binding to the FP-2 of Pf.
Chalcone scaffold linker-isatin (-lki) linker-lactone (-lkl) ]. e DDTS vib (kcalÁmol À1 ) is the relative entropic contribution of the inhibitor to the GFE related to protease-inhibitor complex formation: is the experimental half-maximal inhibitory concentration obtained from Ref. [12]. h Ratio of predicted and experimental half-maximal inhibition concentrations pIC 50 pre /pIC 50 exp (where pIC 50 pre ¼ Àlog 10 IC 50 pre was predicted from computed DDG com using the regression equation (B) shown in Table 3. in FP-2:HLClCx complexes is listed in Table 4 along with its correlation between that energy with observed inhibitory potency (IC 50 exp ) plotted in Figure 5. According to the statistical data in Table 5, some 85% of the IC 50 exp variation is explained by E int drop while moving from the best active HLClC1 to the less active one HLClC6. The quality of this correlation opens the gate to a deeper analysis of the intermolecular interaction energy E int variation in light with structural requirement of FP-2 inhibition namely the active site pockets filling.
The prominent role of the van der Waals (vdW) component of E int in the binding affinity of HLCIC to FP-2 of Pf is highlighted by the correlation between individual contributions to the overall E int . In addition, to assess the impact of the residues occupying individual active site pockets (S1, S2, S3, and S1'; Figure 4) we have analyzed their contribution to the overall E int (Table 6; Figure 6). The contribution of all the four pockets together explained 88% of the FP-2 inhibitory potencies of the training set inhibitors. This fell down to 31% when removing the contribution of the S2 pocket (67%). The filling of the S2 pocket by function groups of the inhibitors is therefore crucial for the FP-2:HLCIC1-15 affinity. Thus, our virtual FP-2 inhibitor design prioritised optimal filling of the S2 pocket by the HLCIC analogues.
The PH4 pharmacophore model of FP-2 inhibition elaborated from QSAR model and training set of HLCIC 12 is presented in Tables 7 and 8 and Figure 7. The 3D-QSAR PH4 generation was carried out in three steps: constructive, subtractive, and    13.6 mM) were used to generate the starting PH4 features and those matching these leads were retained. During the subsequent subtractive phase, features which were present in more than half of the inactive HLCIC were removed. The PH4 models which contained all features were retained. None of the training set compounds was found to be inactive (IC 50 exp > 6.8 Â 10 3.5 ¼ 21503.4 mM). During the final optimisation phase, the score of the PH4 hypotheses was improved. Hypotheses were scored via simulated annealing protocol according to errors in the activity estimates from the regression and complexity. At the end of optimisation, 10 best scoring unique hypotheses (Table 7) displaying five features were kept. The reliability of the generated PH4 models was then assessed using the calculated cost parameters ranging from 74.8 (Hypo1) to 94.7 (Hypo10). Their statistical data (costs, root-mean-square deviation 2.056 RMSD 2.777 and 0.89 R 2 0.94) are listed in Table 7. The PH4 Hypo1, with the best RMSD and highest R 2 was retained for screening. Its regression equation pIC 50 exp ¼ 0.9958ÁpIC 50 pre þ 0.0192 ( Figure 7; Table 8), both R 2 and R xv 2 greater than 0.9 and F-test of 111.25 attest the predictive capacity of the PH4. The fixed cost of Hypo1 (74.8), lower than the null cost (296.7) by D ¼ 221.9, is a chief indicator of the PH4 model  predictability (D > 70 corresponds to a probability higher than 90% that the model represents a valid correlation 22 ). The difference D ! 202 for the set of 10 hypotheses confirms the high quality of the PH4 model. The best-selected hypothesis Hypo1 represents a PH4 model with a similar level of predictive power as the QSAR model utilising the GFE of E:I complex formation with a probability of 98%. The substantial predictive power of the generated PH4 model was also checked through the computed ratio of PH4-predicted and experimentally observed activities (pIC 50 pre /pIC 50 exp ) for the VS, ( Table 1). The computed ratios are as follows: HLCIV1:1.006, HLCIV2:1.057, HLCIV3:0.97; all of them close to one.
We have built a VL of new HLCIC analogues compounds with a variety of substitutions in ortho, meta, and para positions of the benzene rings with the goal to identify more potent orally bioavailable inhibitors of the FP-2 of Pf.
To increase the content of drug-like and orally bioavailable analogues, the initial VL was filtered in an ADME-based focusing step. Only those molecules that satisfied the Lipinski's rule of five 41 computed using QikProp 16 , were kept.
From the initial set of 2,621,440 (1,310,720 Â 2) analogues, 18,288 (9144 Â 2) fulfilled the Lipinski test (except the restriction M w < 500 g/mol). Out of them, 141 analogues mapped to the 5 feature PH4 pharmacophore. The 81 best fitting analogues (PH4 hits) were retained and submitted to structure-based screening using the QSAR model and computed GFE of the FP-2:HLCIC complex formation. The calculated DDG com of the FP-2:HLCIC complexes of the hits, their components as well as predicted half-maximal inhibitory concentrations IC 50 pre estimated from the correlation equation (B) ( Table 3) are listed in Table 10. Thirty-three others HLClC new analogues were added from an intuitive substitution approach intended to fill better the enzyme S2 pocket; they are listed in Table 11. In the majority of new HLCIC analogues, the estimated inhibitory potencies are better than that for the most active training set inhibitor HLCIC1 (IC 50 exp ¼ 6.8 mM) 12 . In order to identify which substituents from Table 9 lead to new inhibitor candidates with high predicted potencies towards the FP-2, we have prepared histograms of the frequency of occurrence of the R 1 to R 6 groups in these 81 HLCIC virtual hits ( Figure 8). Analysis of the histograms showed that the highest frequency of occurrence among the R 1 -groups displayed the fragments 31, 33, and 53 (Table 9). In case of R 2 -groups fragments 1 and 2; for the R 3 -groups fragments 1, 2, and 106; for the R 4groups fragments 1, 2, 4, and 33 against 1 (4), 4 (4), 6 (7), and 7 (5) for R5 while the R6 group include chiefly fragments 1 (4), 29 (5), 31 (7), and 38 (4).

Discussion
Training set of 15 HLCIC inhibitors and observed half-maximal inhibitory concentrations IC 50 exp 12 were employed to derive QSAR model of FP-2 inhibition, which uses a single descriptor determined by molecular modelling (GFE of FP-2:HLCIC complex formation, DDG com ) and crystal structure of the FP-2 of Pf in complex with epoxysuccinate E64 (3BPF) 15 . This statistically significant Table 9. R-groups (fragments, building blocks, substituents) used in the design of the diversity VL of HLCIC analogues. R 1 , R 2 , R 3 , R 4 , R 5 and R 6 -groups a,b linker-isatin (-lki)  -lki a Fragments 1-128 were used in R 1 -groups and R 3 -groups; fragments 1-10 were used in R 2 -group, fragments 138 and H for R 4 , R 5 , and R 6 as scaffold SC1. Reversely fragments 1-128 were used in R 6 -groups and R 4 -groups; fragments 1-10 were used in R 5 -group, fragments 138 and H for R 3 , R 2 , and R 1 as scaffold SC2. Fragments 129-138 were used intuitively as R 6 -groups substituents for the best VL hits at P2 position according to Schechter and Berger notation 42 . b (-) bond indicates the attachment points of individual fragments. Table 10. GFE of FP-2:HLCIC complex formation and its components for the 81 virtually designed HLCIC exploring the pockets S1, S1', and S3. QSAR model confirmed the validity of our 3D models of HLCIC inhibitors and the mode of their binding to the active site of the FP-2 of Pf. Then a 3D pharmacophore (PH4) model of FP-2 inhibition was prepared for the bound conformations of HLCIC and was further used for screening of a large virtual combinatorial library of HLCIC analogues with the aim to identify more potent and bioavailable FP-2 inhibitors. The activities of the identified putative inhibitors and analogues proposed by structure-based design (IC 50 pre ) were predicted by linear QSAR regression equation (B) ( Table 3) and cross-checked by the PH4 regression equation (D) ( Table 8).

QSAR model
The robustness of the single descriptor QSAR model was analyzed by assessing the role of the individual components of GFE of the  Table 2). b DDG sol (kcalÁmol À1 ) is the relative solvation GFE contribution to DDG com . c DDTS vib (kcalÁmol À1 ) is the relative entropic (vibrational) contribution to DDG com . d DDG com (kcalÁmol À1 ) the relative GFE change of the FP-2:HLCIC complex formation DDG com ¼ DDH MM þ DDG sol þ DDTS vib . e IC 50 pre (nM) is the predicted half-maximal inhibitory concentration of HLCIC towards FP-2 of Pf calculated from DDG com using correlation equation (B) ( Table 3). f IC 50 exp is given for the reference inhibitor HLCIC1 12 instead of IC 50 pre (nM). Table 11. GFE of FP-2:HLCIC complex formation and its components for the 33 virtually designed HLCIC with intuitive P2 substitution (R 6 -group) exploring the S2 pocket in addition to S1, S1', and S3.  . Histograms of frequency of occurrence of individual R 1 -R 6 groups in the 81 selected analogues mapping to the five-feature pharmacophore hypothesis Hypo1 (for fragments numbering see Table 9).
FP-2:HLCIC complex formation DDG com , namely the enthalpic, solvation and approximate entropic contributions. The relevance of the enthalpic contribution DDH MM to GFE was well confirmed by the quality of the regression (A) ( Table 3), indicating that a large part (91%) of the variation of the IC 50 exp can be explained by intermolecular interactions, which can be traced back to contributions of active site pockets (Sn) as well as individual residues, E int (Table 6; Figure 6). Addition of the solvation term DDG sol maintained the relationship level between the experimental data and the modelling results. Finally, the validity of the model was increased by adding the DDTS vib term that describes the loss of the inhibitor vibrational entropy upon enzyme binding, which explained 93% of the variation of the IC 50 exp . This last contribution is considered to be one of the most reliable indicators of the predictive power of QSAR models as reported by Freire et al. 43 . The VS of 3 HLCICV inhibitors not included into the training set (Table 2) confirmed good correlation between the DDG com and the observed activities IC 50 exp 12 since the ratio between computed and experimental potencies pIC 50 pre /pIC 50 exp was close to one. Therefore, the QSAR correlation equation (B) and computed relative GFE DDG com can be used for prediction of inhibitory potencies IC 50 pre of new HLCIC analogues against the FP-2 of Pf, since they share the same binding mode as the training set 12 .

Binding mode of FP-2 inhibitors
Besides the robustness of the QSAR model, the analysis of the interactions between the HLCIC and active site residues revealed the key interactions responsible for the HLCIC affinity to FP-2, such as hydrogen bonds, van der Waals interactions, hydrophobic contacts, etc. As displayed in the 2D and 3D schemes of Figure 4, the binding of HLCIC1 to the active site of FP-2 is supported by the following interactions: HB with His174 and stacking interaction with Tyr78. To verify whether also other stronger interactions codetermine the binding mode of HLCIC to FP-2 active site and aid structure-based design of new analogues, interaction energies E int between active site residues and HLCIC were computed ( Table 6). The peptidyl structure of HLCIC shed some light on the structural       ).
g Number of non-trivial (not CX3), non-hindered (not alkene, amide, small ring) rotatable bonds (range for 95% of drugs: 0-15). h Estimated number of hydrogen bonds that could be donated by the solute to water molecules in solution, averaged over a number of configurations (range for 95% of drugs: 0.0-6.0). i Estimated number of hydrogen bonds that could be accepted by the solute from water molecules, averaged taken over a number of configurations (range for 95% of drugs: 2.0-20.0). j Logarithm of partitioning coefficient between n-octanol and water phases (range for 95% of drugs: À2 to 6.5).
k Logarithm of predicted aqueous solubility, log S. S in molÁdm À3 is the concentration of the solute in a saturated solution that is in equilibrium with the crystalline solid (range for 95% of drugs:  Table 3.
Ã Asterisk indicates that the property descriptor value falls outside the range of values for 95% of known drugs.
features required for the binding affinity improvement, taking advantage of the S pockets filling ( Figure 6). The computed overall interaction energy complexes FP-2:HLCICx correlated with the observed inhibitory potencies IC 50 exp of the training set HLCIC inhibitors ( Figure 5; Table 5). In particular, the correlation between the van der Waals component of E int and IC 50 exp pointed to the p-p stacking interaction involving the isatin moiety of the inhibitor and Tyr78.
The S2 pocket filling and interaction energy contribution shows dominant effect on the FP-2 inhibition as reported previously 10 . On the other hand, comparison of the contributions to E int between the most active inhibitor HLCIC1 and less active one HLCIC6, corresponds to the trend of activities, but cannot explain the large gap in their inhibitory potencies (1224%). However, in a recent work, we succeeded in justifying the observed 37.5% jump in experimental biological activity between methylphosphonic arginine and hydroxamic acid derivative, both Pf Leucyl aminopeptidase (PfA-M17) inhibitors by the enzyme active site residues contribution to E int at a level of 35% 44 . Therefore, essential structural information needed for the design of novel potent HLCIC analogues was derived from a more predictive descriptor DDG com . New inhibitor candidates were selected according to the workflow (Figure 2), by virtual screening from a diverse VL of analogues with the active site pockets filling as the central structural requirement displayed by the pharmacophore model of FP-2 inhibition provided by the one descriptor (GFE) QSAR model (Table 3; Figure 3).

Analysis of new inhibitors from in silico screening
An analysis of structural requirement for FP-2 inhibition at the level of hydrophobic contacts with the active site revealed that the P2 substituent, namely the R 6 -group in the training set insufficiently explored the S2 pocket of the FP-2 active site. Therefore, new HLCIC analogues that match the FP-2 inhibition pharmacophore and fill better the S2 pocket may form potent FP-2 inhibitors (Table 11) Table 10), the last set of new analogues reached IC 50 pre 500 times better (Table 11). Despite this exceedingly optimistic picture, our approach helped to identify interesting hydrophobic side chains (R 6 -groups) such as (tetrahydro-2H-pyran)Me (128), (piperidin-4yl)Me (129) and n-butane (137) for the S2 pocket filling with a bulkier group compared to the E64 which is decorated only with a short propyl chain at the P2 position. Indeed, from superimposition of the most active training set HLCIC (Figure 9(a)) and the less active ones (Figure 9(b)), we can see that E64 remains the only inhibitor to map the S2 pocket. The superimposition of the best-designed analogues ( Figure 10) highlights the significance of hydrophobic contact with the S2 pocket. In fact, an additional pharmacophore derived from these new HLCIC analogues suggests an additional hydrophobic feature located at the S2 pocket ( Figure 11). Figure 12 shows the interactions of one of the bestdesigned analogues 125-1-1-H-lki-128 (IC 50 pre ¼ 13 nM) with the FP-2 and the Connolly surface of the binding site shows the lipophilic S2 pocket accommodating a bulkier substituent contributing to better stabilisation and greater affinity. These results are in good agreement with the reported structural information from experimental structure-activity relationship on FP-2 and FP-3 pyrimidine-carbonitrile inhibitors 45 as well as QSAR model and in silico design of dipeptide nitriles inhibitors of FP-3 26 and FP-2 46 . These conclusions are also in line with the recent SAR study on synthesis and molecular docking of coumarin containing pyrazoline derivatives as promising inhibitors of in vitro development of a chloroquine-sensitive (MRC-02) and chloroquine-resistant (RKL-2) strain of Pf 47 .
The ADME-related properties were also computed for the best active designed HLCIC as well as for drug used for the treatment of malaria (Table 12). The chief descriptor, the number of stars ( Ã ) deviation of the computed values from the optimum ranges for 95% of known drugs is close to those for known antimalarials. It can be noticed that the human oral absorption in gastrointestinal tract (HOA) is low for the new HLCIC analogues suggesting nonoral delivery. The blood-brain barrier descriptor is in the appropriate range.

Conclusion
Natural-product-like hybrids design for pfFP-2 inhibition stems from the multiple need to provide naturally occurring, resistance overcoming, and favourable pharmacokinetic profile compounds mimicking in this way, how nature synthesise by exploring efficiently chemical space. HLCIC antimalarials structural requirement for falcipain 2 inhibition has been assessed from the complexation "one descriptor" QSAR correlating GFE upon pfFP-2:HLCIC complex formation with activity. Moreover the derived 3D QSAR Pharmacophore was augmented to S2 pocket filling in order to provide a complete PH4 able to guide synthesis of novel potent isatin-chalcone inhibitors. Virtual screening of large and diverse and 127-1-1-H-lki-137 (15 nM); the best ones according to our design strategy based on S2 hydrophobic contact display predicted potency reaching 200 times that of the most active training set HLCIC1. They are recommended for synthesis and evaluation to check the efficiency of our design approach and guide future discovery of non-peptide, natural product like hybrids FP-2 inhibitors.

Disclosure statement
No potential conflict of interest was reported by the authors.