Advanced search
656
Views
13
CrossRef citations to date
0
Altmetric
Research Article

Computational evaluation of some indenopyrazole derivatives as anticancer compounds; application of QSAR and docking methodologies

, , , , &
Pages 16-32
Received 29 Jun 2009
Accepted 27 Aug 2010
Published online: 14 Oct 2011

A computational procedure was performed on some indenopyrazole derivatives. Two important procedures in computational drug discovery, namely docking for modeling ligand-receptor interactions and quantitative structure activity relationships were employed. MIA-QSAR analysis of the studied derivatives produced a model with high predictability. The developed model was then used to evaluate the bioactivity of 54 proposed indenopyrazole derivatives. In order to confirm the obtained results through this ligand-based method, docking was performed on the selected compounds. An ADME–Tox evaluation was also carried out to search for more suitable compounds. Satisfactory bioactivities and ADME–Tox profiles for two of the compounds, namely 62 and S13, propose that further studies should be performed on such devoted chemical structures.

Introduction

Apoptotic inducers have great importance in design of new cancer chemotherapeutic agents. Selectivity of drugs for cancer cell is affected by regulation of the cell cycle1. There are some regulatory agents that affect cell cycle in eukaryote cells. These agents include positive and negative regulatory agents that exert their effects in G1 cell cycle. One of the most important positive agents in eukaryotes cell is cyclin dependent kinase2,3.

Cyclins, cyclin dependent kinases (CDKs), and CDK inhibitors (CKIs) all are agents that work simultaneously to regulate the cell cycles phenomena. Formation of complexes between cyclins and CDKs is necessary to make sure of the progression of cell cycle phases, and CKIs have an inhibitory role against this property4.

Transition between two phases G1 and S of the cell cycle depends on the activation of CDK2 followed by phosphorylation of retinoblastoma protein5.

Inhibition of CDK2 activity results in the blockade of cell proliferation. Thus CDK2 inhibitors have a potential to be anticancer agents.

Today the drug discovery process is accelerated by the use of computational approaches to estimate the activity of various molecules as drug candidates prior to their synthesis. Most computational studies in drug design and discovery can be obviously subdivided into two categories according to the computational approaches employed: (i) protein-based studies, which deal with modeling ligand-receptor interactions through docking6–9 or molecular dynamics (MD)10–12, and (ii) ligand-based approaches, quantitative structure activity relationships (QSAR) studies13,14. The selection of methods specially depends on the presence or absence of information regarding a biological target and the molecules interacting with it. Each of these approaches has advantages and disadvantages. For example, in parallel to advanced synthesis and screening methods, the knowledge of structural information about potential target proteins has raised extremely after the last decade.

This considerable body of 3D structure information at atomic resolution of a range of relevant drug targets, including receptors, channels, enzymes, and transporter proteins, gives a growing basis for structure-based drug design. On the other hand, in the structure based methods, as in most methods, accuracy comes as a trade-in for speed. Techniques and algorithms employed for protein-ligand binding energy prediction are too time consuming to be used in high throughput mode. Therefore, it is used only for a small number of molecules. Also, these type of approaches face other problems such as insufficient sampling of protein flexibility and inaccurate used modeling function (for example scoring functions applied in docking).

QSAR researches have meant that, once a correlation between structure and biological activity is found, it is probable to screen any number of molecules, including those not yet synthesized, computationally for the selection of those molecules with the desired biological activity. The method is associated with a number of benefits such as biological activity prediction ability, as a tool to give information on possible types of interaction forces and information on the ‘nature’ of the receptor and so on.

As a ligand structure-based method, certain limitations to the use of QSAR have been recognized. Although numerous different descriptors have been used in QSAR studies, they are still not suitable for the explanation of some main interactions, such as the membrane partition of drugs, the strength of hydrogen bonds, the influence of desolvation energies on drug-receptor bindings, and steric interactions with a (most often unknown) binding site. The use of QSAR methods is also limited in understanding the mechanism of action of studied molecule in the whole body.

Hence, the application of the two approaches simultaneously helps medicinal chemist to design more potent ligands as agonists or antagonists.

The purpose in quantitative structure activity relationship (QSAR) methodology is to construct a relationship between physicochemical properties as independent and bioactivity of ligands as dependent variables. Physicochemical properties could be obtained as descriptors and then in silico methods are applied to manipulate the information, remove noise and derive useful information. Most descriptors may be physicochemically meaningful so interpretation of the developed models is possible whilst some others do not have direct physicochemical meaning, but contain useful information. Principal components belong to the latter type of descriptors and could be treated and used in QSAR. Various methodologies have been applied for constructing QSAR models including 2D and 3D methods. Some 3D model building approaches such as comparative field analysis (CoMFA) are comprehensive and less trustworthy or even impracticable for big molecules with numerous bonds. Promising efforts to make these methods simpler and propose outstanding benefits over them have been emerging. Multivariate image analysis applied in quantitative structure activity relationship (MIA-QSAR) is one of the best suggestions for QSAR analysis because neither specific tools nor high computations are needed to perform this type of analysis.

Multivariate image analysis is a type of multivariate regression methods that is based on data sets obtained from 2D images. In MIA-QSAR, images of bioactive molecules as rich sources of information are generated and used15.

The aim of the present study is to rationalize bioactivities and binding affinities of some CDK2 antagonist using MIA-QSAR and docking methods, in order to design and suggest novel CDK2 antagonists for further evaluation.

The following steps in this computational design approach of CDK2 antagonists have been performed: (i) MIA-QSAR was applied in order to construct a QSAR model with the ligands extracted from literature. (ii) Several indenopyrazole derivatives with some modifications in the structures of the used compounds were proposed in order to explore more potential CDK2 antagonists. (iii) Bioactivities of the proposed ligands were calculated employing MIA-QSAR models. (iv) Docking approach was employed to confirm the results of MIA-QSAR. (v) Pharmacodynamics properties of some selected compounds were evaluated to introduce high bioavailable compounds.

In recent years, various QSAR methods as well as docking approach have been widely and successfully used to explore structural and conformational requirements of ligands to act as agonists or antagonists of target receptors. Nevertheless, the combination of MIA-QSAR and docking methodology for the design of CDK2 inhibitors has received little attention and, to the best of our knowledge, there is not any report of such a research for CDK2 inhibitors.

Experimental

Descriptor generation and assigning of training and test sets

In vitro biological activity data used in this study were CDK2 inhibitory activity (in terms of-log IC50), of a set of ninety four indenopyrazole derivatives selected from literature16–20. General chemical structures and the structural details of these compounds and also their activities are reported in Table 1.

Table 1.  Structures and details of the molecules used in this study.

The two dimensional structures of ninety four molecules were built using ChemDraw 7.0 (ChemDraw Ultra, 1985–2001; CambridgeSoft, Cambridge, MA), and then saved in bitmaps. Afterwards the bitmap of molecules were set to 940 × 600 pixels windows, with the resolution of 96 × 96 points per inch. Since the bitmaps of molecules should be superimposed as a 2D alignment, a common pixel was selected among the whole series of molecular structures and then the molecules were totally fixed in that given coordinate. The given coordinate used in the procedure of molecules alignment is shown in Figure 1 (414.279). In MIA-QSAR descriptors are pixels of the bitmaps, hence, each 2D image should be converted into binaries in order to obtain resources suitable for statistical modeling study. After transferring the bitmaps to the Matlab (version 7.6, 2008; MathWorks, Natick, MA) environment, each 2D image was converted into binaries, a double array in Matlab. After converting images, the 94 images of 940 × 600 pixels size each were grouped to give a 94 × 940 × 600 three-way array, which was unfolded to a 94 × 564,000 matrix. This array was applied in order to be correlated with the dependent variable, the vector of the pIC50s of the studied molecules. Least square-support vector machine (LS-SVM) as non-linear method was applied to regression model building. Before performing the regression, in order to minimize the memory used, columns with zero variance in the predictor matrix were deleted to generate final X matrix with 94 rows and 18,161 columns. Hence the calculated descriptors were arranged in an X matrix so that the number of rows and columns were the number of molecules and non zero variance pixels, respectively.

Figure 1.  Building of 3D array and 2D array of molecules in this study. The pixel shown by arrow, common to all molecules, was fixed at 414,279 coordinate. Bitmap of molecules was fixed in 940 × 600 pixels windows. The 3D array of molecules was unfolded to 94 × 564,000.

After building of X matrix about 20% of the molecules (20 out of 94) were selected as the test set for evaluating the performance of generated regression method. According to Tropsha et al. the best models would be built using Kennard and stone algorithm21. This algorithm was applied in the present study22. Molecules assigned as training and test sets by Kennard and Stone algorithm are reported in Table 1.

QSAR model development

Because of the similarities between the molecular structures used in this study, and since the descriptors used in model building are pixels of images of molecules, problem of collinear and noisy descriptors is very serious. For building QSAR models least square-support vector machine was used.

The application of principal components of X matrix generated from images of molecules as input of least square-support vector machine is referred as PC-LS-SVM.

Support vector machine (SVM) is based on statistical learning theory developed by Vapnik23. Using of SVM becomes very popular today for demonstrating and solving classification as well as regression problems, largely encouraged by its property to model data possessing non-linear relationships between original data set and predictor vector by employing the kernel function24.

Theory of LS-SVM has been explained by Suykens et al. For detailed in-depth theoretical background on LS-SVM, the readers are referred to their manuscript25.

It is also strongly related to Gaussian processes and regularization networks but employs an optimisation method as in SVM. Therefore, LS-SVM encompasses similar benefits as SVM, but its additional advantage is that it requires solving a set of only linear equations (linear programming), which is much easier and computationally very simple26. Determination of optimal input variable subset, proper kernel function, and optimum kernel parameters are the key steps in LS-SVM. Kernel functions normally used are the polynomial function radial basis function (RBF) exp(-||xi-x||2/σ2) which is a Gaussian curve. As can be seen, each kernel function is associated with a kernel specific parameter. For the polynomial and RBF kernels, these parameters are the degree of the polynomial (d) and the width of the Gaussian function (r2), respectively. So instead of calculating a specific mapping for each dimension of the data, the problem comes down to selecting a proper kernel function and optimising its specific parameter

The overall performance of LS-SVM is evaluated in terms of root mean square error cross-validation (RMSECV) according to the following equation:

(11)

where yk is the experimental value of biological activity, is the output predicted activity of the developed model calculated by cross-validation. ns is the number of compounds in the analyzed set.

Docking methodology

The crystal structure of CDK2 protein retrieved from the Protein Data Bank (PDB code: 2BTS) was used.

Twenty-four more active compounds among experimentally determined CDK2 inhibitors (Table 1) were selected for docking process based on pIC50. For preparing o ligands for docking process, Three-dimensional structure of compounds were constructed and optimized using Polak-Ribiere conjugate gradient algorithm and AM1 semi empirical method implemented in HyperChem software. These optimized structures were used as input of Auto Dock Tools. Then the partial charges of atoms were calculated using Gasteiger–Marsili procedure implemented in the Auto Dock Tools package27 Non-polar hydrogens of compounds were merged and then rotatable bonds were assigned.

For the protein, all the hetero atoms including water molecules were removed. All missing hydrogens were added and after determining Kollman united atom charges non-polar hydrogens were merged to their corresponding carbons using Auto Dock Tools28. As the final part of the process of preparation of protein, desolvation parameters were assigned to each protein atom. Using auto Grid as a part of the Auto dock package, the grid maps were constructed. The grids (one for each atom type in the ligand, plus one for electrostatic interactions) were chosen to be large enough to include not only the active site of protein but also significant regions of the surrounding surface. In all dockings, a grid map with of 60 grid points in each Cartesian direction and a grid-point spacing of 0.375 Å (roughly a quarter of the length of a carbon-carbon single bond) were generated using the Autogrid (part of the AutoDock package) and since the location of the ligand in the complex was known, the maps were centered on the ligand’s binding site in order to search for regions that could lead to favorable interactions with the functional groups.

Among the three different search algorithms performed by AutoDock 4 the commonly used Lamarckian Genetic Algorithm (LGA) was applied29based on the previous studies which have indicated that the two other methods, simulated annealing and genetic algorithm, are less efficient than LGA.

For all docking procedure, 200 independent runs with the step sizes of 0.2 Å for translations and 5° for orientations and torsions were considered. For the LGA process, the following parameters were considered: an initial population of random individuals with a population size of 150 individuals; the maximum number of energy evaluations was 2.5 × 106; a maximum number of generations of 27,000; an elitism value of 1; rate of mutation 0.02; and rate of crossover 0.8 were used. In the LGA method, the pseudo-Solis and Wets local search method were applied30.The number of iterations per local search was 300. The possibility of performing local search on an individual in the population was 0.06. The maximum number of consecutive successes or failures before doubling or halving the local search step size was 4 and the termination criterion for the local search was 0.01. Auto Dock Tools was employed to produce both grid and docking parameter files i.e. gpf and.dpf files. A 2.0 A˚ clustering tolerance was applied to construct clusters of the closest compounds, and the initial coordinates of the ligand were used as the reference structure. For the internal validation phase ligand structure (corresponding HETATM and CONECT records) were extracted from the pdb file of CDK2 (2BTS) using a plain text editor. After assigning bond orders, missing hydrogen atoms were added and a short minimization (100 steepest descent steps using MM+ force field with a gradient convergence value of 0.05 kcal/mol A˚) was performed using HyperChem in order to release any internal strains. Then the same procedure as described above was used for the more active compounds. Docking results (CDK2-ligand complexes) were visualized using VMD1.8.631.

Validation, predictability and robustness of QSAR model

To demonstrate that the resulted model has a good ability to predict the activity of the selected studied compounds, some different methods for evaluation of model performance have been used. Model performance can be evaluated by different approaches. Here, R2, which presents the explained variance for a given set, was used to determine the goodness of model’s fit performance. In addition, the predictability of the built model must be estimated in order to build a successful QSAR model. In this investigation, we evaluated the predictability of developed model using two parameters, the root mean square error (RMSE) and predicted error sum of square (PRESS (%)).

In order to assess the predictability and to check the statistical significance of the developed model, the proposed model was applied for predicting pIC50 for the external set that was not used in model building.

Cross-validation is a technique used to explore the reliability of statistical models. Root mean square error cross-validation (RMSECV) as a standard index to measure the accuracy of a modeling method which is based on the cross-validation technique and R2LOO as another criterion of predictability of the developed models were applied in this study.

According to Tropsha high R2LOO does not routinely mean a high predictability of the developed model. Thus, the high value of R2LOO is the necessary but not the sufficient condition for the developed model to have a high predictability. We reason that in addition to a high R2LOO a reliable model should also be characterized by a high R2 between the calculated and experimental values of compounds from a test set32.

Some criteria are suggested by Tropsha. If these criteria were satisfied then it can be said that the model is predictive21. These criteria include:

(12) (13) (14) (15)

R2 is the correlation coefficient of regression between the predicted and observed activities of compounds in training and test sets. is the correlation coefficient for regressions between predicted versus observed activities through the origin, is the correlation coefficient for regressions between observed versus predicted activities through the origin, and the slope of the regression lines through the origin are assigned by k and k ‘, respectively. Details of definitions of parameters such as, k and k’ are presented in details in the literature21.

In addition, according to Roy and Roy33 the difference between values of and must be studied and given importance. They suggested following modified R2 form

(16)

If value for a given model is >0.5, indicates good external predictability of the developed model.

Y-Randomization as a technique to evaluate chance correlation in the developed models was also applied in the present study. By using this technique the performance of the original developed model in data description (indicated by R2) is compared to that of models constructed for randomized bioactivity, based on the original X block and the original model building process. If R2 of the obtained models was negligible, this ensures the robustness of the obtained QSAR models for the specific modeling method and data.

ADME–Tox and Lipinski’s rule of five evaluation

The pharmacokinetic and toxicity profiles for some of more potent investigated and all of proposed molecules were calculated by means of the ADME and Tox boxes of Pharma Algorithms server34. SMILES format of the studied molecules were used as input of the Pharma Algorithms server to predict various features and parameters including solubility of the investigated compounds in water (H2O), present of oral bioavailability, Absorption, distribution, and toxic effects of the compounds on various part of body such as blood, cardiovascular and etc.

The parameters described in the Lipinski’s rule of five including logP (the logarithm of octanol/water partition coefficient), number of hydrogen bond donor groups, number of hydrogen bond acceptor groups and molecular weight and also total polar surface area (TPSA), They have been proved to have a correlate with drug absorption, were calculated by using the Molinspiration program35. These properties describe the ‘drug-likeness’ and predict a poor oral absorption or permeation when the investigated molecules have more than five H-bond donors (HBD), 10 H-bond acceptors (HBA), a molecular weight (MW) greater than 500 Da and calculated LogP (cLogP) higher than 5. Molecular total polar surface area (TPSA) was predicted based on the method published by Ertl et al.36 TPSA has been proved to be a very high-quality feature characterizing drug absorption, such as intestinal absorption, bioavailability, Caco-2 permeability and blood brain barrier penetration.

Results and discussion

QSAR model

For finding novel and potent CDK2 antagonists, we applied computational methods as valuable tools. Various softwares for building molecular models, docking, least square-support vector machine and statistical analysis were employed to generate predictive model for biological activities and design of novel and potent CDK2 inhibitors. Computational methods can help medicinal chemists for identification and optimization of lead agonists and antagonists to improve the general efficiency of the drug design procedure.

As it was discussed previously, generated descriptors in multivariate image analysis don’t have any direct interpretation physicochemically because they are binaries. As it was shown by Geladi and Esbensen37 multivariate image analysis may present valuable information in chemistry and 2D images of chemical structures contain useful chemical information38.

Hence numerous obtained descriptors may be treated in a multivariate way in order to correlate the structures of compounds (as images of compounds) with the corresponding bioactivities (pIC50). As mentioned above, the bitmaps for 94 molecules were matricized and a lot of descriptors (columns of X block) were calculated for each molecule using bitmaps of molecules. In order to get the relationship with independent variables, logarithms of the inverse of biological activity (Log 1/IC50) data of 94 molecules were used. After dividing the molecules into two parts, calibration and validation sets, based on Kennard and Stone algorithm, building of models using training set was performed. Developed models were used to predict the activity of molecules in test set to evaluate performance of models.

Images can be considered as rich sources of information that have a wide variety of applications in different branches of chemistry including QSAR. Here an MIA-QSAR method is used to build model between images of chemical structures and bioactivity of molecules. The developed model is applied to evaluate bioactivity of proposed CDK2 inhibitors.

In order to confirm the results obtained through this ligand-based method, docking studies which deal with modeling ligand-receptor interactions were performed for the proposed ligands and for 24 more active CDK2 inhibitor molecules.

Also, an ADME–Tox (absorption, distribution, metabolism, excretion and toxicity) evaluation was carried out to search for the best predicted compounds.

One of the most important notes in image analysis is the collinearity problem. Because the columns of X matrix used in model building are pixels of the images of the molecules, probability of collinear descriptors is very serious in MIA-QSAR. Thus methods on the basis of orthogonalization of original variables such as PCA could be used.

In order to study any relationships between the calculated PCs and bioactivities of drug-like compounds, LS-SVM regression method was used. The number of PCs to enter the model was selected on the basis of the lowest root mean square error of calibration (RMSEC) and root mean square error of cross-validation (RMSECV) in the output. With respect to these criteria, 9 PCs were opted to construct a regression model (Figure 2). Nonlinearity was considered by using an RBF kernel function. The quality of LS-SVM for regression depends on gamma and sig2 parameters. Gamma is regularization parameter and sig2 is kernel parameter that both of them must be optimized. To find out the optimal values of parameters, a grid search was performed based on RMSECV. This grid search was performed on the training set for all parameter combinations of gamma and sig2 from 1 to 250 and 1 to 200, with increment steps of 1 for both of them. These ranges were selected on the basis of previous studies. A robust model is attained by selecting parameters that give the lowest error. The mesh plot of RMSECV as a function of gamma and sig2 are shown in Figure 3. The results indicate that an LS-SVM with gamma of 63, and sig2 of 82 was resulted in the optimum LS-SVM performance. The non-linear regression method was trained using training objects and it was evaluated by the test molecules.

Figure 2.  Determination number of factors used in model building in PC-LS-SVM.

Figure 3.  Optimization of values of Gamma and Sig2.

The predicted activities of pIC50 of training and test data and relative error of prediction by model (REP) are listed in Table 1. Low REP confirms high the predictability of the model.

Figure 4 depicts the plot of observed versus predicted values for the training and test sets. The residuals of the PC-LS-SVM predicted values are plotted against the experimental values in Figure 5. Since the residuals are propagated on both sides of the zero line, there is no systematic error in developing of the PC-LS-SVM model.

Figure 4.  The plot of PC-LS-SVM calculated against experimental values of bioactivities of investigated compounds.

Figure 5.  Plot of the residuals versus experimental activity.

The statistical parameters, such as R2 between the calculated and experimental values obtained using PC-LS-SVM for training and test sets also are shown in Table 2.

Table 2.  Various calculated statistics criteria for the developed model.

Inspection of RMSE and PRESS values for PC-LS-SVM reveals the appropriateness of PC-LS-SVM method in predicting the inhibitory activity of the studied compounds. PC-LS-SVM also shows good fitting between predicted values and experimental values in various sets.

A leave one-out cross-validation method was carried out to confirm the calibration model practicability. In this procedure, 73 models were generated with one different test molecule at a time and a good regression was obtained (R2CV of 0.83 and RMSECV of 0.28). Nevertheless, leave one-out cross-validation has not been recommended as the only method to authenticate a developed QSAR model; external validation is also necessary32. Thus, in order to produce a test set using Kennard and Stone algorithm, the total data set was divided into the training and test sets and the bioactivity of the compounds in the test set were predicted using the developed regression model obtained through the regression of the remaining molecules in the training set.

This external validation procedure yielded an R2 of 0.85 and RMSE of 0.23.

These results show that the generated QSAR model is predictive and stable.

As can be seen in Table 1 the values of REP are divided between −0.063 and +0.071, which show the high predictive capability of the developed QSAR model.

The generated model also passed the rest of the criteria that we operated to illustrate its predictive ability (see equations 12–16 in model validation part). The corresponding values for the criteria are as follows:

Suggesting new CDK2 inhibitors

In order to design novel indenopyrazole derivatives with high CDK2 inhibitory activity we employed the developed MIA-QSAR model to calculate the inhibitory activities of suggested compounds. The reason was the fact that there was a good correlation between the experimental and calculated activities of compounds used in the model development step. Structures of novel CDK2 antagonists may then be suggested and bioactivities of them could be evaluated by using the developed model. Novel compounds were suggested based on the following strategy. Compounds having the general structure shown in Table 1 may give rise to novel compounds by adding various substituents to the main scaffold. Images of these novel ligands were prepared and then principal components of them were generated. Hence, using the calculated PCs and the developed model, bioactivities of the suggested ligands might be calculated.

General structures and structural details of 54 suggested compounds and also their calculated activities are reported in Table 3. The suggested compounds are structurally combination of the most potent compounds of Table 1 especially compounds 62, 12, 17, 59 and 93. All suggested compounds and also compounds selected from Table 1 were submitted to docking evaluation, in order to corroborate the predicted trends by comparing their binding energies to the receptor.

Table 3.  Structures and details of the proposed molecules as novel CDK2 inhibitors. Predicted Activity is theoretical inhibitory activity of each suggested compound calculated using developed QSAR model. Estimated Free Energy of Binding is ΔG of each suggested compound calculated by autodock.

Docking

The proposed method was further validated by docking a series of more potent CDK2 inhibitors of Table 1 and also the suggested compounds listed in Table 3 in the binding site. Docking studies on binding modes are very informative to clarify key structural characteristics and interactions to provide helpful data for suggesting effective CDK2 inhibitors.

In the internal validation phase of docking, 4-[(5-isopropyl-1, 3-thiazol-2-yl) amino benzene sulfonamide (Figure 6) was docked into CDK2 according to the docking protocol provided in the experimental method. The lowest energy pose for docking is shown in Figure 7. After superimposing the experimental and predicted conformations, the RMSD was 1.02 A˚ which is considered as successfully docked39,40. This result showed that the parameters set for the AutoDock simulations are practical for reproducing the X-ray structure. The obtained results reveal that this in silico approach is fairly robust and suitable for assessing the interaction of such ligands with CDK2. After internal validation, docking analysis was performed on some of the more potent compounds selected from Table 1. The predicted ΔG (estimated free energy change of binding) values for each compound are presented in Table 4. The correlation coefficient between theoretically predicted ΔG and pIC50 is 0.249.

Figure 6.  Chemical strurcutre of the ligand (4-[(5-isopropyl-1, 3-thiazol-2-yl) amino benzenesulfonamide) used in the internal validation phase of docking.

Figure 7.  Internal validation phase result. CDK2 active site structure rendered as solvent-excluded surface and conformational comparison of 4-[(5-isopropyl-1,3-thiazol-2-yl) amino benzenesulfonamide from crystal structure (Cyan structure) with that from AutoDock model (Ochre structure).

Table 4.  Estimated free energy change of binding (ΔG) (kcal mol−1) for some of more potent compounds of Table 1.

As discussed above, 54 compounds were suggested as CDK2 inhibitors, and activities of them were calculated using developed MIA-QSAR model. They were docked into CDK2 structure too and resulted estimated free energies of binding are reported in Table 3. For these compounds the predicted ΔG values vs. predicted pIC50 for each compound are determined with R2 = 0.320. With respect to the obtained results, compounds S26, S14, S13, S51, and S52 were selected for further evaluation. As reported in Table 3, these compounds have higher predicted activity and also relatively higher estimated free energy of binding than other proposed compounds of Table 3. In addition to these compounds, five compounds of Table 1 (62, 17, 93, 59, and 12) that have relatively higher activities and estimated free energy of binding than others were opted for the next steps of CDK2 inhibitors development.

To take a snapshot of the activities and ΔG of the selected compounds of Table 1 and Table 3, we provided their activities and ΔG in Table 5. As can be seen in this Table, the proposed compounds have fairly higher activities and ΔG of binding compared to the selected compounds of Table 1. As it is clear, docking results confirm the findings of the developed QSAR model. It is obvious from Table 5 that the order of bioactivities of these compounds is: S26 > 62>S14>S13 > 17>S51>S52 > 12 > 59 > 93. On the basis of ΔG binding, the order is: S13>S14 > 17>S26 > 62>S51 > 12 > 93 > 59>S52, that is, compound S13 interacts more strongly with CDK2 binding site than the other compounds.

Table 5.  Activities and estimated free energy change of binding (ΔG) of selected compounds of Table 1 and Table 3.

Among the selected compounds, final docked pose of two more potent compounds, one among experimentally determined activity(compound 62) and one among suggested compounds (compound S26) is shown in Figure 8. Compounds were docked into the ATP-binding site of CDK2. Docking process tries to discover the correct binding poses within the binding site of the protein while the scoring functions tries to predict binding affinity of ligand for the protein binding site. The scoring functions provide three purposes: (i) ranking the conformations produced by the docking process for one ligand interacting with a specified protein; this feature is necessary to detect the best binding mode which approximates the experimentally observed pose, (ii) ranking different ligands with respect to binding to one protein, i.e. sorting ligands according to their affinity; this feature is necessary in virtual screening procedure, (iii) ranking one or more ligands with respect to their binding affinity to different proteins; this feature is essential for the consideration of specificity and selectivity41. The protein coordinates of CDK2 bound to compound 4-[(5-isopropyl-1, 3-thiazol-2-yl) amino benzene sulfonamide were downloaded from Protein Data Bank and the ligand was separated from the protein. As shown in Figure 8, the docked conformation embedded in the binding pocket of CDK2 for compound 62 and compound S26 is very close to each other. In both of them the sulfur atom of the ligand can bind to CDK2 backbone by hydrogen binding. For example as can be seen in Figure 8, sulfur atom of compound 62 was bound by hydrogen binding to LEU83 and in compound S26, sulfur atom was bound to Asn136. Also with respect to Figure 8A, H-bond donor (–NH) of the ligand formed a hydrogen bond with Glu81 carbonyl group in CDK2 backbone. According to Sielecki et al. one of the most crucial considerations is that for high activity and selectivity in CDK2 inhibitors, the formation of at least two hydrogen bonds between the ligand and the ATP-binding pocket is necessary42. Hence, we must consider several potential sites for hydrogen binding in suggested CDK2 inhibitors.

Figure 8.  Docking simulations results of (A) the most potent compound in experimentally determined activity (compound 62 in Table 1) and (B) the most potent compound in suggested compound set (compounds S26 in Table 3). Hydrogen bonds were shown by green dashed line and ligand structure rendered in colored as scaled ball and stick model.

The results of docking procedure matched with previously published CDK2 binding site43,44.

It is known that ideal drug-like compounds have suitable molecular properties for desirable absorption, distribution, metabolism, and excretion (ADME). Hence, in addition to bioactivities of compounds, ADME of them is an essential feature that must be considered. The development of many drug-like compounds has been stopped for poor absorption. Consequently, ADME–Tox features were calculated for the most potent compounds of suggested and experimentally evaluated datasets to compare their appropriateness for inhibiting CDK2 as anticancer agents.

ADME–Tox evaluation

Diverse filters can be employed to eliminate “non-drug-like” compounds and keep only those that are similar to drugs. Poor pharmacokinetic properties are one of the major causes for canceling the drug development procedure45. The Lipinski rule-of-five is the popular procedure of evaluating the drug-likeness profile of the molecules. The pharmacokinetic characteristics of compounds in Table 5 were evaluated by determining their ADME features. In particular, we calculated the values of Lipinski’s rule-of-five for each compound. In brief, this rule states that strong absorption or permeation are more possible when compounds have a molecular weight (MW) of 500 or less, a log P no higher than 5, five or fewer hydrogen bond donor sites (NH and OH groups) and 10 or fewer hydrogen bond acceptor sites (N and O atoms). Compounds breaching more than one of the conditions may have small oral bioavailability. Additionally, we calculated the topological polar surface area (TPSA). This feature was indicated to correlate well with passive molecular transport through membranes. Therefore, TPSA permits us to predict the transport properties of molecules and has been related to drug bioavailability46. It must be noted that the oral bioavailability is inversely proportional to TPSA or can be said that passively absorbed molecules with a TPSA lower than 140 Å are considered to have low oral bioavailability36. The calculated values of parameters of Lipinski’s rule-of-five for these compounds are reported in Table 6. However, among considered compounds, the compounds 62, 17, 59, and S13 did not breach any parameter of Lipinski’s proposed rule, and thus are supposed to have high bioavailability.

Table 6.  Results for the calculated Lipinski’s rule of five and total polar surface area (TPSA).

In addition, the ADME factors would help to predict the in vivo pharmacokinetics of a compound prior to any experiments in animals or man.

The values of calculated parameters representative for absorption, distribution, metabolism and excretion and also toxic effects on various parts of body are reported Table 6. Principally, all of studied compounds show benefits and drawbacks.

As can be seen in Table 6, in the first rows various features explaining the absorption of compounds are reported. Absorption of drugs from the gastrointestinal (GI) tract is a very complex phenomenon. A large number of features, which can be divided into three groups i.e. physicochemical, physiological, and formulation related features, affect the absorption of compounds from GI tract.

Because formulation related features are regularly optimized experimentally while physiological features cannot be controlled, hence it is supposed that absorption is a function of physicochemical properties of the molecules47.

GI tract consists of several segments with different pHs. Absorption in each segment of GI tract is dependent on the pH and solubility; hence in the first part of Table 7, solubility amount in various segments of GI tract is calculated and reported. In this model, each intestinal segment was considered as a separate compartment and model was utilized for the description of fluid movement in the GI tract with a calculation of drug absorption in each intestinal segment over time. The summation of the drug absorption calculations in each segment gives the % oral bioavailability. For each compound the % oral bioavailability and average of absorption in each cm of GI tract is reported in Table 7. As can be seen, the compounds 62, 17 and 59 have high values of oral bioavailability.

Table 7.  ADME–Tox parameters and their results.

Tissue distribution is a crucial determinant of the pharmacokinetic profile of a compound. Volume of distribution (Vd) is a marker to measure tissue distribution. Relative high value of Vd for a given compound means compound maintains in the body more. In Table 7, the compounds 62, S26, S14, S13, S51, and S52 have relative higher value of Vd than others.

The present study clearly showed that the considered compounds are not mutagenic on the basis of the Ames test and also more or less have the same probability of toxic effects on the various parts of body.

Conclusions

A computational procedure was performed on some indenopyrazole derivatives. Two important procedures in computational drug discovery, namely docking for modeling ligand-receptor interactions and quantitative structure activity relationships were employed. The aim of this study was to rationalize bioactivities and binding affinities of the studied compounds design and propose novel CDK2 antagonists. In silico evaluation of bioactivity of these new compounds was also carried out.

The MIA-QSAR analysis of the compounds produced a model with high predictability and the developed model was used to evaluate the bioactivity of some proposed indenopyrazole derivatives. The more potent compounds were considered for further steps of computational evaluation including docking and ADME–Tox profiles. Docking of more potent CDK2 inhibitors selected from the Table 1 and also some of the suggested compounds listed in Table 2 in the binding site have shown that hydrogen binding is an important factor determining the interaction of ligands with CDK2binding site.

It is known that ideal drug-like compounds have suitable molecular properties for desired absorption, distribution, metabolism, and excretion. The present study clearly showed that the compounds which were investigated as more potent compounds almost good ADME properties and they are not mutagenic according to the Ames test. These compounds have almost the same probability of toxic effects on the various parts of the body. Satisfactory bioactivities and ADME–Tox profiles for two of them, namely 62 and S13 propose that further studies should be performed on such devoted chemical structures.

Acknowledgment

Authors wish to thank Professor Matheus Freitas for his useful advices and comments.

Declaration of interest

The authors declare no conflicts of interest.

References

  • Lee JM, Bernstein A. Apoptosis, cancer and the p53 tumour suppressor gene. Cancer Metastasis Rev 1995;14:149161. [Crossref][Google Scholar]
  • Desai D, Gu Y, Morgan DO. Activation of human cyclin-dependent kinases in vitro. Mol Biol Cell 1992;3:571582. [Crossref][Google Scholar]
  • Hunt T. Maturation promoting factor, cyclin and the control of M-phase. Curr Opin Cell Biol 1989;1:268274. [Crossref][Google Scholar]
  • Xie T, Niu Y, Ge K, Lu S. Regulation of keratinocyte proliferation in rats with deep, partial-thickness scald: modulation of cyclin D1-cyclin-dependent kinase 4 and histone H1 kinase activity of M-phase promoting factor. J Surg Res 2008;147:914. [Crossref][Google Scholar]
  • Zhou W, Takuwa N, Kumada M, Takuwa Y. Protein kinase C-mediated bidirectional regulation of DNA synthesis, RB protein phosphorylation, and cyclin-dependent kinases in human vascular endothelial cells. J Biol Chem 1993;268:2304123048. [Crossref][Google Scholar]
  • Bennion C, Connolly S, Gensmantel NP, Hallam C, Jackson CG, Primrose WU et al. Design and synthesis of some substrate analogue inhibitors of phospholipase A2 and investigations by NMR and molecular modeling into the binding interactions in the enzyme-inhibitor complex. J Med Chem 1992;35:29392951. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Noel JP, Bingman CA, Deng TL, Dupureur CM, Hamilton KJ, Jiang RT et al. Phospholipase A2 engineering. X-ray structural and functional evidence for the interaction of lysine-56 with substrates. Biochemistry 1991;30:1180111811. [Crossref][Google Scholar]
  • Ortiz AR, Pisabarro MT, Gallego J, Gago F. Implications of a consensus recognition site for phosphatidylcholine separate from the active site in cobra venom phospholipases A2. Biochemistry 1992;31:28872896. [Crossref][Google Scholar]
  • Sessions RB, Dauber-Osguthorpe P, Campbell MM, Osguthorpe DJ. Modeling of substrate and inhibitor binding to phospholipase A2. Proteins 1992;14:4564. [Crossref][Google Scholar]
  • Hariprasad V, Kulkarni VM. A molecular dynamics study of the three-dimensional model of human synovial fluid phospholipase A2–transition state mimic complexes. J Mol Recognit 1996;9:95102. [Crossref][Google Scholar]
  • Thunnissen MM, Kalk KH, Drenth J, Dijkstra BW. Structure of an engineered porcine phospholipase A2 with enhanced activity at 2.1 A resolution. Comparison with the wild-type porcine and Crotalus atrox phospholipase A2. J Mol Biol 1990;216:425439. [Crossref][Google Scholar]
  • Tomoo K, Yamane A, Ishida T, Fujii S, Ikeda K, Iwama S et al. X-ray crystal structure determination and molecular dynamics simulation of prophospholipase A2 inhibited by amide-type substrate analogues. Biochim Biophys Acta 1997;1340:178186. [Crossref][Google Scholar]
  • Ortiz AR, Pastor M, Palomer A, Cruciani G, Gago F, Wade RC. Reliability of comparative molecular field analysis models: effects of data scaling and variable selection using a set of human synovial fluid phospholipase A2 inhibitors. J Med Chem 1997;40:11361148. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Ortiz AR, Pisabarro MT, Gago F, Wade RC. Prediction of drug binding affinities by comparative binding energy analysis. J Med Chem 1995;38:26812691. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Eriksson L, Wold S, Trygg J. Multivariate analysis of congruent images (MACI). Journal of Chemometrics. 2005;19:393403. [Crossref][Google Scholar]
  • Nugiel DA, Etzkorn AM, Vidwans A, Benfield PA, Boisclair M, Burton CR et al. Indenopyrazoles as novel cyclin dependent kinase (CDK) inhibitors. J Med Chem 2001;44:13341336. [Crossref][Google Scholar]
  • Nugiel DA, Vidwans A, Dzierba CD. Parallel synthesis of acylsemicarbazide libraries: preparation of potent cyclin dependent kinase (cdk) inhibitors. Bioorg Med Chem Lett 2004;14:54895491. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Nugiel DA, Vidwans A, Etzkorn AM, Rossi KA, Benfield PA, Burton CR et al. Synthesis and evaluation of indenopyrazoles as cyclin-dependent kinase inhibitors. 2. Probing the indeno ring substituent pattern. J Med Chem 2002;45:52245232. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Yue EW, DiMeo SV, Higley CA, Markwalder JA, Burton CR, Benfield PA et al. Synthesis and evaluation of indenopyrazoles as cyclin-dependent kinase inhibitors. Part 4: Heterocycles at C3. Bioorg Med Chem Lett 2004;14:343346. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Yue EW, Higley CA, DiMeo SV, Carini DJ, Nugiel DA, Benware C et al. Synthesis and evaluation of indenopyrazoles as cyclin-dependent kinase inhibitors. 3. Structure activity relationships at C3(1,2). J Med Chem 2002;45:52335248. [Crossref][Google Scholar]
  • Tropsha A, Gramatica P, Gombar V. The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR & Combinatorial Science. 2003;22:6977. [Crossref][Google Scholar]
  • Kennard R, Stone L. Computer Aided Design of Experiments. Technometrics. 1969;11:13748. [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Vapnik VN. The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995. [Crossref][Google Scholar]
  • Schölkopf B, Smola AJ. Learning with Kernels. Cambridge: MIT press, 2002. [Google Scholar]
  • Suykens JAK, Vandewalle J. Least Squares Support Vector Machine Classifiers. Neural Processing Letters. 1999;9:293300. [Crossref], [Web of Science ®][Google Scholar]
  • Thissen U, Ustün B, Melssen WJ, Buydens LM. Multivariate calibration with least-squares support vector machines. Anal Chem 2004;76:30993105. [Crossref][Google Scholar]
  • Gasteiger J, Marsili M. Iterative partial equalization of orbital electronegativity-a rapid access to atomic charges. Tetrahedron. 1980;36:32193228. [Crossref], [Web of Science ®][Google Scholar]
  • Weiner SJ, Kollman PA, Case DA, Singh UC, Ghio C, Alagona G, et al. A new force field for molecular mechanical simulation of nucleic acids and proteins. Journal of the American Chemical Society. 1984;106:765784. [Crossref], [Web of Science ®][Google Scholar]
  • Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, et al. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry. 1998;19:16391662. [Crossref], [Web of Science ®][Google Scholar]
  • Solis FJ, Wets RJB. Minimization by random search techniques. Mathematics of Operations Research. 1981;6:1930. [Crossref], [Web of Science ®][Google Scholar]
  • Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graph 1996;14:33–8, 27. [Crossref], [PubMed][Google Scholar]
  • Golbraikh A, Tropsha A. Beware of q2! J Mol Graph Model 2002;20:269276. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Roy PP, Roy K. On some aspects of variable selection for partial least squares regression models. QSAR and Combinatorial Science. 2008;27:302313. [Crossref][Google Scholar]
  • PharmaAlgorithms. 2008 [cited; Available from: [Google Scholar]
  • MolinspirationCheminformatics. [cited; available at Error! Hyperlink reference not valid. Available from: [Google Scholar]
  • Ertl P, Rohde B, Selzer P. Fast calculation of molecular polar surface area as a sum of fragment-based. J Med Chem. 2000;43:37143717. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Geladi P, Esbensen K. Can image analysis provide information useful in chemistry? Journal of Chemometrics. 1989;3:419429. [Crossref][Google Scholar]
  • Freitas MP, Brown SD, Martins JA. MIA-QSAR: A simple 2D image-based approach for quantitative structure-activity relationship analysis. Journal of Molecular Structure. 2005;738:149154. [Crossref], [Web of Science ®][Google Scholar]
  • Erickson JA, Jalaie M, Robertson DH, Lewis RA, Vieth M. Lessons in molecular recognition: the effects of ligand and protein flexibility on molecular docking accuracy. J Med Chem 2004;47:4555. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Gohlke H, Hendlich M, Klebe G. Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 2000;295:337356. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Sotriffer C, Mstahl M. In: Sotriffer C, Klebe G, Stahl M, Abraham D, editors. Burger’s Medicinal Chemistry and Drug Discovery. New York: John Wiley and Sons, 2003. [Google Scholar]
  • Sielecki TM, Boylan JF, Benfield PA, Trainor GL. Cyclin-dependent kinase inhibitors: useful targets in cell cycle regulation. J Med Chem 2000;43:118. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Kim H, Lee E, Kim J, Jung B, Chong Y, Ahn JH et al. A flavonoid gossypin binds to cyclin-dependent kinase 2. Bioorg Med Chem Lett 2008;18:661664. [Crossref][Google Scholar]
  • Lawrie AM, Noble ME, Tunnah P, Brown NR, Johnson LN, Endicott JA. Protein kinase inhibition by staurosporine revealed in details of the molecular interaction with CDK2. Nat Struct Biol 1997;4:796801. [Crossref][Google Scholar]
  • Prentis RA, Lis Y, Walker SR. Pharmaceutical innovation by the seven UK-owned pharmaceutical companies (1964-1985). Br J Clin Pharmacol 1988;25:387396. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Freitas MP. MIA-QSAR modelling of anti-HIV-1 activities of some 2-amino-6-arylsulfonylbenzonitriles and their thio and sulfinyl congeners. Org Biomol Chem 2006;4:11541159. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
  • Boobis A, Gundert-Remy U, Kremers P, Macheras P, Pelkonen O. In silico prediction of ADME and pharmacokinetics. Report of an expert meeting organised by COST B15. Eur J Pharm Sci 2002;17:183193. [Crossref], [PubMed], [Web of Science ®][Google Scholar]
 

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.