Vanillin enones as selective inhibitors of the cancer associated carbonic anhydrase isoforms IX and XII. The out of the active site pocket for the design of selective inhibitors?

Abstract New C-glycosides and α,β-unsaturated ketones incorporating the 4-hydroxy-3-methoxyphenyl (vanillin) moiety as inhibitors of carbonic anhydrase (CA, EC 4.2.1.1) isoforms have been investigated. The inhibition profile of these compounds is presented against four human CA (hCA) isozymes, comprising hCAs I and II (cytosolic, ubiquitous enzymes) and hCAs IX and XII (tumour associated isozymes). Docking analysis of the inhibitors within the active sites of these enzymes has been performed and is discussed, showing that the observed selectivity could be explained in terms of an alternative pocket out of the CA active site where some of these compounds may bind. Several derivatives were identified as selective inhibitors of the tumour-associated hCA IX and XII. Their discovery might be a step in the strategy for finding an effective non-sulfonamide CA inhibitor useful in therapy/diagnosis of hypoxic tumours or other pathologies in which CA isoforms are involved.


Introduction
Carbonic anhydrases (CAs, EC 4.2.1.1) are members of a great family of metalloproteins found in most organisms such as bacteria, archaea, fungi, protozoans, plants, algae, and vertebrates 1 . CAs catalyze one of the most important physiological reactions: the reversible hydration of carbon dioxide with the formation of a proton and bicarbonate 1,2 . Up to now eight genetically distinct CA families are known, the a-, b-, c-, d-, f-, g-, h-, and i-CAs. Mammals possess only a-CAs, while many pathogenic organisms such as bacteria and fungi encode enzymes belonging to several families, among which a-, b-, cand i-CAs. Most of these enzymes contain a metal ion (usually Zn 2þ ) in their active site, which is coordinated by three histidine residues and a water molecule/hydroxide ion in the a-CA class 1,2 , whereas the recently described i-CAs seem to be devoid of metal ions and perform the catalysis by a diverse mechanism 3 . In mammals, 16 different a-CA isoforms are known to date, with human CA isozymes hCA I and II (cytosolic forms) being the most widespread ones throughout the human body 1 . The disregulated CA activity is linked with numerous pathological states 1 . In recent years, the connection between the overexpression of transmembrane isoforms hCA IX and XII with cancer progression has been investigated in detail and clarified 4 . These isozymes are overexpressed in solid tumours as a consequence of the hypoxia-inducible factor-1 (HIF-1) signalling cascade, being less expressed in several normal tissues 4, 5 . Thus selective inhibition of tumour-associated isoforms hCA IX and XII (over the off-target hCA I and II) is a great promise for the use of CA inhibitors in the cancer therapy/ diagnosis 4, 5 .
The intake of protective factors for fortifying the natural bod ys defense capacity to reduce the risk of cancer is an approach called chemoprevention. Recently natural products containing phenols, have been recognised as cancer chemopreventive agents [6][7][8] . Ginger (Zingiber officinale) is one of the most widely consumed spices in the world 6,7 Ginger contains several phenolic compounds possessing antimicrobial and analgesic activity, such as 6-gingerol and 6-paradol that incorporate vanillin (4-hydroxy-3-methoxyphenyl) moiety in their molecules. Related 6-dehydroparadols have shown antitumor-promoting activity in mouse skin carcinogenesis 9 . 6-Shogaol, another vanillin derivative found in ginger, suppresses the proliferation of non-small cell lung cancer 9 . Phenol derivatives, including some natural products, have also been studied as CA inhibitors (CAIs) [10][11][12][13] .
Several phenolic derivatives share a common moiety: an a,b-unsaturated carbonyl group, or enone. Incorporation of this functionality into a given molecular structure has been shown to increase antitumor activity 14 . In that sense, the introduction of an enone moiety into a triterpenoid skeleton had enhanced the cytotoxic activity against human breast cancer cells 15 .
Our group has developed several selective CAIs by the attachment of C-glycosidic enones to carbohydrates 16. The use of carbohydrate moieties could induce the desired physicochemical properties such as water solubility, lower permeability in organs in which the enzyme is also present, etc 5 . Although several pharmacophores have thoroughly been investigated in the design of sugar-based CAIs 17 , there are still several functionalities that have been only poorly studied such as the vanillin moiety. To the best of our knowledge no alkyl nor aryl enones of vanillin have been studied so far as CAIs 18 .

Materials and apparatus
All starting materials and reagents were purchased from commercial suppliers. Reactions were monitored by TLC and TLC plates visualised with short wave UV fluorescence (k ¼ 254 nm), sulphuric acid stain (5% H 2 SO 4 in methanol). Silica gel flash chromatography was performed using silica gel 60 Å (230-400 mesh). All melting points are uncorrected. 1 H and 13 C NMR spectra were recorded on a Bruker 600 (600 and 151 MHz, respectively). Chemical shifts were measured in ppm and coupling constants in Hz. High-resolution mass spectra were recorded using electrospray as the ionisation technique in positive ion or negative ion modes as stated. All MS analysis samples were prepared as solutions in methanol.

General procedure for the synthesis of compounds 1 and 2
A mixture of b-C-glucopyranosyl or b-C-galactopyranosyl ketone (1 mmol), 4-hydroxy-3-methoxybenzaldehyde (vanillin) (1.2 mmol), and L-proline (0.15 mmol)-TEA (0.3 mmol) in 3 ml of methanol was stirred at reflux for 48-96 h. The endpoint of the reaction was monitored by TLC. The solvent was evaporated under reduced pressure and the product was purified by column chromatography (eluant 8:2 DCM-MeOH) to afford pure material by 1H NMR and 13 C NMR spectroscopy. Yields 35-45%.
2.3. General producer for the synthesis of compounds 3-7 A mixture of ketone (1 mmol) and 4-hydroxy-3-methoxybenzaldehyde (vanillin) (1 mmol) in 2 ml of anhydrous methanol has been added to the catalyst (see Table 1). The reaction was stirred at the corresponding temperature until starting material was consumed as evidenced by TLC, followed by dilution with ice-cold water, acidification with cold 1 M HCl, and extraction with DCM. The solvent was evaporated under reduced pressure. The product was purified by column chromatography (eluant 9:1 DCM-MeOH) to afford pure material by 1 H NMR and 13 C NMR spectroscopy.

CA inhibition studies
An Applied Photophysics stopped-flow instrument has been used for assaying the CA catalysed CO 2 hydration activity as reported by Khalifah 19 . Phenol red (at a concentration of 0.02 mM) has been used as an indicator, working at the absorbance m aximum of 557 nm, with 20 mM Hepes (pH 7.5) as a buffer, and 20 mM Na 2 SO 4 (for maintaining constant the ionic strength), following the initial rates of the CA-catalysed CO 2 hydration reaction for a period of 10-100 s. The CO 2 concentrations ranged from 1.7 to 17 mM for the determination of the kinetic parameters and inhibition constants. For each inhibitor, at least six traces of the initial 5-10% of the reaction have been used for determining the initial velocity. The uncatalyzed rates were determined in the same manner and subtracted from the total observed rates. Stock solutions of inhibitor (0.1 mM) were prepared in distilled-deionized water and dilutions up to 0.01 nM were done thereafter with distilled-deionized water. Inhibitor and enzyme solutions were preincubated together for 15 min at room temperature prior to assay, in order to allow for the formation of the E-I complex. The inhibition constants were obtained by non-linear least-squares methods using PRISM 3, and the Cheng-Prussoff equation as reported earlier 20 and represent the mean from at least three different determinations.

Preparation of the molecular systems
The simulations were based on X-ray crystal structures of human carbonic anhydrases (CA I: PDB ID 2FW4, CA II: PDB ID 3KS3, CA IX: PDB ID 6FE2, CA XII: PDB ID 1JCZ). These structures were selected from the Protein Data Bank 21 , based on resolution, validation parameters, and missing residues.
The preparation of proteins was done with Chimaera 22 . Water molecules and other ligands were removed. All Asp and Glu residues were considered to have a negative charge and all the Arg and Lys residues were considered to have a positive charge. Hydrogens were added following the hydrogen-bonding pattern. Ligand structures were built with Avogadro 23 and then optimised using the PM6 semiempirical method, implemented in OpenMOPAC 24 . The solvent effect was considered using the COSMO implicit model 25 , with a dielectric constant value of 78.4, corresponding to an aqueous medium. The geometry optimisation termination criteria were set to 0.1 kcal/mol/Å gradient norm requirement.

Molecular docking
Molecular docking was carried out to find and score protein-ligand binding poses on carbonic anhydrase structures with Smina with Vina as scoring function 26 . Protein and ligand PDBQT files were prepared with AutoDockTools software 27 . To unify the box set and simplify the analysis of results, chains A of 2FW4, 6FE2, 1JCZ structures were aligned to 3KS3 using Pymol 28 . A docking box with size 27.75 Å Â 27.00 Å Â 28.50 Å was centred on the catalytic binding site. The ligands were docked using a flexible-ligand/rigid-receptor approach. The exhaustiveness value was increased to 20.
In order to have alternative poses, DOCK 6.8 29 was also used with grid score and flexible ligands and applying a subsequent minimisation. The spheres selected where the grid is calculated were the ones that fit inside the box used with Smina. Other parameters were set to their default values.
The post-docking analysis included visualisation of the ligandreceptor complexes with Pymol to analyse the potential interactions with the amino acid.

Binding energy estimated by semiempirical method
In order to obtain greater precision in the characterisation of the interaction of ligands with carbonic anhydrase isozymes, a calculation scheme based on the semi-empirical PM6 30 was adopted, employing OpenMOPAC 24 . It was decided to implement a quantum method in view of the presence of Zn 2þ in the carbonic anhydrase, which makes it somewhat difficult to deal with molecular mechanical force fields, in particular, to estimate binding energy.
Due to the considerable size of the system, the MOZYME 31,32 approach was used, which has already been used to improve docking scoring functions or estimate binding energies [33][34][35] . The solvent effect was also considered using the COSMO implicit model.
A molecular optimisation of all the poses obtained by docking was performed, both on the complexes, as well as on the ligands and the carbonic anhydrase separately. Then, binding enthalpies were calculated as the heat of formation differences (DH bind ¼ 2.5.4. Molecular dynamics MD simulations were performed on poses resulting from molecular docking for CA I and CAII. The system, CAs, and ligands have a positive net charge, so chloride anions were added as counterions with the Leap module to achieve electroneutrality. The neutralised ligand/CA complexes were immersed in a box of TIP3P waters which extended up to 15 Å from the solute. CAs were described using the Amber14SB force field 36 . The ligand was described using the Generalised Amber Force Field 37 with charges derived from RESP, which were calculated with the Antechamber module. Leap and Antechamber are included in the package AmberTools 20.0 38 . Zn 2þ cation, neighbouring residues and a water molecule were modelled with MCPB.py 39 , also included in AmberTools 20.0. All MD simulations were run using the NAMD 2.13 software 40 . The van der Waals interaction cut-off distances were set at 12 Å and long-range electrostatic forces were computed using the particle mesh Ewald summation method with a grid size set to 1.0 Å. The 1-4 contributions were multiplied by a factor of 0.83 to match the AMBER force field requirements. The system was subjected to 100000 minimisation steps, heating from 0 to 300 K in 30 ps, and 10 ns of equilibration/production simulation. This trajectory extension was chosen to achieve a balance between the number of simulations to run with respect to sample at equilibrium conditions, and also considering that the CAs quickly reach the equilibrium condition. For all equilibration/production simulations, constant temperature (300 K) was maintained using Langevin dynamics with a damping coefficient of 5 ps À1 , while pressure was kept constant at 1atm through the Nos e-Hoover Langevin piston method with a decay period of 200 fs and a damping time constant of 100 fs. A time step of 1 fs was used along with molecular mechanics. Bonds involving hydrogen atoms of waters were constrained using the SHAKE algorithm. RMSD values were depicted to determine the convergence and stability of simulations. (See Electronic Supplementary Material, Figures S2 and S3) 2.5.5. Binding energy estimated by MM/GBSA Binding free energies of ligands with CAs were computed using the MM-GB/SA method, where the binding free energy is calculated as the difference between the bound and unbound states of protein and ligand 41,42 .
The solvation free energy was calculated using the generalised Born (GB) model 43 implemented in MMPBSA.py module 44 , igb ¼ 5 as the selected model. The hydrophobic contribution was determined using the solvent-accessible surface area. The CAs-ligands binding free energies were calculated using a single trajectory (for ligand, protein, and complex) based on 500 snapshots taken from the last 5 ns portion of the MD simulation trajectories. Entropies were estimated using quasi-harmonic approximation.
To obtain the detailed representation of interactions, free energy decomposition analysis was employed to decompose the total binding free energies into ligand-amino acid pairs. These calculations were performed using a pairwise energy decomposition scheme (idecomp option 3) also with the MMPBSA.py module.

Chemistry and CA inhibition
The enones 1-7 (Figure 1) have been prepared by aldol condensation of vanillin with the appropriate alkyl, aryl, or glycosidic methyl ketones (Scheme 1). Knoevenagel reaction of D-glucose or D-galactose with pentane-2,4-dione in the presence of aqueous sodium bicarbonate at 90 C afforded the b-D-glycosyl-propan-2ones in 50% and 54% yields, respectively. It should be noted that higher yields were described in the literature 45 , but the products had not been purified as in our report.
C-glycosides 1-2 have been prepared by reaction of b-C-glucosyl or b-C-galactosylketone with vanillin in the presence of L-proline/Et 3 N with moderate yields (Table 1). Next, we studied the aldol condensation of aliphatic or aromatic ketones and vanillin with different catalysts. The best reaction conditions found in the synthesis of the a,b-unsaturated ketones are shown in Table 1. Enones 1-7 were successfully purified by flash chromatography. The 1 H NMR, 13 C NMR, 2 D COSY and HSQC were in full agreement with their structures (see Supplementary information). The large coupling constant (J % 16 Hz) between the two olefinic protons, was consistent with the E configuration of the double bond. In enones, 1-2 the large coupling constant (J ¼ 9.4 Hz) between H-10 and H-20 indicated a diaxial relationship and thus confirmed the b-configuration.
The inhibitory activity of compounds 1-7 against cytosolic isoforms hCA I and II as well as the membrane-associated isoforms hCA IX and XII was assayed by using a stopped-flow assay method and the results are shown in Table 2. A number of structure-activity relationships were identified in this study and are summarised as follows.

Off-target CA isozymes
i. C-glycosides 1 and 2 were micromolar inhibitors of hCA I. A similar trend was found for the aryl derivatives 6 and 7 and the methyl one 3. The alkyl derivatives 4 and 5 were very poor inhibitors of hCA I. It is of great interest to relate the behaviour of these compounds towards hCA I and it can be concluded that attachment of a voluminous scaffold such as the carbohydrate one, to the enone functionality, does not lead to a decrease in the inhibitory potency of these compounds against hCA I. ii. Vanillin derivatives showed an interesting inhibition profile against hCA II. It should be noted that the C-glycosidic derivatives 1-2 and the alkyl enones 3-5 were very poor inhibitors of hCA II. On the other hand, the aromatic derivatives 6-7 were inhibitors acting in the micromolar range. However, being a ubiquitous, housekeeping isoform, this may not be a valuable property in another context if compounds targeting other isoforms (such as hCA IX and XII) should possess activity against hCA II. However, it is interesting to note that compounds 1-5 showed poor inhibition against hCA II while retaining effective inhibition against hCA IX and XII.
3.1.2. Cancer-associated CA isozymes i. The tumour-associated target isoform hCA IX was inhibited in the submicromolar range by all enones except compound 7, which was a poor inhibitor of this isozyme. The best inhibitor was the C-galactoside 2, which weakly inhibited hCA II too. ii. The second tumour-associated isoform, hCA XII, was the most inhibited isoform by all vanillin derivatives. The phenyl enone 6 was the most active compound in the series. iii. Selectivity for inhibiting the tumour-associated isoforms (hCA IX and XII) over the widespread cytosolic forms (hCA I and II) is a key issue when designing CAIs. As can be in Table 2 several compounds showed better activity profiles against hCA IX and XII over I and II which is highly desirable when only the tumour-associated isoforms would be targeted. It was observed that enones 1-5, which showed very good inhibition of isoforms IX and XII were also shown to be highly selective. The most effective hCA IX inhibitor 2 showed an excellent selectivity ratio over hCA II. Only aryl enones 6-7 showed almost no selectivity and are not useful in the design of selective inhibitors.

Molecular docking
It was observed, from the poses generated using molecular docking, that the various ligands presented different binding patterns for the analysed CA isozymes. Thus, in addition to ligand poses within the catalytic active site, the location of some poses   46-50 . b Errors in the range of 5-10% of the reported value, from three different determinations.
indicates that the inhibitors might bind at an adjacent pocket, at the entrance of the active site, Figure 2. This site was previously described for hCA II as an alternative site for inhibitor interaction, and only one carboxylic acid derivative was observed to bind in it by means of X-ray crystallography 51 . The hCA I active site is aligned among others by residues Asp1, Gly3, Tyr4, Asp5, Asn8, His61, Lys167, Ser228, and Met238. In the case of hCA II, the site is integrated, in addition to the previously reported residues by Gly6, Tyr7, Gly8, Asn11, His64, Phe231, Asn232, and Glu239, by Trp5, Gly63, and Lys170 51 .
The hCA XII pocket is characterised by the presence of residues Lys1, Lys166, and Arg238, which are bulkier than the equivalent residues in hCA I and hCA II. In the case of hCA IX (PDB Id 6FE2), this pocket is absent and some docking poses are located close to this, on the rear face of the structure, as shown in Figure 2.

Calculated binding enthalpies
In order to obtain more precise energies of the interaction with CAs, semi-empirical calculations were performed on the obtained docking poses. The binding enthalpies calculated with PM6 and the MOZYME approach are listed in Table S1. The energies were classified with a Python script according to the location of ligands, at the catalytic site or in the adjacent pocket, reporting the most stable energy value for each category when it was observed. In most cases, it was found that the poses in the alternative pocket have a lower (more favourable) binding enthalpy with respect to the location within the catalytic site. The exception to this observation is the CA IX isoform, where poses located at the catalytic site were more favourable than the out of the active site binding 51 .

Molecular dynamics and binding free energies
Although the ligand structures and nearby residues are optimised when applying the semiempirical methodology, this represents a "snapshot" of the enzyme-inhibitor complex. To analyse the "film" of the system, molecular dynamics simulations and subsequent estimation of the binding energy on the hCA I and hCA II isozymes were thereafter performed. The calculated binding free energies including the entropy as quasi-harmonic approximation are shown in Table 3. It was observed in most cases that the interaction in the alternative pocket was more stable than poses placing the inhibitor inside the catalytic cavity with the methoxyphenol derivative close to the Zn 2þ cation, this situation being more favourable than the poses localising the inhibitor inside the catalytic cavity with the outward-facing methoxyphenol.
A correlation was observed for the compounds that presented a more favourable calculated binding energy with respect to those that showed better inhibitory capacity in the experimental assays (compounds 3, 6, and 7 with isoform hCA I, and compounds 6 and 7 with isoform hCA II). In the case of compounds 1 and 2, the correlation was not good, probably due to the difficulty in modelling the carbohydrate fragment, which is structurally very flexible. It should be noted that entropy had to be included in the calculations despite the increased computational cost. Consequently, the entropic term was often neglected. However, Figure 3. Per-residue decomposition of binding energy (only the enthalpy term considered) of ligand-bound hCA I and II, located in the pocket adjacent to the catalytic site. These calculations were performed with the MMPBSA.py module and plotted using Python's Matplotlib library. The residues shown were those that evidenced at least one interaction of < 0.3 kcal mol À1 with a ligand. from preliminary tests that were performed (data not shown) it was found that the relative order of stabilities between the possible binding sites was altered when only the enthalpic term was considered. In order to analyse which residues could interact with the compounds, a per-residue decomposition of MM-GBSA binding energy was performed from molecular dynamics trajectories. Figure 3 shows this analysis for ligands located in the alternative pocket of CA I and CA II. In the case of the CA I pocket, there is a pronounced interaction with the residues Tyr7, Asp8, Gly63, His64, Lys170, Met241, and His243.
While the interaction in the CA II pocket occurs mainly with the residues Gly6, Tyr7, Gly63, and His64. It should be noted that 7, in contrast to the other compounds, also interacts with the residues Leu239, Met240, Val241, Asp242, and Asn243. Figure 4 shows the three-dimensional structures of compound 7 in the alternative CA I and CA II pocket. A possible explanation that we put forward initially is the possibility that this alternative pocket competes with the binding within the catalytic site. Thus, those compounds that fit within this alternative pocket do not block the catalytic site and therefore do not inhibit CA by the canonical inhibition mechanism. Along these lines, the size of this pocket is reduced in the order: CA II, CA I, CA XII, and is absent in CA IX. Precisely in this last isoform, all compounds except 7, produced inhibition. However, computationally estimated binding energies do not support this hypothesis. From these results, the experimentally observed inhibition differences could be correlated with the interaction of the compounds with this alternative binding pocket at the entrance of the active site. Further studies, both computational and experimental, are warranted to fully reveal the mode of interaction and inhibition of these compounds with these CA isoforms.

Conclusions
In conclusion, a small series of enones 1-7 have been prepared by aldol condensation of methyl ketones, including C-glycosylated ones with vanillin, being obtained in moderate yields. The compounds have been investigated as inhibitors against four isozymes of CA comprising the cytosolic, ubiquitous isozymes hCA I and II as well as the transmembrane, tumour-associated isoforms hCA IX and XII which are validated antitumor targets. In this study, C-glycosyl and alkyl enones derivated from vanillin, have been identified as selective inhibitors of hCA IX and XII. Molecular docking studies, quantum semiempirical calculations, and molecular dynamics simulations have been carried out in order to understand the inhibition profile of the compounds 1-7 with the different CA isozymes. A good correlation was found between the calculated binding free energies and the experimental inhibitory activity. An alternative pocket, already discovered for the binding of a carboxylic acid derivative several years ago 51 , close to the active site entrance, is proposed as the site of interaction with these CAIs and could be useful in the design of potent and selective inhibitors. Further studies, both computational and experimental, are needed to fully validate the mode of interaction and inhibition of these compounds with these as well as other CA isoforms of pharmacological interest.

5.. Disclosure statement
CT Supuran is Editor-in-Chief of Journal of Enzyme Inhibition and Medicinal Chemistry and he was not involved in the assessment, peer review, or decision-making process of this paper. The authors have no relevant affiliations or financial involvement with any organisation or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.