Discovery of new 1H-pyrazolo[3,4-d]pyrimidine derivatives as anticancer agents targeting EGFRWT and EGFRT790M

Abstract New 1H-pyrazolo[3,4-d]pyrimidine derivatives were designed and synthesised to act as epidermal growth factor receptor inhibitors (EGFRIs). The synthesised derivatives were assessed for their in vitro anti-proliferative activities against A549 and HCT-116 cancer cells. Compounds 8, 10, 12a, and 12b showed potent anti-proliferative activities. Compound 12b was the most promising member with IC50 values of 8.21 and 19.56 µM against A549 and HCT-116, respectively. Compounds 8, 10, 12a, and 12b were evaluated for their kinase inhibitory activities against wild EGFR (EGFRWT). Compound 12b was the most potent member showing an IC50 value of 0.016 µM. In addition, compound 12b showed noticeable activity against mutant EGFR (EGFRT790M) (IC50 = 0.236 µM). Flow cytometric analyses revealed that compound 12b is a good apoptotic inducer and can arrest the cell cycle at S and G2/M phases. Furthermore, it produced an 8.8-fold increase in BAX/Bcl-2 ratio. Molecular docking studies were carried out against EGFRWT and EGFRT790M.


Introduction
Based on World Health Organisation International Agency for Research on Cancer (IARC), GLOBOCAN digital estimation confirmed the dramatically increased cancer incidence and mortality. The estimated value is about 19.3 million new cancer cases in 2020 1 . In 2022, 1,918,030 new cancer cases and 609,360 cancer deaths are projected to occur in the United States, including approximately 350 deaths per day from lung cancer, the leading cause of cancer death 2 . Also, cancer is a serious health issue in Africa as almost half of the cancer incidences occur in developing countries 3 . Consequently, various drug innovations against cancer were recorded, despite that, the real cause of cancer is inevitably unclear till now. Yet, cancer is mainly referred to as uncontrolled cell proliferation and finally metastasis 4,5 . The regimen of cancer treatment is greatly modified by increasing the knowledge of molecular and tumour biology 6 . Noticeably, the selectivity of anticancer approaches has a low margin 7 . So, it is a serious concern to develop a new strategy of treating cancer that provides a high selectivity margin.
Regarding molecular targeted therapy against cancer, receptor tyrosine kinases (RTKs) play a vital role in cellular programs e.g. proliferation, migration, apoptosis, survival, and differentiation 8 . The role of RTKs is the phosphorylation of tyrosine residue via transferring a gamma phosphate group from ATP to it, so normal physiological cellular functions are maintained [9][10][11] . The general architecture of RTKs includes extracellular ligand-binding region, transmembrane helix, and cytoplasmic region that contains protein tyrosine kinase domain decides carboxy C-terminal and juxtamembrane regulatory regions. Abnormalities can alter the regulation of RTKs that become mutated or aberrantly activated, leading to different pathological conditions such as cancer 8 .
Epidermal growth factor receptor (EGFR) is one of the most important RTKs possessing a key role in cell growth 12,13 . There are many types of tumours with a high level of EGFR overexpression as breast cancer 14 , lung cancer (NSCLC) 15 , and hepatocellular carcinoma (HCC) 16 . EGFR was found to act as a strong prognostic indicator in head and neck, ovarian, cervical, bladder, and oesophageal cancers. In these cancers, increased EGFR expression was associated with reduced recurrence-free or overall survival rates in 70% of studies 16 . EGFRs are thought to be interesting targets for developing novel anticancer drugs [17][18][19][20][21] .
There are many FDA-approved EGFR-tyrosine kinase inhibitors (EGFR-TKIs). The first-generation as erlotinib I 22 has a good effect against wild EGFR (EGFR WT ). This class has many side effects 23,24 in addition to the acquired drug resistance caused by EGFR-TK mutation 25 . The second-generation was discovered to overcome the resistance induced by EGFR T790M . Neratinib II 26 is a one of the most famous drug in this generation. Unfortunately, latter class of drugs has a low maximal-tolerated dose producing inadequate clinical efficacy 27,28 . The third-generation EGFR-TKIs as olmutinib III and osimertinib IV 29 showed enhanced actions against mutant EGFR (EGFR T790M ). However, toxic epidermal necrolysis was associated with these drugs 30 . Hence, there is an urgent need to optimise the approved drugs to reach efficient and less harmful candidates.
EGFR-TKIs must possess some pharmacophoric features to bind efficiently the ATP binding site and hence exert their inhibitory activities. The first pharmacophore is the flat heteroaromatic system which can occupy the adenine binding pocket of the ATP binding site 31 . The second feature is the terminal hydrophobic head which can occupy the hydrophobic region I of the ATP binding site 32 . The third feature is the spacer moiety which is mainly an amino derivative to form a hydrogen bond in the linker region of the ATP-binding site 33 . The fourth feature is the hydrophobic tail which can occupy the hydrophobic region II of the ATP-binding site 34,35 . The fifth feature is the ribose binding moiety which can occupy the ribose binding pocket. Till now, there are limitations in research that target the ribose binding pocket 36 (Figure 1).
1H-Pyrazolo [3,4-d]pyrimidine moiety is an important scaffold in the field of medicinal chemistry as it is a building block in many anticancer agents 36 including EGFR-TKIs 30 . Compound V was approved as an ATP-competitive inhibitor showing EGFR inhibitory effect at a nanomolar concentration 37 . Compound VI is another example of 1H-pyrazolo [3,4-d]pyrimidine derivative with anti-EGFR activity 38 . Furthermore, our team synthesised 1H-pyrazolo [3,4-d]pyrimidine derivative (compound VII) as EGFR inhibitor. This compound showed good anti-proliferative activity with high inhibitory effect against wild and mutant EGFR 39 (Figure 1). Due to the high similarity of this scaffold with the adenine moiety of ATP, it was used as a backbone for the design and synthesis of ATP competitive inhibitors, especially the compounds that target RTKs 38,40 .
Based on the previous reports including the high importance of EGFR as an anticancer target, the generated resistance against the FDA approved anticancer drugs, and the attractiveness of 1Hpyrazolo [3,4-d]pyrimidine moiety, it was decided to design and synthesise a new 1H-pyrazolo [3,4-d]pyrimidine derivatives that may have good inhibitory activities against EGFR. The synthesised compounds were designed to have the pharmacophoric features of EGFR inhibitors.

Rationale of molecular design
In this work, new 1H-pyrazolo [3,4-d]pyrimidine derivatives were designed and synthesised to have the main pharmacophoric features of EGFR-TKIs. In these compounds, many structural modifications for the reported EGFR-TKIs were carried out. The modification processes were achieved at five positions ( Figure 2).
Firstly,1H-pyrazolo [3,4-d]pyrimidine moiety was used as a heteroaromatic system to occupy the adenine binding region 42,43 . Second, different substituted phenyl or aliphatic structures were utilised as a hydrophobic head to occupy the hydrophobic region I of the ATP-binding site. The third modification was performed on the linker moiety. We used different linkers as imino group (compounds 7a,b, 8, and 9), hydrazone derivative (compounds 11a,b and 12a,b), and thiosemicarbazide moiety (compounds 13a,b). For the hydrophobic tail, we used a phenyl ring to occupy the hydrophobic region II of the ATP-binding site. To occupy the ribose-binding pocket, we used an aniline structure. The diversity of modifications gave us good results about the structure-activity relationship of the synthesised compounds as antiproliferative agents targeting EGFR. All modifications were clarified in Figure 3.
Compound 6 was heated with hydrazine hydrate to afford 4hydrazinyl-N,1-diphenyl-1H-pyrazolo [3,4-d]pyrimidin-6-amine 10. The IR spectrum of 10 demonstrated stretching bands at 3444, 3352, and 3190 cm-1 corresponding to NH 2 and NH, respectively. Moreover, 1 H NMR of this compound showed two exchangeable signals at d 4.73 and 9.89 ppm corresponding to NH 2 and NH, respectively. Refluxing of 10 with commercially available aromatic aldehydes or acetophenones in the presence of glacial acetic acid afforded the target compounds 11a,b, and 12a-c. 1 H NMR spectra of hydrazones revealed the presence of D 2 O exchangeable singlet signals of hydrazinyl NH at the range 11.76-12.24 and an increase in the integration of the aromatic protons indicating the presence of an additional aromatic ring.

In vitro antiproliferative activities
The cytotoxic activities of the synthesised compounds were assessed against two human cancer cell lines (lung, A549) and (colon HCT-116) using an MTT assay. These two cell lines were selected in this test due to the overexpression of EGFR in human lung and colon cancer cell lines 48,49 . The tested cells were reported to have a high expression level of EGFR.
As presented in Table 1, the tested compounds showed a wide range of anti-proliferative activities. This range varied from potent, moderate to weak cytotoxic effect. Comparing to erlotinib (IC 50 27.03 mM, respectively. On the other hand, compounds 9, 11a, and 11b showed weak activities against the two tested cell lines while compounds 7a, 7b, 12c, and 12a exhibited weak activity against A549 cells.

Structure-activity relationship
Examining screening results for cytotoxicity assay (Table 1), it was found that the introduction of aliphatic amines as ethyl (compound 7a), propyl (compound 7b), and cyclohexyl (compound 9) in the 4-position of pyrazolo [3,4-d]pyrimidine scaffold was not beneficial for cytotoxic activity, particularly against A549 cells. On the contrary, the introduction of aniline moiety in the same position afforded compound 8 with enhanced anticancer activity. Additionally, a remarkable decline in cytotoxic activity against the two tested cancer cells was detected upon the condensation of hydrazine derivative 10 with aromatic aldehydes in compounds 11a,b. Conversely, the condensation of hydrazine 10 with acetophenone and p-chloroacetophenone furnished compounds 12a and 12b, respectively with better anticancer activity against A549 cells, compared to their parent hydrazine derivative 10. Compound 12b, bearing p-chlorophenyl moiety, stood out as the most potent derivative among the tested compounds, presenting excellent cytotoxic activity, comparable/equipotent to that of erlotinib against A549 and HCT-116 cells, respectively. On the other hand, the p-tolyl derivative 12c revealed a noticeable decrease in cytotoxic activity, relative to its parent hydrazine 10 as well as its p-chlorophenyl analog 12b. Finally, in agreement with the poor  cytotoxic activity elicited by aliphatic amine derivatives 7a,b, the addition of aliphatic isothiocyanates as ethyl (compound 13a) and propyl (compound 13b) groups to hydrazine derivative 10, was not beneficial for anticancer activity (Figure 4).

EGFRWT kinase inhibitory assay
The most promising compounds in the cytotoxicity assay (8, 10, 12a, and 12b) were further evaluated for their ability to inhibit the kinase activity of EGFR WT , employing erlotinib as a positive control. As shown in Table 2, the screened compounds significantly inhibited EGFR WT at low IC 50 values ranging from 0.016 to 0.026 mM, relative to erlotinib (IC 50 ¼ 0.006 mM). Compound 12b was the most potent member showing an IC 50 value of 0.016 mM.
Consistent with the results obtained from the cytotoxicity assay, it was observed that the introduction of the chlorine atom in the 4-position of phenyl ring (compound 12b, (EGFR WT IC 50 ¼0.016 mM) resulted in a noticeable enhancement of EGFR WT inhibitory activity, comparing to its unsubstituted analog 12a (IC 50 ¼0.021 mM).

Egfrt790m kinase inhibitory assay
To assess the inhibitory activity of the synthesised compound against the mutant EGFR (EGFR T790M ), the most promising member 12b was further screened against EGFR T790M utilising erlotinib as a positive control. Noticeably, it was found that compound 12b (IC 50 ¼0.236 5 mM) was 2.4-fold more potent than erlotinib (IC 50 ¼ 0.563 mM) against EGFR T790M (Table 2).

Cell cycle analysis
To determine the biological phase at which the synthesised compounds can interfere with the cell growth, cell cycle analysis was carried out for the most active member 12b in A549 cells. The tested cells were treated with 12b at a concentration of 8.21 mM equal to its IC 50 for 48 h. The results revealed a remarkable  interference with the normal cell cycle distribution. The treated cells revealed about a 2-fold decrease in the percentage of cells in the G1 phase (from 53.87 to 28.04%), compared to untreated cells. Moreover, compound 12b induced a 1.5-fold increase in the percentage of cells in the S phase (from 28.70 to 42.39%) in addition to a 1.7-fold increase in % G2/M (from 16.56 to 28.55%). From these findings, it can be concluded that compound 12b can arrest the cell cycle at S and G2/M phases (Table 3 and Figure 5).

Annexin V-FITC apoptosis assay
As displayed in Table 4 and Figure 6, the treatment of A549 cells with compound 12b for 48 h resulted in a 3-fold decrease in the ratio of viable cells (Left bottom) from 93.43 to 31.57%. In addition, it exhibited an 11-fold increase in the early apoptosis ratio (Right Bottom) from 6.03 to 67.69% and a 1.5-fold increase in the late apoptosis ratio (Right Top) from 0.43 to 0.64% compared to untreated A549 cells. These results indicated that compound 12b is a good apoptotic inducer and that apoptosis is most probably the main mechanism by which compound causes cell death.

Bax/bcl-2 ratio
The effect of the most active compound 12b on the expression levels of the apoptotic (BAX) and anti-apoptotic (Bcl-2) genes was evaluated. As shown in Table 5 and Figure 7, the treatment of A549 cells with compound 12b for 48 h resulted in a 3.3-fold increase in the level of BAX gene expression in addition to a 2.5fold decrease in Bcl-2 gene expression. As a result, an 8.8-fold increase in BAX/Bcl-2 ratio was observed for 12b-treated A549 cells, compared to untreated A549 cells.

In vitro cytotoxicity against normal cell and selectivity index
The in vitro cytotoxic effect of the most active compound 12b against a normal cell line (WI-38) was assessed ( Table 6). The results revealed that compound 12b has low toxicity against the tested cells with IC 50 value of 39.15 lM. Erlotinib as a reference drug showed an IC 50 value of 33.75 mM against WI-38 cell line.
The selectivity index (SI) of compound 12b against tumour cells were shown in Figure 8. This compound showed a SI of 4.77 and 2.00 against A549 and HCT-116, respectively. These indices are comparable to that of erlotinib (4.99 and 1.76) against A549 and HCT-116, respectively.
The results revealed that compound 12b has low toxicity against WI-38 cell line compared to erlotinib. In addition, it showed a high selectivity against the tumour cell lines.

Docking studies
To investigate the manner of binding with the hypothesised target, docking studies were performed for the synthesised compounds against the active site of the wild-type (EGFR WT , PDB: 4HJO) 50 and the mutant type (EGFR T790M , PDB: 3W2O) 51 . The co-crystallised ligands erlotinib and TAK-285 of EGFR WT and EGFR T790M , respectively, were used as reference compounds. The docked compounds showed good binding affinities towards EGFR WT , with binding free energies ranging from À19.63 to À23.67 kcal/mol, according to the results of docking studies. For the docking against mutant type, the synthesised compounds showed binding energy ranging from À16.09 to À21.66 (Table 7).
In these investigations, MOE 2019 software was used. The output figures were further visualised using Discovery Studio software 3.0. The docking mechanisms were initially validated by redocking each protein's co-crystallised ligands (Erlotinib and TAK-285) against the   active sites of EGFR WT and EGFR T790M , respectively. The generated RMSD values between the re-docked conformers and the co-crystallised conformers were 1.18 and 1.66 Å for erlotinib and TAK-285, respectively. As reported, an RMSD value of less than 2 Å suggests that the docking operation is genuine. As a result, the obtained RMSD values confirmed the docking protocol's validity ( Figure 9).
The co-crystallised ligand (erlotinib) of EGFR WT produced a binding score of À22.59 kcal/mol. The binding mode of erlotinib against EGFR WT was shown in Figure 10. The heterocyclic quinazoline moiety was oriented into the adenine pocket forming a hydrogen bond with Met769. In addition, it formed four hydrophobic interactions with Lue694, Ala719, and Leu820. The ethynylphenyl moiety was oriented into the hydrophobic pocket I forming three hydrophobic interactions with Ala719, Lys721, and Val702. The two 2-methoxyethoxy groups occupied the hydrophobic region II forming a hydrogen bond with Cys773 in close contact with Gly772 and Leu694.
Taking compound 12b as a representative example, it showed a similar binding pattern to erlotinib. Compound 12b exhibited a binding score of À23.07 kcal/mol. The 1H-pyrazolo[3,4-d]pyrimidin-6-amine moiety occupied the adenine pocket to form two hydrogen bonds with Met769 and Lys721. In addition, it formed four hydrophobic interactions with Val702, Ala719, and Leu820. The p-chlorophenyl moiety occupied the hydrophobic pocket I forming four hydrophobic interactions with Lys721, Leu764, and Leu834. In addition, it formed an electrostatic attraction with Asp831. Th hydrazinyl linker formed one hydrogen bond with2 Thr830. The phenyl ring at 1-position of 1H-pyrazolo [3,4-d]pyrimidine occupied the hydrophobic region II forming two hydrophobic interactions with Val702 and Cys773 ( Figure 11).
Docking of the synthesised compounds against the mutant EGFR(EGFR T790M ) gave a good insight into its binding pattern. The synthesised compounds showed binding scores ranging from À16.09 to À21.66 kcal/mol (Table 7).
Compound 12b, as a representative example, showed a binding mode like that of TAK-285 against the mutant EGFR with a binding score of À20.59 kcal/mol. The 1H-pyrazolo[3,4-d]pyrimidin-6-amine moiety occupied the adenine pocket to form one  hydrogen bond with Met793. Also, it formed five hydrophobic interactions with Leu718, Leu844, Val726, and Ala743. The p-chlorophenyl moiety occupied the hydrophobic pocket I forming seven hydrophobic interactions with Ala743, Met790, Leu788, Lys745, and Ile759. The phenyl ring at 1-position of 1H-pyrazolo [3,4-d]pyrimidine occupied the hydrophobic region II forming two hydrophobic interactions with Gly796 and Leu718 ( Figure 13).

2.3.2.
In silico ADMET analysis Discovery studio 4.0 was used to predict ADMET descriptors for all compounds. The predicted descriptors are listed in Table 8. Blood-Brain Barrier (BBB) penetration studies predicted that compounds 8, 11a, 11b, 12a, 12b, 12c, 13a, and 13b have very low BBB penetration levels. Accordingly, such compounds were expected to be safe for CNS. All the tested compounds showed low to very low range levels of ADMET aqueous solubility and have good to moderate intestinal absorption levels. Additionally, all compounds were predicted to be cytochrome P450 2D6 non-inhibitors. Consequently, the liver dysfunction side effect maybe not expected upon administration of these compounds. Due to the high planarity of the synthesised compounds, all of them are expected to bind plasma protein over 90% (Table 8 & Supplementary data).

In silico toxicity studies
In this work, six toxicity parameters were estimated computationally depending on the constructed toxicity models in Discovery studio software. The results of in silico toxicity studies were depicted in Table 9, In general, most of the synthesised compounds showed decreased toxicity potential. In detail, all compounds were predicted to be non-mutagenic and non-toxic against Ames mutagenicity and developmental toxicity potential models. In addition, all compounds were anticipated to be non-irritant and mild irritant against Skin Irritancy and Ocular Irritancy models, respectively. For, compounds 7a, 7b, 10, 13a, and 13b showed carcinogenic potency TD 50 values ranging from 18.673 to 34.965 mg/kg body weight/day, which were higher than that of erlotinib (8.057 mg/kg body weight/day). the other compounds showed less carcinogenic potency TD 50 values. In addition, the tested compounds showed rat maximum tolerated dose values ranging from 0.139 to 0.735 g/ kg body weight. This range is higher than the rat's maximum tolerated dose value of erlotinib (0.083 g/kg body weight).

Molecular dynamic simulations
To study the stability and the binding strength of the proteincompound 12b complex, GROMACS 2021 was used to run a 100 ns classical molecular dynamics simulation, and the trajectory was analysed using VMD. RMSD for the protein alone, compound 12b alone, and the complex (Figure 14(A)), RMSF (Figure 14(B)), Figure 6. Flow cytometric analysis of apoptosis in A549 cells exposed to compound 12 b. 1.00 ± 0.22 1.00 ± 0.13 1.00 ± 0.21 12b / A549 3.33 ± 0.37 ÃÃ 0.40 ± 0.08 Ã 8.80 ± 1.41 ÃÃ a Values are given as changes from the corresponding control (A549) group, which is set to '1'. Ã p < 0.05 ÃÃ p < 0.01 indicate statistically significant differences from the corresponding control in unpaired t-tests. SASA (Figure 14(C)), RoG (Figure 14(D)), and the change in the hydrogen bonds for the protein in the protein-compound 12b complex (Figure 14(E)) were calculated. The distance between the centre of mass of protein and the centre of mass of compound 12b (Figure 14(F)) was measured throughout the trajectory.
RMSD values show that the system was stable throughout the trajectory with no drastic fluctuation and an average of 2.26 Å for the protein alone. For compound 12b alone, the RMSD showed a stable trend in almost all the trajectory with two exceptions. The duration from 44.5 ns to 54.6 ns and from 86 ns to 91.6 ns show RMSD of values larger than 2 Å. The RMSD of the complex showed a similar trend to the RMSD of the protein only with slightly larger values. In addition, the amino acids fluctuation depicted in the RMSF values showed that most of the amino acids have fluctuations of less than 2 Å except for the C-terminal (around 6 Å) and the loop from E842:P853 reaching a maximum of 3.5 Å. The SASA (average ¼ 15301 Å 2 ), RoG (average ¼ 19.58 Å), and the change in the number of Hbonds (average ¼ 58 bonds) showed that the protein is stable and did not undergo a change in its folded state. The change in the distance between the centre of mass of compound 12b and that of the protein showed a stable binding with an average distance of 9.95 Å.
The trajectory was clustered using TTClust to obtain the different clusters and a representative frame for each one. To know the different types and numbers of interactions, PLIP was utilised to detect the interactions between compound 12b and the protein in the representative frames for each cluster. Table 10 showed the types and numbers of interactions produced from PLIP. Most of the interactions are hydrophobic with only one amino acid forming a hydrogen bond with the compound 12b. Figure 17 showed the 3D conformations for the complex in representative frames of each cluster.

In vitro cytotoxic activity
In vitro cytotoxicity was carried out for the synthesised compounds against A549, HCT-116, and WI-38 cell lines using the MTT assay protocol 52-55 as described in Supplementary data.

In vitro EGFR kinase assay
In vitro EGFR inhibitory activity was assessed using a Homogeneous time-resolved fluorescence (HTRF) assay 56 as described in Supplementary data.

Apoptosis analysis
The effect of compound 12b on cell apoptosis was investigated as described in Supplementary data 59 .

Quantitative Real-Time Reverse-Transcriptase PCR (qRT-PCR) technique
The effect of compound 12b on the expression of BAX and Bcl-2 was determined using qRT-PCR as described in Supplementary data 60-62 .

Toxicity studies
The toxicity parameters of the synthesised compounds were calculated using Discovery studio 4.0 65,66 as described in Supplementary data.

M D Simulations and MM-GBSA
CHARMM-GUI web server was employed and GROMACS 2021 was used as an MD engine as outlined thoroughly in Supplementary data. The Gmx_MMPBSA package was used as outlined thoroughly in Supplementary data [67][68][69][70] .

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This paper is based upon work supported by Science, Technology & Innovation Funding Authority (STIFA) under grant number 43327.