Receptor Chemoprint Derived Pharmacophore Model for Development of CAIX Inhibitors

Background: Carbonic anhydrase IX (CAIX) is an attractive target for anticancer therapy because it is selectively overexpressed in tumor cells. Various CAs’ inhibitors (sulfonamides/sulfaumates and coumarins) are reported as promising anti-cancer agents, showed appreciable affinity and selectivity. Novel chemical scaffolds with improved pharmacological properties are essential for the development of safe and potent CAIX inhibitors.


Introduction
Carbonic anhydrase (CA) enzymes are zinc containing metalloproteins, which efficiently catalyze the reversible conversion of carbon dioxide (CO 2 ) to bicarbonate (HCO 3 -) and release proton (H + ) [1]. In human, all 16 different isoforms of CA belong to α-class, and show a considerable degree of three-dimensional structural similarity with a conserved catalytic domain [2]. However, they differ widely in their cellular localization and physiological process [3]. CAs have been a potential therapeutic target for numerous pathological conditions [1,4]. Based on the Structure-Activity Relationship (SAR) studies, various α-CAs' inhibitors were designed and synthesized. These inhibitors can be broadly classified into (i) inorganic anions, (ii) sulfonamides and their bioisosteres (sulfamates, sulfamides, etc.), (iii) phenols and (iv) coumarins [5][6][7]. These inhibitors are widely used as drug for treatment of neurological disorders, glaucoma, epilepsy, ulcer, cancer and obesity [1,5,6]. However, most of the currently used CA inhibitors showed cellular toxicity and scarce selectivity including various side effects [6,8]. Thus, it is important to develop a potential and selective CA inhibitors as a therapeutic anti-cancer agent [4,9].
Among all CAs, CAIX is very peculiar, due to its limited expression in normal tissues and predominant expression in varieties of tumor cells [3]. CAIX actively participates in the regulation of pH in tumor cells, cell proliferation, cell adhesion, cell invasion, etc. CAIX is considered as a suitable target for cancer therapy [4,9]. Furthermore, extracellular location of this isozyme is favourable for designing selective inhibitor which can inhibit membrane associated CAs without interacting to other cytosolic and mitochondrial CAs. Recently, a high resolution crystal structure (2.20 Ǻ) of human CAIX has been determined that can be used for designing a selective inhibitor using structure based rational drug design approach [3,9].
Advancements in computational techniques [8][9][10] played a vital role in High Throughput Screening (HTS) to design new chemical entities [11,12]. Pharmacophore modelling, Quantitative Structure Activity Relationship (QSAR), virtual screening, docking simulation and pharmacokinetics based analyses are used for development of new chemical entities [10,13]. However, the ligand based pharmacophore models are significantly depend on selected training set and conformation generation method [14]. This technique is more suitable, when protein structure or its complex with ligand has limited information [15]. Therefore, the physicochemical properties and spatial position of the active site residues of a receptor, so called "chemoprint" could be a new endeavour for the development of receptor specific potential inhibitors [16].
various drug targets [17][18][19][20]. In the present study, docking simulations are carried out by various sets of experimentally known CAIX inhibitors, and interacting active site residues were sorted accordingly to pharmacophore features with LigandScout 3.1 [21]. Virtual screening was found to be a successful method especially when combined with molecular docking studies [10]. The predicted pharmacophore has been employed in virtual screening with a ZINC database to identify a plausible potent lead for CAIX inhibitor. The docking analysis elucidated spatial interaction of novel hit molecules. ADMET analysis is carried out for their probable pharmacokinetic profile in terms of lipophilicity (logP), water solubility (logS W ) and polar surface area (PSA). We have used TOPKAT tool to examine the lead molecules. Analyses of these results are quite impressive for further investigation of ZINC03363328, ZINC08828920, ZINC12941947, ZINC03622539 and ZINC16650541 as a possible lead in development of CAIX inhibitors in cancer therapy.

Pharmacophore modeling
Discovery studio 3.5 (Accelrys San Diego, USA) [22,23] and LigandScout 3.1 [21] were used to generate pharmacophore models. Based on the published literature, 25 diverse set of experimentally known CAIX inhibitors were collected from literature [1,6,7,24,25] ( Figure S1) and structures were drawn with Discovery studio 3.5. Conformers of all compounds were generated using the 'best quality' option with an energy threshold of 20.0 kcal mol −1 and a maximum of 250 conformers sorted according to the Poling algorithm [26]. These conformational models were used in subsequent docking simulations, carried for CAIX-inhibitor complex formation according to LigandScout 3.1 manual. LigandScout 3.1 [21] uses algorithm that allows automatic generation of 3D-pharmacophore from structural data of proteinligand complex that was used to generate 25 individual complexes based pharmacophore models. All pharmacophores were aligned together to create multi-complex-based comprehensive map to define a common and best interacting features corresponding to interacting residues and binding affinity of inhibitors with CAIX. All pharmacophore features identified by LigandScout were clustered according to their interaction pattern with the receptor. Hydrogen bond donor (HBD) and hydrogen bond acceptors (HBA) were clustered for polar interaction in terms of residues. Density-based clustering methods were used for hydrophobic, positive and negative ionizable and ring aromatic features. Amino acid residues present in vicinity of active site were clustered as excluded volume. The distance of neighboring residues from their ideal position were summed to define absolute geometric deviations. The absolute geometric deviation (d a ) for n neighbors, a list of ideal points I 0 …(n-1), and a list of observed points O 0 …(n-1) are expressed as follows: The relative geometric deviation (d r ), which can be used to compare geometric deviations of different geometric bodies as shown below.
Pharmacophore features showing spatial interaction with important residues of binding pocket and consistent with the crystal structure data were merged to form dynamic structure-based pharmacophore.

Molecular docking
Atomic coordinates of crystal structure of CAIX (PDB ID: 3IAI) was retrieved from Protein Data Bank (www.rcsb.org). The optimized co-ordinates of inhibitors were saved in .pdbqt format with babel-2.2.3 to carry docking analysis. All inhibitors in this study were docked into binding site of CAIX using AutoDock 4.2 [27] with standard protocol. The Lamarckian genetic algorithm (LGA) was applied to deal with protein-inhibitors interactions [28,29]. Polar hydrogen atoms were added geometrically. Kollman united atom charges were assigned to protein and PDBQT file was created. The 3-D affinity grid fields with grid map of 60 × 80 × 60 points were created using the auxiliary program AutoGrid. A grid-point spacing of 0.375 Å (roughly a quarter of the length of a carbon-carbon single bond) and a distance-dependent function of the dielectric constant was used for the calculation of energetic map. Other parameters were defined accordingly to our previous report [30,31]. PASS method with Perl script was applied to calculate the center of mass (COM) of the active site [32]. For evaluating binding energy in the docking step, Columbic electrostatic potential, van der Waals interaction represented as a Lennard-Jones12-6 dispersion/repulsion term and hydrogen bonding represented as a directional 12-10 term were taken into account. The most favorable free energy of binding were attained by considering docking orientations lying within the range of 2.0 Å in root-mean square deviation (rmsd) tolerance and clustering each other to get result. Top posed docking conformations obtained were subjected to post-docking energy minimization on Discovery Studio 3.5. All calculations were carried out on workstation machines running Linux x86 as operating systems. The resultant structure files were analyzed using PyMOL visualization programs [33]. The receptor-inhibitors complexes were used to define the pharmacophore model.

Database search and hits selection
Best aligned CAIX-pharmacophore model was used as a 3-D query for searching potent compounds from ZINC chemical database having approximately 2 × 10 8 compounds. Database screening was performed using ligand pharmacophore mapping protocol [34]. Compounds which fitted minimum (n-2; n is total feature of present pharmacophore) pharmacophore features were considered as hits. Docking simulation was carried to validate and reduce hit molecules. These molecules were further filtered with ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity properties) and TOPKAT (TOxicity Prediction by Komputer Assisted Technology) tools of DS3.5 [22,23].

Docking simulation
The docking simulations were carried out with CAIX inhibitors to identify the active site residues involved in inhibitors interaction to define a chemoprint used for comprehensive mapping of pharmacophore (Supplementary Table S1). Docking results showed that CAIX active site is located in a large conical cavity which spans from the surface to the center of the protein and inhibitors spatially fits horizontally (12-14Å). The zinc ion (Zn +2 ) is located at the bottom of this cavity surrounded by His-64, His-94 and His-119. The two distinct hydrophobic ends with core having hydrophilic amino acids restrict the active site of CAIX. The lower hydrophobic domain is defined with Leu-91, Leu-93, Leu-198, Val-121, Leu135, while upper hydrophobic region consists of hydrophobic amino acid residues (Leu-141, Val-143, Pro-201, Pro-202). Trp-5, Tyr-7, Trp-209 and Phe-245 are involved in ring aromatic, π-π interaction ( Figure 1A). In the crystal structure, Leu-91, Val-121, Val-131, Leu-135, Leu-141, Val-143, Leu-198, and Pro-202 define a hydrophobic region, while Asn-62, His-64, Ser-65, Gln-67, Thr-69, and Gln-92 have been identified as hydrophilic active site residues [3]. However, docking orientation of CAIX inhibitors also showed many polar interactions with His-94, His-119, Glu-106, Thr-199 and Thr-200. In these His-94 and His-119 were defined as HBD, and Glu-106, Thr-199 and Thr-200 were designated as HBA. Docked poses of CAIX inhibitor acetazolamide (AAZ) resembled spatial orientation as in crystal structure (PDB ID: 3IAI) ( Figure 1B) and showed a reasonable rmsd value of 0.23. Correlation coefficient graph of binding affinity of CAIX inhibitors is shown in Figure 2. A high level of correlation (r 2 =0.92) is observed between the experimental and predicted binding affinity of CAIX inhibitors. The docking score and pharmacophoric feature of all selected inhibitors are given in Supplementary Table S2. It was observed that docked scores are very close to the experiments value [1,3,4]. Residues involved in ligand interactions were sorted according to pharmacophoric feature. The best-scored docking pose of all CAIX inhibitors were imported to LigandScout for the generation of pharmacophore.

Structure based pharmacophore modeling
Resulting CAIX inhibitor(s) complex was (were) used for identification of pharmacophore features. These features were obtained as a collection of all possible combinations of six features: HBD, PI (positive ionisable), HBA, H (hydrophobic), AR (aromatic ring) and NI (negative ionisable) in three-dimensions. The best quantitative pharmacophore model showed HBA (3), HBD (2), PI (1), H (2) and AR (1) as essential 3-D features of potent CAIX inhibitors ( Figure 3A). AAZ mapped well to CAIX pharmacophore with pharmacophore fit score 0.91 ( Figure 3B). The methy (-CH 3 ) of acetamide preferentially mapped with hydrophobic feature, oxygen and amino of acetamide mapped with HBA and HBD. Thiadiazol ring fitted with AR and sulfonamide moiety mapped with HBD and HBA features of pharmacophore. The pharmacophore mapped 2-D feature description of AAZ is shown in Figure 3D. Thiadiazol ring 3, 4 nitrogens (N), both are accurately fitted with HBA and amino (-NH 2 ) of acetamide as well as sulfonamide exhibited HBD feature. For a full pharmacophore map, it is also important to include excluded volume features which reflect potential steric restriction and correspond to the positions that are inaccessible to ligands. Among four selected residues one hydrogen bond donor (Asn-62) and three hydrogen bond acceptor (Gln-67, Glu-106 and Thr-199) are defined as pharmacophore features.
The LigandScout approach allows the fast generation of active site residues based pharmacophore models for development of CAIX inhibitors. The method captures CAs inhibitors spatial fitting in the active site and their interaction with residues as pharmacophore feature. Recently, ligand based CAIX pharmacophore model has suggested that HBA-(3) and HBD-(1) are essential features for CAIX inhibitor, which is consistent with the present pharmacophore. However, other essential features like PI, AR and H of our model were completely missing there [35]. Moreover, active site of CAIX is characterized by two distinct hydrophobic domains (extended ≈ 12Å) with core electron rich region involving various polar interactions, which is well described in the present model. Results suggest that hydrophobic feature may act as a decisive component to enhance the activity and selectivity of CAIX inhibitors. Furthermore, orientation and localization of CAIX active site residues (Gln-67, Thr-69, Leu-91, Val-131, Asp-132, Leu-135, Gly-136, and Ala-204) that differed significantly among other CAs, [2,3] are well described in our model (Figures 1 and 3). Residues, Gln-67 and Thr-69 defined as HBA, and Asp-132, is designated with PI feature. Leu-91, Val-131, Asp-132, Leu-135 are define as hydrophobic feature. Physiochemical properties of the active site residues and selected pharmacophore features are consistent with CAIX crystal structure [4,6,8] ( Figure 1A and Supplementary Table S1).
This pharmacophore is a true representative of CAIX active site, and chemoprint based method showed superior efficiency than earlier ligand based pharmacophore model [8,35], where quality of model affected significantly with training data set selection and conformation generation method [14,15]. Moreover, Catalyst tool includes ligand information only to define the pharmacophoric feature constraint, whereas in LigandScout, the spatial complementarities of ligands with physiochemical properties of active site residues are used to determine the pharmacophore features [36]. All selected novel hits are well spatially fitted with pharmacophore and showed relative pharmacophore fit hit ≥ 0.95.

Virtual screening
Best fitting pharmacophore model was used as a 3-D structural query for retrieving potential CAIX inhibitors from ZINC chemical databases, which consists of 2 × 10 8 compounds. As a result, a total of 1242 compounds which showed good mapping with pharmacophore (relative pharmacophore fit score>0.95), are selected and subsequently subjected to molecular docking analysis. A total of 321 hits showed an appreciable binding affinity with CAIX. The top ranked 10 molecules, have binding affinity (∆G ≤ 10 kcal/mol) and estimated inhibition constant (Ki ≤ 10 nM) are shown in Supplementary Table S2. The best 10 screened compounds (Figure 4) with their pharmacophore fit scores are given in Supplementary Table S3). All novel hits are accurately fitted within CAIX active site and their interactions were comparable to those of AAZ [3]. The predicted hits are further evaluated with ADMET and TOPKAT tools of Discovery Studio 3.5.

Pharmacokinetics and toxicity
Most of drug molecules fail in clinical trials due to weak pharmacokinetic profile and cellular toxicity. Therefore, in silico pharmacokinetic profile of selected hits was evaluated to address the putative bioavailability to CAIX inhibitors (Table 1). Physicochemical properties, especially aqueous solubility (logS), lipophilicity (clogP), polar surface area (PSA), and molecular weight (MW) are directly associated with absorption and bioavailability of a drug molecule [13]. These properties directly affects the movement of a drug from the site of administration into the blood. Other descriptor CYPs (cytochrome P450), an important family of biotransformation enzymes, play a crucial role in drug metabolism and are equally important for disposition of drugs, their pharmacological and toxicological effects [37]. Here, ADMET (DS3.5) tool was employed to compute the probable pharmacokinetic profile of molecules. It uses QSAR models to estimate a range of ADMET related properties for small molecules. Results showed that all virtual hit molecules have ideal AlogP value ≤ 3, except ZINC14183954 (logP=3.69). However, logP value upto 4 is considerable for a drug molecule. Similarly, all hits showed good to moderate range of solubility level (solubility level=3 to 4). The best novel hit (ZINC03363328) showed better solubility with level 3, AlogP=0.013, having good absorption level 0 and minimum hepatotoxicity probability, i.e., 0.05. Only three compounds (ZINC00284720, 32118378, 16601509) showed hepatotoxicity probability score ≥ 0.5. The observed human intestinal absorption (HIA) value is good for all molecules, except ZINC16650541 having HIA level=1. The blood-brain-barrier (BBB) penetration ability of compounds are categorized in four prediction levels 0-4, having high penetration to no penetration respectively. All hit molecules showed better ability, except ZINC16650541 (BBB=4). The known CA inhibitors, AAZ, methazolamide (MZA), ethoxzolamide (EZA), dichlorophenamide (DCP) and dorzolamide (DZA), all are in clinical application showed good logP value ≤ 3, except DZA. All have good human intestinal absorption value, however, AAZ, MZA and DZA showed higher probability score of hepatotoxicity. The antiglaucoma drug brinzolamide (BRZ) having a good ADMET profile (logP=2.6, BBB=3, solubility=2 and HIA=0), showed hypotoxicity effect and is also reactive to CYP2D6. The CYP2D6 probability of all hits showed<0.5, except ZINC00284720 and ZINC16601509, demonstrated that all compounds were non-inhibitor to CYP2D6 enzyme. For good druggability, the ideal PPB (plasma protein binding) level is 0. All hit compounds, except 4, 6, 8, and 10, showed better PPB activity and come under level-0, whereas DCP, DZA and BRZ showed higher score values (level-1 and 2). PSA is dependent on the conformation and possible internal hydrogen-bonding. It implies the single low-energy Probability; Mut: Ames mutagen prediction; Prob.: Ames probability; Enrichment (Ames enrichment); WOE-_Prediction (weight of evidence); M: (Mutagen); NM: (Non-Mutagen); C: (Carcinogen); NC: (Non-Carcinogen). The BBB (blood brain barrier) level 0-4, having high penetration to no penetration, absorption level ideal value range from 0-1 as good to moderate, Ideal value of solubility level is 3, Hepatotoxicity probability<0.5 is ideal, similarly probability value for CYP2D6<0.5 is good and denoted with level 0, plasma protein binding value 0 is good and a compounds to accessible with BBB, AlogP value should not be greater than 3.0 and polar surface area ≤ 100 is ideal. conformer of the molecule. For drug activity, the optimum value of PSA ≤ 90 Å 2 is well defined [38]. The hydrogen bonding and logP are the main two descriptors to define the PSA of molecule. All predicted compounds showed appreciable PSA, whereas AAZ and EZA showed higher PSA value ≥ 90 Å 2 .
The computer-aided toxicity predictor, TOPKAT was used to predict the cellular toxicity of novel hits. Our primary emphasis was to compute the mutagen and carcinogenic effect of molecules with Ames Prediction and WOE Prediction (weight of evidence). However, it includes varieties of models and toxicity endpoints (teratogenicity, irritation, sensitization, immunotoxicology and neuro-toxicity) that are often used in drug development. All selected molecules showed Ames probability score ≤ 7 and are non-mutagen except ZINC32118378 and ZINC14183954 having probability score ≥ 7 and are anticipated as mutagen. All CA inhibitors under study are predicted as non-mutagen, except nilotinib. Other toxicity predictor WOE (weight of evidence) was used to determine the relative level of certainty of compounds/agents that may cause cancer in human. All compounds were found as noncarcinogenic, except three virtual hits ZINC00494463, ZINC32118378 and ZINC16601509. MZA and DCP are predicted as carcinogenic. The comparative ADMET score and TOPKAT property data of virtual hits with standard drug suggested that selected molecule may be exploited as bioactive molecules.

Molecular properties of lead hits
HTS of ZINC database and their docking analysis led to the identification of 10 novel hits for CAIX inhibitor. In spite of having a good ADMET score, two compounds showed mutagen effect and three are anticipated as a carcinogen in TOPKAT test (Table 1). Among 10 hits, only five hits (ZINC03363328, ZINC08828920, ZINC12941947, ZINC03622539 and ZINC16650541) showing better ADMET and TOPKAT values, are considered as possible lead molecules ( Figure 4). All novel hit compounds preferably fitted with CAIX pharmacophore with fit score ≥ 0.95. The best novel hit ZINC03363328 with fit score 0.97 observed here ( Figures 3C and 3E), showed most favourable score for all computational analyses. It is because of the presence of quinazoline derivative having piperazine scaffold at one end and azepane ring at the other end. The quinazoline is one of the most widespread chemical scaffolds amongst bioactive compounds and shown a promising results in the development of novel anti-cancer drug molecules [39]. Whereas azepane derivatives possesses anti-cell-proliferation activity [40]. Other a azepan derivative hit ZINC12941947, also showed promising pharmacokinetic score. Piperazine ring is generally substituted with compounds to enhance their water solubility. The best docking pose of ZINC03363328, having docking score (∆G=-11.82 kcal/mol; Ki=2.32 nM) showed polar interaction with Asn-62, His-94, Thr200 and Zn +2 . Methylpiperazine ring is oriented towards lower hydrophobic domain of the active site, surrounded with Leu-91, Leu-93, Leu-198, Val-121 and Trp-209 ( Figure 5). Azapane ring enclosed with Trp-5, Leu-141, Val-143, Pro-201, Pro-202 belong to upper hydrophobic domain CAIX active site. The core quinazolin ring having aromatic interaction with Trp-5, Tyr-7 and Phe-245, is also involved in polar interaction with Asn-62, His-94. The geometric constrains for hydrogen bond donor/ acceptor (sp 3 donor/acceptor and sp 2 donor/acceptor) are described in Figure 6. Hence, our results suggest that a conceptual adjustment of these hits may lead to design a novel and potent CAIX inhibitor.

Conclusions
In this study, 3D pharmacophore hypotheses were generated from active site residues involved in CAIX inhibitors interaction with the protocol implemented in LigandScout 3.1. This method have enough potency to generate pharmacophore model with single docked pose of ligand. However, multiple inhibitors are taken here to capture all possible active site residues of CAIX involve in inhibitor interaction. Docked pose of the best pharmacophore characterized by PI (1), HBD (2), HBA (3), H (2) and AR (1) are essential features for CAIX inhibitor. All selected novel hits against ZINC database are well mapped with n-2 feature with relative hit score>.95, showed a good predictive ability of pharmacophore. The observed hits are precisely sorted according to their binding affinity (ΔG ≥ 10 kcal/mol; Ki ≥ 10 nM) and spatial fitting within the active site. The best ten lead molecules, having energetically favorable interaction with active site, His-94, His-119, Glu-106, Thr-199, Thr-200 and Zn 2+ , are selected as novel hits for CAIX. ADMET analysis was used to check the lead candidates for their drug likeness and bioavailability. Finally, based on the consensus scoring values, critical interactions with the active site residues, and predicted activity values, five compounds (ZINC03363328, ZINC08828920, ZINC12941947, ZINC03622539 and ZINC16650541) are proposed as potent CAIX inhibitors. These results suggest that the application of chemoprint based pharmacophore could assist in selection of potential leads for rational design of CAIX inhibitors in cancer therapy.