Received date: February 26, 2014; Accepted date: April 02, 2014; Published date: April 10, 2014
Citation: Sehgal SA, Tahir RA, Shafique S, Hassan M, Rashid S (2014) Molecular Modeling and Docking Analysis of CYP1A1 Associated with Head and Neck Cancer to Explore its Binding Regions. J Theor Comput Sci 1: 112. doi:10.4172/2376-130X.1000112
Copyright: © 2014 Sehgal SA, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Theoretical & Computational Science
Cytochrome P450, family 1, subfamily A, polypeptide 1 is a phase I enzyme of cytochrome super family P-450 (CYP) involved in detoxification or conversion of carcinogens into a more electrophilic form, metabolized by phase II enzymes. These detoxifying enzymes have been widely studied in association with head and neck cancer. Multiple bioinformatics tools are applied for CYP1A1 modeling and its assessment. Homology based modeling from 2HI4 template was carried out by MODELLER 9v10 bioinformatics software. All evaluation tools confirmed the reliability of predicted model. The binding pockets were revealed for binding studies. Inhibitor (C6H13FN2O2) showed maximum binding affinity against CYP1A1. Docking studies revealed that Leu-21, Val-22, Phe-23, Gly-42, Pro-43, Gly-45, His-51, Gln-75 and Ile-76 are critical residues for receptor-ligand interaction. We propose that predicted structure is reliable for the structural insights and functional studies and selected inhibitor might be more potent for Head and Neck Cancer. Further analysis of this inhibitor through site-directed mutagenesis could be helpful for exploring the details of ligand binding pockets. Overall, findings of this study may be helpful in designing the novel therapeutic targets to cure Head and Neck cancer.
Homology modeling; CYP1A1; Head and neck cancer; Bioinformatics; Molecular docking
Carcinomas of Head and Neck (HNC) include oral cavity, larynx and pharynx. HNC is the sixth most occurring cancer in all over the world , and there are about half million of diagnosed cases in every year . In Pakistani population, HNC is the second most prevalent cancer .
Regular in-taking of alcohol and smoking are the main risk factors for HNC [4-7].Mainly, cigarettes and alcohol have many carcinogens that metabolized to active forms having deleterious effects on the body . There are two types of Xenobiotic metabolism enzymes i.e. mediated oxidative metabolism (Phase I), and enzyme conjugate (Phase II). Phase I enzymes are mainly of cytochrome super family P-450 (CYP) that convert many compounds into highly reactive metabolites. Many oxidation and some reduction reactions, involving thousands of substrates are catalyzed by these enzymes . A procarcinogen becomes carcinogenic after the introduction of one or more hydroxyl groups on substrate. Phase II reactions conjugate with endogenous substrates by UDP-Glucoronosil Tranferases, Glutathione S-Transferases (GSTs), and N-Acetyl Transferases (NATs), that catalyze the conversion of reactive electrophilic to inactive, easily removable water soluble conjugates . Gene polymorphisms, encoding Xenobiotic metabolism enzymes may change the activation or detoxification associated with function or expression of carcinogenic compounds [10-13].
Individuals with these polymorphic genes have greater risk of cancer when carcinogens are exposed to them . Androtsopoulos et al. (2009) reviewed and concluded that constituents of dietary products suppress the progression of cancer by the CYP1A1 through enzyme induction of carcinogens and inhibiting the activation of CYP1A1-catalyzed metabolic pathway. Recent in vivo investigation confirmed that CYP1A1 gene may function as a carcinogen detoxification enzyme .
CYP1A1 in human encodes  Cytochrome P450, family 1, subfamily A, polypeptide 1 protein , a member of cytochrome P450 superfamily of enzymes . The association of CYP1A1 and CYP2E1 confirmed the increased risk of oral cancer related to cervical cancer and esophagal cancer associated respectively [19-22].
Bioinformatics is an emerging scientific field that utilizes computational, mathematical and statistical approaches for in silico solutions of biological problems. The number of proteins with known sequences is increasing with the advent of sequence technologies. A large number of protein structure predictions are very expensive and time consuming by X-Ray crystallography and NMR methods. Bioinformaticians apply different tools for comparative modeling and docking of proteins in a fast and easy way. Homology Modeling, Threading and Ab Initio in silico approaches were utilized to solve the structure prediction problems. The three dimensional (3D) structure of the CYP1A1 is not reported in PDB. In this work, a computational approach is applied to predict the 3D structure of the CYP1A1 by homology modeling and to reveal the insights of CYP1A1 3D structure. 3D structure is necessary for drug designing because without 3D structure, it is so hard to elucidate the inhibitor binding sites. 3D structure is an effort to create an environment of protein and then analyze the drugs. MODELLER 9.10 is reliable and most cited software for homology modeling. It predicts the authenticated loops, beta sheets, coils, helices, phi and psi angles by utilizing template structure. This method generates valid and reliable structures by using suitable template having appropriate amino acids identity.
Sequence retrieval and 3D model building
The complete amino acid (512 aa) sequence of CYP1A1 encoding protein was retrieved from Uniprot Knowledgebase database with accession number A0N0X8  in FASTA format. NCBI Basic Local Alignment Search Tool (Psi-BLAST)  was used against Protein Databank (PDB) for suitable template search with query sequence. The top prioritized template with query sequence (PDB ID: 2HI4) was selected having 73% identity score and E value 0.00. Query sequence and template structure were aligned and 3D structure was built by MODELLER 9v10 . The MODELLER assists in 3D structure prediction of proteins by satisfaction of spatial restraints .
The recognition of errors in experimental and theoretical models of protein structures is a major problem in structural bioinformatics. There is no single method that consistently and accurately predicts the errors in 3D structure . Different evaluation tools were used for the assessment of protein structure. From models generated by MODELLER, the model was selected on the basis of MODELLER evaluation score.
The model was further accessed by ProCheck . ProCheck generates Ramachandran plot for the assessment of models along with distribution of residues in favoured, allowed and outlier regions.
The quality of 3D model was verified by ERRAT , ProSA (Protein Structure Analysis), Verify3D and ANOLEA (Atomic Non- Local Alignment Environment Assessment) tools. ERRAT generated a plot indicating the confidence and overall quality of model. ProSA (Protein Structure Analysis)  calculated an overall quality score of the predicted structure. Verify3D checked the compatibility of model (3D) with its own amino acid sequence (1D) and assigned a score for each residue . Energy calculations were performed by ANOLEA (Atomic Non-Local Environment Assessment) server  that gave Non-local normalized energy Z-score of model. The binding pockets were explored by Site Hound  and CastP.
Numerous tools and servers were utilized to analyze for inhibitor that might potentially inhibit CYP1A1 by interacting with its predicted structure such as AutoDock , Chimera 1.6 , VegaZZ , Chemdraw , mCule , Molinspiration  and Osiris Property Explorer. Docking studies were done by AutoDock tools. The number of rotatable bonds, H-bond acceptors and H-bond donors were obtained using MCule and Molinspiration. The online tool Osiris Property Explorer was employed to estimate their possible tumorigenic, reproductive or mutagenic risks and to calculate the drug like properties of inhibitor. Lipinski’s rule of five was analyzed by mCule server. The drug score percentage calculated by Osiris software. The mCule and Osiris programs were employed to estimate the mutagenesis of novel molecules and no mutagenic risks were detected.
C6H13FN2O2 inhibitor  was found for HNC in literature. The aim of docking analysis was to identify the binding pattern and the relative binding specificities. The geometry optimization and energy minimization of inhibitor was performed by Vega ZZ and ChemDraw Ultra. Results were analyzed by using AutoDock tools and Chimera 1.6. Binding interactions were elucidated using AutoDock. Default parameters with generating 100 poses of complexes were utilized. The grid box was used to define the screening site. Lowest binding energy complex was selected and visualized by Chimera 1.6.
MODELLER generated 3D models by using the 2HI4 template having 73% similarity with query sequence. DOPE scores of models were obtained from MODELLER log file and model having least DOPE score and lowest MODELLER objective function was selected (Figure 1).
For further evaluation of the predicted 3D structure, model was submitted into RAMPAGE tool. Ramachandran plots showed Φ and Ψ distributions of non-Glycine, non-Proline residues (Figure 2) and gave residues distribution (Table 1). The phi and psi angles originated were plotted against each other to differentiate the favorable and unfavorable regions. These angles were used to evaluate the quality of regions.
|Residues Distribution||% Residues Expected||Residues in Model|
|% residues in Favoured region||~98.0%||496 (97.3%)|
|% residues in Allowed region||~2.0%||13 (2.5%)|
|% residues in Outlier region||1 (0.2%)|
Table 1: Ramachandran plot calculations of 3D model of CYP1A1 by RAMPAGE.
Model validation by Errat
3D predicted model of CYP1A1 was subjected into the Errat protein structure verification server. Errat provided an Overall quality factor of model as 82.121%, which is very much satisfactory (Figure 3). On the error axis, two lines were drawn to indicate the confidence with which it is possible to reject the regions that exceed the error value. It expressed as the percentage of the protein for the calculated error value falls below the 95% of rejection limit. Good high resolution structure generally produce values around 95% higher. For lower resolutions (2.5 to 3.0 Å), the average overall quality factor is around 91%. Errat produces a plot that gives the value of error function which showed confidence limits by comparing with statistical analysis from highly refined predicted structures. So, 82% overall value is much reliable and satisfactory for further analysis.
Model validation by ProSa
ProSa gave Z-score plot (Figure 4a) of protein structure to determine the overall model quality. The Z-score of the predicted model was -8.76 which represents a good quality model. The local model quality was judged by plotting energies as functions of amino acids in ProSa residue score plot (Figure 4b).
Verify3D showed that 90% of the amino acids exhibited scores ≥ 0.2 in the 3D/1D profile. Predicted model was further assessed by other evaluation tools, namely PROCHECK and ANOLEA which gave satisfactory results suggesting reliability of model (Table 2).
Table 2: Evaluation results of Model by PROCHECK, VERIFY3D and ANOLEA.
Masood et al. (2011) demonstrated that genetic polymorphisms of CYP1A1 gene along with environment factors are the main cause of HNC . The most prevalent area of HNC is oral cavity. These results further supporting the earlier findings but differ only in increased rate of HNC occurrence [41-43].
The top ten binding pockets of CYP1A1 were identified which ranked on the basis of energy (Table 3). The volume of the binding pockets was also analyzed in x-axis, y-axis and z-axis. The binding residues of the binding cavities explored for fruitful binding of novel ligands. The binding pockets of CYP1A1 are not reported yet. So, the in silico approaches utilized for the prediction of binding pockets in this study. The energy range of predicted cavities also elaborates the efficacy of pockets. The mutational study of binding residues suggested that these residues could be used as clinical prospectus against cancer study. The predicted binding residues lead to the drug designing of lead compounds against HNC.
|1||-1325.45||(-18.09, -8.97)||102.00||0.868||17.702||17.070||ILE-115, Ser116, Ser-122, Phe-123, Asn-221, Phe-224, Gly-225, Leu-254, Asn-255, Phe-258, Leu-312, asp-313, Gly-316, Ala-317, Asp-320, Thr-321, Ile-386, Leu-496, Thr-497|
|2||-1313.77||(-17.14,-8.90)||113.00||6.054||25.393||18.356||Arg-106, Arg-106, Mat-121, Ser-122, Arg-135, Asp-313, Ala-317, Thr-321, Phe-381, Val-382, Thr-385, Ile-386, Gln-411, Ile-449, Phe-450, Gly-451, Arg-455, Lys-456, Cys-457, Ile-458, Ala-463, Leu-496|
|3||-922.44||(-16.53,-8.90)||83.00||14.743||33.033||24.806||Leu-89, Arg-93, Leu-373, Phe-376, Val-442, Ser-444, Glu-445, Lys-446, Val-447, Ile-448, Ile-449, Phe-450, Gly-451, Met-452, Lys-456, Glu-460, Arg-464|
|4||-856.76||(-15.77,-8.97)||73.00||18.180||24.881||-5.248||Phe-23, Trp-24, Ile-26, Arg-27, Ala-28, Ser-29, Asn-39, Gln-75, Arg-77, Thr-81, Pro-82, Pro-402, Lys-403, Gly-404, Arg-405, Cys-406|
|5||-756.33||(-17.34,-8.93)||64.00||27.587||22.959||-2.826||Leu-21, Val-22, Phe-23, Arg-27, Lys-38, Asn-39, Pro-40, Pro-41, Gly-42, Pro-43, Trp-44, Gly-45, His-51, Gln-75, Ile-76, Arg-77, Phe-399|
|6||-735.99||(-19.40,-8.90)||61.00||-9.884||17.397||34.957||Lys-165, Glu-166, Val-169, Val-197, Cys199, Ala-200, Gly204, Arg-205, Tyr-207, His-209|
|7||-720.08||(-18.90,-8.93)||61.00||3.903||18.004||52.343||Ile-171, Arg-353, Ser-354, Arg-355, Arg-356, Ile-473, Gln-476, Arg-477, Leu-510, Arg-511, Ser-512|
|8||-674.77||(-16.19,-8.90)||60.00||15.741||9.268||11.292||Met-52, Leu-53, Leu-55, Gly-56, Lys-57, Pro-59, Glu-226, Gly-229, Ser-230, Gly-231, Asn-232, Pro-233, Ser-247, Ile-493, Tyr-494|
|9||-599.82||(-17.11,-9.00)||52.00||29.742||28.345||16.146||Ser-67, Gln-68, Gln-69, Gly71, Asp-72, Ser-87, Gly-88, Leu-89, Asp-90, Gln-413, Asp-417, Gln-418, Lys-419, Leu-420, Lys-446|
|10||-573.59||(-13.44,-8.91)||56.00||10.125||19.961||-0.815||Trp-24, Ser-80, Thr-81, Pro-82, Arg-106, Pro-107, Asp-108, Leu-109, Tyr-110, Pro-125, Asp-235, Phe-236, Pro-238, Arg-241, Ser-389, Gly-404, Cys-406|
Table 3: Top ten binding pockets with binding residues, energy and volume.
AutoDock 4.0.2 tool was employed to explore how the ligand binds to the respective protein, best structural information, functionally interacting residues and the binding conformation. The ligand retrieved for CYP1A1 (Figure 5) from Tahir et al. (2013)  is described in Table 4. The protein-ligand docked complex post docking analysis and amino acids found in binding pocket of protein were identified by Chimera 1.6 (Figure 6). In an endeavor to inspect, It was observed that Leu-21, Val-22, Phe-23, Gly-42, Pro-43, Gly-45, His-51, Gln-75, Ile-76 residues exhibit good binding interactions with inhibitor. It was also analyzed and observed that the inhibitor bind at the binding residues between Leu-21to Ile-76. The inhibitor bind in the binding pocket-5 (Table 3) having volume 64.00 Å.
|Molecular Weight (g/mol)||164.10|
|Hydrogen Bond Acceptor||04|
|Hydrogen Bond Donor||02|
|Binding Energy (Kcal/mol)||-3.99|
|Inhibition Constant (mM)||1.19|
|Interacting residues||Leu-21, Val-22, Phe-23, Gly-42, Pro-43, Gly-45, His-51, Gln-75, Ile-76, Arg-77|
Table 4: Inhibitor properties and binding residues.
In this study, in silico methodologies such as homology modeling and docking analysis were carried out. Three dimension structure of CYP1A1 was modeled by employing the crystal structure template. The predicted structure of CYP1A1 has a good degree of accuracy. The final refined model was assessed by different evaluation programs. Ramachandran plot values indicated ideal results of predicted model as residues in favoured regions which are 97.3%, while only 1 (0.2%) residue was in the outlier region. Overall model quality was measured by ERRAT showing reliable model having 82% overall quality factor. Similarly, Verify3D gave 90% score of predicted model. The molecular docking analyses were conducted by automated docking Tools (AutoDock) that has allowed to conclude the ligand-receptor interactions of selected inhibitor. The analyses of the interactions between CYP1A1 and inhibitor have pointed out the best complex with least binding energy. The results of docking analysissuggest that inhibitor must be the one that comes under the designed parameters of having a less docking score and for satisfying its drug likeness. By analyzing drug score and Lipinski’s rule of five, it is suggested that inhibitor (C6H13FN2O2) is proven as a potential inhibitor for HNC treatment by targeting CYP1A1. Our docking results revealed the involvement of Leu-21, Val-22, Phe-23, Gly-42, Pro-43, Gly-45, His- 51, Gln-75, Ile-76 residues and mutational studies of these residues could be highly effective in further studies. Predicted protein model described in this work may be further used for finding interactions with other proteins involved in HNC or in different cancers.
In conclusion, this analysis suggests that the selected inhibitor is efficacious in the treatment of HNC. Though numerous differences exist between baseline population, computational analysis and trail methodology studies, it seems to be justified to conclude that selected inhibitor may be a good option for the treatment of HNC. Further studies and synthesis of novel compounds considering these findings can expect similar response rates and cure the HNC.
We are grateful to Miss Saima Younas, Functional Informatics Lab, Quaid-i- Azam University Islamabad for kind assistance.