Role of the Cation-π Interaction in Therapeutic Proteins: A Comparative Study with Conventional Stabilizing Forces

The cation-π interaction is an important, general force for molecular recognition in biological receptors. In this study, we have analyzed the energy contribution resulting from cation-π interactions in the set of therapeutic proteins. The contribution of cation-π interacting residues in secondary structure involvement, solvent accessibility, stabilization centers, stabilizing residues and conservation score has been evaluated. Secondary structure of the cation-π involving residues shows that, Arg and Lys prefers to be in strand. Among the π residues, Phe prefer to be in coil, Tyr prefers to be in strand and Trp prefer to be in helix. Among the cation-π interacting residues Arg and Lys were in the exposed regions. Phe and Tyr were in the partially buried region and Trp in the fully buried region. Stabilization centers for these proteins showed that all the five residues found in cation-π interactions are important in locating one or more of such centers. The contribution of stabilizing residues in the cation–π interactions was analyzed. Further, the study shows that, 43 percent of the amino acid residues that are involved in cation-π interactions might be conserved in therapeutic proteins. The comparison between the conventional and nonconventional interactions in the data set, clearly depict the significance of cation-π interaction in the stability of therapeutic proteins. On the whole, the results presented in this work will be very useful for understanding the contribution of cation-π interaction to the stability of therapeutic proteins.


Introduction
The importance of therapeutic proteins has grown rapidly since the emergence of the biotechnology industry more than 30 years ago. There are approximately 140 therapeutic proteins approved in the United States and Europe, and an additional 500 in clinical trials (Walsh, 2003), with an even large number in preclinical development. In recent years, the number of recombinant proteins used for thera-peutic applications has increased dramatically. This increasing trend has driven the development of a variety of improvements in protein expression and stability analysis. The stability can be determined by several interactions such as salt bridge, di-sulfide bond, conventional hydrogen bonds electrostatic interaction, Van der Waals and hydrophobic interactions in the protein structure. These interactions are J Comput Sci Syst Biol Volume 2(1): 051-068 (2009) -052 ISSN:0974-7230 JCSB, an open access journal crucial in many areas of modern chemistry, especially in the field of molecular recognition and for structural stability (Hunter et al., 1990;Wintjens et al., 2000). In addition cation-π interaction (Dougherty, 1996;Ma and Dougherty, 1997; Scrutton and Raine, 1996) is increasingly recognized as an important noncovalent binding interaction relevant to structural biology.
Their understanding is essential for rational drug design and lead optimization in medicinal chemistry (Meyer et al., 2003). In proteins, cation-π interactions occur between the cationic side chain of lysine (K) or arginine (R) and the aromatic side chains of phenylalanine (F), tyrosine (Y) and tryptophan (W) (Chakravarty and Varadarajan, 2000). Previous studies on cation-π interactions have focused on various aspects such as their role in ligand recognition (Zacharias and Dougherty, 2002;Zhong et al., 1998;Scrutton and Raine, 1996) and protein drug interactions (Liu et al., 2002). There are several instances where cation-π interactions have shown to play a significant role. For example, the active site of horse radish peroxidase consists of an arginine interacting with the adjacent tyrosine residue to allow aromatic donor binding (Ma and Dougherty, 1997).
The importance of this interaction has been stressed by several investigators for their role in enhancement of the stability of thermophilic proteins (Chakravarty and Varadarajan, 2000;Gromiha et al., 2002), folding of polypeptides (Shi et al., 2002;Burghardt et al., 2002) and the stability of membrane proteins (Mulhern et al., 2000;Gromiha, 2003). Influence of cation-π interactions in protein-DNA complexes is studied by Gromiha et al., (2004). Also there are reports on theses kinds of interactions in a set of 62 non-reductant DNA binding proteins by the same author (Gromiha, 2005). Recently, our group published work on cation-π interactions in Interleukins (Anand et al., 2006) and in RNA-binding proteins (Anand et al., 2007).
One of the most commonly cited examples of cation-π interactions is the acetylcholine-binding site of acetylcholinesterase (Scrutton and Raine, 1996). The active site of this enzyme is divided into two subsites: the 'esteratic' site and the 'anionic' site. Access to the active site of the enzyme is via the deep and narrow 'aromatic gorge' which consists of 14 highly conserved aromatic residues. Studies have shown that docking of the substrate acetylcholine, at the base of the gorge, results in the cation-π binding of choline to Trp-84 in the 'anionic' site (Dougherty, 1996).
To the best of the authors' knowledge, such an interactions data in therapeutic protein data set is not yet available. Hence, in this work an effort has been made to collect the information concerning conventional and nonconventional interactions such as traditional hydrogen bond, di-sulfide bond, salt bridge, and cation-π interactions in the therapeutic protein data set. We emphasize that 43 therapeutic proteins in our data set showed significant number of cation-π interactions and hence we emphasize that this investigation is very significant in the sense that, cation-π interactions in therapeutic proteins do play a major role in structural stability of these proteins. The knowledge gained from this study is important in the detection of interplay of conventional and non conventional interaction in the therapeutic protein. This will facilitate the design of more potent, less toxic and personalized drugs using these proteins.

Data Set
We have considered a set of 49 therapeutic proteins from the Protein Data Bank (Berman et al., 2000) for our investigation the details of which are given in Table 1. According to the structural classification of proteins, 42% of this protein comes under alpha group, 29% comes under beta 11% comes under alpha and beta and remaining18% comes under small proteins in the therapeutic protein data set.

Computation of Cation-π π Interactions Energy
The cation-π interaction energy in each enzyme has been calculated using the program CaPTURE (Gallivan and Dougherty, 1999). Initially cation-π interactions were identified with approximate distance based criteria. Energetically significant cation-π interactions can be obtained by using the program CaPTURE. This program has meaningful statistics for cation-π interactions for structures within the PDB. Also, simple and unambiguous protocol makes this tool as one of the choicest candidates for the computation of cation-π energies. The percentage composition of a specific amino acid residue contributing to cation-π interactions is obtained by the equation, Comp cat-π (i) = n cat-π (i) × [100/n(i)] where i stands for the five residues, Lys, Arg, Phe, Trp and Tyr, n cat-π is the number of residues involved in cation-π interactions and n(i) is the number of residues of type i in the considered protein structures. We have computed the energetic contribution of cationπ interactions for each enzyme in the data set and for all possible pairs of positively charged and aromatic amino acids. The total cation-π interaction energy (E cat-π ) has been divided into electrostatic (E es ) and van der Waals energy (E vw ) and was computed using the program CaPTURE, which has implemented a subset of OPLS force field (Jorgensen et al., 1996) to calculate the energies. The electrostatic energy (E es ) is calculated using the equation E el = ∑q i q j e 2 /r ij ; (2) Where q i and q j are the charges for the atoms i and j, respectively, and r ij is the distance between them. The van der Waals energy is given by Where σ ij = (σ ii σ jj ) 1/2 and ε ij = (ε ii ε jj ) 1/2 ; σ and ε are the van der Waals radius and well depth, respectively. The electrostatic component of the OPLS binding energies (E es ) were compared with the total ab initio binding energy. These measurements correlate well. A force field-based method was developed to reproduce the trends in the ab initio data. Also the force field-based method was used to select energetically significant cation-π interactions.

Secondary Structure and Solvent Accessibility Studies
Secondary structure and solvent accessibility are considered to be very important to understand the biochemical activity of proteins. Hence a systematic analysis of each cation-π interactions forming residue was performed based on their location in different secondary structures of enzymes and their solvent accessibility. Solvent accessibility was divided into three classes, buried, partially buried and exposed indicating, respectively, the least, moderate and high accessibility of the amino acid residues to the solvent. We used the program DSSP (Kabsch and Sander, 1983) to obtain the information about secondary structures and solvent accessibility. According to the Science Citation Index (July 1995), the program has been cited in the scientific literature more than 1000 times. Hence in our analysis, we have chosen DSSP for predicting the secondary structure and solvent accessibility.

Computation of Stabilization Center
Stabilization centers are clusters of residues that are involved in medium or long range interactions (Dosztanyi et al., 1997). Residues can be considered part of stabilization centers if they are involved in medium or long range interactions and if two supporting residues can be selected from both of their flanking tetra peptides, which together with the central residues form at least seven out of the nine possible contacts. We used the server which is available at http:// www.enzim.hu/scide (Dosztanyi et al., 2003) for this purpose.
Stabilizing residues were computed using the parameters such as surrounding hydrophobicity, long-range order, stabilization center and conservation score as described by Gromiha et al., (2004a). We used the server SRide (Gromiha et al., 2004a) for this purpose. Conservation score of ≥ 6 is the cutoff value used to identify the stabilizing residues.

Computation of Short, Medium and Long-range Contacts in Therapeutic Proteins Data Set
The residues coming within a sphere of 8Å was computed as described earlier (Gromiha et al., 2004b). For a given residue, the comparison of the surrounding residue is analyzed in terms of the location at the sequence level. The contribution from <±4 are treated as short-range contacts, >±4 to <±20 as medium-range contacts and >20 are treated as long-range contacts. This classification enables us to evaluate the contribution of long-range contacts in the formation of cationπ interactions.

Conservation Score
We computed the conservation score of cation-π interacting amino acid residues in each therapeutic protein using the ConSurf server (Glaser et al., 2003). This server computes the conservation based on the comparison of the se-quence of a PDB chain with the proteins deposited in Swiss-Prot (Boeckman et al., 2003) and finds the ones that are homologous to the PDB sequence. The number of PSI-BLAST iterations and the Evalue cutoff used in all similarity searches were 1 and 0.001, respectively. All the sequences that are evolutionarily related with each one of the proteins in the data set were used in the subsequent multiple alignments. Based on these protein sequence alignments the residues are classified into nine categories from highly variable to highly conserved. Residues with a score of 1 are considered highly variable and residues with a score of 9 are considered highly conserved.

Interplay of Conventional and Nonconventional Interactions in Therapeutic Protein
The conventional interactions such as optimal hydrogen bond, salt bridge (a negative atom (side chain oxygen in Asp or Glu) and a positive atom (side chain nitrogen in Arg, Lys or His with an inter-atomic distance less than 7.0 Å) and di-sulfide (Two cysteine are called a bridged pair if the distance between their sulphur is between 1.5 and 2.5 Å) interactions were computed with the help of WHAT IF (Vriend, 1990). The nonconventional cation-π interaction, as reported earlier in this study is calculated using the program CaPTURE (Gallivan and Dougherty, 1999). The knowledge of these interactions and their comparison with the conventional interactions on a therapeutic protein data set probably, is the first such report available in the literature.

Preference of Cationic and Aromatic Residues for Forming Cation-π π Interaction in Therapeutic Proteins
The preference of amino acid residues that are involved in cation-π interactions was analyzed and the results are presented in Table 2. We observed that in these proteins, Phe has the highest occurrence among the aromatic residues involving in cation-π interactions. Moreover, only 50% of the Trp residues are involved in these cation-π interactions as compared to Phe and Tyr. Lys is higher than Arg amongst the cationic residues in the set of therapeutic proteins studied. This trend is similar to those observed in transmembrane, globular proteins (  The number of cation-π interaction in therapeutic proteins in the present data set studied ranges from 1 -6. The study shows that 41, 37 and 10% of the protein had number of interaction of 1, 2 and more than 2 interactions respectively. Almost 10% of the therapeutic protein did not show any cationπ interactions. These results are shown in Table  3. There are six cation-π interacting pairs namely, Arg-Phe, Arg-Tyr, Arg-Trp, Lys-Phe, Lys-Tyr and Lys-Trp pairs. The PyMol view of Arg-Phe, Arg-Trp and Lys-Trp interacting pairs for the protein with a PDB id 1BML in is shown in Fig. 1. It was found that, among the cation-π interactions involving Arg residues Arg-Tyr residues showed the highest percentage of interaction than Arg-Phe and Arg-Trp interactions. Among the cation-π interactions involving Lys residues Lys-Tyr interaction was higher than Lys-Phe and Lys-Trp interactions. These results are shown in Fig. 2. It is interesting to note that even though, individually, Phe and Lys exhibited higher cation-π interactions, but as pairs, Arg-Tyr and Lys-Tyr were involved in more number of cation-π interactions than the other four pairs. Hence, the Arg-Tyr and Lys-Tyr interactions may be quite important in the stability of these therapeutic proteins. Of the total 49 proteins investigated, 43 proteins had significant cation-π interactions and rest of the 6 proteins did not show any significant interaction at all. The therapeutic protein 1PGG had a maximum of six energetically significant cation-π interactions.

Cation-π π Interaction Energies in Therapeutic Protein
The specific pair wise residue involved in cation-π interaction and their position for all the therapeutic proteins studied are given in Table 3. It could be seen from the table that the therapeutic protein with PDB code 1PGG had a maximum energy of -24.05 (kcal/mol). The pair wise cation-π inter-    action energy between the cationic and aromatic residues shows that Arg-Phe energy is the strongest and Lys-Trp is the lowest among the six possible pairs as shown in Fig. 3. The strength of cation-π interaction energy differs significantly in the therapeutic protein. For instance, for 1PGG it was -24.05 (kcal/mol) and in 1M4C it was -2.33 (kcal/mol). Of the 49 proteins investigated, it was found that 69 % showed a cation-π energy less than -10 kcal/mole, 27 %, -10 to -20 kcal/mol and 4 % of them showed a cation-π interaction energy greater than -20 (kcal/mol) respectively. We observed an average energetic contribution of -4.53 (kcal/mol) in the group of therapeutic protein investigated in this work. The composition of cation-π interaction energy into electrostatic and Van der Waals energy terms showed that, among the 49 therapeutic protein, 43 protein had stronger electrostatic energy than Van der Waals energy. Proteins such as LIVBP, MBP, RBP, and Trx had been used as model systems for studying the magnitude of cation-π interactions to protein stability (Prajapati et al., 2006), because these proteins can be expressed to high levels in E.

Cation-π interacting residue pairs in therapeutic proteins
coli. In a separate series of experiments, the aromatic amino acid in each cation-π pair was replaced by Leucine. Stabilities of wild-type (WT) and mutant proteins were characterized by both thermal and chemical denaturation. The experimental results suggest that cation-π interactions can make a significant contribution to the structural stability of proteins.

Secondary Structure Prediction of Amino Acid Residues in the Therapeutic Proteins
The propensities of the amino acid residues to favor a particular conformation are well known. Such conformational preference is not only dependent on the amino acid alone but is also dependent on the local amino acid sequence. We have computed the preference of cation-π interaction forming residues in different secondary structures and the results are shown in Table 4. It was found that, cationic residues such as Arg and Lys preferred to be in strand. In the aromatic group it was found that, Phe prefer to be in coil Tyr preferred to be in strand and Trp prefers to be in helix.

Solvent Sccessibility of the Cation-π π Interacting Residues in Therapeutic Proteins
We used DSSP (Kabsch and Sander, 1983) to estimate the solvent accessibility of the residues involved in cation-π interactions. The average solvent accessibility of the residues Arg, Lys, Phe, Tyr and Trp which are involved in cation-π interactions are 52.13, 74.47, 37.44, 34.86 and 13.72 respectively, as shown in Fig. 4. The solvent accessibility of Arg and Lys residues are significantly higher than other cation-π forming residues. The normalized ASA has been divided into three categories, buried, partially buried and exposed for different ranges of ASA; <20, 20-50 and >50, respectively (Gromiha et al., 1999; Gilis and Rooman, 1996; Gilis and Rooman, 1997). From this classification, we observed that Arg and Lys preferred to be in exposed region. Among the aromatic residues, it was observed that Phe and Tyr preferred to be in partially buried region, while Trp preferred to be in the fully buried regions. This observation is quite reasonable in the sense that, the aromatic residues are in principle, non polar residues, and tend to be buried. Since Arg and Lys are polar in nature they tend be exposed to the solvent surface.

Stabilization Centers of Cation-π π Interacting Residues in Therapeutic Proteins
We have computed the stabilization center for all cationπ interaction forming residues in therapeutic protein using the program SCide and the results are depicted in Fig. 5. It was found that 32% of cationic residues and 24% of π residues were found to have one or more stabilization centers. Cationic residues were found to have more stabilization centers than π residues. This trend was different with the earlier report on RNA binding proteins (Anand et al., 2007). It was interesting to note that all the five residues found in cation-π interactions are important in locating one or more stabilization centers. These observations strongly reveal that    these residues may contribute significantly to the structural stability of these proteins in addition to participating in cation-π interactions.

Stabilizing Residues
We thought it would be useful to identify any patterns of correlation between the Cation-π interactions in a given therapeutic proteins and the theoretically predicted stabilizing residues (Gromiha et al., 2004a). Stabilizing residues were computed using the parameters such as surrounding hydrophobicity, long-range order, stabilization center and conservation score. We used the server SRide for this purpose. Stabilizing residues information was available for 48 out of 49 therapeutic proteins and the results are presented in Table   5. It shows that, 0.93% of these stabilizing residues were also involved in cation-π interactions. From these we infer that, these residues also might contribute to additional stability to therapeutic proteins.

Conservation Score
We used the ConSurf server to compute the conservation score of amino acid residues involved in cation-π interactions in therapeutic proteins, and the results are shown in Fig. 7. 57 percent of the amino acid residues had a conservation score, in the range of below 5, while 43 percent of the amino acid residues had a conservation score 6-9. Conservation score of 6 is the cutoff value used to identify the stabilizing residues. From these observations, we were able to infer that, 43 percent of the amino acid residues that are involved in cation-π interactions might be conserved in therapeutic proteins.

Interplay of Conventional and Nonconventional Interaction in the Stability of Therapeutic Proteins
The conventional interactions studied in this work were computed with the help of WHAT IF (Vriend, 1990). We undertook these studies to infer the role of conventional and the cation-π interactions in individual proteins and in the whole data set as well. Table 6 shows the number of hydro-   2B5I  64  29  1  1  95  3BMP  71  11  3  2  87  2ERJ  45  11  5  1  62  2GMF  61  21  2  1  85  2GOO  47  5  3  2  57  2H62  50  10  3  2  65  2H64  50  9  3  2  64  2IWG  95  61  2  1  159  2OSL  110  41  2  1  154  3INK  71  29  1  1  102  Total  5392  2612     gen bonds, salt bridge, di-sulfide bonds, cation-π interactions and the total number of conventional and non-conventional interactions in individual proteins. It is quite reasonable that the number of hydrogen bonds is much more than salt bridge and di-sulfide bonds on the conventional interaction side and number of cation-π interactions on the nonconventional interaction side, except for one protein (PDB id 1YY9). This protein incidentally also has the highest number of total number of interactions. The protein with PDB id 1PGG shows a total number of 516 interactions out of which 6 of them from cation-π interaction. There were a total of 8251 interactions for the whole data set out of which, 5392 where from hydrogen bond, 2612 from salt bridge, 172 from di-sulfide bond and 75 from cation-π interactions. The individual interaction such as hydrogen bond, salt bridge, di-sulfide bond and cation-π interaction in terms of the percentage are depicted in Table 7. The protein with PDB id 1L6X had the highest percentage of conventional hydrogen bond, which showed a cationπ interaction of 0.36%. How-ever the highest percentage of cation-π interactions was shown by 2GOO even though it had only 2 cation-π interactions. Hence we could not generalize and come to any conclusion from these individual interactions. Hence we undertook the calculation to find out the relation between hydrogen bond, salt bridge, di-sulfide interactions with cation-π interactions. These are shown in Figure 8 to Figure  10. It is observed that, the significance of cation-π interactions is more than conventional interactions like hydrogen bond, salt bridge, and di-sulfide bond for the whole data set. Hence we calculated the percentage contribution of each of these interactions for the whole data set. This result is shown in Figure 11. It is clear from Fig. 11, that, the percentage of cation-π interactions is higher as compared to all the other conventional interactions like hydrogen bond, salt bridges and di-sulfide bonds for the whole data set of protein studied in this work. Based on all the results, in general, and the results of the interplay between conventional and non-conventional forces in particular, we emphasize that,

Conclusions
We have systematically analyzed the influence of cationπ interactions to the stability of therapeutic proteins. The side chain of Lys is more likely to be in cation-π interaction than Arg in the cationic residues. Phe has the highest occurrence in this interaction than the other two π residues such as Tyr and Trp. In the data set 43 therapeutic protein showed significant cation-π interactions in the total of 49. Among the cation-π residue pairs that were involved in this interaction, Arg-Tyr residue pair showed the maximum number of cation-π interaction and Lys-Trp pair showed the minimum number of interaction. The cation-π interaction energy shows that Arg-Phe energy is the strongest and Lys-Trp is the lowest among the six possible pairs in the 49 therapeutic proteins investigated. In the secondary structure arrangement of cationic group, Arg and Lys preferred to be in strand. In the aromatic group it was found that, the Phe prefer to be in coil, Tyr prefers to be in strand and Trp preferred to be in helix. In the cationic residues Lys and Arg preferred to be in exposed region. Among the aromatic residues, Phe and Tyr preferred to be in partially buried region, while Trp preferred to be in the fully buried regions. We found that, all the five residues found in cation-π interactions are important in locating one or more stabilization centers. In the cation-π interacting residues, 43 percent of the amino acid residues that are involved in cation-π interactions might be conserved in therapeutic proteins. These residues might contribute to additional stability to therapeutic proteins. The contribution of cation-π interaction for the stability for the whole therapeutic protein data set is much higher as compared to the conventional interactions such as hydrogen bond, salt bridge and di-sulfide interaction. More specifically, 57% of the proteins exhibited a higher cation-π interaction than hydrogen bond, almost 59% of the proteins exhibited cation-π interaction than salt bridges and 67% of the proteins showed higher cation-π interaction than the disulfide bonds. In all the cases, the contribution of cation-π interaction for the stability of therapeutic protein data set is much higher than the conventional interactions such as hydrogen bond, salt bridge and di-sulfide interaction. Hence we could conclude that, the contribution of cation-π interaction is an important factor for the structural stability of the therapeutic protein studied in this work. On the whole, the results presented in this work will be very useful for further investigations on the specificity and selectivity of therapeutic proteins pharmaceutical applications.

Future Perspectives
Although a great deal of progress has been made in the field of system biology, it is still a long way to understand structural stability of protein and docking studies. This may be possible after getting a better understanding of the various interactions within the protein molecule. Among the different interactions, the reports on cation-π interactions in poly peptides and proteins are scarce. Hence, computation of cation-π interactions energies may be considered significantly important in protein stability, specificity, proteinprotein interfaces and potentially useful for protein docking studies. Majority of the protein complexes analyzed contained at least one such interaction. Therefore, the pres- ence of cation-π interactions could be used as a means of discriminating chemically relevant docking results from false positives. This scrutiny will assist structural biologist and medicinal chemist to design better and safer drugs.