Prediction of Antigenic MHC Peptide Binders and TAP Binder of COX1 Protein through In Silico Approach

In the current analysis Cytochrome c oxidase subunit I (CO1 or MT-CO1) protein sequence from GWD has been used to study the MHC binding antigenic peptide, antigenic peptide prediction through different B cell prediction method, protein solvent accessibility, polar and nonpolar residue to analyze the regions which are probably exposed on the protein surface. From the protein the peptide fragment can be used to analyze and specific nonamer can be selected for the rational vaccine designing. In this investigation, PSSM and SVM algorithms are applied for finding of MHC I and MHC II binding peptides. We also predicted the high affinity TAP binding peptides of CO1 protein from GWD, having 205 amino acids which show 197 nonamers. From the opted outcomes we predict that, the possibilities that, the antigenic peptide of cytochrome c oxidase subunit I (mitochondrion) protein might play a major role and could be the most suitable candidate for subunit vaccine development on the bases of the finding that, with single epitope, the immune response can be generated in large population.


Introduction
COX1 (Cytochrome c Oxidase I) are commonly known as Mitochondrially encoded Cytochrome c Oxidase I (MT-CO1). This protein encodes of MT-CO1 gene in humans, whereas in other eukaryotes, this gene is addressed as COX1, CO1 or COI [1]. COX 1 is a primary subunit of cytochrome c oxidase complex. The subunit I of Cytochrome c Oxidase (CO1 or MT-CO1) is one out of three mitochondrial DNA (mtDNA) encoded subunits (MT-CO1, MT-CO2, MT-CO3) of respiratory complex IV. Complex IV is considered to be the third and final enzyme of mitochondrial oxidative phosphorylation of the ETC. In aerobic metabolism, Cytochrome c oxidase (EC 1.9.3.1) is a primal enzyme. Study suggest that in prokaryotes, this enzyme complex is consist of three to four subunits and up to thirteen polypeptides in mammals, but out of which only the catalytic subunit is found in all heme-copper respiratory oxidases. The enzyme complexes modify in heme and copper composition, substrate type and substrate affinity. The different respiratory oxidases allow the cells to tailor-make their respiratory systems followed by a diverse environmental growth circumstances [2]. Catalytic activity of the COX1 protein in oxidative phosphorylation is: 4 ferrocytochrome c + O 2 + 4 H+ = 4 ferricytochrome c + 2 H 2 O. COX1 involved in oxidative phosphorylation pathway, which is division of Energy metabolism. In order to identify the animal species or closely related species analysis, this COX1 gene is used as the barcode, due to its fastest rate of mutation and sequence conserved specificity among con-specifics. This COX1 gene is frequently practiced as a DNA barcode to identify animal species. In contradiction, raised by the skeptics suggest that, the MT-CO1 sequence differences are available in too minute to be detected among nearly related species, more than 2% sequence divergency is generally detected between such organisms [3]. MTCO1 is encoded by the guanine-rich heavy (H) strand of the mtDNA and situated between nucleotide pairs (nps) 5904 and 7444 [4,5]. It is maternally inherited along with the mtDNA [6,7]. The predicted molecular weight (MW) of MTCO1 is 57 kD [4,5]. However, its apparent MW on SDSpolyacrylamide gels (PAGE) is somewhat less. Using Tris-glycine buffer it runs at 39.5 kD [8-10], whereas results in urea-phosphate test, it gives an apparent MW of 45 kD [11,12]. Investigations suggest that, this protein is extremely expressed in the cytoplasm of colonic crypts (intestinal glands) of the human colon (large intestine). However, with age in the human it is frequently gets lost in colonic crypts and also it is often absent in field defects that give rise to colon cancers or in portions of colon cancers [13]. CCOI is coded for by the mitochondrial chromosome. The occurrence of the chromosome in the mitochondria in multiple copy forms, which varies between two and six / per mitochondrion [14][15][16]. The generation of new type of mitochondria occurs due to random segregation of chromosome during the mitochondrial fission, if a mutation takes place in CCOI in one chromosome of a mitochondrion. The occurrence of mitochondria per cell is 100 -700, depending upon the cell type [15,16]. In rats, the average half-life of mitochondria depends on cell type, and found between 9 and 24 days [17], in mice about 2 days [18], whereas, in human it varies from days to week depending on cell type. The inadequacy of CCOI in a mitochondrion heads to lower reactive oxygen production (and less oxidative damage) and this provides an exclusive vantage in competition with other mitochondria within the same cell to bring forth homoplasmy for CCOI-deficiency [13]. The phylogenetic analysis study conducted by Ngui et al. [19] on the cytochrome c oxidase subunit 1 (cox 1) sequence of A. ceylanicum from positive human and animal fecal samples suggest that, considerable level of genetic variation within the cox 1 sequence of A. ceylanicum might be a potential haplotype-linked divergences in zoonotic, epidemiological and pathobiological characteristics, a hypothesis which still needs a further future in depth investigation [19]. In another study, the analyses of multiple sequence alignments of mitochondrial 16S rDNA (ribosomal DNA) and cox 1 of Trichurisskrjabini revealed high homology with those of Trichinella species and this was the first time when the mitochondrial DNA gene sequences of one species of trichurid nematode have been cited [20]. MT-CO1 may act as agent in the pathogenesis of acquired idiopathic sideroblastic anemia, this sickness is characterized by inadequate formation of heme and excessive aggregation of iron in mitochondria. The overloaded iron in mitochondrial may be attributable to mutations of mitochondrial DNA, because impairing the reduction of ferric iron to ferrous iron can cause respiratory chain dysfunction. Whereas, study suggest that the reduced form of iron is substantive in the last step of mitochondrial heme biosynthesis. Insufficiency of COX drives a clinically heterogeneous variety of neuromuscular and nonneuromuscular disorders in childhood and adulthood. The mutation in the COX1 results in the several other variety of the disorders such as Leber hereditary optic neuropathy (LHON) (primary mitochondrial DNA mutations affecting the respiratory chain complexes), mitochondrial complex IV (MTC4D) deficiency, recurrent myoglobinuria mitochondrial (RM-MT), Deafness, sensorineural, mitochondrial disorder, colorectal cancer. Considering the COX1 importance, we have taken this protein for investigation of its antigenicity role, its solvent accessibility property, polar and nonpolar residue analysis. By this time we all are aware that, the regions that are likely exposed on the surface of proteins could be the potentially antigenic that allows potential drug targets to identify active sites against infection as well as for designing and development of effective drug to treat infections. Cytochrome c Oxidase subunit I (mitochondrion) comprised of 205 amino acid residues obtained from Dracunculus medinensis for the study of MHC class I and II binding peptide, antigenicity, Solvent accessibility, polar and nonpolar residue to analyze the regions that are likely exposed on the surface of proteins. A little dragon from Medina (D. medinesis) is the only species of Dracunculus genus [21][22][23][24] which causes dracunculiasis in humans, commonly well known as "Guinea Worm Disease (GWD)". The other Dracunculus species generally resides in the internal tissues and body cavities of non-human mammals and reptiles (snake and turtles) [25]. This parasite follows a very unique life wheels, comprised of six developmental steps with the longest incubation period of about one an half years approximately. This is also the one of the most neglected tropical parasites which has got clinical grandness and needs to be eradicated completely [26]. Once this parasite approaches the maturation stage, these worms copulate and a millions of eggs is formed in uterus of adult female, whereas male worm dies after copulation. Once the incubation period is over the larvae emerge out from the blister once the blister burst out (predominantly localized in the lower extremities (80% -90%) in most of the reported cases) when an infected individual comes in contact with water. The symptoms developed in the infected individual are slight fever, local skin redness, swelling and severe pruritus around the blister. Other symptoms includes are diarrhea, nausea, vomiting and dizziness [27]. Immersing or pouring water over the blister provides pain relief but this is the point when the adult female is exposed to the external environment [28]. During emergence of the limbs in open water sources it recognizes the temperature difference and releases the milky white liquid in the water which contains millions of immature larvae, when larvae released in water are ingested by copepods where they mount twice and become infective larvae within two weeks [29]. The antigen peptides of D.medinensis could be most desirable segment for the development of subunit vaccine on the bases of fact that, the immune response can be generated in large population with the single epitope. This approach is generally based on the phenomenon of crossprotection, whereby the individual can be infected with the mild strain and is protected against a more severe strain of pathogen of the same. The resistant transgenic host's phenotype includes of fewer centers of initial infection, following a delay development in symptom with low accumulation. There is the possibilities that the predicted antigenic peptides from D. medinensis could contribute a major role in drug formulation (or peptide vaccine) and disease eradication [30] because a single protein subunit can generate sufficient immune response. In this current investigation work, we have applied the in silico approach for MHC class I and class II binding antigenic peptide identification. MHC molecules are cell surface protein which binds to peptides derived from host or antigenic proteins and present them to cell surface for realization by T-cells. T cell recognition is a significant mechanism of the adaptive immune system by which the host distinguishes and responds to foreign antigens [30,31]. The two forms of MHC molecule are extremely polymorphic. MHC class I molecules present peptides from proteins synthesized within the cell, whereas, MHC class II molecule present peptides derived from endocytosed extracellular proteins. MHC molecules take active part in host immune reactions and their contribution in immune response to almost all antigens and it give impacts on specific sites. MHCI binds to some of the peptide fragments generated after proteolytic cleavage of antigen [32]. Identification of MHC-binding peptides and T-cell epitopes helps to improve our understanding of specificity of immune responses [33][34][35][36]. Antigenic peptides are most suitable for peptide or synthetic vaccine development.

Database searching
The cytochrome c oxidase subunit I (mitochondrion) protein sequence of Dracunculus medinensis was recovered from www.ncbi.nlm.nih.gov, UniProt databases which is the essential important step for further investigation [37,38].

Prediction of the physico-chemical properties of the protein
The physico-chemical properties like molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient [39][40][41], estimated half-life [42,43], instability index [44], aliphatic index [45] and Grand Average of Hydropath city (GRAVY) [46] were analyzed by the ProtParam (http://www.expasy.org/). Effect of temperature on protein solubility and denaturation and this physiochemical property prediction will helpful to understand the effect of pH on protein solubility and protein isoelectric point, interaction between protein and water molecules and hydrogen bonds, protein-ligand binding affinity, function of the protein. These property analysis plays an important role and taken into the consideration in the drug development or designing.

Prediction of MHC binding peptide
The Major Histocompatibility Complex (MHC) peptides binding of protein from Dracunculus medinensisis were predicted using neural networks trained on C terminals of known epitopes. Rank pep tool predicts peptide binders to MHC-I ligands whose C-terminal end is likely to be the result of proteosomal cleavage using Position Specific Scoring Matrices (PSSMs). The sequence similarities are shared by the peptides that bind to a given MHC molecule. Traditionally, the sequence patterns used for the prediction of peptides binding to MHC molecules, such sequence patterns are however, have proven to be too simple, as the complexity of the binding motif cannot be precisely represented by the few residues present in the pattern [60]. RANKPEP uses "Position Specific Scoring Matrices (PSSMs) or profiles" from set of aligned peptides known to bind to a given MHC molecule as the forecaster of MHC-peptide binding and overpower the complexity of the binding motif limitation. Whereas, in the other hand the Support Vector Machine (SVM) based method for prediction of promiscuous MHC class II binding peptides from protein sequence; SVM has been trained on the binary input of single amino acid sequence [61][62][63][64].

Prediction of antigenic peptides by cascade SVM based TAPPred method
We predicted the cascade SVM based several TAP binders which was based on the sequence and the features of amino acids [65]. We found the MHCI binding regions, the binding affinity of cytochrome c oxidase subunit I (mitochondrion).

Solvent accessible regions
We also investigated the solvent accessible regions of proteins holding highest probability that a given protein region lies on the surface of a protein Surface Accessibility, backbone or chain flexibility via Emini et al. [66] and Karplus and Schulz [67]. By using different scale we predict the hydrophobic and hydrophilic characteristics of amino acids that are rich in charged and polar residues [68][69][70] (Figure  1).

Results and Discussion
Dracunculus medinensisis protein Cytochrome c oxidase subunit I (mitochondrion), is consist of 205 amino acids long residue with 197 nonamers.

Prediction of the physicochemical properties of the protein
The physico-chemical properties of Cytochrome c oxidase subunit I like molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life instability index, aliphatic index and grand average of hydropathicity (GRAVY) were analyzed by the ProtParam (http://www.expasy.org/) and found the instability index is 32.56 which infers that the protein is the stable protein and its grand average hydropathicity predicted is 0.876 (Table  1).

Prediction of antigenic peptides
In antigenic peptide study, we detected the antigenic determinants by finding the area of greatest local hydrophilicity. In the Hopp-Woods scale Hydrophilicity prediction analysis of the protein found high in position: 29, Score: 0.433 (max), i.e., 23-RAELCKP-32 in a protein sequence, assuming that the antigenic determinants would be displayed on the surface of the protein and thus would be located in hydrophilic regions (Figure 2).  We also study the hydrophobicity plot of HPLC / Parker [49] and the highest peak is obtained in position: 110 (Residue: C) with highest score: 5.971 (107-DSSCGTS-113) (Figure 4).
There are 10 antigenic a determinant sequence is found by Kolaskar and Tongaonkar [50] antigenicity scales (Figures 5a and 5b).

Solvent accessible regions
The solvent accessible regions in proteins are also predicted. The different measurement was performed for the prediction of antigenic activity, surface region of peptides. Emini et al. [66] (Figure 7).
Predicts the highest probability i.e. found In position: 201(Resisdue: S), the sequence is 199-DRSFNT-204 with highest score: 4.77, that a given protein region lies on the surface of a protein and are used to identify antigenic determinants on the surface of proteins. Karplus and Schulz [67] (Figure 8).
The highest score (score: 1.079) is found in the position: 127(Residue: 124) and the sequence is 124-GHPGNSV-130. The other second highest peak has been found in the position: 126(Residue: P) with the sequence 123-SGHPGNS-129 (Score: 1.078). Predict backbone or chain flexibility on the basis of the known temperature B factors of the a-carbons. The hydrophobicity and hydrophilic characteristics of amino acids is determined by using different scales that are rich in charged and polar residues, i.e., Bull and Breese [68] result high in Position: 125 Score: 0.509 (max) 122-TSGHPGN-128 ( Figure 9).  Roseman [69] result found high in Position: 180 Score: 1.491 (max) 177-TVFLLI-183 ( Figure 10).

Prediction of MHC binding peptide
The binding of peptides to a number of different alleles using PSSM were identified from cytochrome c oxidase subunit I (mitochondrion) protein of Dracunculus medinensis having 247 nonamers. We have    We found predicted MHC-I peptide binders of protein for 8mer_H2_Db alleles with the consensus sequence QNWNCCTI that yields the maximum score i.e. 52.494, 9mer_H2_Db with, the consensus sequence FCIHNCDYM that yields the maximum score i.e. 50.365, 10mer_H2_Db with, the consensus sequence SGYYNFFWCL that yields the maximum score, i.e., 58.858, 11mer_H2_Db with, the consensus sequence CGVYNFYYCCY that yields the maximum score, i.e., 79.495 (Table 2).
MHC-II peptide binders for I Ab with the consensus sequence YYAPWCNNA that yields the maximum score, i.e., 35.632,I_Ad with the consensus sequence QMVHAAHAE that yields the maximum score, i.e., 53.145, MHC-II I_Ag7 with the consensus sequence WYAHAFKYV that yields the maximum score, i.e., 40.873 for MHC II allele was tested. The prediction of high affinity binders are performed using a cascade SVM based TAPPred method, where more than 63 High affinity TAP Transporter peptide regions were obtained This high affinity binders represents predicted TAP binders residues which occur at N and C termini from Dracunculus medinensis antigen cytochrome c oxidase subunit I (mitochondrion). TAP is an important transporter that transports antigenic peptides from cytosol to ER. TAP binds and translocate selective antigenic peptides for binding to specific MHC molecules. The efficiency of TAP-mediated translocation of antigenic peptides is directly proportional to its TAP binding affinity. Thus, by understanding the nature of peptides, that bind to TAP with high affinity, is important steps in endogenous antigen processing. The correlation coefficient of 0.88 was obtained by using jackknife validation test. T cell immune responses are derived by antigenic epitopes; hence, their identification is important for design synthetic peptide vaccine. T cell epitopes are recognized by MHCI molecules producing a strong defensive immune response against Dracunculus medinensis antigen cytochrome c oxidase subunit I (mitochondrion).
Therefore, the prediction of peptide binding to MHCI molecules by appropriate processing of antigen peptides occurs by their binding to the relevant MHC molecules. Because, the C-terminus of MHCIrestricted epitopes results from cleavage by the proteasome and thus, proteasome specificity is important for determining T-cell epitopes. Consequently, RANKPEP also focus on the prediction of conserved epitopes. C-terminus of MHCI-restricted peptides is generated by the proteasome, and thus moreover, these sequences are highlighted in purple in the output results. Proteasomal cleavage predictions are carried out using three optional models obtained applying statistical language models to a set of known epitopes restricted by human MHCI molecules as indicated as I_Ab.p, I_Ad.p,I_Ag7.p,I_Ak.p alleles, which is highlighted in red represent predicted binders (Table 3).   RANKPEP report PSSM-specific binding threshold and is obtained by scoring all the antigenic peptide sequences included in the alignment from which a profile is derived, and is defined as the score value that includes 85% of the peptides within the set. Peptides whose score is above the binding threshold will appear highlighted in red and peptides produced by the cleavage prediction model are highlighted in violet. We also use a cascade SVM based TAPPred method which found 63 High affinity TAP Transporter peptide regions (  This represents predicted TAP binders residues which occur at N and C termini from Dracunculus medinensis (cytochrome c oxidase subunit I (mitochondrion)).

Conclusion
MHC molecules are the cell surface proteins, which actively take part in the host immune responses against pathogens and reason of its involvement in the response to almost all variety of antigens and it gives effects on specific sites. By considering the above result we can concluded that the antigenic peptide that binds to MHC molecule are antigenic that means hydrophilic in nature. This means the increase in affinity of MHC binding peptides may result in enhancement of immunogenicity of Dracunculus medinensis antigen cytochrome c oxidase subunit I (mitochondrion) and could be helpful in the designing of synthetic peptide vaccine. This approach can help reduce the time and cost of experimentation for determining functional properties of Dracunculus medinensis antigen cytochrome c oxidase subunit I (mitochondrion). Overall, the results are encouraging, both the 'sites of action' and 'physiological functions' can be predicted with very high accuracies which ultimately facilitating the minimization of number of validation experiments.