ICMR Virus Unit, GB4, ID and BG Hospital, 57, Dr. S. C. Banerjee Road, Beliaghata, Kolkata, West Bengal, India
Received date: May 02, 2016; Accepted date: May 17, 2016; Published date: May 24, 2016
Citation: Sarkar A, Chatterjee A, Ansari S, Chakraborty N (2016) Characterization of Molecular Mimicry Between UL18 Glycoprotein of Human Cytomegalovirus [HCMV] and Class-I MHC Molecule through Pattern-based Analysis: An In-silico Approach. J Health Med Informat 7:230. doi: 10.4172/2157-7420.1000230
Copyright: © 2016 Sarkar A, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Health & Medical Informatics
Viral replication occurs using the host cell synthesis mechanisms and they are able to exploit the mechanisms of nucleic acid replication and protein translation machinery. Viral proteins require a target to fit subcellular compartments of the host cell and localized in different cellular compartments including the nucleus. Human Cytomegalovirus downregulates expression of traditional class-I MHC [Major Histocompatibility Complex] molecules at the infected cell surface and allows the infected cells to avoid acknowledgment by cytotoxic T-cells. In the present study, we have focused on generally accessible online and offline computational tools to examine practical and fundamental characteristics of the most conserved domain [CD] of UL18 gene along with the viral class-I MHC molecule. Six open reading frames [ORF] were reanalyzed by selecting start codon. Site-specific homology was determined to the MHC class-I molecule [19 to 197 residual position; ID: pfam00129, E-value: 3.26e-14]. The predicted protein architecture contained about 28.90% helices [107 residues] and 10.32% strands [38 residues]. The tertiary structure represented that 276 residues [75% of sequence] were modeled by the single highest scoring template with 100% confidence and the structure represents a peptide-binding viral MHC mimic, apprenticed to a host inhibitory receptor [pdb code-3d2u]. Thus, our analysis suggests that the homologous sequence corresponding to MHC class-I gene is situated between 19 and 200th residues of UL18 ORF and the recognition domain was identified with significant E-value. Our study also demonstrated that the ORF18 is homologous to the Ig-superfamily in 229-289th position. Therefore, these domains were found to date homologous in HCMV proteins suggesting a particular functional role during infection. In light of the above, further experimental steps are needed to elucidate the exact role of the UL18 during infections.
Human cytomegalovirus; UL18; MHC-I; Protein prediction; Transmembrane domain
Viruses use the host built device to imitate by developing the instruments to misuse the host nucleic acids replication and protein interpretation mechanical assembly . Viral proteins need a target to localized in different cellular compartments [2,3]. Human Cytomegalovirus [HCMV] is a widely distributed host-specific member of the Herpesviridae family and classified under Betaherpesvirinae subfamily [4-6]. Mature virions of HCMV range in diameter from 200 to 300 nanometers and it is the largest double-stranded DNA virus with a genome size of about 235 kb HCMV encodes over 200 ORF [open reading frames] [7-9]. Serological surveys have demonstrated global prevalence rates of maternal antibody is 30% to nearly 100%, reflecting wide variation in infection rates between populations. In India, serological studies have indicated 80-90% prevalent of CMV IgG antibodies in women of childbearing age [10-12]. The danger of seroconversion amid pregnancy has suggested 2.0-2.5% [13,14]. HCMV is an important agent of numerous diseases  including pneumonitis, hepatitis, retinitis, and gastrointestinal infections , especially in organ transplant recipients, immunocompromised patients, and the fetus or newborn infants [17,18].
HCMV may down-regulates expression of traditional class-I major histocompatibility complex [MHC-I] at the infected cells surface. This allows the infected cells to avoid acknowledgment by cytotoxic T cells. HCMV encodes MHC-I heavy chain homologs that may work in immune response evasion . The first CMV gene [HCMV-H301] recognized as homologous to MHC class-I antigen [latter known as UL18 glycoprotein] . Fahnestock et al.,  hypothesized that the expression of UL18 of HCMV in Chinese hamster ovary [CHO] cells similar to class-I molecules .
Sequence arrangements and correlations proposed that the HCMV-UL18 contains the portrayed groove that serves as the coupling site in MHC molecules [22,23]. In an uninfected cell, peptides obtained from self-proteins are bind to MHC molecules. On the other hand, in an infected cell, MHC molecules are possessed by peptides from viral proteins, to which T cells respond by slaughtering the cell . Ongoing research in last few years has indicated that the role of these homologs in the virus-infected cell is to draw in NK cell inhibitory receptors, in this manner keeping the lysis that would regularly happen because of down-regulation of MHC class-I molecules [25-27]. Over 200 ORFs, there is a lack of evidence for analysis of every quality of all ORF through bioinformatics approaches. Along these lines, the acceptance of real capacity and structure of a significant number of these varieties still anticipates further affirmation. For relative genomic approaches, conserved ORF of the unique quality and moderated spaces are necessary to comprehend the hereditary differing conditions and coding limit of HCMV strains. In the present study, we have focused on generally accessible online and offline bioinformatics tools to examine the practical and fundamental properties of the most conserved domain [CD] of UL18 protein of HCMV along with the class-I MHC molecule and sequences of Ig superfamily.
Sequence retrieval and ORF selection
The HCMV-UL18 of the laboratory strain AD169 and human MHC-I sequences were extracted and retrieved using the accession numbers X17403.1 and ACR55720.1 respectively from the NCBI protein database (Table 1). The accession numbers of HCMV-ORF18 homologs were identified by BLASTp are summarized in Table 2. Six open reading frames [ORF] of UL18 were reanalyzed and recalculated by selecting the start codon through ORF finder [28,29]. The most conserved domain was distinguished utilizing the NCBI-conserved domains database [NCBI-CDD] and BLASTp algorithms [30-32].
|Common name||Source||Accession No.||Size [bp]|
|MHC class-I||Homo sapiens||ACR55720.1||365|
Table 1: Retrieved HCMV homologs of human MHC class-I molecules.
|1||MHC class-I antigen [Homo sapiens]||1e-15||ACR55720.1|
|2||MHC class-I antigen [Homo sapiens]||2e-15||ACN89845.1|
|3||MHC class-I antigen [Homo sapiens]||3e-15||CCB78856.1|
|4||MHC class-I antigen [Homo sapiens]||3e-15||CAL85437.2|
|5||MHC class-I antigen [Homo sapiens]||3e-15||CBL87902.1|
|6||MHC class-I antigen [Homo sapiens]||3e-15||BAG32141.1|
|7||MHC class-I antigen [Macacamulatta]||3e-15||ABU68109.1|
|8||MHC class-I heavy chain [Equuscaballus]||3e-15||NP_001075975.1|
|9||MHC class-I heavy chain [Equuscaballus]||3e-15||CAA56263.1|
|10||MHC class-I antigen [Pan troglodytes]||3e-15||AAF72771.1|
|11||MHC class-I antigen [Alouattaseniculus]||3e-15||AKE50309.1|
|12||MHC class-I antigen [Saguinuslabiatus]||5e-15||AEL31262.1|
|13||MHC class-I antigen [Bostaurus]||6e-15||AAZ73464.1|
|14||MHC class-I antigen [Phocavitulina]||6e-15||AFU81672.1|
|15||MHC class-I antigen [Macacanemestrina]||6e-15||AAO84306.1|
|16||MHC class-I antigen [Homo sapiens]||7e-15||CCP46976.1|
|17||MHC class-I alpha chain [Bison bison]||1e-14||ABJ53222.1|
|18||MHC class-I antigen [Tupaiabelangeri]||1e-14||AFN37215.1|
|19||MHC class-I antigen [Elaphurusdavidianus]||2e-14||AFX84568.1|
|20||MHC class-I antigen [Macacafascicularis]||2e-14||BAI40345.1|
|21||MHC class-IB antigen [Chlorocebussabaeus]||4e-14||AEE37103.1|
Table 2: The sequence of AD169 strain [X17403.1] showing homology with nonredundant protein database through BLASTp.
Computational analysis of sequences
A variety of openly accessible online and offline bioinformatics programs were used for the analysis of shortlisted sequences. Unless expressed in the content the default settings were utilized for the examination. Multiple sequence alignment [MSA] was generated using ClustalW2 and BLASTp algorithms. Homologous proteins were distinguished using BLASTp, position-specific iterated [PSI]-BLAST, and GenThreader [31,33,34]. Protein sequences of all organisms were searched via nonredundant [nr] protein database with BLASTp and PSI-BLAST.
Predictions, analysis and homology modeling of secondary structure
The likely secondary structure predictions were determined using Phyre2.0 [35,36]. Annotation of protein transmembrane segments and helix forecasts were made using TMHMM2.0 [37,38]. Protparam Server validated various physiological parameters of selected proteins from expasy platform [http://web.expasy.org/protparam/] . Homology modeling and analysis were determined using Swiss-Model of expasy platform [http://swissmodel.expasy.org/] . Structural homology between UL18 and MHC-I were compared through iPBA web server [http://www.dsimb.inserm.fr/dsimb_tools/ipba/] . Igsequences were also manually compared for structural examination in light of likeness in the spine neighborhood compliances [42,43].
Sequence retrieval and UL18 domain analysis
To comprehend the protein capacity is to distinguish the potential area and folds present in the polypeptide. The most conserved domain was obtained from six ORFs of the HCMV-UL18 region and domain architecture was analyzed within the amino acid sequences through SMART [Simple Modular Architecture Research Tool: http://www.smart.embl-heidelberg.de/] in normal mode. The repeats and motifs of the sequence prepared by SMART based on the principle of hidden Markov models. In the full-length HCMV-UL18 envelope glycoprotein domain, only recognizable domain has been reported which is most significant. The Figure 1 represents the residual positions of sequence homology to MHC class-I molecule and Ig superfamily including with transmembrane region of the UL18. Sitespecific homology was determined to the MHC class-I molecule in residual position [19 to 197; ID: pfam00129, E-value: 3.26e-14], and Ig superfamily [229 to 289; ID: pfam07654, E-value: 1.38e-06] (Table 3) (Figure 2). The outlier region and homolog of the structure were determined via SCOP [Structural classification of protein] based on similarities of their structures and amino acid sequences. Thus, our hypothesis confirmed that the MHC class-I and Ig superfamily situated between 19 and 300 residues of UL18 ORF and recognition domain was identified with highly significant E-value (Table 4). Thus, these domains were found to date homologous in HCMV proteins suggesting a particular functional role during infection. Prediction of the transmembrane helix has performed through TMHMM server v 2.0 (Figure 3). Out of total 368 amino acids, 41.56% transmembrane helices were predicted, where 18.84% was found in the first 60 amino acids [outside: 1-323; TMhelices: 320-347; Inside: 347-368]; and 0.46% was predicted as N-terminal signal sequences. We also performed to extend our search for conserved domains through the BLASTp search for homologous proteins to HCMV ORF18. All total 21 homologous proteins were classified with highly significant E-values (Table 2). All these proteins were from herpesvirus origin and homologous to MHC class-I origin of various organisms. A PSI-BLAST search with either fulllength ORF performed for more distant homologs. The PSI-BLAST search identified Human MHC-I as the homolog to HCMV-UL18. Therefore, the HCMV UL18 is most likely to possess a viral MHC class-I domain rather than any other currently characterized protein fold.
Figure 1: Sequence homology to MHC class-I molecule and Ig superfamily including with transmembrane region of the UL18 of HCMV. The repeats and motifs of the sequence prepared by SMART based on the principle of hidden Markov models. Domains with scores less significant than established cutoffs not shown in the diagram.
|MHC class-I [Pfam00129]||19||197||3.26e-14|
|C1-set [Ig superfamily] [pfam07654]||229||289||1.38e-06|
Table 3: Confidently predicted domains, repeats, motifs and features.
Table 4: Outlier homologs and homologs of known structure.
Figure 3: Prediction of Transmembrane region of the UL18 domain through TMHMM server v2.0. [A], Feature predictions are colour coded onto the sequence feature as transmembrane regions are shown as red lines, inside regions as blue lines and pink lines representing outside region. [B], showing the amino acids of 320 to 347 located in the transmembrane part and situated in the extracellular region of the host cell by connecting with C and N-terminal region.
The secondary structure of HCMV ORF18
A characteristic pattern of the secondary structure represents protein fold into domains, and the secondary structure predictions can give knowledge into the potential protein design. The likely secondary structure of HCMV ORF18 was determined using PSIPRED [Protein Sequence Analysis Workbench] (Figure 4). In our study, the predicted protein architecture contained about 28.90% Helices [107 residues] and only 10.32% of strands [38 residues], which connected with 60.59% coils [223 residues]. Therefore, the architecture of the UL18 shared by helices and strands; and provided as consistent with its function as a glycoprotein along with the transmembrane region. The molecular model of UL18 was created on the basis of its sequence homology to classical MHC-I. The protein identification with homologous tertiary structure may provide clues about function and mechanisms of proteins. HCMV UL18 previously reported being a homolog of the class-I MHC molecule . We sought to extend these observations in the context of the whole protein and generated the tertiary structure of ORF18 to give a global perspective. Thus, we conclude that a consistent fold prediction was achieved for the conserved domain of UL18 proteins. The tertiary structure was made using Phyre2.0 [Protein Homology/analogY Recognition Engine V 2.0], where 276 residues [75% of the sequence] modeled by the single highest scoring template with 100.0% confidence (Figure 5A).
Figure 5: Tertiary structure of [A] HCMV-UL18, [B] Class-I MHC, and [C] Ig molecules identified by Phyre2.0. Feature predictions are colour coded onto the sequence feature as Alpha helices are shown as purple and beta sheets as yellow. This display affects only protein chains. Alpha helices are shown as “rockets”. Beta strands shown as planks. The entire protein chains shown as a smoothed backbone trace. [A] Class-I MHC molecule distributed helices and strands in the structure and linked with coils. [C], Ig molecule showing lacks of helices and contains only strands.
Prediction of secondary and tertiary structure of MHC-I and Ig superfamily
Secondary structure of class-I MHC molecule [ACR55720.1] was determined and represents heavy chain complexed with β2 microglobulin and pbm8 peptide [pdb ID: c2clvA]. Total 278 residues [76% of your sequence] have been modeled with 100.0% confidence by the single highest scoring template and structures consist of helices and strands (Figure 5B). The UL18 amino acid sequences were compared to establish and analyze the concept that the structure shows similarities with the viral MHC-I molecule and able to attach to a host inhibitory receptor. Ig sequences were manually retrieved from 229- 289 residual regions, as it was found similar in the UL18 region of HCMV (Figure 2) (Table 3). In our present study, we compared 19- 197 region of UL18 with MHC-I; and 229-289 region of UL18 with Ig superfamily respectively. High confidence of structural similarity was found between UL18 and MHC-I molecule. A total 175 residues [99%] modeled with 100% confidence by the single highest scoring template [pdb code-3d2u]; and 59 residues [97% of sequence] of Ig superfamily. The structure of Ig dominated by strands [37.3%] only and connected with coils; however the lack of helices were noticed in the structure (Figure 5C).
Structure comparison of UL18 with MHC I molecules and Ig superfamily
IPBA was used to align protein structures for examination and the structure of UL18 was compared to define the structural homology through the IPBA-web tool [http://www.dsimb.inserm.fr/ dsimb_tools/ipba/]. The closest adaptation of spine was characterized as pentapeptide dihedrals, utilizing Protein Squares [PBs] by a PB substitution network . The comparative methodology was taken for redetermination of the outcomes through TM-align tool by calculation of grouping independent protein structure correlations . Predicted representation of auxiliary superposition of UL18 with MHC-I and Ig molecules resulted that the MHC-I molecule was only sandwiched with HCMV UL18 protein with highly significant TM value (Figure 6A), but the Ig molecules only bound to the strands of UL18 proteins as expected (Figure 6B).
Figure 6: Visualization of structural superposition of UL18 with [A] MHC I and [B] Ig molecule. Structures compared using the TM-align tool. MHC I molecules sandwiched and coded with red colour with UL18 [Blue] protein. The Aligned length of MHC I=158, RMSD=2.20, Superposition of UL18 TM-score=0.80781, Superposition of MHC I TM-score=0.52373. Ig molecules only bound to the strands of UL18 proteins. Ig Aligned length=46, RMSD=3.43, TM-score of superposition of UL18=0.19726, TM-score of superposition of Ig=0.41689.
Probably in light of the particular weight applied to the transparent framework, numerous viruses have advanced proteins that meddle with antigen presentation by major histocompatibility complexes through the entire assortment of astute methodologies to restrain the MHC class-I pathway . Diefenbach et al., and others were reported that some of the viral MHC-I-like molecules either downregulate or debilitate the acknowledgment of specific ligands [45,46]. The quantitative measure of direct tying collaborations between viral MHCI- like proteins and their ligands reflect the quality with which these evasions can contend with host defensive or inhibitory components. To gain a better knowledge of the molecular basis of UL18 mediated downregulation of MHC-I we performed computational analysis and characterized structural and functional motifs. Our analysis indicates HCMV-UL18 does disrupt MHC-I signaling of host cells through their structural homology with MHC-I at 19-197 residual position. A similar hypothesis was suggested by Chensue in 2001 and believed that in hostprotein interactions these domains and motifs may involve . Our UL18/MHC-I model suggests that the UL18 stretch of residues 19-197 plays a central role in the increased binding. Side chains are involved in a network of hydrogen bond interactions. Lucjan and Rychlewski suggested that most of the recognized MHC-fold proteins are involved in the binding of peptides in order to present either internal (Class-I MHC) or external (Class-II MHC) antigens in the process of acquired immune response . The majority of proteins belonging to the MHC fold contain additional immunoglobulin-like domains. The model of UL18 was superposed on MHC-I and Ig in this study. The entire UL18/MHC-I and the UL18/Ig complex were subjected to energy minimization. We found the same at 229-289 residues of UL18 ORF. The presence of an MHC class-I homologue in the CMV genome, encoded by the UL18 gene was previously reported [49,20]. Here, we investigated the level of utilization of human cell repression parts by viral proteins; and demonstrate that homology of the class-I MHC molecules of the host cells with HCMV-UL18 may down regulate the immune system. The HCMV UL18 ORF encoded 368-residues of type-I glycoprotein, whose extracellular region shares 25% amino acid sequences identical to the extracellular region of human class-I molecules . A comparative alignments between UL18 and MHC class-I sequences revealed that UL18 is more likely to adhere a fold of MHC-like peptide binding groove (6A and 6B). Peptide termini fitted with conserved residues at each end of the groove. Previous studies indicated that UL18, the HCMV class-I homolog, binds the MHC class-I and associated with endogenous peptides [25,27]. A similar comparative alignment of m144 and UL18 with class-I MHC sequences reported that UL18 is more conceivable than m144 to incorporate a fold that comprehends MHC-like peptide-binding groove .
The bioinformatic expectation of protein subcellular restriction broadly considered for prokaryotes and eukaryotes. Amino acids groupings are examined to foresee its auxiliary and tertiary structure through bioinformatics, computational demonstrating, and similar arrangement examination of UL18 to comprehend its method of activity in connection with the contamination. What so ever, this is not the situation for viruses whose proteins are regularly included in broad associations at different subcellular restrictions with host proteins .
All these investigations put together will readily benefit to fulfill our expectations to cure irresistible ailments and to build our comprehension of these proteins on their participation in host cells and hence could prove to be valuable for the outline of enhanced remedial intercessions. Such forecasts give a system to expound quickly viral proteomes with subcellular limitation data. In whole, these perceptions show that numerous HHV5 [Human Herpesvirus 5] ORFs offer traditional roots as a consequence of duplication, furthermore raise the likelihood that particular weights have kept up the copied qualities in groupings. HCMV encodes class-I MHC homologs. In light of the above, further experimental steps are needed to elucidate the exact role of the UL18 homologues, during the pathogenesis of Betaherpesvirinae infections.
We are grateful to ICMR Virus Unit, Kolkata- 700 010, West Bengal, India for providing high-speed internet and computational laboratory facilities. We also thank University Grants Commission (UGC), Govt. of India, New Delhi for financial support for the Postdoctoral research of Dr. Agniswar Sarkar.
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals