Characterization of Molecular Mimicry Between UL18 Glycoprotein of Human Cytomegalovirus [HCMV] and Class-I MHC Molecule through Pattern-based Analysis: An In-silico Approach

Viruses use the host built device to imitate by developing the instruments to misuse the host nucleic acids replication and protein interpretation mechanical assembly [1]. Viral proteins need a target to localized in different cellular compartments [2,3]. Human Cytomegalovirus [HCMV] is a widely distributed host-specific member of the Herpesviridae family and classified under Betaherpesvirinae subfamily [4-6]. Mature virions of HCMV range in diameter from 200 to 300 nanometers and it is the largest double-stranded DNA virus with a genome size of about 235 kb HCMV encodes over 200 ORF [open reading frames] [7-9]. Serological surveys have demonstrated global prevalence rates of maternal antibody is 30% to nearly 100%, reflecting wide variation in infection rates between populations. In India, serological studies have indicated 80-90% prevalent of CMV IgG antibodies in women of childbearing age [10-12]. The danger of seroconversion amid pregnancy has suggested 2.0-2.5% [13,14]. HCMV is an important agent of numerous diseases [15] including pneumonitis, hepatitis, retinitis, and gastrointestinal infections [16], especially in organ transplant recipients, immunocompromised patients, and the fetus or newborn infants [17,18].


Introduction
Viruses use the host built device to imitate by developing the instruments to misuse the host nucleic acids replication and protein interpretation mechanical assembly [1]. Viral proteins need a target to localized in different cellular compartments [2,3]. Human Cytomegalovirus [HCMV] is a widely distributed host-specific member of the Herpesviridae family and classified under Betaherpesvirinae subfamily [4][5][6]. Mature virions of HCMV range in diameter from 200 to 300 nanometers and it is the largest double-stranded DNA virus with a genome size of about 235 kb HCMV encodes over 200 ORF [open reading frames] [7][8][9]. Serological surveys have demonstrated global prevalence rates of maternal antibody is 30% to nearly 100%, reflecting wide variation in infection rates between populations. In India, serological studies have indicated 80-90% prevalent of CMV IgG antibodies in women of childbearing age [10][11][12]. The danger of seroconversion amid pregnancy has suggested 2.0-2.5% [13,14]. HCMV is an important agent of numerous diseases [15] including pneumonitis, hepatitis, retinitis, and gastrointestinal infections [16], especially in organ transplant recipients, immunocompromised patients, and the fetus or newborn infants [17,18].
HCMV may down-regulates expression of traditional class-I major histocompatibility complex [MHC-I] at the infected cells surface. This allows the infected cells to avoid acknowledgment by cytotoxic T cells. HCMV encodes MHC-I heavy chain homologs that may work in immune response evasion [19]. The first CMV gene [HCMV-H301] recognized as homologous to MHC class-I antigen [latter known as UL18 glycoprotein] [20]. Fahnestock et al., [21] hypothesized that the expression of UL18 of HCMV in Chinese hamster ovary [CHO] cells similar to class-I molecules [21].
Sequence arrangements and correlations proposed that the HCMV-UL18 contains the portrayed groove that serves as the coupling site in MHC molecules [22,23]. In an uninfected cell, peptides obtained from self-proteins are bind to MHC molecules. On the other hand, in an infected cell, MHC molecules are possessed by peptides from viral proteins, to which T cells respond by slaughtering the cell [24]. Ongoing research in last few years has indicated that the role of these homologs in the virus-infected cell is to draw in NK cell inhibitory receptors, in this manner keeping the lysis that would regularly happen because of down-regulation of MHC class-I molecules [25][26][27]. Over 200 ORFs, there is a lack of evidence for analysis of every quality of all ORF through bioinformatics approaches. Along these lines, the acceptance of real capacity and structure of a significant number of these varieties still anticipates further affirmation. For relative genomic approaches, conserved ORF of the unique quality and moderated spaces are necessary to comprehend the hereditary differing conditions and coding limit of HCMV strains. In the present study, we have focused on generally accessible online and offline bioinformatics tools to examine the practical and fundamental properties of the most conserved domain [CD] of UL18 protein of HCMV along with the class-I MHC molecule and sequences of Ig superfamily. situated between 19 and 300 residues of UL18 ORF and recognition domain was identified with highly significant E-value (Table 4). Thus, these domains were found to date homologous in HCMV proteins suggesting a particular functional role during infection. Prediction of the transmembrane helix has performed through TMHMM server v 2.0 ( Figure 3). Out of total 368 amino acids, 41.56% transmembrane helices were predicted, where 18.84% was found in the first 60 amino acids [outside: 1-323; TMhelices: 320-347; Inside: 347-368]; and 0.46% was predicted as N-terminal signal sequences. We also performed to extend our search for conserved domains through the BLASTp search for homologous proteins to HCMV ORF18. All total 21 homologous proteins were classified with highly significant E-values ( Table 2). All these proteins were from herpesvirus origin and homologous to MHC class-I origin of various organisms. A PSI-BLAST search with either fulllength ORF performed for more distant homologs. The PSI-BLAST search identified Human MHC-I as the homolog to HCMV-UL18. Therefore, the HCMV UL18 is most likely to possess a viral MHC class-I domain rather than any other currently characterized protein fold.

Sequence retrieval and ORF selection
The HCMV-UL18 of the laboratory strain AD169 and human MHC-I sequences were extracted and retrieved using the accession numbers X17403.1 and ACR55720.1 respectively from the NCBI protein database ( Table 1). The accession numbers of HCMV-ORF18 homologs were identified by BLASTp are summarized in Table 2. Six open reading frames [ORF] of UL18 were reanalyzed and recalculated by selecting the start codon through ORF finder [28,29]. The most conserved domain was distinguished utilizing the NCBI-conserved domains database [NCBI-CDD] and BLASTp algorithms [30][31][32].

Computational analysis of sequences
A variety of openly accessible online and offline bioinformatics programs were used for the analysis of shortlisted sequences. Unless expressed in the content the default settings were utilized for the examination. Multiple sequence alignment [MSA] was generated using ClustalW2 and BLASTp algorithms. Homologous proteins were distinguished using BLASTp, position-specific iterated [PSI]-BLAST, and GenThreader [31,33,34]. Protein sequences of all organisms were searched via nonredundant [nr] protein database with BLASTp and PSI-BLAST.

Sequence retrieval and UL18 domain analysis
To comprehend the protein capacity is to distinguish the potential area and folds present in the polypeptide. The most conserved domain was obtained from six ORFs of the HCMV-UL18 region and domain architecture was analyzed within the amino acid sequences through SMART [Simple Modular Architecture Research Tool: http://www. http://smart.embl-heidelberg.de/] in normal mode. The repeats and motifs of the sequence prepared by SMART based on the principle of hidden Markov models. In the full-length HCMV-UL18 envelope glycoprotein domain, only recognizable domain has been reported which is most significant. The Figure 1 represents the residual positions of sequence homology to MHC class-I molecule and Ig superfamily including with transmembrane region of the UL18. Sitespecific homology was determined to the MHC class-I molecule in residual position [

The secondary structure of HCMV ORF18
A characteristic pattern of the secondary structure represents protein fold into domains, and the secondary structure predictions can give knowledge into the potential protein design. The likely secondary structure of HCMV ORF18 was determined using PSIPRED [Protein Sequence Analysis Workbench] (Figure 4). In our study, the predicted protein architecture contained about 28.90% Helices [107 residues] and only 10.32% of strands [38 residues], which connected with 60.59% coils [223 residues]. Therefore, the architecture of the UL18 shared by helices and strands; and provided as consistent with its function as a glycoprotein along with the transmembrane region. The molecular model of UL18 was created on the basis of its sequence homology to classical MHC-I. The protein identification with homologous tertiary structure may provide clues about function and mechanisms of proteins. HCMV UL18 previously reported being a homolog of the class-I MHC molecule [1]. We sought to extend these observations in the context of the whole protein and generated the tertiary structure of ORF18 to give a global perspective. Thus, we conclude that a consistent fold prediction was achieved for the conserved domain of UL18 proteins. The tertiary structure was made using Phyre2.0 [Protein Homology/analogY Recognition Engine V 2.0], where 276 residues [75% of the sequence] modeled by the single highest scoring template with 100.0% confidence ( Figure 5A).

Prediction of secondary and tertiary structure of MHC-I and Ig superfamily
Secondary structure of class-I MHC molecule [ACR55720.1] was determined and represents heavy chain complexed with β 2 microglobulin and pbm8 peptide [pdb ID: c2clvA]. Total 278 residues [76% of your sequence] have been modeled with 100.0% confidence by the single highest scoring template and structures consist of helices and strands ( Figure 5B). The UL18 amino acid sequences were compared to establish and analyze the concept that the structure shows similarities with the viral MHC-I molecule and able to attach to a host inhibitory receptor. Ig sequences were manually retrieved from 229-289 residual regions, as it was found similar in the UL18 region of HCMV ( Figure 2) Figure 5C).

Structure comparison of UL18 with MHC I molecules and Ig superfamily
IPBA was used to align protein structures for examination and the structure of UL18 was compared to define the structural homology through the IPBA-web tool [http://www.dsimb.inserm.fr/ dsimb_tools/ipba/]. The closest adaptation of spine was characterized as pentapeptide dihedrals, utilizing Protein Squares [PBs] by a PB substitution network [42]. The comparative methodology was taken for redetermination of the outcomes through TM-align tool by calculation of grouping independent protein structure correlations [43]. Predicted representation of auxiliary superposition of UL18 with MHC-I and Ig molecules resulted that the MHC-I molecule was only sandwiched with HCMV UL18 protein with highly significant TM value ( Figure 6A), but the Ig molecules only bound to the strands of UL18 proteins as expected ( Figure 6B).

Discussion
Probably in light of the particular weight applied to the transparent framework, numerous viruses have advanced proteins that meddle with antigen presentation by major histocompatibility complexes through the entire assortment of astute methodologies to restrain the MHC class-I pathway [44]. Diefenbach et al., and others were reported that some of the viral MHC-I-like molecules either downregulate or debilitate the acknowledgment of specific ligands [45,46]. The quantitative measure of direct tying collaborations between viral MHC-I-like proteins and their ligands reflect the quality with which these evasions can contend with host defensive or inhibitory components. To gain a better knowledge of the molecular basis of UL18 mediated downregulation of MHC-I we performed computational analysis and characterized structural and functional motifs. Our analysis indicates HCMV-UL18 does disrupt MHC-I signaling of host cells through their structural homology with MHC-I at 19-197 residual position. A similar hypothesis was suggested by Chensue in 2001 and believed that in hostprotein interactions these domains and motifs may involve [47]. Our UL18/MHC-I model suggests that the UL18 stretch of residues 19-197 plays a central role in the increased binding. Side chains are involved in a network of hydrogen bond interactions. Lucjan and Rychlewski suggested that most of the recognized MHC-fold proteins are involved in the binding of peptides in order to present either internal (Class-I MHC) or external (Class-II MHC) antigens in the process of acquired immune response [48]. The majority of proteins belonging to the MHC fold contain additional immunoglobulin-like domains. The model of UL18 was superposed on MHC-I and Ig in this study. The entire UL18/MHC-I and the UL18/Ig complex were subjected to energy minimization. We found the same at 229-289 residues of UL18 ORF. The presence of an MHC class-I homologue in the CMV genome, encoded by the UL18 gene was previously reported [49,20]. Here, we investigated the level of utilization of human cell repression parts by viral proteins; and demonstrate that homology of the class-I MHC molecules of the host cells with HCMV-UL18 may down regulate the immune system. The HCMV UL18 ORF encoded 368-residues of type-I glycoprotein, whose extracellular region shares 25% amino acid sequences identical to the extracellular region of human class-I molecules [20]. A comparative alignments between UL18 and MHC class-I sequences revealed that UL18 is more likely to adhere a fold of MHC-like peptide binding groove (6A and 6B). Peptide termini fitted with conserved residues at each end of the groove. Previous studies indicated that UL18, the HCMV class-I homolog, binds the MHC class-I and associated with endogenous peptides [25,27]. A similar comparative alignment of m144 and UL18 with class-I MHC sequences reported that UL18 is more conceivable than m144 to incorporate a fold that comprehends MHC-like peptide-binding groove [19].
The bioinformatic expectation of protein subcellular restriction broadly considered for prokaryotes and eukaryotes. Amino acids groupings are examined to foresee its auxiliary and tertiary structure through bioinformatics, computational demonstrating, and similar arrangement examination of UL18 to comprehend its method of activity in connection with the contamination. What so ever, this is not the situation for viruses whose proteins are regularly included in broad associations at different subcellular restrictions with host proteins [50].

Conclusion
All these investigations put together will readily benefit to fulfill our expectations to cure irresistible ailments and to build our comprehension of these proteins on their participation in host cells and hence could prove to be valuable for the outline of enhanced remedial intercessions. Such forecasts give a system to expound quickly viral proteomes with subcellular limitation data. In whole, these perceptions show that numerous HHV5 [Human Herpesvirus 5] ORFs offer traditional roots as a consequence of duplication, furthermore raise the likelihood that particular weights have kept up the copied qualities in groupings. HCMV encodes class-I MHC homologs. In light of the above, further experimental steps are needed to elucidate the exact role of the UL18 homologues, during the pathogenesis of Betaherpesvirinae infections.