alexa Homology Modelling of Conserved rbcL Amino Acid Sequences in Leguminosae Family | OMICS International
ISSN: 2153-0602
Journal of Data Mining in Genomics & Proteomics
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Homology Modelling of Conserved rbcL Amino Acid Sequences in Leguminosae Family

Sagar S Patel*, Megha B Vaidya and Dipti B Shah

G. H. Patel Post Graduate Department of Computer Science and Technology, Sardar Patel University, Vallabh Vidyanagar, Gujarat-388120, India

*Corresponding Author:
Sagar Patel
G. H. Patel, Department of Computer Science and Technology
Sardar Patel University, Gujarat-388120, India
Tel: 02692-226802
E-mail: [email protected]

Received date: March 20, 2014; Accepted date: April 26, 2014; Published date: April 29, 2014

Citation: Patel SS, Vaidya MB, Shah DB (2014) Homology Modelling of Conserved rbcL Amino Acid Sequences in Leguminosae Family. J Data Mining Genomics Proteomics 5:154. doi:10.4172/2153-0602.1000154

Copyright: © 2014 Patel SS, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of Data Mining in Genomics & Proteomics

Abstract

This study is focus on Homology modelling of few Leguminosae family species which are found in Gujarat state, INDIA. There are three subfamilies of Leguminosae family which are Fabaceae (Papilionaceae), Caesalpiniaceae and Mimosaceae. Multiple sequence alignment carried out of few species’ rbcL protein sequences in each subfamily and conserved amino acid considered for homology modelling. Evolutionarily related proteins have similar sequences and naturally occurring homologous proteins have similar protein structure. It has been shown that three-dimensional protein structure is evolutionarily more conserved than would be expected on the basis of sequence conservation alone; we found that there are few amino acids which are common with same base pairs in each sub-family even though they are from different genus. There is no Protein structure available of conserved amino acids in PDB database of our study so we did homology modelling of three rbcL protein sequences (one from each sub family) which are found conserved in Multiple sequence alignment and structure validation with Ramachandran Plot was carried out and CASTp server was used to find out active sites in predicted protein structure and finally function of each predicted protein reported after this homology modeling of few conserved rbcL amino acid sequences in Leguminosae family.

Keywords

Homology modelling; Bioinformatics; Leguminosae family; rbcL

Introduction

Leguminosae family

Leguminosae family contains species of Plants, Herbs, Shrubs, and Trees. Legumes are used as crops, forages and green manures; they also synthesize a wide range of natural products such as flavours, drugs, poisons and dyes. Legumes are able to convert atmospheric nitrogen into nitrogenous compounds useful to plants [1] This is achieved by the presence of root nodules containing bacteria of the genus Rhizobium. These bacteria have a symbiotic relationship with Legumes, fixing free nitrogen for the plants; in return legumes supply the bacteria with a source of fixed carbon produced by photosynthesis. This enables many legumes to survive and compete effectively in nitrogen poor conditions [2,3]. Leguminosae family is further classified into three subfamilies; 1. Fabaceae (Papilionaceae), Caesalpiniaceae and 3. Mimosaceae.

rbcL gene

The most common gene used for plant phylogenetic analyses is the plastid-encoded rbcL gene. This single copy gene is approximately 1430 base pairs in length and is free from length mutations except at the far 3’ end. It has fairly conservative rate of evolution. The function of the rbcL gene is to code for the large subunit of ribulose 1, 5 bisphosphate carboxylase/oxygenase (RUBISCO or RuBP Case) [4].

Protein structure

Recent genome sequencing projects have provided massive amount of data, however, many of these genomes are still not fully annotated and consist of genes/proteins with unknown function and structure. This is due to several limitations, such as the cost and time required for experimental approaches [5]. An alternative to laboratory based methods is a bioinformatics approach that utilizes algorithms and databases to estimate protein function. As these algorithms and databases are based on experimental results, they can be an effective means to perform functional and structural annotation of hypothetical proteins. Structures are more evolutionary conserved than sequence; therefore, analysis of three-dimensional (3D) structures holds great potential. Our present study describes the three 3D models of rbcL protein sequences which found conserved in multiple sequence alignment and further three protein structure predicted through homology modelling. In addition sequence and structural analysis and functional annotation were also done [6].

Methodology

In current research, we have considered around 266 species which are found in Gujarat state of India [7,8]. Further we searched each species in NCBI database and finally found around 149 species’ information like DNA, Protein and other useful information of leguminosae family [9]. We have only considered rbcL protein sequences for analysis. For calculating physio-chemical properties, Prot Param was used; Various parameters computed by ProtParam included the molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index, and grand average of hydropathicity (GRAVY) [7] Secondary structure was also predicted (helix, sheets, and coils) by using PSI Pred [8].

Homology modelling

Homology modelling approach was used to determine the 3D structure of three rbcL conserved protein sequences. BLASTP by Altschul et al. [9] search with default parameters were performed against the Brookhaven Protein Data Bank (PDB) to find suitable templates for homology modelling. For Fabaceae (Papilionaceae) 1RLD; for Caesalpiniaceae 1WDD and for Mimosaceae 1EJ7 were considered as the best templates for Homology modelling. Later SPDBV was used for homology model construction.

Protein structure validation

SPDBV generated three structures, which were further validated by using Structural Analysis and Verification Server (SAVS) [10-12] and each three structure were validated in Ramachandran Plot.

Active site prediction

The PDB file constructed was then used for finding the cavities in the protein and for this Computed Atlas of Surface Topography of proteins (CASTp) server was used. CASTp provides an online resource for locating, delineating and measuring concave surface regions on three-dimensional structures of proteins.

The pipeline for the followed methodology is as represented in Chart 1.

data-mining-genomics-Pipeline-modelling

Chart 1: Pipeline for modelling and characterization of the unknown protein.

Results and Discussion

The present study focused on sequence and structural analysis of rbcL protein sequences which are found conserved in Leguminosae Family’s subfamilies; for Fabaceae subfamily 38%, for Caesalpiniaceae 60% and for Mimosaceae 54% species were found which had conserved sequences as shown in Table 1. Prot Param was further used to analyze different physiochemical properties from the amino acid sequence which are listed in Table 2.

Sub family rbcL protein sequences
Fabaceae ASKWSPELAAACEVWK
Caesalpiniaceae SVGFKAGVKDYK
Mimosaceae RGGLDFTKDDENVNSQPFMR

Table 1: Information of conserved rbcL protein sequences considered for Homology modelling.

Physio-Chemical Property Fabaceae Caesalpiniaceae Mimosaceae
Molecular weight 1776.0 Daltons 1298.5 Daltons 2326.5 Daltons
Theoretical pI 6.18 9.52 4.68
Molecular Formula C81H122N20O23S1 C60H95N15O17 C98H152N30O34S1
Instability index 75.16 -12.83 32.77
Aliphatic index 67.50 56.67 34.00
Estimated half-life 4.4 hours (mammalian reticulocytes, in vitro). 1.9 hours (mammalian reticulocytes, in vitro). 1 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo). >20 hours (yeast, in vivo). 2 min (yeast, in vivo).
>10 hours (Escherichia coli, in vivo). >10 hours (Escherichia coli, in vivo). 2 min (Escherichia coli, in vivo).
Grand average of hydropathicity (GRAVY) -0.131 -0.425 -1.290
Classification of Protein Unstable Stable Stable

Table 2: Result of Physio-chemical Properties as calculated by Prot Param tool.

Results of Prot Param tools shows that protein of Fabaceae subfamily is unstable but stable protein was found in rest of subfamily. While estimated half-life result of Mimosaceae was found very less compare to other two sub-family as shown in Table 2.

Secondary structure analysis was performed using PSI Pred and the three rbcL protein were predicted to contain several helices, coil along with beta sheets as shown in Figures 1a-1c.

data-mining-genomics-PSIPred-result

Figure 1a: PSIPred result of Fabaceace subfamily.

data-mining-genomics-Caesalpiniaceae-subfamily

Figure 1b: PSIPred result of Caesalpiniaceae subfamily.

data-mining-genomics-Mimosaceae-subfamily

Figure 1c: PSIPred result of Mimosaceae subfamily.

Homology modeling and protein structure validation

Homology or comparative modelling is one of the most common structure prediction methods in structural genomics and proteomics. Numerous online servers and tools have become available for homology or comparative modelling of proteins in past years [13]. Despite minimal modifications, one initial step that is common in all modelling tools and servers is to find the best matching template by performing a sequence homology search with BLASTP [14]. Templates are experimentally determined 3D structures of proteins that share sequence similarity with the query sequence. The template sequence and the protein sequence whose structure is to be determined are aligned using multiple sequence alignment algorithms [15]. A welldefined alignment is very important for the prediction of a reliable 3D structure. BLASTP search was performed for each protein sequence against the PDB to identify templates for homology modelling. Then the query sequence and template ID were given as input for homology modelling using SPDBV. It generated three predicted protein Models which are shown in Figures 2a-2c. From the models retrieved, the selected model along with Ramachandran plot is shown in Figures 3a-3c respectively [16-24]. The final model was selected by checking various parameters and these are shown in Table 3. These parameters included percentage of amino acids in core, allowed and disallowed regions along with no of bad contacts.

Subfamily Core region  (in %) Allowed region  (in %) Disallowed region (in %) Bad Contacts
Fabaceace 100.0 0.0 0.0 0.0
Caesalpiniaceae 100.0 0.0 0.0 0.0
Mimosaceae 80.0 20.0 0.0 0.0

Table 3: Information of Ramachandran Plot.

data-mining-genomics-Fabaceace-subfamily

Figure 2a: Protein model of Fabaceace subfamily.

data-mining-genomics-Caesalpiniaceae-subfamily

Figure 2b: Protein model of Caesalpiniaceae subfamily.

data-mining-genomics-Mimosaceae-subfamily

Figure 2c: Protein model of Mimosaceae subfamily.

data-mining-genomics-Ramachandran-Plot

Figure 3a: Ramachandran Plot of Fabaceace subfamily.

data-mining-genomics-Caesalpiniaceae-subfamily

Figure 3b: Ramachandran Plot of Caesalpiniaceae subfamily.

data-mining-genomics-Mimosaceae-subfamily

Figure 3c: Ramachandran Plot of Mimosaceae subfamily.

Active site prediction

Active site signifies the functional region of the protein. During the active site prediction with the help of CASTp, it was observed that few pockets were predicted in Caesalpiniaceae and Mimosaceae subfamily protein structure but no pocket found in Fabaceace subfamily protein structure. Some of the predicted pockets are as shown in Figures 4a-4c.

data-mining-genomics-CASTp-server

Figure 4a: Result of CASTp server of Caesalpiniaceae subfamily.

data-mining-genomics-Mimosaceae-subfamily

Figure 4b: Result of CASTp server of Mimosaceae subfamily.

data-mining-genomics-Fabaceace-subfamily

Figure 4c: Result of CASTp server of Fabaceace subfamily.

Conclusion

We have used homology modelling approach to propose the 3D structure and possible functions for the conserved rbcL protein sequences which are found in Leguminosae Family. The function of protein can be understood better by its structure and structure of rbcL protein is already known so the function of these fragments of conserved sequences are confirmed by taking following templates; for Fabaceae (Papilionaceae) subfamily, 1RLD; for Caesalpiniaceae subfamily, 1WDD and for Mimosaceae subfamily, 1EJ7 for Homology modelling. Later SPDBV was used for homology model construction. With the help of above findings we found that each conserved protein involved in important function like in Caesalpiniaceae subfamily predicted protein structure has site which is heterodimer interface [polypeptide binding] and disulfide bond found within that particular structure. In Mimosaceae subfamily, predicted protein structure has few sites which are heterodimer interface [polypeptide binding], active catalytic residue site and metal binding site [ion binding] and no active site found in Fabaceae subfamily predicted protein structure. So, these particular predicted protein structures has many important feature as described above and found common in selected species of study in each subfamily of Leguminosae family and these protein sequences can be used for classification of Leguminosae Family species as protein sequences are found conserved in each subfamily. So, if your protein sequence has one of the conserved protein sequences as described in this study then it might be fall within that particular subfamily of Leguminosae Family.

Acknowledgements

We are heartily thankful to Prof. (Dr.) P.V. Virparia, Director, GDCST, Sardar Patel University, Vallabh Vidyanagar, for providing us facilities for the research work. We are also thankful to DST-PURSE program and Center for Interdisciplinary Studies in Science and Technology (CISST), Sardar Patel University, Vallabh Vidyanagar, Gujarat (India) for providing financial assistance in the form of fellowship.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

Article Usage

  • Total views: 11957
  • [From(publication date):
    August-2014 - Aug 21, 2018]
  • Breakdown by view type
  • HTML page views : 8162
  • PDF downloads : 3795
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2018-19
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

kactakapaniyor.com

[email protected]

+1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

Taktube

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

porn sex

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

Gaziantep Escort

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

sikiş

[email protected]

1-702-714-7001Extn: 9037

instafollowers

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

mp3 indir

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

putlockers

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

seks

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

 
© 2008- 2018 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
Leave Your Message 24x7