In-Silico Analysis, Structural Modelling and Phylogenetic Analysis of Acetohydroxyacid Synthase Gene of Oryza sativa

The Acetohydroxyacid synthase (EC 2.2.1.6) or Acetolactate synthase (ALS) belongs to a family of thiamine diphosphate (TPP) dependent enzymes which catalyzes the first reaction in the biosynthesis of essential amino acids isoleucine, leucine and valine. Acetohydroxyacid synthase (AHAS) is present in plants, algae, fungi and bacteria and is found to be a vital target of multiple herbicides. We revealed the homology model of OsAHAS protein using the structure of Arabidopsis thaliana AHAS (PDB ID: 3E9Y) as template. The resulting model structure was refined by PROCHECK, ProSA, RMSD and Verify3D that indicated the model structure is reliable with 76% amino acid sequence identity with template. RMSD (1.75Å), Verify3D (86.02%), Z-score (-9.55) and Ramachandran plot analysis showed that conformations for 81.6% of amino acid residues are within the most favoured regions. The phylogenetic tree constructed revealed different clusters based on AHAS in respect of bacteria, fungi, algae and plants. The multiple sequence alignment of these AHAS protein sequences from different organisms showed conserved regions at different stretches with homology in amino acid residues. Through motif analysis, it was revealed that conserved AHAS domain are found in all AHAS proteins suggesting its possible role in cellular and metabolic functions. M ed ici na l & Aatic Pants ISSN: 2167-0412 Medicinal & Aromatic Plants Citation: Yaqoob U, Kaul T, Nawchoo IA (2016) In-Silico Analysis, Structural Modelling and Phylogenetic Analysis of Acetohydroxyacid Synthase Gene of Oryza sativa. Med Aromat Plants (Los Angel) 5: 272. doi: 10.4172/2167-0412.1000272


Introduction
The Acetohydroxyacid synthase (EC 2.2.1.6) or Acetolactate synthase (ALS), a plastid enzyme [1] which catalyzes the first reaction in the biosynthesis of branched-chain essential amino acids -isoleucine, leucine and valine [2][3][4] is the vital target of multiple herbicides. Acetohydroxyacid synthase (AHAS) belongs to a family of thiamine diphosphate (TPP) dependent enzymes present in plants, algae, fungi, and bacteria [5]. The ion cofactor is typically Mg 2+ [6] which anchors TPP to AHAS. Flavin adenine dinucleotide (FAD) molecule, a third cofactor is also required by AHAS. Commercially available herbicides that inhibit AHAS include sulfonylureas (SU), imidazolinones (IMI), triazolopyrimidines (TP), pyrimidinyl-thiobenzoates (PTB) [also known as pyrimidinylsalicylic acids or pyrimidinyloxybenzoic acids] and sulfonyl-aminocarbonyl-triazolinones (SCT) [7,8]. Out of these the sulfonylureas and imidazolinones are the most significant, with the sulfonylureas being the leading group on an active ingredient basis. Due to amino acid starvation, AHAS inhibition leads to plant death [9]. The mammals lack the pathway for branched-chain amino acids biosynthesis and thus the ALS-inhibiting herbicides are thought to be non-toxic to them [10]. They are highly selective to plants and have a broad range of weed control activity [11][12][13]. The most common naturally occurring mutations are at amino acids Ala122 [14,15], Pro197 [16][17][18], Trp574 [14,16,19] and Ser653 [15,20]. Thus understanding its structural details would be a great revolution for engineering new herbicides, developing resistant crops and antimicrobial drugs.

Materials and Methods
Homology modelling and structural analysis: Oryza sativa AHAS (OsAHAS) sequence was retrieved by using NCBI database (http://www.ncbi.nlm.nih.gov). By searching the PDB of known protein structures, the homology modelling was performed with target sequence as the query [21]. The target sequence was searched for similar sequence using the BLAST (Basic Local Alignment Search Tool) [22] against Protein Database (http://www.rcsb.org). The BLAST results yielded X-ray structure of AHAS from Arabidopsis thaliana (AtAHAS) with 76% similarity to our target protein (OsAHAS). Using ClustalW [23], all the sequences of AHAS were aligned to find out the similarity present among the sequences. 2D and 3D structure alignment was carried out using ClustalW [24] and MATRAS 1.2 [25] respectively. The sequences of the AHAS were further analysed for the presence of specific AHAS domains and motifs through motifscan (myhits.isbsib.ch/cgi-bin/motif scan) and scan prosite (Prosite.expasy.nlm.nih. gov). Analysis of conserved motifs was done by MEME version 3.5.7 [26] using minimum and maximum motif width of 20 and 50 residues respectively and maximum number of 7 motifs, keeping rest of the considerations at default. Via Modeller 9.12 by comparative modelling of protein structure prediction, the theoretical structure of OsAHAS was generated.

Model validation of OsAHAS:
The model was evaluated on the basis of geometrical and stereo-chemical constraints using RAMPAGE server (http://mordred.bioc.cam.ac.uk/-rapper/rampage.php), PROCHECK [33], Verify 3D [34] and ProSA-Web [35]. The model with the least number of residues in the disallowed region was selected for the further studies. The RMSD value between the template and target was calculated using MOE [36]. The best model structure was then compared with the template protein by superimposition using SuperPose Version 1.0 [37].
Phylogenetic analysis: Phylogenetic analysis of the sequences was done by Molecular Evolutionary Genetic Analysis (MEGA) software Version 4.1 [38] by using UPGMA method. Each node was tested using the bootstrap approach by taking 5,000 replicates.

Results and Discussion
Homology modelling and structural analysis: The Oryza sativa AHAS (OsAHAS) protein sequence consist of 644 amino acid residues. The query sequence from OsAHAS protein was selected for homology based searching of the template structure by the BLAST program against the structural database of PDB (http://www.rcsb.org) [30,31]. Sequences that showed maximum identity with high score and low e-value were aligned and the alignment was used to build a 3D model for OsAHAS. According to the result of BLAST search against PDB [39], three reference proteins (PDB: 3E9Y, 1YBH, 1NOH) represented a high level of sequence identity that is 76%, 75% and 41%, respectively. The homology search of AHAS revealed 76% sequence identity to Arabidopsis thaliana (PDB ID: 3E9Y) with an e-value of 0.0 and was selected for comparative modelling. Multiple sequence alignment of the AHAS sequences highlighted the sequence conservation of amino acid residues among different species (Supplementary File S1). Structurally conserved regions (SCRs) between model OsAHAS (target) and homologous proteins (PDB: 3E9Y, 1YBH, 1NOH) were determined by multiple sequence alignment ( Figure 1). Structurally conserved regions (SCRs) between model OsAHAS and template (PDB: 3E9Y) were also determined ( Figure 2). An extensive search of the motifs and their positions was done by MEME software which identified several conserved motifs in the protein sequences of AHAS ( Figure 3). Multilevel consensus sequences for the MEME defined motifs along with their e-values are shown in Figure 4.
The initial model of OsAHAS was built by homology modelling methods using Modeller 9.12. software [40]. In this study, predicted 3D structure of OsAHAS was generated and the N-terminal and C-terminal domains were identified ( Figure 5). Each subunit consists of three domains -α, β, and γ, plus a C-terminal tail. In Arabidopsis thaliana, each subunit consists of three domains, α (residues 86-280), β (281-451), and γ (463-639), plus a C-terminal tail (646-668) that loops over the active site [41]. The secondary structural features of the Arabidopsis thaliana and OsAHAS sequences were calculated using SOPMA [42] with default parameters ( Table 1). The AHAS protein is composed of 31.52% α-helices, 22.52% extended strands and 9.94% β-turn in rice. In case of Arabidopsis thaliana, the AHAS protein is composed of 33.22% α-helices, 23.63% extended strands and 9.76% β-turn. Thus the α-helices and the β-sheets cover comparatively larger portions of the rice and Arabidopsis thaliana AHAS enzymes. Similar results have been observed by McCourt et al. [41] in Arabidopsis thaliana. Several physico-chemical properties of AHAS sequences were calculated by using Expasy's ProtParam server [31]. The results are shown in Table  2. In developing buffer system for protein purification, the computed isoelectric point (pI) will be useful. The very high aliphatic index of the AHAS enzyme sequences indicate that these enzymes may be stable for a wide temperature range. The high extinction coefficient of enzyme in rice indicates the presence of more Cys, Trp and Tyr. The instability index value for the AHAS proteins were found to be ranging from 36.51 to 41.61 indicating the stable and instable nature of the Arabidopsis thaliana and rice AHAS protein respectively.
Using String software, the AHAS interacting partners as well as its coexpression genes were predicted in both rice and Arabidopsis       (Figure 6). Some proteins such as ketol-acid reductoisomerase, dihydroxyacid dehydratase, 2-isopropylmalate synthase and 3-isopropylmalate dehydrogenase are found to be common interacting partners of AHAS in both rice and Arabidopsis thaliana. These proteins are involved in the BCAA synthesis pathway for the biosynthesis of amino acids which is conserved in prokaryotes, algae, fungi and plants.
Validation of OsAHAS structure: RAMPAGE server and PROCHECK generated model revealed that 81.6% residues are falling in the most favoured region, 11.7% residues in allowed region, and 6.7% residues in outlier region of the Ramachandran plot (Figure 7). ProSA-Web analysis of the model revealed a Z-score value of target protein. The Z-score value of the target model OsAHAS (-9.55) is located within the space of proteins determined by NMR and X-ray crystallography. This Z-score value is close to the value of template 3E9Y (-11.49) which suggested that the obtained model was reliable and very close to experimentally determined structures (Figure 8a). Verify3D showed a score greater than 0.2 in 86.02% of the residues that corresponded to the quality of the OsAHAS model that was acceptable and reliable. The value of RMSD indicates the degree to which the two three dimensional structures are similar. The lesser the value, the more similar the structures are. The Cα RMSD and backbone RSMD deviation for the OsAHAS model and the AtAHAS template were 1.03Å, and 1.10 Å, respectively and overall RMSD was 1.75 Å. Thus, the OsAHAS model generated by Modeller 9.12 was confirmed to be reliable and accurate. The superimposition of the template and the model structure is shown in Figure 8b. It shows that the helix and the sheet regions of the template and model structure superimposed in a better way and a large deviation can be observed mainly in loop regions. It is reported that the loop region is the main region where the accuracy of a model protein structure deviates from the templates [43].  Figure  9). The algae A. flos-aquae and O. nigro-viridis differs from others. Similarly, the bacteria C. botulinum differs from others. However, B. distachyon showed highest sequence similarity to OsAHAS. The results indicate that AHAS protein gene family is conserved and has evolved from bacteria and algae.

Conclusions
The homology model of OsAHAS protein was revealed using the structure of Arabidopsis thaliana AHAS (PDB ID: 3E9Y) as template.
The resulting model structure was refined by PROCHECK, ProSA, RMSD and Verify3D that indicated the model structure is reliable with 76% amino acid sequence identity with template. The multiple sequence alignment of these AHAS protein sequences from different organisms showed conserved regions at different stretches with homology in amino acid residues. Through motif analysis, it was revealed that conserved AHAS domain are found in all AHAS proteins suggesting its possible role in cellular and metabolic functions.