alexa Bacterial Serine Proteases: Computational and Statistical Approach to Understand Temperature Adaptability

ISSN: 0974-276X

Journal of Proteomics & Bioinformatics

  • Research Article   
  • J Proteomics Bioinform, Vol 10(12)
  • DOI: 10.4172/jpb.1000459

Bacterial Serine Proteases: Computational and Statistical Approach to Understand Temperature Adaptability

Tilak Raj1, Nikhil Sharma1, Savitri2 and Tek Chand Bhalla2*
1Bioinformatics Centre, Himachal Pradesh University, Summer Hill, Shimla-171005, Himachal Pradesh, India
2Department of Biotechnology, Himachal Pradesh University, Summer Hill, Shimla-171005, Himachal Pradesh, India
*Corresponding Author: Tek Chand Bhalla, Department of Biotechnology, Himachal Pradesh University, Summer Hill, Shimla-171005, Himachal Pradesh, India, Tel: +91-177-2832153, Email: [email protected]

Received Date: Nov 10, 2017 / Accepted Date: Dec 29, 2017 / Published Date: Dec 30, 2017


Proteases belong to the group of hydrolases which tend to break the chemical bond joining two amino acids together. Chymotrypsin the first serine protease to be discovered by the scientists in our pancreas revolutionized their study both in the living system and their applications in the industry. Computational tools and techniques to analyse and identify proteases from organisms inhabiting extreme of habitats has opened avenues to study as to what contributes sequentially and structurally to whithstand extreme of pH or temperature. Keeping this in view sixteen amino acid sequnces of serine proteases from mesophilic, thermophilic, hyperthermophilic and psychrophilic organisms were critically analyzed to identify the variation in the physiochemical properties and their amino acids which are responsible in making them to adapt in various extreme conditions. Physiochemical properties and their analysis showed negatively charged residues (Asp+Glu) to be stastically significant contributing for the stability of proteases. Multiple sequence alignment of the amino acid sequences of serine proteases showed catalytic triad (Asp-130; His-163 and Ser- 315) to be conserved in all the four groups. Amino acids Ala (A), Arg (R), Asn (N), Asp (D), Cys (C), Gly (G), Phe (F), Tyr (Y) and Val (V) were found to be stastically significant. Cysteine (C) was exceptionally high in the psychrophilic serine proteases in comparison to their counterpart. Phylogenetic analysis using Neighbour Joining (NJ) method distinguished thermophilic, mesophilic, hyperthermophilic and psychrophilic serine proteases into their respected groups.

Keywords: Proteases; Thermophiles; Mesophiles; Hyperthermophiles; Psychrophiles; Physiochemical properties, Amino acids


Serine proteases are the most studied class of proteases having a histidine, aspartic acid and serine residue at the catalytic center. Microbial serine proteases have attracted growing interest in the last decade because they find applications mainly in leather tanning, detergent formulation and diagnostics [1-4]. Keeping in view the wider acceptability and high industrial demand, serine proteases have drawn interest of the researchers and efforts are being made to either look for novel proteases [5] or tailor these proteins which can withstand extremes of pH and temperature [6]. Although conventional methods which involves isolation of microbes and their screening for desired products are quite popular and largely followed in industrial microbiology yet are time consuming, tedious and cost intensive [7,8]. Newer tools and techniques in computational biology have led to generate sufficient data available in the biological databases which have opened new oppturnities for the researches to analyze various attributes of the proteins responsible for their extreme stability at different pH and temperatures [9,10]. A comparative study of important properties and variation in amino acids of proteins thriving at extreme conditions using traditional in vitro approaches is an expensive venture. Advances in computational biology and bioinformatics have opened new vistas in molecular sciences to analyze and compare gene and protein sequences data to deduce and predict site specific amino acids or motifs or domains of proteins responsible for their stability under extremes of temperature, pH, salt or pressure and organic solvent concentration [11-13]. Although some information on serine proteases of microbes from various environments is there yet an overall comparison of psychrophilic, mesophilic, thermophilic and hyperthermophilic proteases till date has not been carried out [14,15]. Some important physiochemical properties e.g. molecular mass, theoretical pI, amino acid composition, negative and positive charged residues, extinction coefficients, instability index, grand average hydropathicity of enzymes immensely influence their applications and need to be carefully studied. Besides to these properties variation in the total count of amino acids has been found to play a significant role in stability, selectivity and reactivity of the enzymes [11,16,17]. In view of the above a systematic comparative in silico analysis of amino acid sequences and physiochemical properties of psychrophilic, mesophilic, thermophilic and hyperthermophilic microbial serine proteases has been undertaken and the observations will be useful for predicting the behavior of a given serine protease as mesophilic or thermophilic or psychrophilic in terms of its temperature stability is reported in this communication.

Material and Methods

Data collection and tools

The amino acid sequences of some microbial serine proteases from thermophiles, hyperthermophiles, mesophiles and psychrophiles were retrieved from NCBI (, UniProt proteomic server (, and MEROPS database ( were downloaded in fasta format. ProtParam tool ( available on ExPASy proteomic server, was used for comparison of various physiochemical parameters among the different serine proteases.

To identify and highlight the conserved catalytic triad in the amino acid sequence of proteases, multiple sequence alignment of various organisms were performed using clustal omega and phylogenetic tree was generated.

Statistical analysis

An analysis of variance (ANOVA) was used to calculate different physiochemical parameter for each study with the statistical packages ‘Assistat version-7.7 beta 2016’. F-tests were applied to determine the statistical significance. Tukey test was applied for all significant effects over the pairwise comparison of mean responses.


Computational analysis of physiochemical parameters of various proteases

In the present study comparison of some important physiochemical parameters of various groups of serine proteases has been done and significant differences are recorded. Overall analysis revealed only negatively charged residues (Asp + Glu) to be statistically significant among all the groups of serine proteases (Tables 1 and 2). Individual comparison among the various group of serine proteases found negatively charged residues (Asp+Glu) to be statisctically significantly and higher in case of mesophiles (1.61 fold) in comparison to thermophiles. On the other hand aliphatic index which is defined as the volume occupied by the aliphatic amino acids in proteins was found to be significantly higher (1.05 fold) in thermophiles. When compared molecular weight and negatively charged residues were found to be higher (1.16 and 1.35 fold) in mesophiles as compare to hyperthermophiles. Aliphatic index was higher in case of hyperthermophiles (1.16) in comparison to mesophiles. Mesophilic and psychrophilic proteases too showed some significant difference with molecular weight (1.12 fold) of the psychrophilic proteases higher in comparison to mesophiles whereas, positively charged residues and theoretical pI were 1.16 and 1.24 fold higher in mesophiles as when compared with psychrophiles. The instability index which estimates the stability of the protein in a test tube was alone found significantly higher (1.34 fold) in thermophiles in comparison to hyperthermophiles. Significant difference was observed for the negatively charged residues (Asp+Glu) which were higher in psychrophiles as compared with thermophiles (1.32 fold) and statistically significant aliphatic index (1.19 fold) higher in hyperthermophilic proteases in comparison to psychrophilic proteases.

Sr. No. Accession number (UniProtKB/
1. Q9AER6 Thermoanaerobacter yonseii
2. P41363 Bacillus halodurans
3. P08594 Thermus aquaticus
4. P80146 Thermus sp. (strain Rt41A)
1. P30199 Staphylococcus epidermidis
2. Q8KH46 Enterococcus faecalis
3. H2JJ14 Clostridium sp. BNL1100
4. MER016986 Streptococcus mutans
1. F4HL71 Pyrococcus sp. NA2
2. Q5JIZ5 Thermococcus kodakarensis ATCC BAA-918
3. G0EG32 Pyrolobus fumarii
4. B8D5T9 Desulfurococcus kamchaatkensis
1. B8CU08 Shewanella piezotolerans
2. K4M7H8 Methanolobus psychrophilus R15
3. Q480E3 Colwellia psychrerythraea ATCC BAA-681
4. Q8GB52 Vibrio sp. PA-44

Table 1: Sources of some microbial proteases from various environmental conditions and their accession number.

Parameters   Microorganisms Significance
1 2 3 4
Number of amino acids Thermophiles 412.0 361.0 513.0 410.0 ns
Mesophiles 461.0 412.0 564.0 447.0
Hyperthermophiles 422.0 663.0 401.0 411.0
Psychrophiles 608.0 529.0 789.0 530.0
Molecular weight (Da) Thermophiles 44503.2 38115.8 53913 42876.4 ns
Mesophiles 51813.9 45570.2 59331.1 49196.3
Hyperthermophiles 44986.0 70955.1 42709.8 44143.0
Psychrophiles 61541.0 55101.8 80857.1 55682.5
Theoretical pI Thermophiles 9.2 6.6 6.9 6.2 ns
Mesophiles 9.4 4.9 5.2 4.9
Hyperthermophiles 5.3 4.8 9.0 5.2
Psychrophiles 4.7 4.9 4.4 4.6
Negatively charged residues (Asp + Glu) Thermophiles 40.0 29.0 35.0 30.0 *
Mesophiles 56.0 57.0 56.0 47.0
Hyperthermophiles 40.0 68.0 34.0 37.0
Psychrophiles 59.0 51.0 80 48.0
Positively charged residues (Arg + Lys) Thermophiles 49.0 27.0 35.0 27.0 ns
Mesophiles 75.0 43.0 45.0 34.0
Hyperthermophiles 33.0 46 43.0 30.0
Psychrophiles 35.0 36.0 44.0 31.0
Extinction coefficients
(M-1cm-1) at 280
Thermophiles 45965 30370 109585 56060 ns
Mesophiles 49405 33810 57300 60740
Hyperthermophiles 81835 123540 55030 79315
Psychrophiles 44975 63050 78325 54945
Instability Index Thermophiles 31.24 29.93 34.86 28.35 ns
Mesophiles 23.67 28.57 22.52 32.65
Hyperthermophiles 20.33 18.1 30.02 23.82
Psychrophiles 22.79 24.68 30.32 40.18
Aliphatic Index Thermophiles 95.17 90.8 73.68 90.98 ns
Mesophiles 80.3 90.87 81.15 80.94
Hyperthermophiles 98.08 81.21 93.42 100.78
Psychrophiles 73.45 83.53 76.92 79.09
Grand average of hydropathicity (GRAVY) Thermophiles -0.121 -0.111 -0.121 0.054 --------
Mesophiles -0.683 -0.333 -0.165 -0.456
Hyperthermophiles 0.113 -0.186 -0.029 0.155
Psychrophiles -0.02 -0.013 -0.115 -0.181

Thermophiles: 1) Thermoanaerobacter yonseii 2) Bacillus halodurans 3) Thermus aquaticus 4) Thermus sp. (strain Rt41A)
Mesophiles: 1) Staphylococcus epidermidis 2) Enterococcus faecalis 3) Clostridium sp. BNL1100 4) Streptococcus mutans
Hyperthermophiles: 1) Pyrococcus sp. NA2 2) Thermococcus onnurineus 3) Pyrolobus fumarii 4) Desulfurococcus kamchaatkensis
Psychrophiles: 1) Shewanella piezotolerans 2) Methanolobus psychrophilus R15 3) Psychroflexus gondwanensis 4) Vibrio sp. PA-44

Table 2: Physiochemical parameters of various microorganisms calculated using ProtParam tool at ExPASy proteomic server.

Computational analysis of twenty amino acid of bacterial proteases

Overall comparison of amino acids for various serine proteases exhibited amino acids Ala (A), Arg (R), Asn (N), Asp (D), Cys (C), Gly (G), Phe (F), Tyr (Y) and Val (V) to be statistically significant (Table 3). Comparative analysis between mesophilic and thermophilic serine proteases revealed Ala (1.70) Gly (1.30), Pro (1.8), Arg (1.2) and Val (2.2) to be statistically significant in case of thermophiles whereas, Asp (1.6 fold) was significantly higher in mesophiles. A significant difference and higher the number of Ala (A), Arg (R), Gly (G) and Val (V) (1.5, 2.0, 1.4 and 1.8 fold) were found in case of hyperthermophiles as when compared with mesophiles having more number of Asn (N) and Phe (F) (2.2 & 1.3 fold). The amino acid residues Cys (C), Gly (G) and Val (V) were found to be significantly higher with 9.5;1.5 and 1.28 fold in psychrophilic serine proteases whereas, Glu (E), Ile (I) and Phe (F) were significantly higher with 1.7, 1.5 and 1.3 fold respectively in mesophilic bacteria Fink.

Amino acid composition   Microorganisms Significance
1 2 3 4
Ala (A) Thermophiles 8.7 11.6 12.5 13.9 *
Mesophiles 4.6 5.8 10.1 6.5
Hyperthermophiles 10.2 10.0 11.2 9.5
Psychrophiles 14.1 10.8 10.8 7.9
Arg (R) Thermophiles 2.9 3.9 5.3 4.6 *
Mesophiles 1.7 2.7 1.1 1.8
Hyperthermophiles 4.5 1.1 4.0 4.1
Psychrophiles 2.5 1.1 1.6 3.4
Asn (N) Thermophiles 6.1 7.8 4.1 3.9 *
Mesophiles 10.2 8.7 5.3 10.5
Hyperthermophiles 4.5 5.3 4.0 4.9
Psychrophiles 6.9 5.7 5.8 6.6
Asp (D) Thermophiles 5.3 2.5 4.3 4.4 **
Mesophiles 6.5 8.3 6.9 6.5
Hyperthermophiles 5.9 7.7 5.2 6.1
Psychrophiles 6.9 6.4 7.1 7.2
Cys (C) Thermophiles 0.5 0.0 1.4 1.2 *
Mesophiles 0.4 0.0 0.0 0.2
Hyperthermophiles 0.5 0.0 1.2 0.5
Psychrophiles 1.6 0.9 1.3 1.9
Gln (Q) Thermophiles 1.5 3.6 3.1 3.9 ns
Mesophiles 2.8 1.7 2.7 5.8
Hyperthermophiles 1.9 3.6 2.7 1.5
Psychrophiles 1.6 1.7 2.9 5.7
Glu (E) Thermophiles 4.4 5.5 2.5 2.9 ns
Mesophiles 5.6 5.6 3.0 4.0
Hyperthermophiles 3.6 2.6 3.2 2.9
Psychrophiles 2.8 3.2 3.0 1.9
Gly (G) Thermophiles 9.0 9.1 12.1 10.0 **
Mesophiles 6.9 7.0 8.3 8.3
Hyperthermophiles 11.4 10.3 10.2 10.0
Psychrophiles 13.5 9.8 12.0 10.8
His (H) Thermophiles 1.5 2.8 1.2 1.7 ns
Mesophiles 1.1 1.2 1.2 1.3
Hyperthermophiles 1.2 1.5 1.5 1.0
Psychrophiles 1.8 1.3 1.3 0.9
Ile (I) Thermophiles 9.2 6.4 2.7 3.4 ns
Mesophiles 5.9 9.7 6.4 8.5
Hyperthermophiles 5.2 5.4 6.5 7.1
Psychrophiles 5.1 6.0 4.6 4.2
Leu (L) Thermophiles 7.5 6.9 7.6 10.0 ns
Mesophiles 8.2 7.8 6.6 6.3
Hyperthermophiles 6.4 6.3 7.7 8
Psychrophiles 4.6 5.9 6.1 7.9
Lys (K) Thermophiles 9.0 3.6 1.6 2.0 ns
Mesophiles 14.5 7.8 6.9 5.8
Hyperthermophiles 3.3 5.9 6.7 3.2
Psychrophiles 3.3 5.7 3.9 2.5
Met (M) Thermophiles 1.5 1.9 1.4 1.2 ns
Mesophiles 1.7 2.2 0.5 1.3
Hyperthermophiles 1.4 1.7 1.2 1.7

Thermophiles: 1) Thermoanaerobacter yonseii 2) Bacillus halodurans 3) Thermus aquaticus 4) Thermus sp. (strain Rt41A)
Mesophiles: 1) Staphylococcus epidermidis 2) Enterococcus faecalis 3) Clostridium sp.BNL1100 4) Streptococcus mutans
Hyperthermophiles: 1) Pyrococcus sp. NA2 2) Thermococcus onnurineus 3) Pyrolobus fumarii 4) Desulfurococcus kamchaatkensis
Psychrophiles: 1) Shewanella piezotolerans 2) Methanolobus psychrophilus R15 3) Psychroflexus gondwanensis 4) Vibrio sp. PA-44
** Significant at a level of 1 % of probability (P<0.01)
* Significant at a level of 5 % of probability (0.01 ≤ P<0.05)
ns non-significant (P ≥ 0.05)

Table 3: Comparative analysis of amino acid residues in thermophiles, mesophiles, hyperthermophiles and psychrophiles.

Multiple sequence alignment and phylogenetic analysis

Multiple sequence alignment (MSA) showed the presence of conserved catalytic triad of D-130, H-163 and S-315 (Figures 1 and 2) which is responsible for the catalytic activity in serine proteases. Phylogram was generated using Neighbor Joining method to study the evolutionary relationship among the bacteria for the four groups of serine proteases.


Figure 1: Multiple sequence alignment (MSA) of bacterial amino acid sequences of serine proteases from thermophilic, mesophilic, hyperthermophilic and psychrophilic microorganisms with their catalytic triad of D-130, H-163 and S-315.


Figure 2: Phylogenetic tree of bacterial serine protease sequences from thermophilic, mesophilic, hyperthermophilic and psychrophilic organisms constructed by NJ-method of CLC workbench software.


Looking into the fundamentals of protein stability, discovering enzymes bearing extreme of temperature and pressure has led to many practical applications in the industry and for the scientific community. Understanding how these enzymes achieve the ability to bear extreme of conditions could lead to design proteins with better selectivity, reactivity and stability. The four groups of proteases i.e. mesophilic, thermophilic, hyperthermophillic and psychrophillic serine proteases amino acid sequences were distinguished using the sequencing and statistical methods. Analysis of physiochemical properties and amino acid compositions of different groups of serine proteases revealed a clearcut segregation as to what makes proteins to work at extreme of temperature. Detailed comparative and statistical analyses confirmed the separation of the mesophiles from the three classes i.e. psychrophiles, thermophiles and hyperthermophiles in terms of the amino acids usage. Keeping in view the broad applications of serine proteases in the industries which have have drawn a considerable interest of the researchers to engineer and produce the proteases with better stability and selectivity [6,18] which indeed will be useful in economic and environmental benefits [19-21]. The diversity in twenty amino acids and their combinations make the proteins to differ in their physicochemical properties as well as substrate specificity [11,18,22]. The predominance of alanine (A) and proline (P) have less surface nonpolar area exposed in both thermostable and hyperthermostable proteases making them to be buried in the core [23]. Glycine (G) and Valine (V) are responsible for compact core packing and functional regulation [24,25]. The hydrophobic core is very necessary for folding and stability so more the hydrophobic interactions more stable are proteins i.e. these attain higher thermostability [26]. Another important amino acid proline (P) which was higher in thermophilic proteases provides rigidity and reduces the free energy of the main chain [27]. Proline is said to be highly prevalent in thermophilic proteins because of its side chain having distinctive cyclic structure that locks its backbone and leads to an exceptional conformational rigidity in the turns and loops [28]. Cysteine (C) content was exceptionally high with 9.5 fold in psychrophilic proteases as compared to its hyperthermophilic, thermophilic and mesophilic counterparts. Cysteine (C) tend to provide flexibility and are capable of making cavities in the core of the psychrophilic protein structure [29,30] which imparts extra stability to psychrophilic proteins. Cysteine residues also play a dual role by both increasing thermostability by forming disulphide bridges and decreasing thermostability when available in free form as it is highly sensitive to oxidation at elevated temperature [31]. Keeping this in view the trend observed in the present study shows with maximum frequency of Cys (C) to occur in psychrophilic proteases in comparion to its counterparts. This natural or any changes made through mutagenesis under controlled temperature conditions can lead to tailor proteases which could be a big boon for the food industry and human mankind.


The presence of Ala (A), Gly (G), Pro (P), Arg (R) and Val (V) in thermophiles and Asp (D) in mesophiles clearly discriminates the thermophiles from the mesophiles. The amino acid residues Ala (A), Arg (R), Gly (G) and Val (V) were significantly higher in hyperthermophiles and Asn (N) and Ser (S) in mesophillic bacteria demarcate the mesophiles from hyperthermophiles. Similarly, the presence of exceptionally high Cys (C), in psychrophiles differentiates them from their counterpart. The results of the present study will indeed be of great help to understand the role of amino acids especially cysteine to develop practical stratagies in engineering serine proteases and their potential use in different industries, their role in biological and in bioremediation processes.

Conflict of Interest

The authors declare that they have no conflict of interests.


The authors are thankful to the Department of Biotechnology (DBT), New Delhi for the continuous support to the Bioinformatics Centre, Himachal Pradesh University, Summer Hill, Shimla, India.


Citation: Raj T, Nikhil S, Savitri, Bhalla TC (2017) Bacterial Serine Proteases: Computational and Statistical Approach to Understand Temperature Adaptability. J Proteomics Bioinform 10: 329-334. Doi: 10.4172/jpb.1000459

Copyright: © 2017 Raj T, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Select your language of interest to view the total content in your interested language

Post Your Comment Citation
Share This Article
Relevant Topics
Recommended Conferences
  • Proteomics, Genomics and Bioinformatics

    May 16-17, 2018 Singapore City, Singapore

  • Glycobiology, Lipids & Proteomics

    August 27-28, 2018 Toronto, Canada

  • Computational Biology and Bioinformatics

    Sep 05-06 2018 Tokyo, Japan

  • Advancements in Bioinformatics and Drug Discovery

    November 26-27, 2018 Dublin, Ireland

Article Usage
  • Total views: 553
  • [From(publication date): 0-2017 - Feb 23, 2018]
  • Breakdown by view type
  • HTML page views: 522
  • PDF downloads: 31

Post your comment

captcha   Reload  Can't read the image? click here to refresh