alexa A Short Review of Deep Learning Neural Networks in Protein Structure Prediction Problems | OMICS International
ISSN: 2379-1764
Advanced Techniques in Biology & Medicine
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

A Short Review of Deep Learning Neural Networks in Protein Structure Prediction Problems

Kuldip Paliwal*, James Lyons and Rhys Heffernan
Signal Processing Laboratory, School of Engineering, Griffith University, Brisbane, Australia
Corresponding Author : Kuldip Paliwal
Griffith School of Engineering, Griffith University,
Brisbane, QLD 4111, Australia
Tel: +61-7-3735 6536
Fax: +61-7-3735 5198
E-mail: [email protected]
Received: September 09, 2015; Accepted: September 17, 2015; Published: September 24, 2015
Citation: Paliwal K, Lyons J, Heffernan R (2015) A Short Review of Deep Learning Neural Networks in Protein Structure Prediction Problems. Adv Tech Biol Med 3:139. doi: 10.4172/2379-1764.1000139
Copyright: © 2015 Paliwal K, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Related article at Pubmed, Scholar Google

Visit for more related articles at Advanced Techniques in Biology & Medicine

Abstract

Determining the structure of a protein given its sequence is a challenging problem. Deep learning is a rapidly evolving field which excels at problems where there are complex relationships between input features and desired outputs. Deep Neural Networks have become popular for solving problems in protein science. Various deep neural network architectures have been proposed including deep feed-forward neural networks, recurrent neural networks and more recently neural Turing machines and memory networks. This article provides a short review of deep learning applied to protein prediction problems.

Keywords
Deep neural networks; Recurrent neural networks; Protein structure prediction
Introduction
There is a complex dependency between a protein’s sequence and its structure, and determining the structure given a sequence is one of the greatest challenges in computational biology [1]. In recent years Deep Neural Networks (DNNs) and other related deep neural architectures have become popular tools for machine learning; with DNNs currently state-of-the-art in many problem domains including speech recognition, image recognition and natural language processing tasks [2]. Conventional machine learning techniques such as support vector machines, random forests and neural networks with a single hidden layer are limited in the complexity of the functions they can efficiently learn. These methods often require careful design of features so that patterns can be classified. DNNs have recently been shown to outperform these conventional methods in some areas as they are capable of learning intermediate representations, with each layer of the network learning a slightly more abstract representation than the previous layer [3,4]. With enough layers very complex patterns can be learned. This ability to learn features automatically is helpful for proteins, as it is not known what the ideal mid or high level features are for protein structure prediction problems.
DNNs are neural networks with multiple hidden layers (usually more than 2) which can efficiently learn complex mappings between features and labels. The problems that DNNs excel at are those where there may be very complex relationships between the inputs and the labels, and where large amounts of training data are available. These characteristics have made DNNs popular for solving problems in protein science. The basic DNN consists of layers of hidden units connected by trainable weights. The weights are trained using the backpropagation algorithm to minimise the error between the NN output and the true output on a training set. Various architectures have been tailored to specific problems, the simplest architecture is the multi-layer feedforward neural network, for images convolutional neural networks are used, and for sequence problems recurrent neural networks are used.
Having a large amount of training data is a requirement for DNNs, as more trainable parameters usually requires more training data to reliably learn. When designing neural networks input features need to be normalised, especially when features are heterogeneous. Feature selection is also beneficial for achieving good classification and regression performance [5,6].
Feed-forward DNNs
Many current state-of-the-art protein predictors are based on feedforward DNNs that use a fixed-width window of amino acids, centered on the predicted residue. The window is moved over the protein so that predictions can be made for each residue.
PSIPRED was an early protein secondary structure predictor based on a neural network with a single hidden layer [7]. PSIPRED achieved accuracies of around 80% when predicting 3 secondary structure elements: helix, coil and sheet. Later predictors include SPINE-X, Scorpion, DNSS and SPIDER-2 which are based on deeper neural networks and increase the secondary structure prediction accuracy to around 82% [8-11]. In addition to 3 state secondary structure, other protein properties have also been predicted using deep neural networks including Accessible Surface Area (ASA), phi and psi angles, theta and tau angles, and disorder prediction [8,11-16].
Other Architectures
In addition to standard feed forward DNN architectures, Recurrent Neural Networks (RNNs) are tailored to sequence prediction problems. RNNs were developed to handle time series of information such as speech signals. These networks can pass information from one time step to the next, so context information contained earlier in the sequence can be utilized later in the sequence. Bidirectional Recurrent Neural Networks (BRNNs) were later introduced to utilize information along the entire sequence [17]. RNNs can be considered to be very deep neural nets since information may potentially be passed through many time steps. Early RNNs had problems learning when they were required to remember information over long time periods. Long Short Term Memory (LSTM) RNNs were proposed to circumvent these problems and have become widely used for sequence prediction tasks [18].
RNNs have been applied to secondary structure prediction with some success [19-25]. The recurrent connections in the RNN remove the need for large context windows, as the surrounding context is provided by the network. RNNs have also been applied to protein disorder prediction [26]. Basic RNNs can handle arbitrary length 1-dimensional input and output sequences, but they can be modified to handle arbitrarily sized 2-dimensional (and higher) inputs and outputs. These 2-D RNNs have been applied to protein contact map prediction in which a prediction is made for every pair of residues in a protein, as well as prediction of disulfide bridges [27-32]. The latest area of research in neural network architectures is towards adding memory to RNNs for so called neural Turing machines and memory networks [33-35]. These networks can be trained to solve problems that basic RNNs are incapable of solving, e.g., given examples of sorted and unsorted data, learn to sort new unseen data. These architectures have not yet been applied to protein prediction problems and it remains to be seen whether they will be able to succeed where simpler architectures have not.
Conclusion
This article has attempted to give a short non-exhaustive overview of the applications of DNNs to protein structure prediction problems. Deep learning is a rapidly evolving field which excels at problems where there are complex relationships between input features and desired outputs, problems that simpler classifiers are incapable of solving. The main strength of deep learning is the ability to easily take advantage of increases in the amount of data and computational power. One of the catalysts for the success of deep learning for speech and image recognition problems was the emergence of large datasets and sufficient computational power to process them. As more protein data becomes available we hope that deep learning can provide similar improvements to protein structure prediction problems. New deep learning architectures will only accelerate this progress.
References



































Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

  • International Conference on Fitness and Expo
    June 06-07, 2018 Philadelphia, USA
  • 3rd International Conference on Anesthesia June 21-22, 2018 Dublin, Ireland
    June 21-22, 2018 Dublin, UK
  • Annual congress on Research and Innovations in Medicine
    July 02-03, 2018 Bangkok, Thailand
  • 7th International Conference On Telemedicine & Medical Informatics July 30 to July 31 Melbourne, Australia
    July 30- 31, 2018 Melbourne, Australia

Article Usage

  • Total views: 12823
  • [From(publication date):
    November-2015 - May 27, 2018]
  • Breakdown by view type
  • HTML page views : 11949
  • PDF downloads : 874
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2018-19
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

[email protected]

1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

 
© 2008- 2018 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
Leave Your Message 24x7