Author(s): Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y
Abstract Share this page
Abstract Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80\% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner. Our method called SPINE X was applied to a dataset of 2640 proteins (25\% sequence identity cutoff) previously built for the first version of SPINE and achieved a 82.0\% accuracy based on 10-fold cross validation (Q(3)). Surpassing 81\% accuracy by SPINE X is further confirmed by employing an independently built test dataset of 1833 protein chains, a recently built dataset of 1975 proteins and 117 CASP 9 targets (critical assessment of structure prediction techniques) with an accuracy of 81.3\%, 82.3\% and 81.8\%, respectively. The prediction accuracy is further improved to 83.8\% for the dataset of 2640 proteins if the DSSP assignment used above is replaced by a more consistent consensus secondary structure assignment method. Comparison to the popular PSIPRED and CASP-winning structure-prediction techniques is made. SPINE X predicts number of helices and sheets correctly for 21.0\% of 1833 proteins, compared to 17.6\% by PSIPRED. It further shows that SPINE X consistently makes more accurate prediction in helical residues (6\%) without over prediction while PSIPRED makes more accurate prediction in coil residues (3-5\%) and over predicts them by 7\%. SPINE X Server and its training/test datasets are available at http://sparks.informatics.iupui.edu/ Copyright © 2011 Wiley Periodicals, Inc.
This article was published in J Comput Chem
and referenced in Advanced Techniques in Biology & Medicine