Author(s): Fechteler T, Dengler U, Schomburg D
Abstract Share this page
Abstract The prediction of protein structure in insertion/deletion regions (referred to as indels) is an important part of protein model building by homology. Here we combine cluster analysis with data base search procedures. Initially, data bases of representative protein fragments are constructed using two different clustering algorithms. In the HCAPD (hierarchical clustering after preliminary division) approach, all protein fragments are divided into classes with similar anchor region structures (a protein fragment consists of two anchoring regions and a central region). Within these classes the fragments are further clustered using a hierarchical cluster algorithm. The DCANN (deterministic clustering by assignment of all nearest neighbours) approach is a variant of the k-nearest neighbours cluster algorithm. Only geometric scoring criteria are used for data base searching. The main advantage of a non-redundant data base is the ability to provide structurally different fragments during the search process, which leads to an improvement in structure prediction. Both methods have been tested on 71 insertions and 74 deletions with lengths between one and eight residues.
This article was published in J Mol Biol
and referenced in Medicinal Chemistry