Author(s): Basu S, Pan A, Dutta C, Das J
Abstract Share this page
Abstract The present report proposes a new method for the chaos game representation (CGR) of different families of proteins. Using concatenated amino acid sequences of proteins belonging to a particular family and a 12-sided regular polygon, each vertex of which represents a group of amino acid residues leading to conservative substitutions, the method can generate the CGR of the family and allows pictorial representation of the pattern characterizing the family. An estimation of the percentages of points plotted in different segments of the CGR (grid points) allows quantification of the nonrandomness of the CGR patterns generated. The CGRs of different protein families exhibited distinct visually identifiable patterns. This implies that different functional classes of proteins follow specific statistical biases in the distribution of different mono-, di-, tri-, or higher order peptides along their primary sequences. The potential of grid counts as the discriminative and diagnostic signature of a family of proteins is discussed.
This article was published in J Mol Graph Model
and referenced in Journal of Computer Science & Systems Biology