Author(s): JimnezMontao MA, Ebeling W, Pohl T, Rapp PE
Abstract Share this page
Abstract The paper is devoted to the analysis of digitized sequences of real numbers and discrete strings, by means of the concepts of entropy and complexity. Special attention is paid to the random character of these quantities and their fluctuation spectrum. As applications, we discuss neural spike-trains and DNA sequences. We consider a given sequence as one realization of finite length of certain random process. The other members of the ensemble are defined by appropriate surrogate sequences and surrogate processes. We show that n-gram entropies and the context-free grammatical complexity have to be considered as fluctuating quantities and study the corresponding distributions. Different complexity measures reveal different aspects of a sequence. Finally, we show that the diversity of the entropy (that takes small values for pseudorandom strings) and the context-free grammatical complexity (which takes large values for pseudorandom strings) give, nonetheless, consistent results by comparison of the ranking of sample sequences taken from molecular biology, neuroscience, and artificial control sequences.
This article was published in Biosystems
and referenced in Journal of Data Mining in Genomics & Proteomics