Author(s): Lisni B, Svetec IK, Sari H, Nikoli I, Zgaga Z
Abstract Share this page
Abstract Palindromic sequences are important DNA motifs involved in the regulation of different cellular processes, but are also a potential source of genetic instability. In order to initiate a systematic study of palindromes at the whole genome level, we developed a computer program that can identify, locate and count palindromes in a given sequence in a strictly defined way. All palindromes, defined as identical inverted repeats without spacer DNA, can be analyzed and sorted according to their size, frequency, GC content or alphabetically. This program was then used to prepare a catalog of all palindromes present in the chromosomal DNA of the yeast Saccharomyces cerevisiae. For each palindrome size, the observed palindrome counts were significantly different from those in the randomly generated equivalents of the yeast genome. However, while the short palindromes (2-12 bp) were under-represented, the palindromes longer than 12 bp were over-represented, AT-rich and preferentially located in the intergenic regions. The 44-bp palindrome found between the genes CDC53 and LYS21 on chromosome IV was the longest palindrome identified and contained only two C-G base pairs. Avoidance of coding regions was also observed for palindromes of 4-12 bp, but was less pronounced. Dinucleotide analysis indicated a strong bias against palindromic dinucleotides that could explain the observed short palindrome avoidance. We discuss some possible mechanisms that may influence the evolutionary dynamics of palindromic sequences in the yeast genome.
This article was published in Curr Genet
and referenced in Journal of Data Mining in Genomics & Proteomics