Genomic Methylation Status for Discrimination Among Helicobacter Species: A Bioinformatics Approach

The genus Helicobacter comprises several species of both gastric and enterohepatic intestinal bacteria. H. pylori , the type species of the genus, is associated with gastritis, peptic ulcer and gastric cancer in humans. H. pylori genome has a high number of restriction and modification (R-M) systems and their diversity is useful for strain typing. To analyse if such a high number of expressed methyltransferases is a characteristic of the genus Helicobacter , the genomic methylation of five non-pylori Helicobacter spp. ( H. canadensis , H. canis , H. felis , H. mustelae and H. pullorum ) was determined. The results revealed that the number of R-M systems among non-pylori Helicobacter spp. is smaller than those observed among a group of 221 H. pylori strains (p<0,001), but is greater than those observed for the mean of all bacteria sequenced genomes (p=0,005). 16S ribosomal RNA analysis of H. pylori sequenced strains and five non-pylori Helicobacter spp. clearly isolate H. pylori species. Surprisingly, the analysis of the genomic methylation status by MCRM algorithm performs similarly. This suggests that R-M systems do not appear to be spread in a miscellaneous manner, once even that these genes may be subjected to acquisition and loss; their expression still allows discriminating among Helicobacter spp.


Introduction
Helicobacter pylori colonizes the stomach of about half of the human population and is associated with several disease outcomes, like gastritis, peptic ulcer and gastric cancer (Dunn et al., 1997;Kusters et al., 2006). Other similar spiral bacteria have been isolated from several animals, like cat, dog or mice, among others. Helicobacter species can be subdivided into gastric Helicobacter species and enterohepatic (nongastric) Helicobacter species. The two lineages demonstrate a high level of organ specificity, such that gastric Helicobacter spp. in general does not colonize the intestine or liver, and vice versa (Kusters et al., 2006). The genomic DNA of H. pylori is characterised by the presence of a high unusual number of restriction and modification (R-M) system (Nobusato et al., 2000;Lin et al., 2001;Takata et al., 2002). The type II R-M systems are composed by least two genes: one coding for a restriction endonuclease (REase) that recognizes a specific DNA sequence and cuts both strands; and other gene coding for a cognate MTase that methylates the same DNA sequence, thus protecting the genomic DNA from being cleaved by the companion REase (Roberts et al., 2003). Type II R-M systems have been referred as selfish genetic elements, because the descendants of cells that had lost these genes appeared unable to modify a sufficient number of recogni-J Proteomics Bioinform Volume 1(5) :258-266(2008) -259 ISSN:0974-276X JPB, an open access journal tion sites in their chromosomes to protect them from lethal attack by the remaining restriction enzyme molecules (Naito et al., 1995).
Recently we have demonstrated that the diversity of R-M systems among H. pylori strains is high enough to be used as a typing method (Vale and Vitor, 2007). Cluster analysis by conventional methods does not consider the propensity for R-M systems conservation after acquisition, due to the "selfish behavior" (Naito et al., 1995). Considering this we have recently developed a new clustering algorithm [Minimum Common Restriction Modification (MCRM) algorithm] that takes into account the pressure of REases on MTases, and that is based on the hypothesis that each strain evolves by acquiring new RM systems without loosing acquired RM systems (Vale et al., 2008). In this algorithm it is considered that: i) the strain with less RM systems is the one that has the core set of the most abundant R-M systems expressed among the typed strains; ii) these core set of R-M systems was the first to be acquired by H. pylori, so that they exhibit a large dissemination (expression) among several daughter strains (Vale et al., 2008). MCRM analysis of the genomic methylation data from H. pylori strains isolated from different geographic revealed a clustering according to strain's continent of origin (Vale et al., 2008), which is in agreement with H. pylori coevolution with its human host (Covacci et al., 1999;Linz et al., 2007;Vale et al., 2008). This observation led to the suggestion that R-M systems may trace H. pylori geographic distribution and, by default also its human host migrations (Vale et al., 2008).  (Megraud, 1996). Non-pylori Helicobacter spp. were cultured on Muller Hinton agar (Oxoid, UK) supplemented with 10% (v/v) defibrinated horse blood (Probiologica, Portugal) and incubated in similar conditions. Genomic DNA was extracted by standard methods.

Genomic methylation status comparison
The mean number of active MTases on non-pylori Helicobacter spp. was compared with: i) the mean number of expressed MTases of 221 H. pylori tested by us; ii) the mean number of M genes predicted by REBASE for all sequenced bacteria (Roberts et al., 2007). The Kruskal-Wallis test was performed using the statistical package SPSS v.15 (SPSS Inc., Chicago, IL).

Ribosomal RNA alignment
16S rRNA sequences available on public data bases from the Helicobacter spp. (table 1) were aligned using ClustalW, producing a cladogram (Chenna et al., 2003).

MCRM clustering
Several dendrograms were produced using MCRM algorithm after genomic methylation analysis of 7 Helicobacter spp. Most of the dendrograms produced by MCRM algorithm are indeed similar, but it is possible that distinct dendrograms are generated as different choices of strain or R-M system at ties may result in different clustering. Thus, 10 different dendrograms were produced from the same data in order to increase the confidence on the clustering results.

Helicobacter spp. genomic methylation
After genomic DNA hydrolysis with the selected REases it was observed that among non-pylori Helicobacter spp.  Table 1 resumes the number of active methyltransferases in tested Helicobacter spp. A significant statistical difference between mean number of active MTases from non-pylori Helicobacter spp. and H. pylori (p<0,001) and also between non-pylori Helicobacter spp. and the overall sequenced genomes (p=0,005) was verified. Present study results showed that the number of expressed MTases in a decreasing order by organism, or group of organisms, was: H. pylori, non-pylori Helicobacter and all sequenced bacteria available at REBASE (Roberts et al., 2007).     Table 2.

MCRM clustering analysis
After genomic DNA hydrolysis with the selected REases the codified data were analysed using MCRM algorithm (Vale et al., 2008). The Simpson index of diversity, which reflects the capacity of the method to distinguish unrelated strains, was 100% (Hunter and Gaston, 1988). The produced dendrogram is present in Figure 2. Surprisingly, this analysis based on the genomic methylation status clearly isolated H. pylori species from non-pylori Helicobacter spp., as it was observed when the analysis focuses on the 16S rRNA. Out of 10 produced dendrograms, 60% cluster H. pylori together and 40% of the dendrograms also discriminate between H. pylori and non-pylori Helicobacter spp. All of these last mentioned dendrograms gathered H. pylori strains (data not shown  i.e. one MTase common to all Helicobacter species used).
This MTase common to all tested Helicobacter spp. was M.NaeI (Table 3).

Discussion
When comparing the number of MTases expressed in each Helicobacter spp. it was verified that H. pylori expresses a number of MTases higher than tested non-pylori Helicobacter spp. (p<0,001) and, that non-pylori Helicobacter spp. expresses a number higher than all sequenced bacteria analysed by REBASE (p=0,005). This analysis reveals that the increased number of MTases genes expressed is probably a characteristic of Helicobacter genus and not only of H. pylori. To our knowledge the evaluation of the expressed MTases in non-pylori Helicobacter spp. has only been referred for the sequenced genomes of H. acinonychis Sheeba (Eppinger et al., 2006) (Edmonds et al., 1992) found that the GATC methylation is absent in H. mustelae (Table3).
The only MTase that it was found to be expressed among all tested Helicobacter spp. is M.NaeI (Table 3). Previously we have reported that this MTase is probably conserved in all H. pylori strains (Vale and Vitor, 2007;Vale et al., 2008). In order to confirm if this MTase is common to the Helicobacter genus, other non-pylori Helicobacter spp. should be included in the study, and for each species several stains should be characterized.
Analysis of 16S rRNA gene sequences has become the primary method for determining prokaryotic phylogeny, which is the current basis for prokaryotic systematics. Although it has been described that Helicobacter is susceptible to horizontal transfer of 16S rRNA gene so that it can be misleading in Helicobacter spp. identification (Vandamme et al., 2000;Dewhirst et al., 2005), the 16S rRNA cladogram clearly discriminate the Helicobacter spp. used in the present study as expected. Moreover, Figure 1 presents gastric Helicobacter spp. and enterohepatic Helicobacter spp. in different clusters, as described elsewhere (Dewhirst et al., 2005). However, the cladogram obtained from 60 kDa heat-shock protein (HSP60), referred as better marker for Helicobacter species phylogeny (Mikkonen et al., 2004) was similar to the one obtained with 16S rRNA (data not shown).  A surprising result was the capacity of the genomic methylation and of MCRM algorithm to cluster separately gastric Helicobacter spp. and enterohepatic Helicobacter spp. in 40% of the produced dendrograms. Also this methodology discriminate H. pylori from non-pylori Helicobacter spp. in 60% of the produced dendrograms. When genomic methylation and MCRM analysis is compared with the current phylogeny approach the results are remarkably similar. The genomic methylation status appears to be an interesting new tool to characterize genomes with a high number of MTases expressed. It has been described that R-M systems are subject to horizontal transfer (Jeltsch and Pingoud, 1996;Gressmann et al., 2005). The separation between gastric Helicobacter spp. and  We postulate that similarly to H. pylori, non-pylori Helicobacter may also present a high diversity of MTases expressed which could be used for strain typing, but this still needs to be confirmed with further investigation. R-M systems probably play an important role in Helicobacter genus biology that has not been ascertained, yet.
In conclusion, it was observed a high number of MTases expressed in non-pylori Helicobacter spp. as it was previously determined for H. pylori. The discrimination of Helicobacter species by the dendrogram produced with MCRM algorithm and by 16S rRNA alignment performed in a similar way. The results suggest that some R-M systems do not appear to be spread in a miscellaneous manner, once genomic methylation analysis allows discrimination among Helicobacter spp. Future work should include increasing the number of Helicobacter species analysed and also the number of tested strains from each species, in order to confirm present study results.