Diversity Study of Nitrate Reducing Bacteria from Soil Samples – A Metagenomics Approach

Soils cover almost all of the terrestrial area on the Earth and have indispensable ecological functions in the global carbon cycle, nitrogen cycle and sulfur cycle. Due to their physico-chemical complexity with many micro-niches, they teem with bio-diversity, both phylogenetically and functionally [1,2]. A single gram of soil has been estimated to contain thousands to millions of different bacterial, archaeal and eukaryotic species2 interwoven in extremely complex food webs. Communities of soil microbes carry out a multitude of very small-scale processes that underlie many environmentally important functions [3].


Introduction
Soils cover almost all of the terrestrial area on the Earth and have indispensable ecological functions in the global carbon cycle, nitrogen cycle and sulfur cycle. Due to their physico-chemical complexity with many micro-niches, they teem with bio-diversity, both phylogenetically and functionally [1,2]. A single gram of soil has been estimated to contain thousands to millions of different bacterial, archaeal and eukaryotic species 2 interwoven in extremely complex food webs. Communities of soil microbes carry out a multitude of very small-scale processes that underlie many environmentally important functions [3].
As nitrogen is one of the essential elements for all living organisms, the availability of a suitable nitrogen source often limits primary productivity of both natural environments and agriculture. Nitrogen levels in the environment are affected by an interacting web of processes, which are including the oxidation of ammonium and nitrite (nitrification), the dissimilatory reduction of nitrate (NO 3 -) to ammonium (NH 3 ) (nitrate ammonification), and the dissimilatory reduction of nitrate via nitrite (NO 2 -) and gaseous nitrogen oxides (NO x ) to dinitrogen gas N 2 (denitrification) [4]. Nitrogen (N) can be found in several oxidation states, from +5 in the most oxidized compound (nitrate NO 3 -) to -3 in the most reduced form (ammonium NH 4 + ), but in biological compounds it is almost exclusively present in the fully reduced state [5].
The nitrogen cycle is one of the most important nutrient cycles in terrestrial ecosystems. Nitrogen cycling involves four key microbiological processes: nitrogen fixation, mineralization (decay), nitrification and denitrification [1]. Microorganisms play very important roles in the nitrogen cycles of various ecosystems. Research has revealed that a greater diversity of microorganisms is being involved in the nitrogen cycle than previous knowledge [6]. It is becoming clear that denitrifying fungi, anammox bacteria, nitrifying archaea [7] aerobic denitrifying bacteria and heterotrophic nitrifying microorganisms are key players in the nitrogen cycle [1].
Environmental bacteria maintain the global nitrogen cycle by metabolizing organic as well as inorganic nitrogen compounds.
Denitrification is critical for maintenance of the global nitrogen cycle, through which nitrate (NO 3 -) or nitrite (NO 2 -) is reduced to gaseous nitrogen forms such as N 2 and nitrous oxide (N 2 O) [5]. It is thought that most of the microbial taxa cannot be cultured outside of their natural environment; thus, microbial diversity remains poorly described. The explicit functional and ecological roles of individual taxa remain unknown because most microbes withstand laboratory cultivation [8]. Therefore the most basic questions in microbial ecology is that about ''who'' and ''what'' . While soils seem to be harbor [3] for the most complex microbial communities, these considerations apply to many other environments as well, like e.g. oceans and sediments [9]. The metagenomic techniques [10] developed recently have therefore greatly extended our knowledge of microbial genetic diversity [11]. With metagenomic technologies new dimensions in the characterization of complex microbial communities have been reached [12]. A large scale shotgun sequencing approaches will able the discovery of many novel genes found in the environments and which are independent of cultivation techniques [13].
In this study we used web based server, MG-RAST (Metagenomics RAST) http://metagenomics.anl.gov/ for metagenomic analysis which is an automated analysis platform for metagenomes providing quantitative insights into microbial populations based on sequence data [14]. Metagenome samples were analyzed for the taxonomic composition with the MG-RAST server using similarity to a large non-redundant protein database; M5NR. Using the same nonredundant database, affinities were also tested for the sequences for known metabolic function against both SEED subsystems and KEGG metabolic pathways using a maximum e-value of 1e -5 [15]. Although there are a number of metabolic functions that can be tested but specific interest was focused on the microbial contributions at the level of nitrate reduction in nitrogen metabolism. Thus, enzymes were selected particularly to this area only.

Selection of metagenomes
For this study metagenomes were selected from MG-RAST server (http://metagenomics.anl.gov/) public. Mainly three parameters were taken as consideration for the selection of metagenomes respectively environment (material), environment (biome) and sequence type. Mainly the Rain forest, temperate broadleaf and temperate grassland type of soil was chosen. Whole genome shotgun sequence in sequence type was selected to have an idea of microbes present in metagenomes.

Analysis of the taxonomic abundance
Abundance is the measurement of large number of individuals in a given sample. The analysis for the taxonomic abundance on selected metagenomes was carried over by different parameters like database for annotation sequence (M5NR), e-value and percentage identity cut-off and alignment length. The e-value and percentage identity cutoff were set at 1e -5 and 60%. Alignment length of 45 was set for minimum length of matching aligned sequences in amino acid for proteins and base pair for RNA database.

Characterization of functional attributes
Characterization of functional attributes related to nitrogen metabolism, the MG-RAST server was used with the following parameters as hierarchical classification, subsystems and others. COG, NOG, SEED and etc databases were selected to find out the abundances that support relationship between functions. To compare annotated sequences, subsystem database was chosen. The values for the other parameters like e-value, percentage identity cutoff and alignment length were the same as used in taxonomic abundance.

Pathway detection
For the pathway detection related to functional aspect (nitrate metabolism) in a metagenome, the KEGG map tool of the MG-RAST server used and the parameters were selected as follows: Database for annotation sequence comparison (subsystems), Maximum probability of a sequence with higher similarity to target sequence than one provided (e-value set to 1e -5 ), Minimum percent identity between selected metagenome with existing sBLAT sequences (percentage identity cutoff 60%), Minimum length of matching sequence in amino-acids for proteins and base pairs for RNA database (alignment length of 45).

Results and Discussion
In continuation to the earlier discussion about parameters, in the material and method section three metagenomes (Table 1) from different soil samples were selected. Soil borne microorganisms are one of the earth's greatest sources of biodiversity [16], with ranging between 3000 and 11 000 microbial genomes per gram of soil [17]. One gram of soil may contain up to 4,000 different species [18] however, current estimates indicate that less than 1% of these organisms are readily cultural with known cultivation techniques [19]. Because of the huge diversity of soil and its history as a source of commercially important molecules in agriculture, chemical, industrial and pharmaceutical industries, it remains the most common target for studies of functional metagenomics [20][21][22].

Metagenome analysis
In metagenome analysis we find out occurrence of domain, phyla, and etc. All three metagenomes (Table 1) were analyzed in MG-RAST server. This open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes [14]. Which has built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis -the availability of high-performance computing for annotating the data. The taxonomic analysis for the bacterial community was accomplished up to the species level via M5NR database in MG-RAST. A suitable reference using similar parameter cut-off values have been cited as well.   (Table 2). Table 2 also describe about the third metagenome 4453261.3, where 0 sequences failed quality control (indicates all reads are of about to mean length). Of those, dereplication identified 0 sequences (0.0% of total) as artificial duplicate reads (ADRs). Of the 1,130,719 sequences (totaling 465,558,160 bps) that passed quality control, 1,103,922 (97.6%) produced a total of 1,159,527 predicted protein coding regions. Of these 1,159,527 predicted protein features, 629,343 (54.3% of features) have been assigned an annotation using at least one of our protein databases (M5NR) and 530,184 (45.7% of features) have no significant similarities to the protein database (orfans). 574,253 features (91.2% of annotated features) were assigned to functional categories.

Taxonomic hits distribution
The taxonomic classification of protein-coding genes was assigned to the M5NR (non-redundant protein database) annotation source using the best hit classification of MG-RAST [14].  (Table 3). Figure 2 illustrates the individual In the first metagenome 4446153.3, 140,207 sequences failed quality control (reads more than two standard deviations away from the mean read length are discarded). Of those, dereplication identified 83,075 sequences (10.6% of total) as artificial duplicate reads (ADRs). Of the 642,197 sequences (totaling 279,379,947 bps) that passed quality control, 637,914 (99.3%) produced a total of 677,007 predicted protein coding regions. Of these 677,007 predicted protein features, 341,249 (50.4% of features) have been assigned an annotation using at least one of our protein databases (M5NR) and 335,758 (49.6% of features) have no significant similarities to the protein database (orfans). 314,106 features (92.0% of annotated features) were assigned to functional categories ( Table 2). The failure rate seems high for the first metagenome. But if we look at the class distribution of the microbial community, we can observe that the Alphaproteobacteria has been dominant in the metagenome.
In metagenome 4508941.3, 261,170 sequences failed quality control. Of those, de-replication identified 82,191 sequences (0.7% of total) as artificial duplicate reads (ADRs). Of the 10,805,789 sequences (totaling 1,895,810,832 bps) that passed quality control, 10,223,265 (94.6%) produced a total of 10,166,026 predicted protein coding regions. Of these 10,166,026 predicted protein features, 3,272,265 (32.2% of features) have been assigned an annotation using at least one of our protein databases (M5NR) and 6,893,761 (67.8% of features) have no significant similarities to the protein database (orfans). 2,570,983 The observation of B. japonicum dominance has been found to be similar to the results of VanInsberghe et al. and Ormenno-Orrilli et al. [26,27]. Similarly, Delmont et al. also describes the statistical view of functional distributions of the Rothamsted soil metagenome, which aided the knowledge about soil microbial communities at a metagenomic level [28].
Nitrate is a major nitrogen source for many bacteria. In the general assimilatory pathway, nitrate is converted via nitrite to ammonia, which is then assimilated into nitrogen metabolism [29]. This metabolic route functions aerobically and anaerobically and involves assimilatory nitrate reductases which are repressed by ammonia. Nitrate can also serve as an electron acceptor for anaerobic respiration in the absence of oxygen. In this case nitrate is reduced by respiratory nitrate reductases to nitrite, the end product of nitrate respiration is denitrogen. This denitrification pathway involves [24], in addition to the respiratory nitrate reductase, further respiratory reductases for nitrite, nitric oxide, and nitrous [30]. Assimilatory nitrate reductases were found in bacteria like Azotobacter chroococcum, Clostridium perfringens, and Ectothiorhodospira shaposhnikovii (Ferredoxin-Nas) Klebsiella pneumonia and Rhodobacter capsulatus (NADH-dependent).
Mainly three different types of nitrate-reducing systems have been described in bacteria [31][32][33]. The first type is a cytoplasmic assimilatory nitrate reductase, which enables the utilization of nitrate as the nitrogen source for biosynthesis. This enzyme is repressed by ammonium, but is not affected by oxygen [33]. The second type is a membrane-bound respiratory nitrate reductase, which catalyses nitrate respiration and the first step of denitrification to allow ATP synthesis by using nitrate as an alternative electron acceptor under anaerobic conditions. This enzyme is repressed by oxygen, but is insensitive to ammonium [31]. Membrane-bound nitrate reductases are associated with denitrification and anaerobic nitrate respiration in Escherichia coli and Paracoccus denitrificans, (NO). The membranebound dissimilatory nitrate reductase been shown to be involved in anaerobic nitrate reduction in Paracoccus denitrificans, Pseudomonas aeruginosa, Pseudomonas denitrificans, and Pseudomonas stutzeri. The third nitrate-reducing system is a periplasmic nitrate reductase found in some Gram-negative bacteria. This enzyme is repressed by neither ammonium nor oxygen and probably participates in redox balance and/or aerobic nitrate respiration [25]. Nitrate reductases located in the periplasmic compartment have also been described in   denitrificans, and Pseudomonas putida. Reduction of nitrate in the periplasm is not sensitive to the oxygen inhibition of nitrate transport across the cytoplasmic membrane that prevents reduction by the membrane-bound enzyme. Table 4 also describes the existence of Escherichia coli, Rhodobacter capsulatus and Rhodobacter sphaeroides which reduces nitrate to ammonium. In Pseudomonas putida a membrane-bound nitrate reductase with an active site in the cytoplasm. This enzyme allows the oxidation of quinol by nitrate to be coupled to the generation of a transmembrane proton electrochemical gradient and thus has an important role in energy generation under anoxic conditions.

Pathway detection related to nitrate reduction in a metagenome
Interest in nitrate reduction exists for several reasons. First, it is a major mechanism of loss of fertilizer nitrogen resulting in decreased efficiency of fertilizer use. Second, it is of great potential application in the removal of nitrogen from high-nitrogen waste materials such as animal residues. Third, nitrate reduction is an important process, contributing N 2 O to the atmosphere, where it is involved in stratospheric reactions which result in the depletion of ozone. Fourth, it is the mechanism by which the global nitrogen cycle is balanced. From Figures 3-5 we can say that there are three types of nitrate reducing pathways are present in all three selected metagenomes mainly dissimilatory nitrate reduction pathway, assimilatory nitrate reduction pathway and denitrification pathway.

Conclusion
Metagenomics can provide valuable insights into the functional ecology of environmental communities. Using the metagenome sequences to fully understand how complex microbial communities function and how microbes interact within these niches represents a major challenge for microbiologists today. Microorganisms play important roles in the nitrogen cycles of various ecosystems. Research has revealed that a greater diversity of microorganisms is involved in the nitrogen cycle than previously understood. It is becoming clear that denitrifying fungi, nitrifying archaea, anammox bacteria, aerobic denitrifying bacteria and heterotrophic nitrifying microorganisms are key players in the nitrogen cycle. From soil metagenome potential taxonomic diversity of nitrate reducing bacteria  with their probable activity were studied. The results explain the potential taxonomic diversity of nitrate reducing bacteria with the dominance of Bradyrhizobium japonicum from soil sample. The nitrate reducing metabolic pathway were studied and it is found that in given metagenomes all three pathways of nitrate reduction i.e. assimilatory nitrate reduction, respiratory (dissimilatory) nitrate reduction as well as dinitrification are present.