A Comparison of Three Bioinformatics Pipelines for the Analysis of Preterm Gut Microbiota using 16S rRNA Gene Sequencing Data
- *Corresponding Author:
- Erica Plummer
Murdoch Childrens Research Institute
The Royal Children’s Hospital, Flemington Rd
Parkville, Victoria 3052 Australia
Tel: +61 1300 766 439
E-mail: [email protected]
Received date: November 17, 2015; Accepted date: December 22, 2015; Published date: December 28, 2015
Citation: Plummer E, Twin J, Bulach DM, Garland SM, Tabrizi SN (2015) A Comparison of Three Bioinformatics Pipelines for the Analysis of Preterm Gut Microbiota using 16S rRNA Gene Sequencing Data. J Proteomics Bioinform 8: 283-291. doi: 10.4172/jpb.1000381
Copyright: © 2015 Plummer E, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Objective and Methods: Analysis of massive parallel sequencing 16S rRNA data requires the use of sophisticated bioinformatics pipelines. Several pipelines are available, however there is limited literature available comparing the features, advantages and disadvantages of each pipeline. This makes the choice of which method to use often unclear. Using gut microbial read data collected from a cohort of very preterm babies, we compared three pipelines commonly used for 16S rRNA gene analysis: MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST), Quantitative Insights into Microbial Ecology (QIIME) and mothur. Using primarily default parameters, the three pipelines were compared in terms of taxonomic classification, diversity analysis and usability.
Results: Overall, the three pipelines detected the same phylum in similar abundances (P>0.05). A difference was observed between the pipelines in terms of taxonomic classification of genera from the Enterobacteriaceae family, specifically Enterobacter and Klebsiella (P<0.0001 and P=0.0026 respectively). We found the analysis time to be quickest with QIIME compared to mothur and MG-RAST (approximately 1 hour as compared to 10 hours and 2 days respectively).
Conclusion: This study showed that QIIME, mothur and MG-RAST produce comparable results and that regardless of which pipeline or algorithm is selected for the analysis of 16S rRNA gene sequencing data you are likely to generate a reliable high-level overview of sample composition when analysing faecal samples. The differences we observed at the genus level highlight that a key limitation of using 16S rRNA gene analysis for genus and species level classification is that related bacterial species may be indistinguishable due to near identical 16S rRNA gene sequences. This is important to keep in mind when analysing 16S rRNA gene sequencing data.