Visual Mining Methods for RNA-Seq Data: Data Structure, Dispersion Estimation and Significance TestingTengfei Yin1*, Mahbubul Majumder2, Niladri Roy Chowdhury2, Dianne Cook2, Randy Shoemaker3 and Michelle Graham3
- *Corresponding Author:
- Tengfei Yin
Department of GDCB
Virtual Reality Applications Center
Iowa State University, 1620 Howe Hall 2274
Ames, IA 50011- 2274, USA
E-mail: [email protected]
Received Date: May 16, 2013; Accepted Date: August 20, 2013; Published Date: August 28, 2013
Citation: Yin T, Majumder M, Chowdhury NR, Cook D, Shoemaker R, et al. (2013) Visual Mining Methods for RNA-Seq Data: Data Structure, Dispersion Estimation and Significance Testing. J Data Mining Genomics Proteomics 4:139. doi: 10.4172/2153-0602.1000139
Copyright: © 2013 Yin T, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
In an analysis of RNA-Seq data from soybeans, initial significance testing using one software package produced very different gene lists from those yielded by another. How can this happen? This paper demonstrates how the disparities between the results were investigated, and can be explained. This type of contradiction can occur more generally in high-throughput analyses. To explore the model fitting and hypothesis testing, we implemented an interactive graphic that allows the exploration of the effect of dispersion estimation on the overall estimation of variance and differential expression tests. In addition, we propose a new procedure to test for the presence of any structure in biological data.