Experimental Validation of a Probabilistic Framework for Microarray Data AnalysisClaudio A. Gelmi1, Purusharth Prakash2, Jeremy S. Edwards3,4 and Babatunde A. Ogunnaike2*
- *Corresponding Author:
- Babatunde A. Ogunnaike
Department of Chemical Engineering
University of Delaware, Newark
DE 19716, USA
E-mail: [email protected]
Received date: April 26, 2011; Accepted date: August 01, 2011; Published date: September 25, 2011
Citation: Gelmi CA, Prakash P, Edwards JS, Ogunnaike BA (2011) Experimental Validation of a Probabilistic Framework for Microarray Data Analysis. J Biomet Biostat 2:114. doi:10.4172/2155-6180.1000114
Copyright: © 2011 Gelmi CA, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
With the primary objective of developing fundamental probability models that can be used for drawing rigorous statistical inference from microarray data, we have presented in a previous publication, theoretical results for characterizing the entire microarray data set as an ensemble. Specifically, we established, from first principles, that under reasonable assumptions, the distribution of microarray intensities follows the gamma model, and consequently that the underlying theoretical distribution for the entire set of fractional intensities is a mixture of beta densities. This probabilistic framework was then used to develop a rigorous statistical inference methodology whose outcome, for each gene, is an ordered triplet: a raw computed fractional (or relative) change in expression level; an associated probability that this number indicates lower, higher, or no differential expression; and a measure of confidence associated with the stated result. In this paper we validate the probabilistic framework and associated statistical inference methodology through confirmatory experimental studies of gene expression in Saccharomyces cerevisiae using Affymetrix GenechipsÂ®. The array data were analyzed using the probabilistic framework, and 9 genes-with indeterminate expression status according to the standard 2-fold change criteria, but for which our probabilistic method indicated high expression status probabilities-were selected for higher precision characterization. In particular, for genes CGR1, GOS1, ICS2, PCL5 and PLB1, the high probabilities of being differentially expressed (up or down) were found to be in excellent agreement with the expression status determined by the independent, high precision confirmatory experiments. These confirmatory experiments, using the high precision, medium throughput polonies technique, confirmed that the probabilistic framework performs quite well in correctly identifying the expression status of genes in general, but especially differentially expressed genes that would otherwise not have been identifiable using the standard 2-fold change criteria.