k-Means Walk: Unveiling Operational Mechanism of a Popular Clustering Approach for Microarray Data

Victor Chukwudi Osamor; Ezekiel Femi Adebiyi; Ebere Hezekiah Enekwa

doi:10.4172/jcsb.1000098

k-Means Walk: Unveiling Operational Mechanism of a Popular Clustering Approach for Microarray Data

Abstract

Victor Chukwudi Osamor, Ezekiel Femi Adebiyi and Ebere Hezekiah Enekwa

Since data analysis using technical computational model has profound influence on interpretation of the final results, basic understanding of the underlying model surrounding such computational tools is required for optimal experimental design by target users of such tools. Despite wide variation of techniques associated with clustering, cluster analysis has become a generic name in bioinformatics and is seen to discover the natural grouping(s) of a set of patterns, points or sequences. The aim of this paper is to analyze k-means by applying a step-by-step k-means walk approach using graphic-guided analysis to provide clear understanding of the operational mechanism of the k-means algorithm. Scattered graph was created using theoretical microarray gene expression data which is a simplified view of a typical microarray experiment data. We designate the centroid as the first three initial data points and applied Euclidean distance metrics in the k-means algorithm leading to assignment of these three data points as reference point to each cluster formation. A test is conducted to determine if there is a shift in centroid before the next iteration is attained. We were able to trace out those data points in same cluster after convergence. We observed that, as both the dimension of data and gene list increases for hybridization matrix of microarray data, computational implementation of k-means algorithm becomes more rigorous. Furthermore, the understanding of this approach will stimulate new ideas for further development and improvement of the k-means clustering algorithm especially within the confines of the biology of diseases and beyond. However, the major advantage will be to give improved cluster output for the interpretation of microarray experimental results, facilitate better understanding for bioinformaticians and algorithm experts to tweak k-means algorithm for improved run-time of clustering.

PDF

Share this article

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology received 2279 citations as per Google Scholar report

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

CAS Source Index (CASSI)
Index Copernicus
Google Scholar
Sherpa Romeo
Academic Journals Database
Genamics JournalSeek
JournalTOCs
CiteFactor
Electronic Journals Library
RefSeek
Hamdard University
EBSCO A-Z
Directory of Abstract Indexing for Journals
World Catalogue of Scientific Journals
OCLC- WorldCat
Scholarsteer
SWB online catalog
Virtual Library of Biology (vifabio)
Publons
Dtu findit
Geneva Foundation for Medical Education and Research

Journal of Computer Science & Systems Biology

k-Means Walk: Unveiling Operational Mechanism of a Popular Clustering Approach for Microarray Data

Abstract

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

Related Links

Open Access Journals