Combination of Ant Colony Optimization and Bayesian Classification for Feature Selection in a Bioinformatics DatasetMehdi Hosseinzadeh Aghdam1*, Jafar Tanha2, Ahmad Reza Naghsh-Nilchi3 and Mohammad Ehsan Basiri3
- *Corresponding Author:
- Mehdi Hosseinzadeh Aghdam
Computer Engineering Department
Technical & Engineering Faculty of Bonab
University of Tabriz, Tabriz, Iran
Phone:+98 311 7932671
Fax: +98 311 7932670
E-mail: [email protected], [email protected]
Received date: March 31, 2009; Accepted date: June 14, 2009; Published date: June 15, 2009
Citation: Aghdam MH, Tanha J, Naghsh-Nilchi AR, Basiri ME (2009) Combination of Ant Colony Optimization and Bayesian Classification for Feature Selection in a Bioinformatics Dataset. J Comput Sci Syst Biol 2:186-199. doi:10.4172/jcsb.1000031
Copyright: © 2009 Aghdam MH, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Feature selection is widely used as the first stage of classification task to reduce the dimension of problem, decrease noise, improve speed and relieve memory constraints by the elimination of irrelevant or redundant features. One approach in the feature selection area is employing population-based optimization algorithms such as particle swarm optimization (PSO)-based method and ant colony optimization (ACO)-based method. Ant colony optimization algorithm is inspired by observation on real ants in their search for the shortest paths to food sources. Protein function prediction is an important problem in functional genomics. Typically, protein sequences are represented by feature vectors. A major problem of protein datasets that increase the complexity of classification models is their large number of features. This paper empowers the ant colony optimization algorithm by enabling the ACO to select features for a Bayesian classification method. The naive Bayesian classifier is a straightforward and frequently used method for supervised learning. It provides a flexible way for dealing with any number of features or classes, and is based on probability theory. This paper then compares the performance of the proposed ACO algorithm against the performance of a standard binary particle swarm optimization algorithm on the task of selecting features on Postsynaptic dataset. The criteria used for this comparison are maximizing predictive accuracy and finding the smallest subset of features. Simulation results on Postsynaptic dataset show that proposed method simplifies features effectively and obtains a higher classification accuracy compared to other feature selection methods.