Information Theory Based Feature Selection for Multi-Relational Naive Bayesian ClassifierVimalkumar B Vaghela1*, Kalpesh H Vandra2 and Nilesh K Modi3
- *Corresponding Author:
- Vimalkumar B Vaghela
Department of Computer Science & Engineering
E-mail: [email protected]
Received date: May 09, 2014; Accepted date: June 30, 2014; Published date: July 08, 2014
Citation: Vaghela VB, Vandra KH, Modi NK (2014) Information Theory Based Feature Selection for Multi-Relational Naïve Bayesian Classifier. J Data Mining Genomics Proteomics 5:155. doi:10.4172/2153-0602.1000155
Copyright: 2014 Vaghela VB, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Today data’s are stored in relation structures. In usual approach to mine these data, we often use to join several relations to form a single relation using foreign key links, which is known as flatten. Flatten may cause troubles such as time consuming, data redundancy and statistical skew on data. Hence, the critical issues arise that how to mine data directly on numerous relations. The solution of the given issue is the approach called multi-relational data mining (MRDM). Other issues are irrelevant or redundant attributes in a relation may not make contribution to classification accuracy. Thus, feature selection is an essential data pre-processing step in multi-relational data mining. By filtering out irrelevant or redundant features from relations for data mining, we improve classification accuracy, achieve good time performance, and improve comprehensibility of the models. We had proposed the entropy based feature selection method for Multi-relational Naïve Bayesian Classifier. We have use method InfoDist and Pearson’s Correlation parameters, which will be used to filter out irrelevant and redundant features from the multi-relational database and will enhance classification accuracy. We analyzed our algorithm over PKDD financial dataset and achieved the better accuracy compare to the existing features selection methods.