Classifying Y-Short Tandem Repeat Data: A Decision Tree Approach
- *Corresponding Author:
- Ali Seman
Center for Computer Sciences
Faculty of Computer and Mathematical Sciences
Universiti Teknologi MARA (UiTM)
40450 Shah Alam, Selangor, Malaysia
E-mail: [email protected]
Received date: October 13, 2013; Accepted date: November 14, 2013; Published date: November 18, 2013
Citation: Seman A, Othman IR, Sapawi AM, Bakar ZA (2013) Classifying Y-Short Tandem Repeat Data: A Decision Tree Approach. J Proteomics Bioinform 6: 271-274. doi: 10.4172/jpb.1000290
Copyright: © 2013 Seman A, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Classifying Y-Short Tandem Repeat data has recently been introduced in supervised and unsupervised classifications. This study continues the efforts in classifying YSTR data based on four decision tree models: CHisquared Automatic Interaction Detection (CHAID), Classification and Regression Tree (CART), Quick, Unbiased, Efficient Statistical Tree (QUEST) and C5. A data mining tool, called IBM Statistical Package for the Science Social Modeler 15.0 (IBM® SPSS® Modeler 15) was used for evaluating the performances of the models over six Y-STR data. Overall results showed that the decision tree models were able to classify all six Y-STR data significantly. Among the four models, C5 is the most consistent modelm where it had produced the highest accuracy score of 91.85%, sensitivity score of 93.69% and specificity score of 96.32%.