Implementation of Decision Tree Using Hadoop MapReduceTianyi Yang1* and Anne Hee Hiong Ngu2
- *Corresponding Author:
- Tianyi Yang
Texas Center for Integrative Environmental Medicine
Tel: +1 512-245-2111
E-mail: [email protected]
Received Date: December 21, 2016; Accepted Date: February 04, 2017; Published Date: February 11, 2017
Citation: Yang T, Ngu AHH (2017) Implementation of Decision Tree Using Hadoop MapReduce. Int J Biomed Data Min 6: 125. doi: 10.4172/2090-4924.1000125
Copyright: © 2017 Yang T, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Hadoop is one of the most popular general-purpose computing platforms for the distributed processing of big data. HDFS is implementation of distributed file system by Hadoop to be able to store huge amount of data in a reliable way and serve data processing component by Hadoop at the same time. MapReduce is the main processing engine of Hadoop. In this study, we have implemented HDFS and MapReduce for a well- known learning algorithm—decision tree in a scalable fashion to large input problem size. Computational performance with node count and problem size is evaluated.