Sentiment analysis of twitter data using parallel write approach of replica placement in Hadoop cluster

Divyesh Patel

doi:10.4172/0974-7230.S1.011

Sentiment analysis of twitter data using parallel write approach of replica placement in Hadoop cluster

International Conference on Big Data Analysis and Data Mining

May 04-05, 2015 Kentucky, USA

Divyesh Patel

Scientific Tracks Abstracts: J Comput Sci Syst Biol

Abstract :

In recent years social networking has turned out to be very popular. Twitter, a micro-blogging service, is appraised to have about 200 million registered users and these users create about 65 million tweets a day. Twitter users usually express their views about topics of their interest. The challenge is that each tweet is partial up to 140 characters, and is hence very short. It may contain jargon and misspelled texts. Thus, it is hard to apply traditional NLP techniques which are designed for working with formal languages, into Twitter domain. One more challenge is that the total volume of tweets is tremendously high, and it takes a long time to process. In this project, we have described a Hadoop distributed system for real-time Twitter sentiment analysis. Our system consists of three components: A lexicon builder, a sentiment classifier and Hadoop new distributed file system for replica placement. These three components are capable of running on a large-scale distributed system since they are implemented using a Hive, Flume Map Reduce framework, HBase database model and other Hadoop environment. Thus, our sentiment classifier and lexicon builder are scalable with the number of machines and the size of data. The experiments also show that our lexicon has a good quality in opinion extraction, and the accuracy of the sentiment classifier can be improved by combining the lexicon with a machine learning technique. Another Part in our project is of HDFS; we are experiencing an information explosion era. Due to which huge amount of distributed data is being achieved and put in storage? To manage this type of data, application uses distributed file system. For replica placement the HDFS is used to save time and for better reuse. Recent state of the art design & implementation of the HDFS implement the new approach for efficient replica placement in Hadoop DFS which can improve throughout and data transfer rate.

Biography :

Divyesh Patel has completed his BTech from Charotar University of Science and Technology, and currently pursuing MTech in Computer Engineering at Charotar University of Science and Technology. His research interests are Data Mining and Business Analytics. He played a role of General Secretary at his University during his graduation. He is expert in many fields and areas like management and engineering. He has done various projects with Government of Gujarat, India.

PDF HTML

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology received 2279 citations as per Google Scholar report

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

CAS Source Index (CASSI)
Index Copernicus
Google Scholar
Sherpa Romeo
Academic Journals Database
Genamics JournalSeek
JournalTOCs
CiteFactor
Electronic Journals Library
RefSeek
Hamdard University
EBSCO A-Z
Directory of Abstract Indexing for Journals
World Catalogue of Scientific Journals
OCLC- WorldCat
Scholarsteer
SWB online catalog
Virtual Library of Biology (vifabio)
Publons
Dtu findit
Geneva Foundation for Medical Education and Research

Journal of Computer Science & Systems Biology

Sentiment analysis of twitter data using parallel write approach of replica placement in Hadoop cluster

International Conference on Big Data Analysis and Data Mining

May 04-05, 2015 Kentucky, USA

Abstract :

Biography :

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

Related Links

Open Access Journals

Sentiment analysis of twitter data using parallel write approach of replica placement in Hadoop cluster

International Conference on Big Data Analysis and Data MiningMay 04-05, 2015 Kentucky, USA

Abstract :

Biography :

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

Related Links

Open Access Journals

International Conference on Big Data Analysis and Data Mining

May 04-05, 2015 Kentucky, USA