Analysis of Log Data and Statistics Report Generation Using Hadoop
|Siddharth Adhikari, Devesh Saraf, Mahesh Revanwar and Nikhil Ankam
B.E Student, Department of Computer Engineering. of Computer, Vishwakarma Institute of Information Technology, Pune, India
|Related article at Pubmed, Scholar Google|
Web Log analyser is a tool used for finding the statics of web sites. Through Web Log analyzer the web log files are uploaded into the Hadoop Distributed Framework where parallel procession on log files is carried in the form of master and slave structure. Pig scripts are written on the classified log files to satisfy certain query. The log files are maintained by the web servers. By analysing these log files gives an idea about the user in the way like which IP address have generated the most errors, which user is visiting a web page frequently.. This paper discuss about these log files, their formats, access procedures, their uses, the additional parameters that can be used in the log files which in turn gives way to an effective mining and the tools used to process the log files. It also provides the idea of creating an extended log file and learning the user behaviour. Analysing the user activities is particularly useful for studying user behaviour when using highly interactive systems. This paper presents the details of the methodology used, in which the focus is on studying the information-seeking process and on finding log errors and exceptions. The next part of the paper describes the working and techniques used by web log analyzer.