Clustering Of Web Usage Data Using Chameleon Algorithm
|T.Vijaya Kumar1, Dr. H.S.Guruprasad2
|Related article at Pubmed, Scholar Google|
Clustering is a discovery process in data mining which groups set of data items, in such a way that maximizes the similarity within clusters and minimizes the similarity between two different clusters. These discovered clusters depict the characteristics of the underlying data distribution. Clustering is useful in characterizing customer groups based on purchasing patterns, categorizing web documents that have similar functionality. In this work, graphbased clustering is proposed to form clusters based on web usage patterns. First sessions are constructed using time oriented approach. Based on the constructed sessions and page requests, adjacency matrix is created. Then data points are generated using adjacency matrix as input. Chameleon clustering algorithm takes data points as input and forms clusters. Chameleon uses a two phase approach to find the clusters. In the first phase, it uses a graph partitioning algorithm to cluster the data items into several relatively small sub-clusters. In the second phase, it uses an algorithm to form genuine clusters by repeatedly combining these sub clusters. Then these clusters are plotted on a plane using MATLAB where different clusters are distinguished by distinct colours and distinct symbols. In this paper, the server log files of the Website www.enggresources.com is considered for overall study and analysis.