Fuzzy Logic Approach to Combat Web Spam with
|Amit Prakash1, Debjani Mustafi2
|Related article at Pubmed, Scholar Google|
Web spam refers to techniques that manipulates the ranking algorithms of web search engines and cause them to rank search results higher than they deserve . The spam web pages may pretend to provide assistance or facts about a particular subject, but the help is often meaningless and the information shallow. Recently, the amount of web spam has increased dramatically, leading to a degradation of search results. Today's search engines use variations of the fundamental ranking methods that feature some degree of spam resilience. PageRank is one of them which not only counts the number of hyperlinks referring to a web page, but also takes the PageRank of the referring page into account, but this concept has proven to be vulnerable to manipulation . TrustRank overcomes the PageRank problems but involves human operators to judge seed sets to find if a page is spam or not. There are situations where an operator fails to assign a crisp value to a page. In such case a human sentiment involve in deciding a page is spam or not. Our work reveals the human sentiment involved in the judgment of seed set. We also proposed a model that minimizes the involvement of human sentiment by employing Fuzzy Logic in seed selection process.