An Efficient Sequential Frequent Pattern Analysis Using DBCA
|K.Sasikala 1, P.Velusamy 2
|Related article at Pubmed, Scholar Google|
In this work, the practical problem of frequent-itemset discovery in data-stream environments which may suffer from data overload. The main issues include frequent-pattern mining and data-overload handling. Therefore, a mining algorithm together with Separate dedicated overload-handling mechanisms is proposed. The algorithm DBCA (Dynamic Base Combinatorial Algorithm) extracts basic information from streaming data and keeps the information in its data structure. The DBCA algorithm extracts base information from data streams in a dynamic way. More specifically, it keeps base information on a data stream with the size concerning the average length n of transactions. It could effectively manage data overload with the overload-handling mechanisms. Our results may leads to a possible solution for sequential frequent-pattern mining in dynamic streams, the Sliding window by pruning the excess of incoming data and dealing only with the trimmed data, not by processing on the full amount of incoming data. Depending on how overloading data can be trimmed, there may be various policies on load shedding, and we have described three such policies. The proposed policies, although possess different properties, have all been verified by the experiment to be effective.