alexa Efficient streaming text clustering.


International Journal of Advancements in Technology

Author(s): Zhong S

Abstract Share this page

Abstract Clustering data streams has been a new research topic, recently emerged from many real data mining applications, and has attracted a lot of research attention. However, there is little work on clustering high-dimensional streaming text data. This paper combines an efficient online spherical k-means (OSKM) algorithm with an existing scalable clustering strategy to achieve fast and adaptive clustering of text streams. The OSKM algorithm modifies the spherical k-means (SPKM) algorithm, using online update (for cluster centroids) based on the well-known Winner-Take-All competitive learning. It has been shown to be as efficient as SPKM, but much superior in clustering quality. The scalable clustering strategy was previously developed to deal with very large databases that cannot fit into a limited memory and that are too expensive to read/scan multiple times. Using the strategy, one keeps only sufficient statistics for history data to retain (part of) the contribution of history data and to accommodate the limited memory. To make the proposed clustering algorithm adaptive to data streams, we introduce a forgetting factor that applies exponential decay to the importance of history data. The older a set of text documents, the less weight they carry. Our experimental results demonstrate the efficiency of the proposed algorithm and reveal an intuitive and an interesting fact for clustering text streams-one needs to forget to be adaptive. This article was published in Neural Netw and referenced in International Journal of Advancements in Technology

Relevant Expert PPTs

Recommended Conferences

Peer Reviewed Journals
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri, Food, Aqua and Veterinary Science Journals

Dr. Krish

1-702-714-7001 Extn: 9040

Clinical and Biochemistry Journals

Datta A

1-702-714-7001Extn: 9037

Business & Management Journals


1-702-714-7001Extn: 9042

Chemical Engineering and Chemistry Journals

Gabriel Shaw

1-702-714-7001 Extn: 9040

Earth & Environmental Sciences

Katie Wilson

1-702-714-7001Extn: 9042

Engineering Journals

James Franklin

1-702-714-7001Extn: 9042

General Science and Health care Journals

Andrea Jason

1-702-714-7001Extn: 9043

Genetics and Molecular Biology Journals

Anna Melissa

1-702-714-7001 Extn: 9006

Immunology & Microbiology Journals

David Gorantl

1-702-714-7001Extn: 9014

Informatics Journals

Stephanie Skinner

1-702-714-7001Extn: 9039

Material Sciences Journals

Rachle Green

1-702-714-7001Extn: 9039

Mathematics and Physics Journals

Jim Willison

1-702-714-7001 Extn: 9042

Medical Journals

Nimmi Anna

1-702-714-7001 Extn: 9038

Neuroscience & Psychology Journals

Nathan T

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

John Behannon

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

1-702-714-7001 Extn: 9042

© 2008-2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version