A Novel Cluster Prediction Approach Based on Locality-Sensitive Hashing for Fuzzy Clustering of Categorical Data

IEEE ACCESS(2022)

引用 1|浏览0
暂无评分
摘要
This paper addresses the problem of fuzzy clustering for categorical data. During the last two decades, many attempts have been made to extend the k-means algorithm, making it applicable to clustering for categorical data, due to its simplicity and efficiency. However, as k-means-like algorithms are local optimization methods, their clustering results are highly sensitive to initialization. In this paper, we propose to use Locality-Sensitive Hashing (LSH) to reduce the categorical data dimensions and predict the initial fuzzy clusters in low-dimensional space. Different from the existing cluster initialization optimization methods that aim to create only crisp initial clusters, the proposed method aims at predicting 'high quality' fuzzy clusters at the initialization step before proceeding in the k-means-like fashion. The numerical results show that the proposed method yields relatively accurate results on 16 UCI datasets and outperforms all other related approaches in terms of both crisp and fuzzy clustering effectiveness.
更多
查看译文
关键词
Cluster analysis, fuzzy clustering, categorical data, locality-sensitive hashing, initial cluster prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要