CDC: A Simple Framework for Complex Data Clustering
arxiv(2024)
摘要
In today's data-driven digital era, the amount as well as complexity, such as
multi-view, non-Euclidean, and multi-relational, of the collected data are
growing exponentially or even faster. Clustering, which unsupervisely extracts
valid knowledge from data, is extremely useful in practice. However, existing
methods are independently developed to handle one particular challenge at the
expense of the others. In this work, we propose a simple but effective
framework for complex data clustering (CDC) that can efficiently process
different types of data with linear complexity. We first utilize graph
filtering to fuse geometry structure and attribute information. We then reduce
the complexity with high-quality anchors that are adaptively learned via a
novel similarity-preserving regularizer. We illustrate the cluster-ability of
our proposed method theoretically and experimentally. In particular, we deploy
CDC to graph data of size 111M.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要