k-sums Clustering: A Stochastic Optimization Approach

Conference on Information and Knowledge Management(2021)

引用 1|浏览106
暂无评分
摘要
ABSTRACTIn this paper, we revisit the decades-old clustering method k -means. The egg-chicken loop in traditional k -means has been replaced by a pure stochastic optimization procedure. The optimization is undertaken from the perspective of each individual sample. Different from existing incremental k -means, an individual sample is tentatively joined into a new cluster to evaluate its distance to the corresponding new centroid, in which the contribution from this sample is accounted. The sample is moved to this new cluster concretely only after we find the reallocation makes the sample closer to the new centroid than it is to the current one. Compared with traditional k -means and other variants, this new procedure allows the clustering to converge faster to a better local minimum. This fundamental modification over the k -means loop leads to the redefinition of a family of k -means variants, such as hierarchical k -means, and Sequential k -means. As an extension, a new target function that minimizes the summation of pairwise distances within clusters is presented. Under l2-norm, it could be solved under the same stochastic optimization procedure. The re-defined traditional k -means, hierarchical k -means, as well as Sequential k-means all show considerable performance improvement over their traditional counterparts under different settings and on various types of datasets.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要