Bayesian Level-Set Clustering
arxiv(2024)
摘要
Broadly, the goal when clustering data is to separate observations into
meaningful subgroups. The rich variety of methods for clustering reflects the
fact that the relevant notion of meaningful clusters varies across
applications. The classical Bayesian approach clusters observations by their
association with components of a mixture model; the choice in class of
components allows flexibility to capture a range of meaningful cluster notions.
However, in practice the range is somewhat limited as difficulties with
computation and cluster identifiability arise as components are made more
flexible. Instead of mixture component attribution, we consider clusterings
that are functions of the data and the density f, which allows us to separate
flexible density estimation from clustering. Within this framework, we develop
a method to cluster data into connected components of a level set of f. Under
mild conditions, we establish that our Bayesian level-set (BALLET) clustering
methodology yields consistent estimates, and we highlight its performance in a
variety of toy and simulated data examples. Finally, through an application to
astronomical data we show the method performs favorably relative to the popular
level-set clustering algorithm DBSCAN in terms of accuracy, insensitivity to
tuning parameters, and quantification of uncertainty.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要