Bayesian Level-Set Clustering

arxiv(2024)

引用 0|浏览2
暂无评分
摘要
Broadly, the goal when clustering data is to separate observations into meaningful subgroups. The rich variety of methods for clustering reflects the fact that the relevant notion of meaningful clusters varies across applications. The classical Bayesian approach clusters observations by their association with components of a mixture model; the choice in class of components allows flexibility to capture a range of meaningful cluster notions. However, in practice the range is somewhat limited as difficulties with computation and cluster identifiability arise as components are made more flexible. Instead of mixture component attribution, we consider clusterings that are functions of the data and the density f, which allows us to separate flexible density estimation from clustering. Within this framework, we develop a method to cluster data into connected components of a level set of f. Under mild conditions, we establish that our Bayesian level-set (BALLET) clustering methodology yields consistent estimates, and we highlight its performance in a variety of toy and simulated data examples. Finally, through an application to astronomical data we show the method performs favorably relative to the popular level-set clustering algorithm DBSCAN in terms of accuracy, insensitivity to tuning parameters, and quantification of uncertainty.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要