PANDORA: A Parallel Dendrogram Construction Algorithm for Single Linkage Clustering on GPU
CoRR(2024)
摘要
This paper presents , a novel parallel algorithm for efficiently
constructing dendrograms for single-linkage hierarchical clustering, including
. Traditional dendrogram construction methods from a minimum spanning
tree (MST), such as agglomerative or divisive techniques, often fail to
efficiently parallelize, especially with skewed dendrograms common in
real-world data.
addresses these challenges through a unique recursive tree
contraction method, which simplifies the tree for initial dendrogram
construction and then progressively reconstructs the complete dendrogram. This
process makes asymptotically work-optimal, independent of dendrogram
skewness. All steps in are fully parallel and suitable for massively
threaded accelerators such as GPUs.
Our implementation is written in Kokkos, providing support for both CPUs and
multi-vendor GPUs (e.g., Nvidia, AMD). The multithreaded version of is
2.2× faster than the current best-multithreaded implementation, while
the GPU implementation achieved 6-20× on and
10-37× on speed-up over multithreaded . These
advancements lead to up to a 6-fold speedup for on GPUs over the
current best, which only offload MST construction to GPUs and perform
multithreaded dendrogram construction.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要