Fast and High-Quality Influence Maximization on Multiple GPUs

2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)(2022)

引用 0|浏览11
暂无评分
摘要
Influence Maximization (IM) is a popular problem focusing on finding a seed vertex set in a graph that maximizes the expected number of vertices affected via diffusion under a given, usually probabilistic model. For most diffusion models used in practice, finding an optimal seed set of a given size is NP-Hard. Hence, approximation algorithms and heuristics are often proposed and used. The Greedy approach is one of the most frequently applied approximation approach employed for IM. Indeed, this Monte-Carlo-based approach performs remarkably well in terms of seed set quality, i.e., the number of affected vertices. However, it is impractical for real-life networks containing tens of millions of vertices due to its expensive simulation costs. Recently, parallel IM kernels running on CPUs and GPUs have been proposed in the literature. In this work, we propose SUPERFUSER, a blazing-fast, sketch-based Influence Maximization algorithm developed for multiple GPUs. SUPERFUSER uses hash-based fused sampling to process multiple simulations at the same time with minimal overhead. In addition, we propose a Sampling-Aware Sample-Space Split approach to partition the edges to multiple GPUs efficiently by exploiting the unique characteristics of the sampling process. Based on our experiments, SUPERFUSER is up to 6.31× faster than its nearest competitor on a single GPU. Furthermore, we achieve 6.8× speed-up on average using 8 GPUs over a single GPU performance, and thanks to our novel partitioning scheme, we can process extremely large-scale graphs in practice without sacrificing quality too much. As an example, SUPERFUSER can generate a high-quality seed set with 50 vertices for a graph having 1.8B edges in less than 15 seconds on 2 GPUs.
更多
查看译文
关键词
Influence maximization,Data sketches,GPU,Fused sampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要