Flowless: Extracting Densest Subgraphs Without Flow Computations

WWW '20: The Web Conference 2020 Taipei Taiwan April, 2020(2020)

引用 45|浏览141
暂无评分
摘要
The problem of finding dense components of a graph is a major primitive in graph mining and data analysis. The densest subgraph problem (DSP) that asks to find a subgraph with maximum average degree forms a basic primitive in dense subgraph discovery with applications ranging from community detection to unsupervised discovery of biological network modules [16]. The DSP is exactly solvable in polynomial time using maximum flows [14, 17, 22]. Due to the high computational cost of maximum flows, Charikar’s greedy approximation algorithm is usually preferred in practice due to its linear time and linear space complexity [3, 8]. It constitutes a key algorithmic idea in scalable solutions for large-scale dynamic graphs [5, 7]. However, its output density can be a factor 2 off the optimal solution. In this paper we design Greedy++, an iterative peeling algorithm that improves upon the performance of Charikar’s greedy algorithm significantly. Our iterative greedy algorithm is able to output near-optimal and optimal solutions fast by adding a few more passes to Charikar’s greedy algorithm. Furthermore Greedy++ is more robust against the structural heterogeneities (e.g., skewed degree distributions) in real-world datasets. An additional property of our algorithm is that it is able to assess quickly, without computing maximum flows, whether Charikar’s approximation quality on a given graph instance is closer to the worst case theoretical guarantee of or to optimality. We also demonstrate that our method has significant efficiency advantage over the maximum flow based exact optimal algorithm. For example, our algorithm achieves ∼ 145 × speedup on average across a variety of real-world graphs while finding subgraphs of density that are at least 90% as dense as the densest subgraph.
更多
查看译文
关键词
dense subgraph discovery, algorithm design, graph mining, applications
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要