Computing Complex Graph Properties With Sql Queries

2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2019)

引用 6|浏览10
暂无评分
摘要
In big data analytics, graph problems are as important as machine learning. There exist many algorithms to analyze large graphs, but most of them are limited by the main memory. On the other hand, a lot of data stored on DBMSs needs to be analyzed as graphs. Even data is not in DBMSs, it is fast to load large data sets into DBMSs. More importantly, DBMSs can work in parallel and they do not have RAM limitations. Based on these reasons, we propose several algorithms that compute metrics and properties of the graph as well as help us to understand the graph structure specifically diameter, betweenness centrality. This work is a big step beyond transitive closure and recursive queries. We propose SQL queries that can work on a large graph stored in relational form as triples. And we prove these queries can compute complex graph metrics in a more flexible and efficient manner. Moreover, we study how to optimize the SQL queries combining demanding joins and aggregations that remove the main memory limitation and also can work in parallel. We provide an experimental evaluation to evaluate the accuracy and perbwmance of our SQL algorithms by comparing our algorithms with popular platforms including Python and Spark. In general, the experiments show our SQL algorithms are accurate and efficient, without memory limitation.
更多
查看译文
关键词
Graph Metrics, SQL Queries, Betweenness Centrality, Query Optimization, Columnar DBMS
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要