Mining Keys for Graphs

DATA & KNOWLEDGE ENGINEERING(2024)

引用 0|浏览1
暂无评分
摘要
Keys for graphs are a class of data quality rules that use topological and value constraints to uniquely identify entities in a data graph. They have been studied to support object identification, knowledge fusion, data deduplication, and social network reconciliation. Manual specification and discovery of graph keys is tedious and infeasible over large-scale graphs. To make GKeys useful in practice, we study the GKey discovery problem, and present GKMiner, an algorithm that mines keys over graphs. Our algorithm discovers keys in a graph via frequent subgraph expansion, and notably, identifies recursive keys, i.e., where the unique identification of an entity type is dependent upon the identification of another entity type. We introduce the key properties, minimality and support, which effectively help to reduce the space of candidate keys. GKMiner uses a set of auxillary structures to summarize an input graph, and to identify likely candidate keys for greater pruning efficiency and evaluation of the search space. Our evaluation shows that identifying and using recursive keys in entity linking, lead to improved accuracy, over keys found using existing graph key mining techniques.
更多
查看译文
关键词
Graph keys,Key mining,Recursive keys
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要