REPLACE: A Logical Framework for Combining Collective Entity Resolution and Repairing

IJCAI 2023(2023)

引用 2|浏览28
暂无评分
摘要
In this paper, we investigate the problem of querying dirty databases that may suffer both from the presence of erroneous facts and from multiple names being used to refer to the same entity. While each of these issues has been widely studied in isolation, our contribution is a holistic framework for jointly deduplicating and repairing data, thereby taking advantage of the interdependencies between these two operations. Our REPLACE framework follows a declarative approach, utilizing logical rules to specify under which conditions a pair of entity references can or must be merged and logical constraints to specify consistency requirements. The semantics of REPLACE gives rise to a space of possible solutions, each consisting of a set of merges to perform and a set of facts to delete, among which we single out three notions of optimal solutions, based upon maximizing merges and minimizing deletions. As there can be multiple optimal solutions, we consider the classical notions of possible and certain query answers, as well as novel notions of most informative possible and certain answers, which provide a more compact and useful representation of the answers. We perform a detailed analysis of the data complexity of the central reasoning tasks of recognizing optimal solutions and (most informative) possible and certain answers, for each of the three notions of optimal solution and for both general and restricted specifications.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要