Parallel Progressive Approach To Entity Resolution Using Mapreduce

2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017)(2017)

引用 32|浏览44
暂无评分
摘要
Entity resolution (ER) is the process of identifying which entities in a dataset represent the same real-world object. This paper proposes a progressive approach to ER using MapReduce. In contrast to traditional ER, progressive ER aims to resolve the dataset such that the rate at which the data quality improves is maximized. Such a progressive approach is useful for many emerging analytical applications that require low latency response and/ or in situations where the underlying resources are constrained or costly to use. Experiments with real-world datasets demonstrate the ability of our approach to generate high-quality results using limited amounts of resolution cost.
更多
查看译文
关键词
parallel progressive approach,entity resolution,MapReduce,progressive ER,data quality improvement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要