Towards an automated record linkage process for datasource-independent company matching

Felix Kruse, Christoph Schröer,Jan-Philipp Awick, Jan Reinkensmeier,Jorge Marx Gómez

2022 3rd International Conference on Next Generation Computing Applications (NextComp)(2022)

引用 1|浏览5
暂无评分
摘要
Record linkage (RL) is becoming increasingly important for companies to integrate data silos and create a more qualitative information base for decision-making. Despite state-of-the-art research results in RL, these are not used in companies because the manual effort is high, and the necessary know-how is lacking. This research-in-progress paper aims to show how a generic RL process for company matching can be developed and how the manual effort can be reduced. For this purpose, our data-driven inductive research is based on insights of an extensive fundament of 18 company relevant data sources. In this research-in-progress paper, we implemented a first version of our generic RL process for company matching. This was applied in three experiments consisting of different data sources. The results show a Precision range of 0.88 to 0.98 and a Recall range of 0.9 - 0.99. These results are promising and show that the development of a generic RL process for company matching is possible. The generic RL process for company matching would majorly impact companies by making it more efficient to integrate new and previously unused data sources.
更多
查看译文
关键词
Data Integration,Record Linkage,Entity Matching,Deduplication,Data Quality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要