Linking records in dynamic world

SIGMOD/PODS PhD Symposium(2012)

引用 2|浏览0
暂无评分
摘要
In real-world, entities change dynamically and the changes are capture in two dimensions: time and space. For data sets that contain temporal records, where each record is associated with a time stamp and describes some aspects of a real-world entity at that particular time, we often wish to identify records that describe the same entity over time and so be able to enable interesting longitudinal data analysis. For data sets that contain geographically referenced data describing real-world entities at different locations (i.e., location entities), we wish to link those entities that belong to the same organization or network. However, existing record linkage techniques ignore additional evidence in temporal and spatial data and can fall short for these cases. This proposal studies linking temporal and spatial records. For temporal record linkage, we apply time decay to capture the effect of elapsed time on entity value evolution, and propose clustering methods that consider time order of records in clustering. For linking location records, we distinguish between strong and weak evidence; for the former, we study core generation in presence of erroneous data, and then leverage the discovered strong evidence to make remaining decisions.
更多
查看译文
关键词
interesting longitudinal data analysis,time stamp,time order,spatial data,geographically referenced data,particular time,real-world entity,dynamic world,time decay,erroneous data,record linkage,data integrity,two dimensions,data integration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要