Streaming cross document entity coreference resolution

COLING (Posters)(2010)

引用 76|浏览105
暂无评分
摘要
Previous research in cross-document entity coreference has generally been restricted to the offline scenario where the set of documents is provided in advance. As a consequence, the dominant approach is based on greedy agglomerative clustering techniques that utilize pairwise vector comparisons and thus require O(n2) space and time. In this paper we explore identifying coreferent entity mentions across documents in high-volume streaming text, including methods for utilizing orthographic and contextual information. We test our methods using several corpora to quantitatively measure both the efficacy and scalability of our streaming approach. We show that our approach scales to at least an order of magnitude larger data than previous reported methods.
更多
查看译文
关键词
offline scenario,previous reported method,dominant approach,approach scale,magnitude larger data,cross-document entity coreference,cross document entity coreference,greedy agglomerative,previous research,contextual information,coreferent entity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要