Learning The Distinctive Pattern Space Features For Relation Extraction

CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014(2014)

引用 0|浏览92
暂无评分
摘要
Recently, Distant Supervision (DS) is used to automatically generate training data for relation extraction. As the vast redundancy of information on the web, multiple sentences corresponding to a fact may be achieved. In this paper, we propose pattern space features to leverage data redundancy. Each dimension of pattern space feature vector corresponds to a basis pattern and the vector value is the similarity of entity pairs' patterns to basis patterns. To achieve distinctive basis patterns, a pattern selection procedure is adopted to filter out noisy patterns. In addition, since too specific patterns will increase the number of basis patterns, we propose a novel pattern extraction method that can avoid extracting too specific patterns while maintaining pattern distinctiveness. To demonstrate the effectiveness of the proposed features, we conduct the experiments on a real world data set with 6 different relation types. Experimental results demonstrate that pattern space features significantly outperform State-of-the-art.
更多
查看译文
关键词
Feature Vector, Semantic Relation, Test Instance, Basis Pattern, Name Entity Recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要