Learning The Distinctive Pattern Space Features For Relation Extraction
CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014(2014)
摘要
Recently, Distant Supervision (DS) is used to automatically generate training data for relation extraction. As the vast redundancy of information on the web, multiple sentences corresponding to a fact may be achieved. In this paper, we propose pattern space features to leverage data redundancy. Each dimension of pattern space feature vector corresponds to a basis pattern and the vector value is the similarity of entity pairs' patterns to basis patterns. To achieve distinctive basis patterns, a pattern selection procedure is adopted to filter out noisy patterns. In addition, since too specific patterns will increase the number of basis patterns, we propose a novel pattern extraction method that can avoid extracting too specific patterns while maintaining pattern distinctiveness. To demonstrate the effectiveness of the proposed features, we conduct the experiments on a real world data set with 6 different relation types. Experimental results demonstrate that pattern space features significantly outperform State-of-the-art.
更多查看译文
关键词
Feature Vector, Semantic Relation, Test Instance, Basis Pattern, Name Entity Recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要