A semi-supervised method to learn and construct taxonomies using the web

EMNLP(2010)

引用 216|浏览29
暂无评分
摘要
Although many algorithms have been developed to harvest lexical resources, few organize the mined terms into taxonomies. We propose (1) a semi-supervised algorithm that uses a root concept, a basic level concept, and recursive surface patterns to learn automatically from the Web hyponym-hypernym pairs subordinated to the root; (2) a Web based concept positioning procedure to validate the learned pairs' is-a relations; and (3) a graph algorithm that derives from scratch the integrated taxonomy structure of all the terms. Comparing results with WordNet, we find that the algorithm misses some concepts and links, but also that it discovers many additional ones lacking in WordNet. We evaluate the taxonomization power of our method on reconstructing parts of the WordNet taxonomy. Experiments show that starting from scratch, the algorithm can reconstruct 62% of the WordNet taxonomy for the regions tested.
更多
查看译文
关键词
integrated taxonomy structure,is-a relation,harvest lexical resource,semi-supervised method,basic level concept,concept positioning procedure,semi-supervised algorithm,web hyponym-hypernym pair,wordnet taxonomy,root concept,graph algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要