Automatic Keyphrase Extraction by Bridging Vocabulary Gap.

CoNLL '11: Proceedings of the Fifteenth Conference on Computational Natural Language Learning(2011)

引用 24|浏览52
暂无评分
摘要
Keyphrase extraction aims to select a set of terms from a document as a short summary of the document. Most methods extract keyphrases according to their statistical properties in the given document. Appropriate keyphrases, however, are not always statistically significant or even do not appear in the given document. This makes a large vocabulary gap between a document and its keyphrases. In this paper, we consider that a document and its keyphrases both describe the same object but are written in two different languages. By regarding keyphrase extraction as a problem of translating from the language of documents to the language of keyphrases, we use word alignment models in statistical machine translation to learn translation probabilities between the words in documents and the words in keyphrases. According to the translation model, we suggest keyphrases given a new document. The suggested keyphrases are not necessarily statistically frequent in the document, which indicates that our method is more flexible and reliable. Experiments on news articles demonstrate that our method outperforms existing unsupervised methods on precision, recall and F-measure.
更多
查看译文
关键词
Keyphrase extraction,appropriate keyphrases,new document,suggested keyphrases,statistical machine translation,translation model,translation probability,different language,statistical property,unsupervised method,automatic keyphrase extraction,vocabulary gap
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要