A Supervised Learning to Rank Approach for Dependency Based Concept Extraction and Repository Based Boosting for Domain Text Indexing.

U. K. Naadan,T. V. Geetha,U. Kanimozhi,D. Manjula, R. Viswapriya, C. Karthik

Lecture Notes in Computer Science(2018)

引用 1|浏览4
暂无评分
摘要
In conventional information retrieval systems, keywords extracted from documents are indexed and used for retrieval. Since same information can be represented by different keywords, there is hindrance in extracting relevant documents. Concept based indexing and retrieval which semantically identifies similar documents overcomes this problem by mapping the document phrases to a domain repository. In this paper, the problem of extracting and ranking concepts i.e. key phrases, from domain oriented text is explored. This paper ranks concepts (key phrases) of a document based not only on statistical and cue phrases but also based on the dependency relations in which the candidate concept occurs. For each candidate a vector is formed with the phrase weight and the dependency relations. The features used to score the phrases in the vectors, for re-ranking and as features to weigh the vector corresponding to the candidate are the cue features (presence in title, abstract), C-value in case of multi-words, frequency of occurrence and the type of dependency relation. The ranking process utilizes RankingSVM to rank the candidate concepts based on the feature vectors. In addition, to make the ranking domain sensitive and to determine the domain relevance of the candidate concepts they are fully or partially matched with the domain repository. Based on the depth of the concept and the presence of parent and siblings, the domain relevant concepts are boosted up the order. The results indicate that the use of dependency based context vector and domain repository provides substantial enhancement in the key phrase extraction task compared with other methods.
更多
查看译文
关键词
Dependency-based key phrase extraction,Repository based concept mapping,Learning to Rank,Domain text indexing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要