Deep context of citations using machine-learning models in scholarly full-text articles

Scientometrics(2018)

引用 52|浏览17
暂无评分
摘要
Information retrieval systems for scholarly literature rely heavily not only on text matching but on semantic- and context-based features. Readers nowadays are deeply interested in how important an article is, its purpose and how influential it is in follow-up research work. Numerous techniques to tap the power of machine learning and artificial intelligence have been developed to enhance retrieval of the most influential scientific literature. In this paper, we compare and improve on four existing state-of-the-art techniques designed to identify influential citations. We consider 450 citations from the Association for Computational Linguistics corpus, classified by experts as either important or unimportant, and further extract 64 features based on the methodology of four state-of-the-art techniques. We apply the Extra-Trees classifier to select 29 best features and apply the Random Forest and Support Vector Machine classifiers to all selected techniques. Using the Random Forest classifier, our supervised model improves on the state-of-the-art method by 11.25%, with 89% Precision-Recall area under the curve. Finally, we present our deep-learning model, the Long Short-Term Memory network, that uses all 64 features to distinguish important and unimportant citations with 92.57% accuracy.
更多
查看译文
关键词
Citation-context analysis,Deep learning,Influential citations,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要