Improving Academic Homepage Identification From The Web Using Neural Networks

COMPUTATIONAL SCIENCE - ICCS 2019, PT V(2019)

引用 0|浏览32
暂无评分
摘要
Identifying academic homepages is a fundamental work of many tasks, such as expert finding, researcher profile extraction and homonym researcher disambiguation. Many works have been proposed to obtain researcher homepages using search engines. These methods only extract features at the lexical-level from each single retrieval result, which is not enough to identify homepage from retrieval results with high similarity. To address this problem, we first make deep-insight improvements on three aspects. (1) Fine-gained features are designed to efficiently detect whether the researcher's name appears in retrieval results; (2) Establishing correlation of multiple retrieval results for the same researcher; (3) Obtaining semantic information involved in URL, title and snippet of each retrieval result by recurrent neural networks. Afterwards, we employ a joint neural network framework which is able to make comprehensive use of these informative information. In comparison with previous work, our approach gives a substantial increase of 10%-11% accuracy on a real-world dataset provided by AMiner. Experimental results demonstrate the effectiveness of our method.
更多
查看译文
关键词
Academic homepage identification, Retrieval results, Semantic information representation, Joint neural network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要