Accelerating annotation of articles via automated approaches: evaluation of the neXtA5 curation-support tool by neXtProt.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION(2018)

引用 7|浏览54
暂无评分
摘要
The development of efficient text-mining tools promises to boost the curation workflow by significantly reducing the time needed to process the literature into biological databases. We have developed a curation support tool, neXtA(5), that provides a search engine coupled with an annotation system directly integrated into a biocuration workflow. neXtA(5) assists curation with modules optimized for the thevarious curation tasks: document triage, entity recognition and information extraction. Here, we describe the evaluation of neXtA(5) by expert curators. We first assessed the annotations of two independent curators to provide a baseline for comparison. To evaluate the performance of neXtA(5), we submitted requests and compared the neXtA(5) results with the manual curation. The analysis focuses on the usability of neXtA(5) to support the curation of two types of data: biological processes (BPs) and diseases (Ds). We evaluated the relevance of the papers proposed as well as the recall and precision of the suggested annotations. The evaluation of document triage by neXtA(5) precision showed that both curators agree with neXtA(5) for 67 (BP) and 63% (D) of abstracts, while curators agree on accepting or rejecting an abstract similar to 80% of the time. Hence, the precision of the triage system is satisfactory. For concept extraction, curators approved 35 (BP) and 25% (D) of the neXtA(5) annotations. Conversely, neXtA(5) successfully annotated up to 36 (BP) and 68% (D) of the terms identified by curators. The user feedback obtained in these tests highlighted the need for improvement in the ranking function of neXtA(5) annotations. Therefore, we transformed the information extraction component into an annotation ranking system. This improvement results in a top precision (precision at first rank) of 59 (D) and 63% (BP). These results suggest that when considering only the first extracted entity, the current system achieves a precision comparable with expert biocurators.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要