Leveraging Multi-source knowledge for Chinese clinical named entity recognition via relational graph convolutional network.

Journal of biomedical informatics(2022)

引用 13|浏览44
暂无评分
摘要
OBJECTIVE:External knowledge, such as lexicon of words in Chinese and domain knowledge graph (KG) of concepts, has been recently adopted to improve the performance of machine learning methods for named entity recognition (NER) as it can provide additional information beyond context. However, most existing studies only consider knowledge from one source (i.e., either lexicon or knowledge graph) in different ways and consider lexicon words or KG concepts independently with their boundaries. In this paper, we focus on leveraging multi-source knowledge in a unified manner where lexicon words or KG concepts are well combined with their boundaries for Chinese Clinical NER (CNER). MATERIAL AND METHODS:We propose a novel method based on relational graph convolutional network (RGCN), called MKRGCN, to utilize multi-source knowledge in a unified manner for CNER. For any sentence, a relational graph based on words or concepts in each knowledge source is constructed, where lexicon words or KG concepts appearing in the sentence are linked to the containing tokens with the boundary information of the lexicon words or KG concepts. RGCN is used to model all relational graphs constructed from multi-source knowledge, and the representations of tokens from multi-source knowledge are integrated into the context representations of tokens via an attention mechanism. Based on the knowledge-enhanced representations of tokens, we deploy a conditional random field (CRF) layer for named entity label prediction. In this study, a lexicon of words and a medical knowledge graph are used as knowledge sources for Chinese CNER. RESULTS:Our proposed method achieves the best performance on CCKS2017 and CCKS2018 in Chinese with F1-scores of 91.88% and 89.91%, respectively, significantly outperforming existing methods. The extended experiments on NCBI-Disease and BC2GM in English also prove the effectiveness of our method when only considering one knowledge source via RGCN. CONCLUSION:The MKRGCN model can integrate knowledge from the external lexicon and knowledge graph effectively for Chinese CNER and has the potential to be applied to English NER.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要