Multi-label topic classification for COVID-19 literature annotation: A BioBERT-based feature enhancement approach

CIBDA 2022; 3rd International Conference on Computer Information and Big Data Applications(2022)

引用 0|浏览4
暂无评分
摘要
With the rapid expansion and exponential growth of biomedical literatures, especially in the current environment of COVID-19 pandemic, it is urgent to explore an effective technology to automatically manage and categorize massive information for biomedical texts. The wide application and powerful performance of BERT have shown promising results in the field of natural language processing. Thus, we first choose the improved pre-trained language models CovidBERT and BioBERT as the basis, from the best performance of which further enhances semantic representation of abstract with extra title information. Finally, a novel feature enhancement method is proposed to exploit and integrate the distribution of label information effectively. The experimental results show that our model achieves an instance-based F1 score, precision and recall of 93.94%, 93.5% and 94.38% in the task of multi-label topic classification from track 5 BioCreative VII.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要