DeepPPPred: An Ensemble of BERT, CNN, and RNN for Classifying Co-mentions of Proteins and Phenotypes
biorxiv(2020)
摘要
The biomedical literature provides an extensive source of information in the form of unstructured text. One of the most important types of information hidden in biomedical literature is the relationships between human proteins and their phenotypes, which, due to the exponential growth of publications, can remain hidden. This provides a range of opportunities for the development of computational methods to extract the biomedical relationships from the unstructured text. In our previous work, we developed a supervised machine learning approach, called PPPred, for classifying the validity of a given sentence-level human protein-phenotype co-mention. In this work, we propose DeepPPPred, an ensemble classifier composed of PPPred and three deep neural network models: RNN, CNN, and BERT. Using an expanded gold-standard co-mention dataset, we demonstrate that the proposed ensemble method significantly outperforms its constituent components and provides a new state-of-the-art performance on classifying the co-mentions of human proteins and phenotype terms.
### Competing Interest Statement
The authors have declared no competing interest.
更多查看译文
关键词
bert,proteins,phenotypes,cnn,rnn,co-mentions
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要