When The How Outweighs The What: The Pivotal Importance Of Context

2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)(2019)

引用 1|浏览4
暂无评分
摘要
A growing body of knowledge about biological mechanisms and interaction of biological components is contained in the peer-reviewed scientific literature. In order to leverage this knowledge towards the development of predictive models, one must first extract these relationships from the text. However, the context in which the interaction was reported is critical in ensuring that it is used in a manner consistent with the model's intended application. Here we assess the applicability of two generic automated methods for leveraging a broader contextual structure in the more specific domain of a biological experiment using only the paper's title and abstract. In an example use case, a Support Vector Machine (SVM) and two variants of the broadly used Bidirectional Encoder Representations from Transformers (BERT) neural network model, serve to distinguish mouse from human subject experiments in a corpus of over 12,000 papers documenting mechanistic interactions in a regulatory model of of mucosal immune signaling. The BERT and domain-specific BioBERT yielded essentially equivalent classification accuracy with both neural network models performing only marginally better than the SVM. Words occurring frequently in abstracts were largely non-specific, whereas words unique to each class were used in 4% or less of the abstracts. These high-specificity words were used in very similar contexts that separated mouse and human study abstracts on the basis of study design and experimental procedure rather than species or basic biological markers.
更多
查看译文
关键词
natural language processing, document classification, contextual embedding, immune signaling, causal modelling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要