Antibody Watch: Text Mining Antibody Specificity From The Literature

PLOS COMPUTATIONAL BIOLOGY(2021)

引用 2|浏览32
暂无评分
摘要
Antibodies are widely used reagents to test for expression of proteins and other antigens. However, they might not always reliably produce results when they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable research results. While many proposals have been developed to deal with the problem of antibody specificity, it is still challenging to cover the millions of antibodies that are available to researchers. In this study, we investigate the feasibility of automatically generating alerts to users of problematic antibodies by extracting statements about antibody specificity reported in the literature. The extracted alerts can be used to construct an "Antibody Watch" knowledge base containing supporting statements of problematic antibodies. We developed a deep neural network system and tested its performance with a corpus of more than two thousand articles that reported uses of antibodies. We divided the problem into two tasks. Given an input article, the first task is to identify snippets about antibody specificity and classify if the snippets report that any antibody exhibits non-specificity, and thus is problematic. The second task is to link each of these snippets to one or more antibodies mentioned in the snippet. The experimental evaluation shows that our system can accurately perform the classification task with 0.925 weighted F1-score, linking with 0.962 accuracy, and 0.914 weighted F1 when combined to complete the joint task. We leveraged Research Resource Identifiers (RRID) to precisely identify antibodies linked to the extracted specificity snippets. The result shows that it is feasible to construct a reliable knowledge base about problematic antibodies by text mining.Author summary Antibodies are widely used reagents to test for the expression of proteins. However, antibodies are also a known source of reproducibility problems in biomedicine, as specificity and other issues can complicate their use. Information about how antibodies perform for specific applications are scattered across the biomedical literature and multiple websites. To alert scientists with reported antibody issues, we develop text mining algorithms that can identify specificity issues reported in the literature. We developed a deep neural network algorithm and performed a feasibility study on 2,223 papers. We leveraged Research Resource Identifiers (RRIDs), unique identifiers for antibodies and other biomedical resources, to match extracted specificity issues with particular antibodies. The results show that our system, called "Antibody Watch," can accurately perform specificity issue identification and RRID association with a weighted F-score over 0.914. From our test corpus, we identified 37 antibodies with 68 nonspecific issue statements. With Antibody Watch, for example, if one were looking for an antibody targeting beta-Amyloid 1-16, from 74 antibodies at dkNET Resource Reports (on 10/2/20), one would be alerted that "some non-specific bands were detected at 55 kDa in both WT and APP/PS1 mice with the 6E10 antibody..."
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要