A natural language processing system for the efficient updating of highly curated pathophysiology mechanism knowledge graphs

Negin Sadat Babaiha, Hassan Elsayed,Bide Zhang, Abish Kaladharan, Priya Sethumadhavan,Bruce Schultz, Jürgen Klein, Bruno Freudensprung,Vanessa Lage-Rupprecht,Alpha Tom Kodamullil,Marc Jacobs,Stefan Geissler,Sumit Madan,Martin Hofmann-Apitius

Artificial Intelligence in the Life Sciences(2023)

Cited 0|Views13
No score
Abstract
Biomedical knowledge graphs (KG) have become crucial for describing biological findings in a structured manner. To keep up with the constantly changing flow of knowledge, their embedded information must be regularly updated with the latest findings. Natural language processing (NLP) has created new possibilities for automating this upkeep by facilitating information extraction from free text. However, due to annotated and labeled biomedical data limitations, the development of completely autonomous information extraction systems remains a substantial scientific and technological hurdle. This study aims to explore methodologies best suited to support the automatic extraction of causal relationships from biomedical literature with the aim of regular and rapid updating of disease-specific pathophysiology mechanism KGs. Our proposed approach first searches and retrieves PubMed abstracts using the desired terms and keywords. The extension corpora are then passed through the NLP pipeline for automatic information extraction. We then identify triples representing cause-and-effect relationships and encode this content using the Biological Expression Language (BEL). Finally, domain experts perform an analysis of the completeness, relevance, accuracy, and novelty of the extracted triples. In our test scenario, which is focused on the KG regarding the phosphorylation of the Tau protein, our pipeline successfully contributed novel data, which was then subsequently used to update the KG leading to the identification of six additional upstream regulators of Tau phosphorylation. Here, it is demonstrated that the NLP-based workflow we created is capable of rapidly updating pathophysiology mechanism graphs. As a result, production-scale, semi-automated updating of pre-existing, curated mechanism graphs is enabled.
More
Translated text
Key words
Knowledge graphs,Relation extraction,Natural language processing,Biomedical text mining,Biological expression language (BEL),Human brain pharmacome (HBP)
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined