Prediction and Curation of Missing Biomedical Identifier Mappings with Biomappings

bioRxiv (Cold Spring Harbor Laboratory)(2022)

引用 0|浏览0
暂无评分
摘要
Abstract Motivation Biomedical identifier resources (ontologies, taxonomies, controlled vocabularies) commonly overlap in scope and contain equivalent entries under different identifiers. Maintaining mappings for these relationships is crucial for interoperability and the integration of data and knowledge. However, there are substantial gaps in available mappings motivating their semi-automated curation. Results Biomappings implements a curation cycle workflow for missing mappings which combines automated prediction with human-in-the-loop curation. It supports multiple prediction approaches and provides a web-based user interface for reviewing predicted mappings for correctness, combined with automated consistency checking. Predicted and curated mappings are made available in public, version-controlled resource files on GitHub. Biomappings currently makes available 8,560 curated mappings and 41,178 predicted ones, providing previously missing mappings between widely used resources covering small molecules, cell lines, diseases and other concepts. We demonstrate the value of Biomappings on case studies involving predicting and curating missing mappings among cancer cell lines as well as small molecules tested in clinical trials. We also present how previously missing mappings curated using Biomappings were contributed back to multiple widely used community ontologies. Availability The data and code are available under the CC0 and MIT licenses at https://github.com/biopragmatics/biomappings . Contact benjamin_gyori@hms.harvard.edu
更多
查看译文
关键词
biomedical identifier mappings
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要