CADA: phenotype-driven gene prioritization based on a case-enriched knowledge graph

NAR GENOMICS AND BIOINFORMATICS(2021)

引用 17|浏览42
暂无评分
摘要
Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed; however, the performance of many such tools suffers from an over-representation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With Cada, we built a knowledge graph based on both case annotations and disorder annotations. Using network representation learning, we achieve gene prioritization by link prediction. Our results suggest that CADA exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Additionally, information about the frequency of occurrence of a feature can readily be incorporated, when available. Crucial in the design of our approach is the use of the growing amount of phenotype-genotype information that diagnostic labs deposit in databases such as ClinVar. By this means, CADA is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly.
更多
查看译文
关键词
gene prioritization,knowledge graph,phenotype-driven,case-enriched
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要