Machine learning-based detection of insertions and deletions in the human genome

bioRxiv(2019)

引用 3|浏览34
暂无评分
摘要
Insertions and deletions (indels) make a critical but under-studied contribution to human genetic variation. While indel calling has improved significantly, it lags dramatically in performance relative to single-nucleotide variant calling, something of particular concern for clinical genomics where larger scale disruption of the open reading frame can commonly cause disease. Here, we present a machine learning-based approach to the detection of indel breakpoints. Our novel approach improves sensitivity to larger variants by leveraging sequencing metrics and signatures of poor read alignment. We use new benchmark datasets and Sanger sequencing to compare our approach to current gold standard indel callers, achieving unprecedented levels of precision and recall. We demonstrate the impact of Scotch9s calling improvements by applying this tool to a cohort of patients with undiagnosed disease, generating plausible candidates in 21 out of 26 cases. We highlight the diagnosis of one patient with a 498-bp deletion in HNRNPA1 missed by traditional indel-detection tools.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要