qc3C: reference-free quality control for Hi-C sequencing data

crossref(2021)

引用 0|浏览1
暂无评分
摘要
AbstractHi-C is a sample preparation method that enables high-throughput sequencing to capture genome-wide spatial interactions between DNA molecules. The technique has been successfully applied to solve challenging problems such as 3D structural analysis of chromatin, scaffolding of large genome assemblies and more recently the accurate resolution of metagenome-assembled genomes (MAGs). Despite continued refinements, however, Hi-C library preparation remains a complex laboratory protocol and diligent quality management is recommended to avoid costly failure. Current wet-lab protocols for Hi-C library QC provide only a crude assay, while commonly used sequence-based QC methods demand a reference genome; the quality of which can skew results. We propose a new, reference-free approach for Hi-C library quality assessment that requires only a modest amount of sequencing data. The algorithm builds upon the observation that proximity ligation events are likely to create k -mers that would not naturally occur in the sample. Our software tool (qc3C) is to our knowledge the first to implement a reference-free Hi-C QC tool, and also provides reference-based QC, enabling Hi-C to be more easily applied to non-model organisms and environmental samples. We characterise the accuracy of the new algorithm on simulated and real datasets and compare it to reference-based methods.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要