Long-read whole genome analysis of human single cells

biorxiv(2021)

引用 8|浏览19
暂无评分
摘要
With long-read sequencing we have entered an era where individual genomes are routinely assembled to near-completion and where complex genetic variation can efficiently be resolved. Here we demonstrate that long reads can be applied also to study the genomic architecture of individual human cells. Clonally expanded CD8+ T-cells from a human donor were used as starting material for a droplet-based multiple displacement amplification (dMDA) method designed to ensure long molecule lengths and minimal amplification bias. Sequencing of two single cells was performed on the PacBio Sequel II system, generating over 2.5 million reads and ~20Gb HiFi data (>QV20) per cell, achieving up to 40% genome coverage. This data allowed for single nucleotide variant (SNV) detection, including in genomic regions inaccessible by short reads. Over 1000 high-confidence structural variants (SVs) per cell were discovered in the PacBio data, which is four times more than the number of SVs detected in Illumina dMDA data from clonally related cells. In addition, several putative clone-specific somatic SV events could be identified. Single-cell de novo assembly resulted in 454-598 Mb assembly sizes and 35-42 kb contig N50 values. 1762 (12.8%) of expected gene models were found to be complete in the best single-cell assembly. The de novo constructed mitochondrial genomes were 100% identical for the two single cells subjected to PacBio sequencing, although mitochondrial heteroplasmy was also observed. In summary, the work presented here demonstrates the utility of long-read sequencing towards understanding the extent and distribution of complex genetic variation at the single cell level. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要