Resolving the unsolved: Comprehensive assessment of tandem repeats at scale

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 4|浏览26
暂无评分
摘要
Tandem repeat (TR) variation is associated with gene expression changes and over 50 rare monogenic diseases. Recent advances in sequencing have enabled accurate, long reads that can characterize the full-length sequence and methylation profile of TRs. However, despite these advances in sequencing technology, computational methods to fully profile tandem repeats across the genome do not exist. To address this gap, we introduce tools for tandem repeat genotyping (TRGT), visualization and an accompanying TR database. TRGT accurately resolves the length and sequence composition of TR regions in the human genome. Assessing 937,122 TRs, TRGT showed a Mendelian concordance of 99.56%, allowing a single repeat unit difference. In six samples with known repeat expansions, TRGT detected all repeat expansions while also identifying methylation signals, mosaicism, and providing finer resolution of repeat length. Additionally, we release a database with allele sequences and methylation levels for 937,122 TRs across 100 genomes. ### Competing Interest Statement Egor Dolzhenko, Guilherme De Sena Brandine, Tom Mokveld, William J. Rowell, Caitlin Karniski, Zev Kronenberg, Aaron Wenger, Michael A Eberle are employees and shareholders of Pacific Biosciences. Fritz J. Sedlazeck received research support from Illumina, Pacific Biosciences, Nanopore, and Genentech.
更多
查看译文
关键词
tandem repeats
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要