Genome-wide profiling of highly similar paralogous genes using HiFi sequencing

Xiao Chen, Daniel Baker,Egor Dolzhenko, Joseph M Devaney, Jessica Noya, April S Berlyoung,Rhonda Brandon,Kathleen S Hruska,Lucas Lochovsky,Paul Kruszka, Scott Newman,Emily Farrow,Isabelle Thiffault,Tomi Pastinen,Dalia Kasperaviciute,Christian Gilissen,Lisenka Vissers,Alexander Hoischen,Seth Berger,Eric Vilain,Emmanuèle Délot, UCI Genomics Research to Elucidate the Genetics of Rare Diseases (UCI GREGoR) Consortium,Michael A Eberle

biorxiv(2024)

引用 0|浏览14
暂无评分
摘要
Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of a gene family. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations revealed highly variable copy numbers of these regions. We identified 23 families with exceptionally low within-family diversity, where extensive gene conversion and unequal-crossing over have resulted in highly similar gene copies. Furthermore, our analysis of 36 trios identified 7 de novo SNVs and 4 de novo gene conversion events, 2 of which are non-allelic. Finally, we summarized extensive genetic diversity in 9 medically relevant genes previously considered challenging to genotype. Paraphase provides a framework for resolving gene paralogs, enabling accurate testing in medically relevant genes and population-wide studies of previously inaccessible genes. ### Competing Interest Statement X.C., D.B., E.D. and M.A.E. are employees of PacBio. J.M.D., J.N., A.S.B., R.B., K.S.H., L.L., P.K. and S.N. are employees of GeneDx.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要