Virus-Derived Variation In Diverse Human Genomes

PLOS GENETICS(2021)

引用 3|浏览6
暂无评分
摘要
Author summaryThe sequences that make up each person's genome have diverse origin stories. For example, some people's genomes include fragments of chromosomes originating from Neanderthals. Variation in the genes in these chromosomal fragments may be associated with observable differences between people. Human genomes also include sequences originating from viruses, but the extent to which these sequences vary in different humans is unknown. We studied virus-derived variation in thousands of human genomes using new tools made for this purpose. In the process, we uncovered virus infection in cell lines used as reference materials for human genome variation. We also found rare germline variants originating from acquisition of human herpesvirus 6, as well as more extensive variation in human endogenous retroviruses than has previously been described. This work expands the toolbox for studying the intriguing class of virus-derived genetic variants in human genomes.Acquisition of genetic material from viruses by their hosts can generate inter-host structural genome variation. We developed computational tools enabling us to study virus-derived structural variants (SVs) in population-scale whole genome sequencing (WGS) datasets and applied them to 3,332 humans. Although SVs had already been cataloged in these subjects, we found previously-overlooked virus-derived SVs. We detected non-germline SVs derived from squirrel monkey retrovirus (SMRV), human immunodeficiency virus 1 (HIV-1), and human T lymphotropic virus (HTLV-1); these variants are attributable to infection of the sequenced lymphoblastoid cell lines (LCLs) or their progenitor cells and may impact gene expression results and the biosafety of experiments using these cells. In addition, we detected new heritable SVs derived from human herpesvirus 6 (HHV-6) and human endogenous retrovirus-K (HERV-K). We report the first solo-direct repeat (DR) HHV-6 likely to reflect DR rearrangement of a known full-length endogenous HHV-6. We used linkage disequilibrium between single nucleotide variants (SNVs) and variants in reads that align to HERV-K, which often cannot be mapped uniquely using conventional short-read sequencing analysis methods, to locate previously-unknown polymorphic HERV-K loci. Some of these loci are tightly linked to trait-associated SNVs, some are in complex genome regions inaccessible by prior methods, and some contain novel HERV-K haplotypes likely derived from gene conversion from an unknown source or introgression. These tools and results broaden our perspective on the coevolution between viruses and humans, including ongoing virus-to-human gene transfer contributing to genetic variation between humans.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要