Searching across-cohort relatives via encrypted genotype regression

biorxiv(2022)

引用 0|浏览18
暂无评分
摘要
Identifying relatives across cohorts makes one of the basic routines for genomic data. As conventional such practice often requires explicit genomic data sharing, it is easily hampered by privacy or ethical constraints. In this study, using our proposed scheme for genomic encryption we developed encG-reg, a regression approach that is able to detect relatives of various degrees based on encrypted genomic data. The encryption properties of encG-reg is built on random matrix theory, which masks the original individual genotype but still provides controllable precision to that of direct individual-level data. With our established connection between dimension of random matrix and the required precision of a study – to find tractable eighth-order moments for encrypted genotypic matrix, encG-reg led to i) balanced false positive and false negative rates and ii) balanced computational cost and the degree of relatives to be searched. We validated encG-reg in 485,158 UKBiobank multi-ethnical samples, and the resolution of encG-reg was comparable with the conventional method such as KING. In a more complex application, we launched a fine-devised multi-center collaboration across 6 research institutes across China, covering 11 cohorts of 64,091 GWAS samples. In both examples, encG-reg robustly identified and validated relatives across the cohorts even under various ethnical background and different genotypic qualities. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要