Efficient Genome-Wide, Privacy-Preserving Similar Patient Query Based On Private Edit Distance
CCS(2015)
摘要
Edit distance has been proven to be an important and frequently used metric in many human genomic research, with Similar Patient Query (SPQ) being a particularly promising and attractive example. However, due to the widespread privacy concerns on revealing personal genomic data, the scope and scale of many novel use of genome edit distance are substantially limited. While the problem of private genomic edit distance has been studied by the research community for over a decade [5], the state-of-the-art solution [30] is far from even close to be applicable to real genome sequences.In this paper, we propose several private edit distance protocols that feature unprecedentedly high efficiency and precision. Our construction is a combination of a novel genomic edit distance approximation algorithm and new construction of private set difference size protocols. With the private edit distance based secure SPQ primitive, we propose GENSETS, a genome-wide, privacy preserving similar patient query system. It is able to support searching large-scale, distributed genome databases across the nation. We have implemented a prototype of GEN SETS. The experimental results show that, with 100 Mbps network connection, it would take GENSETS less than 200 minutes to search through 1 million breast cancer patients (distributed nation-wide in 250 hospitals, each having 4000 patients), based on edit distances between their genomes of lengths about 75 million nucleotides each.
更多查看译文
关键词
Secure Computation,Genomic Computation,Edit Distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络