Structured Low-Rank Matrix Factorization for Haplotype Assembly.

J. Sel. Topics Signal Processing(2016)

引用 25|浏览57
暂无评分
摘要
In matrix factorization problems, one seeks to decompose a data matrix into a product of two matrices—frequently, one captures meaningful information contained in the data, and the other specifies how this information is combined to generate the data matrix. In this paper, matrix factorization that arises in haplotype assembly, an important NP-hard problem in genomics, is studied. Haplotypes are sequences of chromosomal variations in an individual’s genome, which are of critical importance for understudying the individual’s susceptibility to various diseases. A novel formulation of haplotype assembly as the partially observed low-rank matrix factorization problem is proposed and efficiently solved via a modified gradient descent method that exploits salient structural properties of sequencing data. In particular, the observed matrix in the problem at hand contains noisy samples of the product of an informative matrix with rows having entries from a finite alphabet and a matrix with rows that are standard unit basis. Convergence of the proposed algorithm is analyzed and its performance tested on both synthetic and experimental data. The results demonstrate superior accuracy and speed of the proposed method as compared to state-of-the-art haplotype assembly techniques.
更多
查看译文
关键词
Haplotype assembly,gradient descent,haplotype assembly,low rank,matrix factorization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要