Mining massive SNP data for identifying associated SNPs and uncovering gene relationships.

BCB(2014)

引用 0|浏览33
暂无评分
摘要
ABSTRACTStudies on SNP correlations have been focused on SNPs located on the same chromosome since SNPs on different chromosomes are expected to segregate randomly. Previous studies suggest that SNPs can be associated with each other over long distances and even across different chromosomes. To facilitate the study of SNP associations, our goal is to find SNPs that coexist in a significant number of samples regardless of their genomic distance, and subsequently to study the relationships among these associated SNPs and corresponding genes. This problem of mining co-occurrent SNP associations is computationally challenging and motivates us to design an efficient data mining algorithm FCIRC to mine SNP associations from massive SNP data. By applying our method on the original SNP data and random chromosome permutation data, we demonstrate that our method is able to find non-random SNP associations across multiple chromosomes. Among the large amount of associated SNPs identified by our method, many of them involve multiple chromosomes. Some SNP associations also suggest novel relationships among the corresponding genes, and some may imply biological and disease mechanisms related to corresponding genes.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要