A multi-functional analyzer uses parameter constraints to improve the efficiency of model-based gene-set analysis

ANNALS OF APPLIED STATISTICS(2015)

引用 10|浏览14
暂无评分
摘要
We develop a model-based methodology for integrating gene-set information with an experimentally-derived gene list. The methodology uses a previously reported samplingmodel, but takes advantage of natural constraints in the high-dimensional discrete parameter space in order to work from a more structured prior distribution than is currently available. We show how the natural constraints are expressed in terms of linear inequality constraints within a set of binary latent variables. Further, the currently available prior gives low probability to these constraints in complex systems, such as Gene Ontology (GO), thus reducing the efficiency of statistical inference. We develop two computational advances to enable posterior inference within the constrained parameter space: one using integer linear programming for optimization and one using a penalized Markov chain sampler. Numerical experiments demonstrate the utility of the new methodology for a multivariate integration of genomic data with GO or related information systems. Compared to available methods, the proposed multi-functional analyzer covers more reported genes without mis-covering nonreported genes, as demonstrated on genome-wide data from association studies of type 2 diabetes and from RNA interference studies of influenza.
更多
查看译文
关键词
Gene-set enrichment,Bayesian analysis,integer linear programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要