ARIC: accurate and robust inference of cell type proportions from bulk gene expression or DNA methylation data

BRIEFINGS IN BIOINFORMATICS(2022)

引用 2|浏览5
暂无评分
摘要
Quantifying cell proportions, especially for rare cell types in some scenarios, is of great value in tracking signals associated with certain phenotypes or diseases. Although some methods have been proposed to infer cell proportions from multicomponent bulk data, they are substantially less effective for estimating the proportions of rare cell types which are highly sensitive to feature outliers and collinearity. Here we proposed a new deconvolution algorithm named ARIC to estimate cell type proportions from gene expression or DNA methylation data. ARIC employs a novel two-step marker selection strategy, including collinear feature elimination based on the component-wise condition number and adaptive removal of outlier markers. This strategy can systematically obtain effective markers for weighted upsilon-support vector regression to ensure a robust and precise rare proportion prediction. We showed that ARIC can accurately estimate fractions in both DNA methylation and gene expression data from different experiments. We further applied ARIC to the survival prediction of ovarian cancer and the condition monitoring of chronic kidney disease, and the results demonstrate the high accuracy and robustness as well as clinical potentials of ARIC. Taken together, ARIC is a promising tool to solve the deconvolution problem of bulk data where rare components are of vital importance.
更多
查看译文
关键词
deconvolution,gene expression,DNA methylation,bulk data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要