plasma: Partial LeAst Squares for Multiomics Analysis

biorxiv(2023)

引用 26|浏览16
暂无评分
摘要
Motivation The rapid growth in the number and application of high-throughput “omics” technologies has created a need for better methods to integrate multiomics data sets. Much progress has been made in developing unsupervised methods, but supervised methods have lagged behind. Results We develop a novel algorithm, plasma, to train and validate models to predict time-to-event outcomes from multiomics data sets. The model is built on using two layers of the existing partial least squares algorithm to first select components that covary with the outcome in order to construct a joint Cox proportional hazards model. We apply plasma to the lung squamous cell carcinoma (LUSC) data from The Cancer Genome Atlas (TCGA). Our model successfully separates an independent test data set into high risk and low risk patients (p = 0.0132). The performance of the joint multiomics model is superior to that of the individual omics data sets. It is also superior to the performance of an approach that uses an unsupervised method (Multi Omics Factor Analysis; MOFA) to find factors that might work as predictors. Many of the factors that contribute strongly to the plasma model can be justified from the biological literature. Availability and Implementation The plasma R package can be obtained from The Comprehensive R Archive Network (CRAN) at . The latest version of the package can be obtained from R-Forge at . Source code and data for the analysis presented here can be obtained from GitLab, at . Contact Email: kcoombes{at}augusta.edu Supplementary Information Supplementary material is available from Bioinformatics online. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
multiomics analysis,partial least squares
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要