Multi-scale variational autoencoder for imputation of missing values in untargeted metabolomics using whole-genome sequencing data

Chen Zhao, Kuan-Jui Su, Chong Wu,Xuewei Cao, Qiuying Sha, Wu Li,Zhe Luo,Tian Qing, Chuan Qiu,Lan Juan Zhao, Anqi Liu, Lindong Jiang, Xiao Zhang,Hui Shen,Weihua Zhou,Hong-Wen Deng

Computers in Biology and Medicine(2024)

引用 0|浏览2
暂无评分
摘要
Background Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies. Method In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-scale variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information. Results We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved R2-scores > 0.01 for 71.55 % of metabolites. Conclusion The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.
更多
查看译文
关键词
Metabolomics,Whole genome sequencing,Imputation,Multi-scale,Variational autoencoder
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要