Sensitive detection of cancer using deep learning model of cfDNA genome-wide methylation data.

Journal of Clinical Oncology(2022)

引用 0|浏览4
暂无评分
摘要
e18789 Background: Methylation analysis of cfDNA has been used to diagnose cancer in its early stages. Previous research has concentrated on local methylation signals using cancer type specific methylation markers. We used not just methylation markers, but also global methylation patterns for sensitive cancer detection. Methods: We generated methylation data from cancer patients (N = 717) and normal controls (N = 190) using cell-free methylated DNA immunoprecipitation and high-throughput sequencing (cfMeDIP-seq, N = 907) and cell free whole genome enzymatic methyl seq (cfWGEM-seq, N = 162) from cancer patients (N = 137) and normal controls (N = 25). We analyzed at the Illumina 450K methylation microarray (N = 3,479) from The Cancer Genome Atlas (TCGA) to find differentially methylated regions (DMR) in 6 cancer types (breast, lung, liver, ovarian, esophageal, and pancreatic cancer). After determining the overlapping DMRs of each dataset, the best 1661 regions that differed the most between the cancer patient group and the normal group were left. The selected marker-based model was cross-validated using cfMeDIP samples separated into training, validation, and test sets. Additionally, global methylation count values of cfMeDIP-seq data were used to train convolutional neural network. Finally, the global methylation pattern deep learning algorithm and the marker-based algorithm were combined to detect cancer. Results: Deep learning models based on selected markers and global methylation patterns achieved test data accuracy of 0.88-0.92 and 0.90-0.91, respectively, with AUC 0.94-0.96 and 0.95-0.96. The ensemble model of two models showed test data accuracy 0.91-0.92 and AUC 0.96-0.97 with the detection of early stage of cancers (stage 1:detection rate of 88-100%, stage 2:detection rate of 75-100%, stage 3:detection rate of 90-97%, stage 4:detection rate of 92-100%). Conclusions: In this study, we selected best markers by using tissue methylation dataset (TCGA) and cfDNA methylation datasets (cfMeDIP-seq, cfWGEM-seq). To train cancer detection models, we used not only the DMR pattern but also the global methylation pattern. And the ensemble model that included these features outperformed a single model. In the field of early cancer detection, our models show potential.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要