Figure S3 from Novel Aberrations Uncovered in Barrett's Esophagus and Esophageal Adenocarcinoma Using Whole Transcriptome Sequencing

crossref(2023)

引用 0|浏览19
暂无评分
摘要

Supplementary figure 3. Work flow and filtering steps for gene selection following machine learning to discover EAC driver genes. (a) For cross validation we used all normalised reads from EdgeR as cpm. These were put through the R package CMA using four cross-validation methods to find driver genes between EAC and NDBE. Loodat (leave-one-out), Bootstrap, Monte Carlo, Five-cross validation was used. All four methods were compared and genes present in {greater than or equal to}2 methods were kept (n=52). These 52 genes were then investigated for differential expression in EAC vs NDBE (n=39). Gene ontology enrichment was run on this filtered set. Furthermore, genes from all 4 cross validations methods were used as a gene panel and validated in 2 independent micro array data sets comparing EAC vs NDBE (GSE37203 and GSE26886). Random forest analysis was conducted to identify the most important genes of the 12-gene panel using the combined microarray datasets, and subsequently tested on two separate, public microarrays of benign and malignant gastric and colonic tissues. (b) The RNA-seq expression of the selected 12-gene panel of MSMO1, ACAA1, COL17A1, CXCL2, MKNK2, AQP9, E2F3, DHCR24, CTSL, KLF4, PM20D2, CREB5 for EAC and NDBE. Y-axis is log2(cpm 1). (c) AUROC curves for the 4-gene signature on RNA-seq data (left) the two individual microarray data sets (right, bottom left) comparing EAC and NDBE that were used to identify the 4-gene signature. (bottom right) Independent microarray data set used to validate the 4-gene signature on (d) Decision tree of the 12-gene signature on colonic (left) and gastric (right) normal and tumor tissues identifying E2F3 and KLF4 as main classification drivers.

更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要