Meta-analysis of Multi-functional Biomarkers for Discovery and Predictive Modeling of Colorectal Adenoma and Carcinoma

Scott N. Peterson, Alexey Eroshkin, Piotr Koźbiał, Ermanno Florio,Farnaz Fouladi, Noah Strom, Yacgley Valdes, Gregory Kuehn,Giorgio Casaburi, Thomas Kuehn

Research Square (Research Square)(2023)

引用 0|浏览2
暂无评分
摘要
Abstract Background: Despite the effectiveness of colonoscopy for reducing colorectal cancer (CRC) mortality, poor screening compliance ranks CRC as the second most deadly malignancy. There is a need to develop a preventative, non-invasive diagnostic test, such as a fecal microbiota test, for early detection of both pre-cancerous adenomas and carcinomas to effectively reduce mortality. Results: We conducted a clinical meta-analysis of published deep metagenomic stool sequence datasets including 1,670 subjects from 9 countries, including 703 healthy controls, 161 precancerous colorectal adenoma (CRA), 48 advanced precancerous colorectal adenoma (CRAA) and 758 CRC cases diagnosed by colonoscopy. We analyzed these data through a novel automated machine learning workflow using a two-stage feature importance ranking and ensemble modeling method to identify and select highly predictive taxonomic and functional biomarkers. Machine learning modeling of selected features differentiated the metagenomic profiles of healthy patients from CRA, CRAA and CRC cases with an average area under the curve (AUC) for external holdout testing of 0.84 (sensitivity=0.82; specificity=0.71, accuracy=0.77) for CRC; an AUC of 0.97 (sensitivity=0.78; specificity=0.98, accuracy=0.97) for CRAA; and an AUC of 0.90 (sensitivity=0.74, specificity=0.89, accuracy=0.86) for CRA. These performance outcomes represented a 2%, 3% and 8% increase in AUC, compared to baseline ML performance, respectively. The predictive features identified for each disease class were largely distinct and represented differing proportions of taxonomic and functional features. Conclusions: The predictive taxonomic features identified for each disease class were largely distinct, whereas many functional gene features were shared across disease classes but displayed differing direction of change. Application of our ensemble approach for feature selection increased the predictive power of each disease class and moreover may generate discriminatory models with greater generalizability.
更多
查看译文
关键词
biomarkers,colorectal adenoma,carcinoma,meta-analysis,multi-functional
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要