谷歌浏览器插件
订阅小程序
在清言上使用

Abstract 13172: Using Machine Learning to Identify Transcriptomic Biomarkers That Differentiate Early Vs Late Stage of Atherosclerosis

Circulation(2021)

引用 0|浏览7
暂无评分
摘要
Introduction: We used a machine learning approach to explore the transcriptomic signaling component of atherosclerosis. This approach can be viewed as complementary to classical differential-expression-based RNA-Seq approach while defining some of its limitations and providing insight into the cellular basis of atherosclerosis. Methods: Abdominal aorta specimens (n=242) from 128 Coroner’s autopsy cases were graded by pathologists and classified as normal (nl), fatty streak (fs), fibrous plaque (fp), or complex fibrous plaque (fc). The pathology samples were sorted into two groups for comparison: normal/early (nl/fs) vs late stage (fp/fc) of atherosclerosis. The RNA-Seq data were analyzed using an ensemble of machine learning methods to identify a set of genes that differentiate the normal/early and late pathology stages. Three feature selection algorithms (recursive feature elimination, random forest optimization and regularized linear regression) were employed to assign a total ranking to the importance of genes for pathology classification. Five different classifiers (Naive Bayes, Logistic Regression, Random Forest, Support Vector Machine and Decision Tree) were trained and XGboost model was used for ensemble learning. We used the resulting performance characterization relative to the clinically established ground truth to validate the characteristic genes and abundances selected by this model. Results: The ensemble machine learning approach identified a set of gene features that best explain late versus normal/early stage of atherosclerosis in a 5-fold cross validation experimental design. XG-boosting shallow learners give stable performance across sample splits and more consistently high F1 scores. We found that the performance of the resulting model was highly correlated with the degree of the disease severity, indicating a relationship between concerted transcript abundance and the presentation of disease phenotype. Conclusions: Our machine learning approach identifies ensemble biomarkers that differentiate early vs late stage atherosclerosis. Our results also indicated a potential transcriptomic basis for the severity of disease phenotype as embodied by histopathology grading not included in the training data.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要