A Computational Approach to Identification of Candidate Biomarkers in High-Dimensional Molecular Data

Justin Gerolami,Justin Jong Mun Wong, Ricky Zhang, Tong Chen,Tashifa Imtiaz, Miranda Smith,Tamara Jamaspishvili,Madhuri Koti,Janice Irene Glasgow,Parvin Mousavi,Neil Renwick,Kathrin Tyryshkin

DIAGNOSTICS（2022）

引用 4|浏览2

暂无评分

摘要

Complex high-dimensional datasets that are challenging to analyze are frequently produced through '-omics' profiling. Typically, these datasets contain more genomic features than samples, limiting the use of multivariable statistical and machine learning-based approaches to analysis. Therefore, effective alternative approaches are urgently needed to identify features-of-interest in '-omics' data. In this study, we present the molecular feature selection tool, a novel, ensemble-based, feature selection application for identifying candidate biomarkers in '-omics' data. As proof-of-principle, we applied the molecular feature selection tool to identify a small set of immune-related genes as potential biomarkers of three prostate adenocarcinoma subtypes. Furthermore, we tested the selected genes in a model to classify the three subtypes and compared the results to models built using all genes and all differentially expressed genes. Genes identified with the molecular feature selection tool performed better than the other models in this study in all comparison metrics: accuracy, precision, recall, and F1-score using a significantly smaller set of genes. In addition, we developed a simple graphical user interface for the molecular feature selection tool, which is available for free download. This user-friendly interface is a valuable tool for the identification of potential biomarkers in gene expression datasets and is an asset for biomarker discovery studies.

查看译文

关键词

biomarker, feature selection, big data analysis, RNA-Seq, prostate adenocarcinoma

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要