An Integrative Machine Learning Framework for Classifying SEER Breast Cancer

Research Square (Research Square)(2022)

引用 0|浏览0
暂无评分
摘要
Abstract BACKGROUND: Breast cancer is the commonest type of cancer in women worldwide and the leading cause of mortality for females. Despite the fact that many breast cancer patients have no family members who have also had the disease. Women who have it are more at risk than those who don't. OBJECTIVE: The aim of this research is to classify the death status of breast cancer patients using the Surveillance, Epidemiology, and End Results (SEER) dataset. Due to its capacity to handle enormous data sets systematically, machine learning has been widely employed in biomedical research to answer diverse classification difficulties. Pre-processing data enables its visualization and analysis for use in making important decisions. METHODOLOGY: This research presents a feasible machine learning-based approach for categorizing datasets related to breast cancer. Moreover, a two-step feature selection method based on Variance Threshold and Principal Component Analysis (PCA) was employed to select the features from the SEER breast cancer dataset. After selecting the features, the classification of the breast cancer dataset is carried out using Supervised and Ensemble learning techniques such as Ada Boosting (AB), XG Boosting (XGB), and Gradient Boosting (GB), as well as binary classification techniques such as Naive Bayes (NB) and Decision Tree (DT). RESULTS:In this study, it is observed that the Decision Tree algorithm showed better results than other algorithms used in this analysis (AB, XGB, GB & NB). The accuracy of DT for both train-test split and cross validation achieved as 98%. CONCLUSION: Utilizing the train-test split and k-fold cross-validation approaches, the performance of various machine learning algorithms is examined. The Decision Tree algorithm outperforms other supervised and ensemble learning approaches, according to the experimental data.
更多
查看译文
关键词
integrative machine learning framework,seer breast cancer,breast cancer,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要