Prediction of secondary testosterone deficiency using machine learning: A comparative analysis of ensemble and base classifiers, probability calibration, and sampling strategies in a slightly imbalanced dataset

Informatics in Medicine Unlocked(2021)

引用 2|浏览0
暂无评分
摘要
Testosterone is the most important male sex hormone, and its deficiency brings many physical and mental harms. Efficiently identifying individuals with low testosterone is crucial prior to starting proper treatment. However, routine monitoring of testosterone levels can be costly in many regions, resulting in an underreporting of cases, especially in developing countries. Moreover, there are few studies that employ machine learning (ML) in prognosticating testosterone deficiency. This research, therefore, aims to offer a coherent comparative analysis of machine learning methods that can predict testosterone deficiency without having patients undergo costly medical tests. In doing so, we seek to provide to the urological community a publicly available dataset (https://github.com/osmarluiz/Testosterone-Deficiency-Dataset) to increase research in this yet untapped field. For this analysis, we used ten base classifiers (optimized with grid search stratified K-fold cross-validation); three ensemble methods; and eight sampling strategies to analyze a total of 3397 patients. The analysis was based on six features (age; abdominal circumference; triglycerides; high-density lipoprotein; diabetes; and hypertension), all of which were obtained by low-cost exams. We compared the sampling strategies and the classifiers' performance on an independent test set using ranking (PR-AUC), probabilistic (Brier score), and threshold metrics. We found that: (1) within the ranking metrics, sampling strategies did not enhance results in this slightly imbalanced (4:1 ratio) dataset; (2) the ensemble classifier using weighted average presented the best performance; (3) the best base classifier was XGBoost; (4) calibration showed significant improvement for the sampling strategies and slight improvements for the no sampling strategy; (5) the McNemar's test presented statistically similar results among all classifiers; and (6) abdominal circumference (AC) had by far the highest feature importance, followed by triglycerides (TG). Age showed very little significance in predicting testosterone deficiency.
更多
查看译文
关键词
Machine learning,Imbalanced data,Testosterone deficiency,Ensemble classifier
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要