Chrome Extension
WeChat Mini Program
Use on ChatGLM

A comparison of internal validation methods for validating predictive models for binary data with rare events

Journal of Statistical Research(2018)

Cited 6|Views1
No score
Abstract
In clinical research, prediction models for binary data are frequently developed in logistic regression framework to predict the risk of patient’s health status such as death and illness. However, when the outcome is rare, the maximum likelihood (ML) based standard logistic regression has been reported to show poor predictive performance by providing overfitted model. To overcome this, penalized maximum likelihood (PML) based logistic models are being widely used in risk prediction, however, their predictive performance in validation settings is not well-documented. Several validation approaches, namely split-sample, crossvalidation, bootstrap validation and its two variants 0.632 and 0.632+, have been widely used to validate the performance of a prediction model, however, it is also unclear which one of these approaches best for estimating accurate predictive performance of a rare-outcome model. This paper focused on evaluating predictive performance of PML based logistic model in such validation settings in comparison with ML based standard model and identifying the effective validation method. An extensive simulation study was performed by creating several scenarios to reflect modeling in dataset with few events. The results revealed that PML based model showed better performance by reducing overfitting to some extent and increasing discriminatory ability over ML based model, irrespective of validation methods under study. Of the validation methods, regular bootstrap and its variants 0.632 and 0.632+, particularly 0.632+, performed well by providing nearly accurate and stable estimate of the true predictive performance. We also illustrated the methods applying them to cardiac data set with few events.
More
Translated text
Key words
internal validation methods,predictive models,binary data
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined