Examining properness in the external validation of survival models with squared and logarithmic losses
CoRR(2022)
Abstract
Scoring rules promote rational and honest decision-making, which is becoming
increasingly important for automated procedures in `auto-ML'. In this paper we
survey common squared and logarithmic scoring rules for survival analysis and
determine which losses are proper and improper. We prove that commonly utilised
squared and logarithmic scoring rules that are claimed to be proper are in fact
improper, such as the Integrated Survival Brier Score (ISBS). We further prove
that under a strict set of assumptions a class of scoring rules is strictly
proper for, what we term, `approximate' survival losses. Despite the difference
in properness, experiments in simulated and real-world datasets show there is
no major difference between improper and proper versions of the widely-used
ISBS, ensuring that we can reasonably trust previous experiments utilizing the
original score for evaluation purposes. We still advocate for the use of proper
scoring rules, as even minor differences between losses can have important
implications in automated processes such as model tuning. We hope our findings
encourage further research into the properties of survival measures so that
robust and honest evaluation of survival models can be achieved.
MoreTranslated text
Key words
survival,rules,scoring
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined