A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization
IEEE Transactions on Information Theory(2022)
摘要
We introduce a tunable loss function called
$\alpha $
-loss, parameterized by
$\alpha \in (0,\infty]$
, which interpolates between the exponential loss (
$\alpha = 1/2$
), the log-loss (
$\alpha = 1$
), and the 0–1 loss (
$\alpha = \infty $
), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between
$\alpha $
-loss and Arimoto conditional entropy, verify the classification-calibration of
$\alpha $
-loss in order to demonstrate asymptotic optimality via Rademacher complexity generalization techniques, and build-upon a notion called strictly local quasi-convexity in order to quantitatively characterize the optimization landscape of
$\alpha $
-loss. Practically, we perform class imbalance, robustness, and classification experiments on benchmark image datasets using convolutional-neural-networks. Our main practical conclusion is that certain tasks may benefit from tuning
$\alpha $
-loss away from log-loss (
$\alpha = 1$
), and to this end we provide simple heuristics for the practitioner. In particular, navigating the
$\alpha $
hyperparameter can readily provide superior model robustness to label flips (
$\alpha > 1$
) and sensitivity to imbalanced classes (
$\alpha < 1$
).
更多查看译文
关键词
α-loss,Arimoto conditional entropy,robustness,classification-calibration,strictly local quasi-convexity,generalization
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要