Consistent algorithms for multi-label classification with macro-at-k metrics
CoRR(2024)
摘要
We consider the optimization of complex performance metrics in multi-label
classification under the population utility framework. We mainly focus on
metrics linearly decomposable into a sum of binary classification utilities
applied separately to each label with an additional requirement of exactly k
labels predicted for each instance. These "macro-at-k" metrics possess
desired properties for extreme classification problems with long tail labels.
Unfortunately, the at-k constraint couples the otherwise independent binary
classification tasks, leading to a much more challenging optimization problem
than standard macro-averages. We provide a statistical framework to study this
problem, prove the existence and the form of the optimal classifier, and
propose a statistically consistent and practical learning algorithm based on
the Frank-Wolfe method. Interestingly, our main results concern even more
general metrics being non-linear functions of label-wise confusion matrices.
Empirical results provide evidence for the competitive performance of the
proposed approach.
更多查看译文
关键词
multi-label classification,complex performance metrics,macro-at-k,extreme classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要