Robust instance-dependent cost-sensitive classification

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION(2023)

引用 0|浏览8
暂无评分
摘要
Instance-dependent cost-sensitive (IDCS) learning methods have proven useful for binary classification tasks where individual instances are associated with variable misclassification costs. However, we demonstrate in this paper by means of a series of experiments that IDCS methods are sensitive to noise and outliers in relation to instance-dependent misclassification costs and their performance strongly depends on the cost distribution of the data sample. Therefore, we propose a generic three-step framework to make IDCS methods more robust: (i) detect outliers automatically, (ii) correct outlying cost information in a data-driven way, and (iii) construct an IDCS learning method using the adjusted cost information. We apply this framework to cslogit, a logistic regression-based IDCS method, to obtain its robust version, which we name r-cslogit. The robustness of this approach is introduced in steps (i) and (ii), where we make use of robust estimators to detect and impute outlying costs of individual instances. The newly proposed r-cslogit method is tested on synthetic and semi-synthetic data and proven to be superior in terms of savings compared to its non-robust counterpart for variable levels of noise and outliers. All our code is made available online at https://github.com/SimonDeVos/Robust-IDCS .
更多
查看译文
关键词
Cost-sensitive learning,Instance-dependent costs,Classification,Outliers,Regression diagnostics,Logistic regression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要