Towards Non-Adversarial Algorithmic Recourse
CoRR(2024)
Abstract
The streams of research on adversarial examples and counterfactual
explanations have largely been growing independently. This has led to several
recent works trying to elucidate their similarities and differences. Most
prominently, it has been argued that adversarial examples, as opposed to
counterfactual explanations, have a unique characteristic in that they lead to
a misclassification compared to the ground truth. However, the computational
goals and methodologies employed in existing counterfactual explanation and
adversarial example generation methods often lack alignment with this
requirement. Using formal definitions of adversarial examples and
counterfactual explanations, we introduce non-adversarial algorithmic recourse
and outline why in high-stakes situations, it is imperative to obtain
counterfactual explanations that do not exhibit adversarial characteristics. We
subsequently investigate how different components in the objective functions,
e.g., the machine learning model or cost function used to measure distance,
determine whether the outcome can be considered an adversarial example or not.
Our experiments on common datasets highlight that these design choices are
often more critical in deciding whether recourse is non-adversarial than
whether recourse or attack algorithms are used. Furthermore, we show that
choosing a robust and accurate machine learning model results in less
adversarial recourse desired in practice.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined