Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations
arxiv(2024)
Abstract
We study the problem of assessing the robustness of counterfactual
explanations for deep learning models. We focus on plausible model
shifts altering model parameters and propose a novel framework to reason
about the robustness property in this setting. To motivate our solution, we
begin by showing for the first time that computing the robustness of
counterfactuals with respect to plausible model shifts is NP-complete. As this
(practically) rules out the existence of scalable algorithms for exactly
computing robustness, we propose a novel probabilistic approach which is able
to provide tight estimates of robustness with strong guarantees while
preserving scalability. Remarkably, and differently from existing solutions
targeting plausible model shifts, our approach does not impose requirements on
the network to be analyzed, thus enabling robustness analysis on a wider range
of architectures. Experiments on four binary classification datasets indicate
that our method improves the state of the art in generating robust
explanations, outperforming existing methods on a range of metrics.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined