On Smoothed Explanations: Quality and Robustness

Conference on Information and Knowledge Management(2022)

引用 1|浏览27
暂无评分
摘要
ABSTRACTExplanation methods highlight the importance of the input features in taking a predictive decision, and represent a solution to increase the transparency and trustworthiness in machine learning and deep neural networks (DNNs). However, explanation methods can be easily manipulated generating misleading explanations particularly under visually imperceptible adversarial perturbations. Recent work has identified the decision surface geometry of DNNs as the main cause of this phenomenon. To make explanation methods more robust against adversarially crafted perturbations, recent research has promoted several smoothing approaches. These approaches smooth either the explanation map or the decision surface. In this work, we initiate a very thorough evaluation of the quality and robustness of the explanations offered by smoothing approaches. Different properties are evaluated. We present settings in which the smoothed explanations are both better, and worse, than the explanations derived by the commonly-used (non-smoothed) Gradient explanation method. By making the connection with the literature on adversarial attacks, we demonstrate that such smoothed explanations are robust primarily against additive attacks. However, a combination of additive and non-additive attacks can still manipulate these explanations, revealing important shortcomings in their robustness properties.
更多
查看译文
关键词
Transparency, Explainable AI, Gradient-based explanations, Robust explanations, Neural Networks, Adversarial Robustness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要