Rectifying adversarial inputs using XAI techniques.

European Signal Processing Conference (EUSIPCO)(2022)

引用 0|浏览0
暂无评分
摘要
With deep neural networks (DNNs) involved in more and more decision making processes, critical security problems can occur when DNNs give wrong predictions. This can be enforced with so-called adversarial attacks. These attacks modify the input in such a way that they are able to fool a neural network into a false classification, while the changes remain imperceptible to a human observer. Even for very specialized AI systems, adversarial attacks are still hardly detectable. The current state-of-the-art adversarial defenses can be classified into two categories: pro-active defense and passive defense, both unsuitable for quick rectifications: Pro-active defense methods aim to correct the input data to classify the adversarial samples correctly, while reducing the accuracy of ordinary samples. Passive defense methods, on the other hand, aim to filter out and discard the adversarial samples. Neither of the defense mechanisms is suitable for the setup of autonomous driving: when an input has to be classified, we can neither discard the input nor have the time to go for computationally expensive corrections. This motivates our method based on explainable artificial intelligence (XAI) for the correction of adversarial samples. We used two XAI interpretation methods to correct adversarial samples. We experimentally compared this approach with baseline methods. Our analysis shows that our proposed method outperforms the state-of-the-art approaches.
更多
查看译文
关键词
explainable AI,neural networks,deep learning,adversarial defense
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要