Toward feature space adversarial attack in the frequency domain

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS(2022)

引用 1|浏览15
暂无评分
摘要
Recent researchers have shown that deep neural networks (DNNs) are vulnerable to adversarial exemplars, making them unsuitable for security-critical applications. Transferability of adversarial examples is crucial for attacking black-box models, which facilitates adversarial attacks in more practical scenarios. We propose a novel adversarial attack with high transferability. Unlike existing attacks that directly modify the input pixels, our attack is executed in the feature space. More specifically, we corrupt the abstract features by maximizing the feature distance between the adversarial example and clean images with a perceptual similarity network, inducing model misclassification. In addition, we apply a spectral transformation to the input, thus narrowing the search space in the frequency domain to enhance the transferability of adversarial examples. The disruption of crucial features in a specific frequency component achieves greater transferability. Extensive evaluations illustrate that our approach is easily compatible with many existing frameworks for transfer attacks and can significantly improve the baseline performance of black-box attacks. Moreover, we can obtain a higher fooling rate even if the model has a defense technique. We achieve a maximum black-box fooling rate of 61.70% on the defense model. Our work indicates that existing pixel space defense techniques are difficult to guarantee the robustness of the feature space, and the feature space from a frequency perspective is promising for developing more robust models.
更多
查看译文
关键词
adversarial examples, black-box attack, computer, deep neural networks, transfer attack
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要