Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning
arxiv(2022)
Abstract
Strong adversarial examples are crucial for evaluating and enhancing the
robustness of deep neural networks. However, the performance of popular attacks
is usually sensitive, for instance, to minor image transformations, stemming
from limited information – typically only one input example, a handful of
white-box source models, and undefined defense strategies. Hence, the crafted
adversarial examples are prone to overfit the source model, which hampers their
transferability to unknown architectures. In this paper, we propose an approach
named Multiple Asymptotically Normal Distribution Attacks (MultiANDA) which
explicitly characterize adversarial perturbations from a learned distribution.
Specifically, we approximate the posterior distribution over the perturbations
by taking advantage of the asymptotic normality property of stochastic gradient
ascent (SGA), then employ the deep ensemble strategy as an effective proxy for
Bayesian marginalization in this process, aiming to estimate a mixture of
Gaussians that facilitates a more thorough exploration of the potential
optimization space. The approximated posterior essentially describes the
stationary distribution of SGA iterations, which captures the geometric
information around the local optimum. Thus, MultiANDA allows drawing an
unlimited number of adversarial perturbations for each input and reliably
maintains the transferability. Our proposed method outperforms ten
state-of-the-art black-box attacks on deep learning models with or without
defenses through extensive experiments on seven normally trained and seven
defense models.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined