BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack
arxiv(2024)
摘要
We study the unique, less-well understood problem of generating sparse
adversarial samples simply by observing the score-based replies to model
queries. Sparse attacks aim to discover a minimum number-the l0
bounded-perturbations to model inputs to craft adversarial examples and
misguide model decisions. But, in contrast to query-based dense attack
counterparts against black-box models, constructing sparse adversarial
perturbations, even when models serve confidence score information to queries
in a score-based setting, is non-trivial. Because, such an attack leads to i)
an NP-hard problem; and ii) a non-differentiable search space. We develop the
BruSLeAttack-a new, faster (more query-efficient) Bayesian algorithm for the
problem. We conduct extensive attack evaluations including an attack
demonstration against a Machine Learning as a Service (MLaaS) offering
exemplified by Google Cloud Vision and robustness testing of adversarial
training regimes and a recent defense against black-box attacks. The proposed
attack scales to achieve state-of-the-art attack success rates and query
efficiency on standard computer vision tasks such as ImageNet across different
model architectures. Our artefacts and DIY attack samples are available on
GitHub. Importantly, our work facilitates faster evaluation of model
vulnerabilities and raises our vigilance on the safety, security and
reliability of deployed systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要