Towards Sample-specific Backdoor Attack with Clean Labels via Attribute Trigger
CoRR(2023)
Abstract
Currently, sample-specific backdoor attacks (SSBAs) are the most advanced and
malicious methods since they can easily circumvent most of the current backdoor
defenses. In this paper, we reveal that SSBAs are not sufficiently stealthy due
to their poisoned-label nature, where users can discover anomalies if they
check the image-label relationship. In particular, we demonstrate that it is
ineffective to directly generalize existing SSBAs to their clean-label variants
by poisoning samples solely from the target class. We reveal that it is
primarily due to two reasons, including \textbf{(1)} the `antagonistic effects'
of ground-truth features and \textbf{(2)} the learning difficulty of
sample-specific features. Accordingly, trigger-related features of existing
SSBAs cannot be effectively learned under the clean-label setting due to their
mild trigger intensity required for ensuring stealthiness. We argue that the
intensity constraint of existing SSBAs is mostly because their trigger patterns
are `content-irrelevant' and therefore act as `noises' for both humans and
DNNs. Motivated by this understanding, we propose to exploit content-relevant
features, $a.k.a.$ (human-relied) attributes, as the trigger patterns to design
clean-label SSBAs. This new attack paradigm is dubbed backdoor attack with
attribute trigger (BAAT). Extensive experiments are conducted on benchmark
datasets, which verify the effectiveness of our BAAT and its resistance to
existing defenses.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined