Improving the Robustness of Quantized Deep Neural Networks to White-Box Attacks using Stochastic Quantization and Information-Theoretic Ensemble Training
CoRR(2023)
摘要
Most real-world applications that employ deep neural networks (DNNs) quantize
them to low precision to reduce the compute needs. We present a method to
improve the robustness of quantized DNNs to white-box adversarial attacks. We
first tackle the limitation of deterministic quantization to fixed ``bins'' by
introducing a differentiable Stochastic Quantizer (SQ). We explore the
hypothesis that different quantizations may collectively be more robust than
each quantized DNN. We formulate a training objective to encourage different
quantized DNNs to learn different representations of the input image. The
training objective captures diversity and accuracy via mutual information
between ensemble members. Through experimentation, we demonstrate substantial
improvement in robustness against $L_\infty$ attacks even if the attacker is
allowed to backpropagate through SQ (e.g., > 50\% accuracy to PGD(5/255) on
CIFAR10 without adversarial training), compared to vanilla DNNs as well as
existing ensembles of quantized DNNs. We extend the method to detect attacks
and generate robustness profiles in the adversarial information plane (AIP),
towards a unified analysis of different threat models by correlating the MI and
accuracy.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要