How to Train your Antivirus: RL-based Hardening through the Problem-Space
CoRR(2024)
摘要
ML-based malware detection on dynamic analysis reports is vulnerable to both
evasion and spurious correlations. In this work, we investigate a specific ML
architecture employed in the pipeline of a widely-known commercial antivirus
company, with the goal to harden it against adversarial malware. Adversarial
training, the sole defensive technique that can confer empirical robustness, is
not applicable out of the box in this domain, for the principal reason that
gradient-based perturbations rarely map back to feasible problem-space
programs. We introduce a novel Reinforcement Learning approach for constructing
adversarial examples, a constituent part of adversarially training a model
against evasion. Our approach comes with multiple advantages. It performs
modifications that are feasible in the problem-space, and only those; thus it
circumvents the inverse mapping problem. It also makes possible to provide
theoretical guarantees on the robustness of the model against a particular set
of adversarial capabilities. Our empirical exploration validates our
theoretical insights, where we can consistently reach 0% Attack Success Rate
after a few adversarial retraining iterations.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要