Investigating White-Box Attacks for On-Device Models
CoRR(2024)
摘要
Numerous mobile apps have leveraged deep learning capabilities. However,
on-device models are vulnerable to attacks as they can be easily extracted from
their corresponding mobile apps. Existing on-device attacking approaches only
generate black-box attacks, which are far less effective and efficient than
white-box strategies. This is because mobile deep learning frameworks like
TFLite do not support gradient computing, which is necessary for white-box
attacking algorithms. Thus, we argue that existing findings may underestimate
the harmfulness of on-device attacks. To this end, we conduct a study to answer
this research question: Can on-device models be directly attacked via white-box
strategies? We first systematically analyze the difficulties of transforming
the on-device model to its debuggable version, and propose a Reverse
Engineering framework for On-device Models (REOM), which automatically reverses
the compiled on-device TFLite model to the debuggable model. Specifically, REOM
first transforms compiled on-device models into Open Neural Network Exchange
format, then removes the non-debuggable parts, and converts them to the
debuggable DL models format that allows attackers to exploit in a white-box
setting. Our experimental results show that our approach is effective in
achieving automated transformation among 244 TFLite models. Compared with
previous attacks using surrogate models, REOM enables attackers to achieve
higher attack success rates with a hundred times smaller attack perturbations.
In addition, because the ONNX platform has plenty of tools for model format
exchanging, the proposed method based on the ONNX platform can be adapted to
other model formats. Our findings emphasize the need for developers to
carefully consider their model deployment strategies, and use white-box methods
to evaluate the vulnerability of on-device models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要