Better constraints of imperceptibility, better adversarial examples in the text

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS(2022)

引用 3|浏览9
暂无评分
摘要
State-of-the-art adversarial attacks in the text domain have shown their power to induce machine learning models to produce abnormal outputs. The samples generated in these attacks have three important attributes: attack ability, transferability, and imperceptibility. However, compared with the other two attributes, the imperceptibility of adversarial examples has not been well investigated. Unlike the pixel-level perturbations in images, adversarial perturbations in the text are usually traceable, reflecting changes in characters, words, or sentences. The generation of imperceptible samples in texts is more difficult than in images. Therefore, how to constrain adversarial perturbations added in the text is a crucial step to construct more natural adversarial texts. Unfortunately, recent studies merely select measurements to constrain the added adversarial perturbations, but none of them explain where these measurements are suitable, which one is better, and how they perform in different kinds of adversarial attacks. In this paper, we fill this gap by comparing the performance of these metrics in various attacks. Furthermore, we propose a stricter constraint for word-level attacks to obtain more imperceptible samples. It is also helpful to enhance existing word-level attacks for adversarial training.
更多
查看译文
关键词
adversarial texts, imperceptibility, measurements, semantic and visual similarity, visual optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要