Shared Interest...Sometimes: Understanding the Alignment between Human Perception, Vision Architectures, and Saliency Map Techniques.

CVPR Workshops(2023)

引用 2|浏览4
暂无评分
摘要
Empirical studies have shown that attention-based architectures outperform traditional convolutional neural networks (CNN) in terms of accuracy and robustness. As a result, attention-based architectures are increasingly used in high-stakes domains such as radiology and wildlife conservation to aid in decision-making. However, understanding how attention-based architectures compare to CNNs regarding alignment with human perception is still under-explored. Previous studies exploring how vision architectures align with human perception evaluate a single architecture with multiple explainability techniques or multiple architectures with a single explainability technique. Through an empirical analysis, we investigate how two attention-based architectures and two CNNs for two saliency map techniques align with the ground truth for human perception on 100 images from an interpretability benchmark dataset. Using the Shared Interest metrics, we found that CNNs align more with human perception when using the XRAI saliency map technique. However, we found the opposite for Grad-CAM. We discuss the implications of our analysis for human-centered explainable AI and introduce directions for future work.
更多
查看译文
关键词
attention-based architectures,human perception,multiple architectures,saliency map techniques,single architecture,vision architectures,XRAI saliency map technique
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要