AVT: Au-Assisted Visual Transformer for Facial Expression Recognition.
ICIP(2022)
摘要
Facial expression recognition (FER) has made significant progress over the past few years. But how to overcome the problem of high inter-class similarity and large intra-class difference in FER is still challenging. To address this problem, we propose a novel FER framework called AU-assisted Visual Transformer (AVT) by incorporating facial action units (AU) information into Visual Transformer, which mainly consists of three modules: Local Feature Extraction (LFE) module, Global Relationship Modeling (GRM) module and AU Fusion Module (AFM). Specifically, the LFE module aims to extract local facial expression features by using a deep convolutional neural network, the GRM module is a multi-layer Transformer encoder that captures the relation between local facial regions and obtains a global understanding of the face, and in particular, the AFM introduces fine-grained AU feature and fuses it with expression feature for final classification. Extensive experiments are conducted on RAF-DB and FERPlus datasets, and our AVT achieves competitive results compared to previous state-of-the-art methods, demonstrating the effectiveness of our approach.
更多查看译文
关键词
Facial Expression Recognition, Transformer, AU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要