Facial Expression Recognition Through Cross-Modality Attention Fusion

IEEE Transactions on Cognitive and Developmental Systems(2023)

引用 6|浏览32
暂无评分
摘要
Facial expressions are generally recognized based on handcrafted and deep-learning-based features extracted from RGB facial images. However, such recognition methods suffer from illumination/pose variations. In particular, they fail to recognize these expressions with weak emotion intensities. In this work, we propose a cross-modality attention-based convolutional neural network (CM-CNN) for facial expression recognition. We extract expression-related features from complementary facial images (gray-scale, local binary pattern, and depth images) to handle the illumination/pose variations and to capture appearance details that describe expressions with weak emotion intensities. Rather than directly concatenating the complementary features, we propose a novel cross-modality attention fusion network to enhance spatial correlations between any two types of facial images. Finally, the CM-CNN is optimized with an improved focal loss, which pays more attention to facial expressions with weak emotion intensities. The average classification accuracies on VT-KFER, BU-3DFE(P1), BU-3DFE(P2), and Bosphorus are 93.86%, 88.91%, 87.28%, and 85.16%, respectively. Evaluations on these databases demonstrate that our approach is competitive to state-of-the-art algorithms.
更多
查看译文
关键词
Convolutional neural network (CNN),cross-modality attention fusion,facial depth images,facial expression recognition (FER),focal loss
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要