Transformer-Augmented Network With Online Label Correction for Facial Expression Recognition.

IEEE Trans. Affect. Comput.(2024)

引用 0|浏览13
暂无评分
摘要
Facial expression recognition (FER) in the wild is extremely challenging due to occlusions, variant head poses under unconstrained conditions and incorrect annotations (e.g., label noise). In this paper, we aim to improve the performance of in-the-wild FER with Transformers and online label correction. Different from pure CNNs based methods, we propose a Transformer-augmented network (TAN) to dynamically capture the relationships within each facial patch and across the facial patches. Specifically, the TAN translates a number of facial patch images into a set of visual feature sequences by a backbone convolutional neural network. The intra-patch Transformer is subsequently utilized to capture the most discriminative features within each visual feature sequence. The position-disentangled attention mechanism of the intra-patch Transformer is proposed to better incorporate the positional information for feature sequences. Furthermore, we propose the inter-patch Transformer to model the dependencies across these feature sequences. More importantly, we present the online label correction (OLC) framework to correct suspicious hard labels and accumulate soft labels based on the predictions of the model, which strengthens the robustness of our model against label noise. We validate our method on several widely-used datasets (RAF-DB, FERPlus, AffectNet), realistic occlusion and pose variation datasets, and synthetic noisy datasets. Extensive experiments on these benchmarks demonstrate that the proposed method performs favorably against state-of-the-art methods. The source code will be made publicly available.
更多
查看译文
关键词
Facial expression recognition in the wild,Transformers,position-disentangled attention,online label correction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要