Appearance-Based Gaze Estimation Method Using Static Transformer Temporal Differential Network

Yujie Li,Longzhao Huang, Jiahui Chen, Xiwen Wang,Benying Tan

MATHEMATICS(2023)

引用 3|浏览14
暂无评分
摘要
Gaze behavior is important and non-invasive human-computer interaction information that plays an important role in many fields-including skills transfer, psychology, and human-computer interaction. Recently, improving the performance of appearance-based gaze estimation, using deep learning techniques, has attracted increasing attention: however, several key problems in these deep-learning-based gaze estimation methods remain. Firstly, the feature fusion stage is not fully considered: existing methods simply concatenate the different obtained features into one feature, without considering their internal relationship. Secondly, dynamic features can be difficult to learn, because of the unstable extraction process of ambiguously defined dynamic features. In this study, we propose a novel method to consider feature fusion and dynamic feature extraction problems. We propose the static transformer module (STM), which uses a multi-head self-attention mechanism to fuse fine-grained eye features and coarse-grained facial features. Additionally, we propose an innovative recurrent neural network (RNN) cell-that is, the temporal differential module (TDM)-which can be used to extract dynamic features. We integrated the STM and the TDM into the static transformer with a temporal differential network (STTDN). We evaluated the STTDN performance, using two publicly available datasets (MPIIFaceGaze and Eyediap), and demonstrated the effectiveness of the STM and the TDM. Our results show that the proposed STTDN outperformed state-of-the-art methods, including that of Eyediap (by 2.9%).
更多
查看译文
关键词
gaze estimation,static transformer temporal differential network,static transformer module,temporal differential module,self-attention mechanism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要