Inflation-Deflation Networks for Recognizing Head-Movement Functions in Face-to-Face Conversations

Kazuki Takeda,Kazuhiro Otsuka

Multimodal Interfaces and Machine Learning for Multimodal Interaction(2021)

引用 1|浏览7
暂无评分
摘要
ABSTRACT Head movements have various functions in face-to-face conversations. Recently, convolutional neural networks (CNNs) have been proposed to recognize the communicative functions performed by the head movements from the time series of interlocutors’ head pose angles during multiparty conversations. However, there is room for improvement in the recognition performance. To boost the CNNs’ performance, this paper proposes a feature Inflation-Deflation module (I/DeF module) as an additional module attached ahead of the CNNs’ input layer to facilitate the feature learning of the head-movement dynamics. The I/DeF module consists of repeated inflation and deflation processes. The inflation process upscales and extrapolates the windowed input time series by a transposed convolution. The deflation process compresses the inflated data and recovers its original data length. Targeting the ten frequent head-movement functions, the experiments showed that CNNs with the I/DeF module (I/DeF-CNNs) outperformed the previous CNNs in all function categories up to 4.5 points in F-measure. We also integrated the I/DeF module into VGG and ResNet. Comparison to these methods showed that I/DeF-CNNs surpassed the other models for 8 out of 10 functions. These results confirmed the effectiveness of the I/DeF module and its potential for advancing nonverbal behavior recognition.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要