Video Emotion Recognition using Hand-Crafted and Deep Learning Features

2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia)(2018)

引用 7|浏览13
暂无评分
摘要
In this paper, we present our system designed for the video emotion recognition task of the Multimodal Emotion Challenge (MEC 2017). Histogram of Oriented Gradients (HOG), face shape (SHAPE), and geometric (GEO) features are extracted from the detected face images as hand-crafted video features. A pre-trained VGG-Face model is fine-tuned with the face images and emotion labels from the training set of CHEAVD 2.0, the outputs of the penultimate fully-connected layer (FC6) and the last fully-connected layer (FC7) are adopted as Deep Convolutional Neural Network (DCNN) based features. For each video clip, the hand-crafted features and DCNN based features are input into corresponding hidden Markov models (HMMs, one for each emotion class), respectively, for the initial emotion recognitions. The output logarithm likelihood probabilities from the HMMs are then ranked, and the orders constitute an eight-dimensional feature vector as inputs to a Naive Bayes classifier for decision fusion. Experimental results on the CHEAVD 2.0 database show that the combination of FC6, GEO, SHAPE and HOG features obtains the highest macro average precisions (MAPs) on both the validation set (46.61%) and test set (43.88%), which are 12.51% and 22.18% higher than the baseline results, respectively.
更多
查看译文
关键词
MEC 2017,emotion recognition,HOG,SHAPE,DCNN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要