SERC-GCN: Speech Emotion Recognition In Conversation Using Graph Convolutional Networks

Deeksha Chandola, Enas Altarawneh,Michael Jenkin,Manos Papagelis

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览13
暂无评分
摘要
Speech emotion recognition (SER) is the task of automatically recognizing emotions expressed in spoken language. Current approaches focus on analyzing isolated speech segments to identify a speaker’s emotional state. Meanwhile, recent text-based emotion recognition methods have effectively shifted towards emotion recognition in conversation (ERC) that considers conversational context. Motivated by this shift, here we propose SERC-GCN, a method for speech emotion recognition in conversation (SERC) that predicts a speaker’s emotional state by incorporating conversational context, speaker interactions, and temporal dependencies between utterances. SERC-GCN is a two-stage method. First, emotional features of utterance-level speech signals are extracted. Then, these features are used to form conversation graphs that are used to train a graph convolutional network to perform SERC. We empirically evaluate the effectiveness of SERC-GCN and show that it outperforms the current state-of-the-art methods on the IEMOCAP benchmark dataset.
更多
查看译文
关键词
speech emotion recognition in conversation,human-computer interaction,graph convolutional network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要