Combining Relative and Absolute Learning Formulations to Predict Emotional Attributes From Speech.

2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)(2023)

引用 0|浏览2
暂无评分
摘要
Predicting absolute scores is the most common speech-emotion recognition (SER) task when predicting emotional attributes (i.e., valence, arousal, and dominance). However, studies have shown that emotion has an ordinal nature where it is more reliable to establish a preference between speech samples (e.g., one sample is more positive than the other). This paper pursues a novel direction to combine absolute and relative learning formulations for SER. The proposed multitask formulation can simultaneously estimate preference between speech samples and predict their absolute score, providing a flexible tool to analyze emotional content in speech. Both tasks mutually complement each other, allowing the model to outperform SER systems that are exclusively trained to either predict absolute scores or estimate preferences. The multitask weights can be set according to the intended applications, prioritizing one task while slightly compromising the performance of the other task.
更多
查看译文
关键词
Speech emotion recognition,Multi-task learning,Preference learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要