Real Time Distant Speech Emotion Recognition In Indoor Environments
PROCEEDINGS OF THE 14TH EAI INTERNATIONAL CONFERENCE ON MOBILE AND UBIQUITOUS SYSTEMS: COMPUTING, NETWORKING AND SERVICES (MOBIQUITOUS 2017)(2017)
摘要
We develop solutions to various challenges in different stages of the processing pipeline of a real time indoor distant speech emotion recognition system to reduce the discrepancy between training and test conditions for distant emotion recognition. We use a novel combination of distorted feature elimination, classifier optimization, several signal cleaning techniques and train classifiers with synthetic reverberation obtained from a room impulse response generator to improve performance in a variety of rooms with various source-to-microphone distances. Our comprehensive evaluation is based on a popular emotional corpus from the literature, two new customized datasets and a dataset made of YouTube videos. The two new datasets are the first ever distance aware emotional corpuses and we created them by 1) injecting room impulse responses collected in a variety of rooms with various source-to-microphone distances into a public emotional corpus; and by 2) re-recording the emotional corpus with microphones placed at different distances. The overall performance results show as much as 15.51% improvement in distant emotion detection over baselines, with a final emotion recognition accuracy ranging between 79.44%-95.89% for different rooms, acoustic configurations and source-to-microphone distances. We experimentally evaluate the CPU time of various system components and demonstrate the real time capability of our system.
更多查看译文
关键词
Emotion,speech,noise and reverberation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络