I-vector Based Emotion Recognition in Assamese Speech

International Journal of Engineering and Future Technology(2016)

引用 23|浏览2
暂无评分
摘要
As emotion is an integral part of speech and is strongly related to a speaker’s characteristics, it plays a vital role in any speech or speaker recognition system. Assamese is a widely spoken language in the north-eastern part of India and is known for its dialectal richness and ethnographic diversity. A speech or speaker recognition system is expected to deal with the emotion content of the samples. Here, we report the design of an emotion recognition system in Assamese language exploiting the ability of the Recurrent Neural Network (RNN) to track the temporal variations in the speech sample. RNN is an Artificial Neural Network (ANN) with feed forward and feedback sections enabling it to capture time dependentvariations in speech samples. We have designed i-vector and Mel Frequency Cepstral Coefficients delta (MFCC-delta) features and report the comparative performance, for the recognition of various moods, derived from the RNN and Distributed Time Delay Neural Network (DTDNN) classifiers.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要