Deep functional multiple index models with an application to SER
arxiv(2024)
摘要
Speech Emotion Recognition (SER) plays a crucial role in advancing
human-computer interaction and speech processing capabilities. We introduce a
novel deep-learning architecture designed specifically for the functional data
model known as the multiple-index functional model. Our key innovation lies in
integrating adaptive basis layers and an automated data transformation search
within the deep learning framework. Simulations for this new model show good
performances. This allows us to extract features tailored for chunk-level SER,
based on Mel Frequency Cepstral Coefficients (MFCCs). We demonstrate the
effectiveness of our approach on the benchmark IEMOCAP database, achieving good
performance compared to existing methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要