Chrome Extension
WeChat Mini Program
Use on ChatGLM

CyTex: Transforming speech to textured images for speech emotion recognition

Speech Communication(2022)

Cited 12|Views19
No score
Abstract
Speech emotion recognition is an important aspect of emotional state recognition in human–machine interaction. Approaches using speech-to-image transforms have become popular in recent years because they can utilise deep neural network models that have proven to be successful in the image processing domain. In this paper, we propose a new speech-to-image transform, CyTex, that maps the raw speech signal directly to a textured image by using calculations based on the fundamental frequency of each speech frame. The textured RGB images resulting from the CyTex transform can then be classified using standard deep neural network models for the recognition of different classes of emotion. Using this approach, we can report an improvement of classification accuracies over the previous state-of-the-art results by 0.81% for the Emo-DB database, and also by 0.5% for the IEMOCAP database.
More
Translated text
Key words
Speech emotion recognition,Speech to textured-image transform,Deep neural network,Human–machine interaction
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined