Evolving Fuzzy-Neural Method for Multimodal Speech Recognition.

EANN(2015)

Cited 25|Views4
No score
Abstract
Improving automatic speech recognition systems is one of the hottest topics in speech-signal processing, especially if such systems are to operate in noisy environments. This paper proposes a multimodal evolutionary neuro-fuzzy approach to developing an automatic speech-recognition system. To make inferences at the decision stage about audiovisual information for speech-to-text conversion, the EFuNN paradigm was applied. Two independent feature extractors were developed, one for the speech phonetics (speech listening) and the other for the speech visemics (lip reading). The EFuNN network has been trained to fuse decisions on audio and decisions on video. This soft computing approach proved robust in harsh conditions and, at the same time, less complex than hard computing, pattern-matching methods. Preliminary experiments confirm the reliability of the proposed method for developing a robust, automatic, speech-recognition system.
More
Translated text
Key words
Audiovisual Speech Recognition (AVSR), Evolutionary Fuzzy Neural Network (EFuNN), Speech-To-Text (STT), Decision fusion, Multimodal speech recognition
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined