Cochleagram to Recognize Dysphonia: Auditory Perceptual Analysis for Health Informatics

IEEE Access(2024)

Cited 0|Views0
No score
Abstract
The spectral images provide the dynamic characteristics of the voice signal in the time and frequency domains. However, extracting the predominant spectral features from the voice samples is still challenging. This work generates cochleagram images to unveil detailed spectral content of the voice samples to recognize dysphonic voice. Both sustained vowel (‘/a/’) and sentence voice samples are considered to include phonation, respiration, and resonance of the vocal tone. Also, gender bias is eliminated by considering male and female voice samples separately, as they have structurally different vocal tracts, pharynx, and oral cavities. The simulation results show that the cochleagram, coined with a designed pre-trained convolutional neural network (CNN), can achieve 95% accuracy in identifying dysphonic voices with sentence samples. A robust, noninvasive, and automated voice pathology detection system is effectively generated through perceptual analysis of voice signals. The proposed automated pathological voice detection system can objectively correlate the clinical findings and assist in monitoring the treatment progress of dysphonic voice on top of subjective assessment by clinicians.
More
Translated text
Key words
Classifier,CNN,cochleagram,dysphonia,gammatone filters,voice pathology
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined