Cochleagram to Recognize Dysphonia: Auditory Perceptual Analysis for Health Informatics

Rumana Islam,Esam Abdel-Raheem,Mohammed Tarique

IEEE Access（2024）

Cited 0|Views0

No score

Abstract

The spectral images provide the dynamic characteristics of the voice signal in the time and frequency domains. However, extracting the predominant spectral features from the voice samples is still challenging. This work generates cochleagram images to unveil detailed spectral content of the voice samples to recognize dysphonic voice. Both sustained vowel (‘/a/’) and sentence voice samples are considered to include phonation, respiration, and resonance of the vocal tone. Also, gender bias is eliminated by considering male and female voice samples separately, as they have structurally different vocal tracts, pharynx, and oral cavities. The simulation results show that the cochleagram, coined with a designed pre-trained convolutional neural network (CNN), can achieve 95% accuracy in identifying dysphonic voices with sentence samples. A robust, noninvasive, and automated voice pathology detection system is effectively generated through perceptual analysis of voice signals. The proposed automated pathological voice detection system can objectively correlate the clinical findings and assist in monitoring the treatment progress of dysphonic voice on top of subjective assessment by clinicians.

Translated text

Key words

Classifier,CNN,cochleagram,dysphonia,gammatone filters,voice pathology

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined