谷歌浏览器插件
订阅小程序
在清言上使用

An Audio-Visual Speech Enhancement System Based on 3D Image Features: An Application in Hearing Aids

Yu-Ching Chung,Ji-Yan Han, Bo-Sin Wang,Wei-Zhong Zheng, Kung-Yao Shen,Ying-Hui Lai

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

引用 0|浏览2
暂无评分
摘要
Previous research has shown that auditory and visual inputs are not asynchronous in the human brain, and that visual cues can enhance attention in the hearing process. Therefore, this study proposes audio-visual speech enhancement (SE) with 3D image features (AV-3D-SE) that imitates the auditory process of humans to elevate listening quality. More specifically, AV-3D-SE uses the FlowNet3D model to predict temporal facial motion from the recorded 3D image combining with features for SE applications. The evaluation results showed that the average scores of perceptual evaluation of speech quality and short-time objective intelligibility in 3 dB signal-to-noise ratio increased to 3.229 and 0.914, respectively, while the average hearing aid speech quality index significantly outperformed baseline SE systems (audio-only and audio-visual-2D) in seven typical types of hearing loss with high hearing aid speech perception index. In conclusion, the proposed AV-3D-SE enhances the effectiveness of the SE system and can increase the listening satisfaction of hearing aid users.
更多
查看译文
关键词
deep learning,point cloud,scene flow,hearing aid and speech enhancement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要