Auditory Cortex-Inspired Spectral Attention Modulation for Binaural Sound Localization in HRTF Mismatch

Waradon Phokhinanan,Nicolas Obin,Sylvain Argentieri

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览0
暂无评分
摘要
In applications like noise cancellation and virtual reality, precise sound source localization is crucial. Existing data-driven binaural systems offer high performance in adverse conditions such as noise and reverberation but face limitations with real-time operation and performance degradation in HRTF mismatch scenarios. Our work introduces a compact Vision Transformer tailored to address these issues, with a primary focus on horizontal speech localization. Inspired by the auditory cortex, our model uniquely incorporates spectral attention mechanisms using encoded speech representations. This architecture enhances generalization on the azimuth plane under mismatched HRTFs. Our empirical results show a marked improvement over conventional DNN, CNN-based and Transformer-based models, both in noisy and noise-free environments. Significantly, the proposed model maintains high accuracy in localizing adjacent azimuths, ideal for real-world applications.
更多
查看译文
关键词
sound source localization,binaural audition,HRTF mismatch,attention modulation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要