谷歌Chrome浏览器插件
订阅小程序
在清言上使用

Time Delay Estimation for Speaker Localization Using CNN-Based Parametrized GCC-PHAT Features.

Interspeech(2021)

引用 1|浏览0
暂无评分
摘要
We propose a time delay estimation (TDE) method for speaker localization based on parametrized generalized cross-correlation phase transform (PGCC-PHAT) functions and convolutional neural networks (CNNs). The PGCC-PHAT is used to build a feature matrix, which gives TDE information of two microphone signals with different normalization levels in the cross-correlation functions. The feature matrix is processed by a CNN, composed by several convolutional layers and fully connected layers and by a regression output for the directly estimation of the time difference of arrival (TDOA). Simulations in noisy and reverberant adverse conditions show that the proposed method improves the TDOA estimation performance if compared to the GCC-PHAT.
更多
查看译文
关键词
time delay estimation,parametrized generalized cross-correlation,convolutional neural network,speaker localization,microphone pair,time difference of arrival
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要