Multi-Channel Far-Field Speaker Verification with Large-Scale Ad-hoc Microphone Arrays

semanticscholar(2021)

引用 0|浏览0
暂无评分
摘要
Speaker verification based on ad-hoc microphone arrays has the potential of reducing the error significantly in adverse acoustic environments. However, existing approaches extract utterance-level speaker embeddings from each channel of an ad-hoc microphone array, which does not consider fully the spatial-temporal information across the devices. In this paper, we propose to aggregate the multichannel signals of the ad-hoc microphone array at the frame-level by exploring the cross-channel information deeply with two attention mechanisms. The first one is a self-attention method. It consists of a cross-frame self-attention layer and a cross-channel self-attention layer successively, both working at the frame level. The second one learns the cross-frame and cross-channel information via two graph attention layers. Experimental results demonstrate that the proposed methods reach the state-of-the-art performance. More-over, the graph-attention method is better than the self-attention method in most cases.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要