Chrome Extension
WeChat Mini Program
Use on ChatGLM

Low-complexity Multi-Channel Speaker Extraction with Pure Speech Cues

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

Cited 0|Views4
No score
Abstract
Most multi-channel speaker extraction schemes use the target speaker's location information as a reference, which must be known in advance or derived from visual cues. In addition, memory and computation costs are enormous when the model deals with the fusion input. In this paper, we propose Speaker-extraction-and-filter Network (SeafNet), which is a low-complexity multi-channel speaker extraction network with only speech cues. Specifically, the SeafNet separates the mixture by utilizing the correlation between an estimation of target speaker on reference channel and the mixed input on rest channels. Experimental results show that compared with the baseline, the SeafNet model achieves 6.4% relative SISNRi improvement on the fixed geometry array and 8.9% average relative SISNRi improvement on the ad-hoc array. Meanwhile, the SeafNet achieves 60% relative reduction in the number of parameters and 42% relative reduction in the computational cost.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined