So-DAS: A Two-Step Soft-Direction-Aware Speech Separation Framework

IEEE Signal Processing Letters(2023)

引用 0|浏览14
暂无评分
摘要
Most existing direction-aware speech separation systems lead to performance degradation when the angle difference between speakers is small due to the low spatial discrimination. To address this issue, we propose a two-step soft-direction-aware speech separation (So-DAS) framework, which consists of a direction of arrival (DOA) estimation module and a speech separation module. First, the two modules are individually optimized, and directional features (DFs) derived from ground-truth DOAs are utilized as spatial information to facilitate the separation module. Next, the two modules are cascaded and optimized with only separation loss, and the DFs are generated using the estimator outputs. By this means, the consistency between the two modules is strengthened, and thus spatial cues that are more beneficial to the separation task can be exploited by the network itself. The experimental results show that compared to the baselines, DFs extracted by our proposed method provides clearer superiority, especially when the angle difference between speakers is small. In addition, our approach yields a state-of-the-art word error rate of 3.4% on the real-recorded utterance-wise LibriCSS dataset.
更多
查看译文
关键词
Multi-channel speech separation,directional feature,deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要