Collaborative Speech Dereverberation - Regularized Tensor Factorization for Crowdsourced Multi-Channel Recordings.

European Signal Processing Conference(2018)

引用 1|浏览7
暂无评分
摘要
We propose a regularized nonnegative tensor factorization (NTF) model for multi-channel speech dereverberation that incorporates prior knowledge about clean speech. The approach models the problem as recovering a signal convolved with different room impulse responses, allowing the dereverberation problem to benefit from microphone arrays. The factorization learns both individual reverberation filters and channel-specific delays, which makes it possible to employ an ad-hoc microphone array with heterogeneous sensors (such as multi-channel recordings by a crowd) even if they are not synchronized. We integrate two prior-knowledge regularization schemes to increase the stability of dereverberation performance. First, a Nonnegative Matrix Factorization (NMF) inner routine is introduced to inform the original NTF problem of the pre-trained clean speech basis vectors, so that the optimization process can focus on estimating their activations rather than the whole clean speech spectra. Second, the NMF activation matrix is further regularized to take on characteristics of dry signals using sparsity and smoothness constraints. Empirical dereverberation results on different simulated reverberation setups show that the prior-knowledge regularization schemes improve both recovered sound quality and speech intelligibility compared to a baseline NTF approach.
更多
查看译文
关键词
multi-channel dereverberation,nonnegative matrix factorization,nonnegative tensor factorization,collaborative audio enhancement,speech enhancement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要