Chrome Extension
WeChat Mini Program
Use on ChatGLM

Cross-Talk Reduction

IJCAI 2024(2024)

Cited 0|Views10
No score
Abstract
While far-field multi-talker mixtures are recorded, each speaker can wear a close-talk microphone so that close-talk mixtures can be recorded at the same time. Although each close-talk mixture has a high signal-to-noise ratio (SNR) of the wearer, it has a very limited range of applications, as it also contains significant cross-talk speech by other speakers and is not clean enough. In this context, we propose a novel task named \textit{cross-talk reduction} (CTR) which aims at reducing cross-talk speech, and a novel solution named CTRnet which is based on unsupervised or weakly-supervised neural speech separation. In unsupervised CTRnet, close-talk and far-field mixtures are stacked as input for a DNN to estimate the close-talk speech of each speaker. It is trained in an unsupervised, discriminative way such that the DNN estimate for each speaker can be linearly filtered to cancel out the speaker’s cross-talk speech captured at other microphones. In weakly-supervised CTRnet, we assume the availability of each speaker’s activity timestamps during training, and leverage them to improve the training of unsupervised CTRnet. Evaluation results on a simulated two-speaker CTR task and on a real-recorded conversational speech separation and recognition task show the effectiveness and potential of CTRnet.
More
Translated text
Key words
Machine Learning -> ML: Unsupervised learning,Machine Learning -> ML: Weakly supervised learning,Machine Learning -> ML: Applications,Natural Language Processing -> NLP: Speech
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined