Improving Music Source Separation Based On Deep Neural Networks Through Data Augmentation And Network Blending

Stefan Uhlich,Marcello Porcu,Franck Giron,Michael Enenkl,Thomas Kemp,Naoya Takahashi,Yuki Mitsufuji

2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)（2017）

引用 249|浏览124

暂无评分

摘要

This paper deals with the separation of music into individual instrument tracks which is known to be achallenging problem. We describe two different deep neural network architectures for this task, a feed-forward and a recurrent one, and show that each ofthem yields themselves state-of-the art results on the SiSEC DSD100 dataset. For the recurrent network, we use data augmentation during training and show that even simple separation networks are prone to overfitting if no data augmentation is used. Furthermore, we propose a blending of both neural network systems where we linearly combine their raw outputs and then perform a multi-channel Wiener filter post-processing. This blending scheme yields the best results that have been reported to-date on the SiSEC DSD100 dataset.

查看译文

关键词

Music source separation (MSS), Deep neural network (DNN), Long-short term memory (LSTM), Blending

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要