StyleWaveGAN: Style-based synthesis of drum sounds using generative adversarial networks for higher audio quality

2022 30th European Signal Processing Conference (EUSIPCO)(2022)

引用 0|浏览1
暂无评分
摘要
In this paper we introduce StyleWaveGAN, a style-based drum sound generator that is a variation of StyleGAN, a state-of-the-art image generator. By conditioning Style WaveGAN on the type of drum, we are able to synthesize waveforms faster than real-time on a GPU directly in CD quality up to a duration of 1.5s while retaining some control over the generation. We also introduce an alternative to the progressive growing of GANs and experimented on the effect of dataset balancing for generative tasks. The experiments are carried out on an augmented subset of a publicly available dataset comprised of different drums and cymbals. We evaluate against two recent drum generators, WaveGAN and NeuroDrum, demonstrating significantly improved generation quality using two quality measures: first the Frechet Audio Distance and second a perceptual test.
更多
查看译文
关键词
Percussive Sound Synthesis,Generative Models,Creative Interfaces
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要