Hellinger Distance Based Conditional Variational Auto-Encoder And Its Application In Raw Audio Generation

2018 IEEE 18TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT)(2018)

引用 0|浏览26
暂无评分
摘要
Nowadays, audio generation plays an important role in human-computer interactive applications. However, the audio generated by machine is far from nature sound, especially in expressiveness and complexity. Currently, conditional variational Auto-encoder (cVAE) has achieved excellent results in data generation, but original cVAE cannot avoid the defects caused by KL divergence which used in stochastic distribution measurement. This paper introduced Hellinger distance into cVAE model. First of all, the experiment shows that using Hellinger distance can improve the weakness of KL divergence effectively. And then, the relationship between the latent space parameters and the generated music quality is analyzed by experiments, and we found the best generative parameter is the distribution centroid. Finally, the generated music is subjectively evaluated and the results show that it is significantly better than the original model.
更多
查看译文
关键词
cVAE, KL divergence, hellinger distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要