Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation
CoRR(2024)
摘要
Variational Autoencoders (VAEs) constitute a crucial component of neural
symbolic music generation, among which some works have yielded outstanding
results and attracted considerable attention. Nevertheless, previous VAEs still
encounter issues with overly long feature sequences and generated results lack
contextual coherence, thus the challenge of modeling long multi-track symbolic
music still remains unaddressed. To this end, we propose Multi-view MidiVAE, as
one of the pioneers in VAE methods that effectively model and generate long
multi-track symbolic music. The Multi-view MidiVAE utilizes the two-dimensional
(2-D) representation, OctupleMIDI, to capture relationships among notes while
reducing the feature sequences length. Moreover, we focus on instrumental
characteristics and harmony as well as global and local information about the
musical composition by employing a hybrid variational encoding-decoding
strategy to integrate both Track- and Bar-view MidiVAE features. Objective and
subjective experimental results on the CocoChorales dataset demonstrate that,
compared to the baseline, Multi-view MidiVAE exhibits significant improvements
in terms of modeling long multi-track symbolic music.
更多查看译文
关键词
symbolic music generation,long multi-track,Multi-view MidiVAE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要