A Bag of Systems Representation for Music Auto-Tagging

IEEE Transactions on Audio, Speech, and Language Processing(2013)

引用 17|浏览23
暂无评分
摘要
We present a content-based automatic tagging system for music that relies on a high-level, concise “Bag of Systems” (BoS) representation of the characteristics of a musical piece. The BoS representation leverages a rich dictionary of musical codewords, where each codeword is a generative model that captures timbral and temporal characteristics of music. Songs are represented as a BoS histogram over codewords, which allows for the use of traditional algorithms for text document retrieval to perform auto-tagging. Compared to estimating a single generative model to directly capture the musical characteristics of songs associated with a tag, the BoS approach offers the flexibility to combine different generative models at various time resolutions through the selection of the BoS codewords. Additionally, decoupling the modeling of audio characteristics from the modeling of tag-specific patterns makes BoS a more robust and rich representation of music. Experiments show that this leads to superior auto-tagging performance.
更多
查看译文
关键词
content-based retrieval,music,bos codewords,bos histogram,bos representation,audio characteristics,bag of systems representation,content-based automatic tagging system,generative model,music auto-tagging,musical characteristics,musical codewords,musical piece,tag-specific patterns,temporal characteristics,text document retrieval,timbral characteristics,time resolutions,audio annotation and retrieval,bag of systems,content-based music processing,dynamic texture model,music information retrieval,histograms,computational modeling,data models,hidden markov models,feature extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要