Exploring audio semantic concepts for event-based video retrieval

ICASSP(2014)

引用 28|浏览36
暂无评分
摘要
The audio semantic concepts (sound events) play important roles in audio-based content analysis. How to capture the semantic information effectively from the complex occurrence pattern of sound events in YouTube quality videos is a challenging problem. This paper presents a novel framework to handle the complex situation for semantic information extraction in real-world videos and evaluate through the NIST multimedia event detection task (MED). We calculate the occurrence confidence matrix of sound events and explore multiple strategies to generate clip-level semantic features from the matrix. We evaluate the performance using TRECVID2011 MED dataset. The proposed method outperforms previous HMM-based system. The late fusion experiment with the low-level features and text feature (ASR) shows that audio semantic concepts capture complementary information in the soundtrack.
更多
查看译文
关键词
audio-based content analysis,event-based video retrieval,nist multimedia event detection task,semantic concept,audio semantic concepts,trecvid2011 med dataset,matrix algebra,sound events,hmm-based system,occurrence confidence matrix,complex occurrence pattern,clip-level semantic features,youtube quality videos,text feature,low-level features,audio signal processing,multimedia retrieval,semantic information,audio processing,video retrieval,speech,semantics,vectors,hidden markov models,feature extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要