Scalable Video Event Retrieval by Visual State Binary Embedding.

IEEE Trans. Multimedia(2016)

引用 21|浏览89
暂无评分
摘要
With the exponential increase of media data on the web, fast media retrieval is becoming a significant research topic in multimedia content analysis. Among the variety of techniques, learning binary embedding (hashing) functions is one of the most popular approaches that can achieve scalable information retrieval in large databases, and it is mainly used in the near-duplicate multimedia search. However, till now most hashing methods are specifically designed for near-duplicate retrieval at the visual level rather than the semantic level. In this paper, we propose a visual state binary embedding (VSBE) model to encode the video frames, which can preserve the essential semantic information in binary matrices, to facilitate fast video event retrieval in unconstrained cases. Compared with other video binary embedding models, one advantage of our proposed VSBE model is that it only needs a limited number of key frames from the training videos for hash function training, so the computational complexity is much lower in the training phase. At the same time, we apply the pairwise constraints generated from the visual states to sketch the local properties of the events at the semantic level, so accuracy is also ensured. We conducted extensive experiments on the challenging TRECVID MED dataset, and have proved the superiority of our proposed VSBE model.
更多
查看译文
关键词
Visualization,Semantics,Streaming media,Training,Multimedia communication,Hidden Markov models,Event detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要