Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations

ICMR '14: Proceedings of International Conference on Multimedia Retrieval(2014)

引用 20|浏览0
暂无评分
摘要
Spatial Pyramid Matching (SPM) assumes that the spatial Bag-of-Words (BoW) representation is independent of data. However, evidence has shown that the assumption usually leads to a suboptimal representation. In this paper, we propose a novel method called Jensen-Shannon (JS) Tiling to learn the BoW representation from data directly at the BoW level. The proposed JS Tiling is especially appropriate for large-scale datasets as it is orders of magnitude faster than existing methods, but with comparable or even better classification precision. Experimental results on four benchmarks including two TRECVID12 datasets validate that JS Tiling outperforms the SPM and the state-of-the-art methods. The runtime comparison demonstrates that selecting BoW representations by JS Tiling is more than 1,000 times faster than running classifiers. Besides, JS Tiling is an important component contributing to CMU Teams' final submission in TRECVID 2012 Multimedia Event Detection.
更多
查看译文
关键词
suboptimal representation,js tiling,spatial pyramid matching,multimedia event detection,bow representation,bow level,trecvid12 datasets validate,optimal spatial bag-of-words representations,cmu teams,proposed js tiling,towards efficient learning,large-scale datasets,bag of visual words,spm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要