Wanet: weight and attention network for video summarization

Discover Artificial Intelligence(2024)

引用 0|浏览0
暂无评分
摘要
In this paper, we propose a deep learning-based model, called Weight and Attention Network (WANet), for video summarization. The network comprises a simple multi-head attention mechanism, followed by a feed-forward network to obtain the frame importance scores. Summary keyshots are obtained from the scores using a combination of kernel temporal segmentation and the knapsack algorithm. Contrary to past methods, we first enrich the input frames with similar information as opposed to letting the model learn all the features by itself. A novel weight assignment mechanism is introduced to assign weights to the input frames based on their similarity before passing the same to the model. Experimental results on the SumMe and TVSum datasets indicate the effectiveness of the present method when compared to state-of-the-art methods applied to the same datasets.
更多
查看译文
关键词
Video Summarization,Multi head attention,Weight Assignment Mechanism,Deep Learning,SumMe,TVSum
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要