A Human-Machine Collaborative Video Summarization Framework Using Pupillary Response Signals

Haigang Ma, Wenyuan Yu, Yaoyao Wang, Minwei Zhou,Nai Ding,Jing Zheng

2023 6th International Conference on Information Communication and Signal Processing (ICICSP)(2023)

引用 0|浏览2
暂无评分
摘要
This paper proposes a novel, human-machine collaborative, pupillary response-based video summarization framework. Considering that humans are the end-users and evaluators of video content, it is natural to establish the links between video features and real-time viewers' attentive response for designing a video summarization framework. In this paper, pupil size, a pupillary response, is introduced as a real-time indicator for assessment of viewers' engagement and attention. Firstly, we augmented the TVSum dataset by replacing manual annotations with attention scores that converted from pupillary size signals to generate the perceptually driven dataset. Secondly, we developed a video summarization framework which uses cues from pupillary size signal of humans to predict frame-level attention scores, and then extracts key shots from videos. On the perception-driven dataset, the average F-measure of the proposed summarization method is 69.71%, while the precision and recall are 69.67% and 69.77%, respectively, which is a significant improvement compared to random summarization. Experimental results initially demonstrated that our modeling approach can learn the dynamic attention mechanism of viewers and apply it to video summarization. In addition, the experimental results validate the effect.
更多
查看译文
关键词
Attention mechanism,Encoder-decoder framework,Gated Recurrent Unit (GRU),Pupil size,Video summarization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要