Characters Link Shots: Character Attention Network for Movie Scene Segmentation

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS(2024)

引用 0|浏览5
暂无评分
摘要
Movie scene segmentation aims to automatically segment a movie into multiple story units, i.e., scenes, each of which is a series of semantically coherent and time-continual shots. Previous methods have continued efforts on shot semantic association, but few take into account the impact of different semantics on foreground characters and background scenes in movie shots. In particular, the background scene in the shot can adversely affect scene boundary classification. Motivated by the fact that it is the characters who drive the plot development of a movie scene, we build a Character Attention Network (CANet) to detect movie scene boundaries in a character-centric fashion. To eliminate the background clutter, we extract multi-view character semantics for each shot in terms of human bodies and faces. Furthermore, we equip our CANet with two stages of character attention. The first is Masked Shot Attention (MSA) through selective self-attention over similar temporal contexts from multi-view character semantics to yield an enhanced omni-view shot representation, by which the CANet can better handle the variations of characters in pose and appearance. The second is Key Character Attention (KCA) through temporal-aware attention on character reappearances for Bidirectional Long Short-TermMemory (Bi-LSTM) feature association so that linking shots can be focused on those with recurring key characters. We encourage the proposed CANet in learning boundary-discriminative shot features. Specifically, we formulate a Boundary-Aware circle Loss (BAL) to push far apart CANet-features between adjacent scenes, which is also coupled with the cross-entropy loss to drive CANet-features sensitive to scene boundaries. Experimental results on the MovieNet-SSeg and OVSD datasets show that our method achieves superior performance in temporal scene segmentation compared with state-of-the-art methods.
更多
查看译文
关键词
Movie scene detection,video structuring,masked attention,video understanding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要