Cluster2Former: Semisupervised Clustering Transformers for Video Instance Segmentation

Aron Fothi, Adrian Szlatincsan,Ellak Somfai

SENSORS(2024)

引用 0|浏览5
暂无评分
摘要
A novel approach for video instance segmentation is presented using semisupervised learning. Our Cluster2Former model leverages scribble-based annotations for training, significantly reducing the need for comprehensive pixel-level masks. We augment a video instance segmenter, for example, the Mask2Former architecture, with similarity-based constraint loss to handle partial annotations efficiently. We demonstrate that despite using lightweight annotations (using only 0.5% of the annotated pixels), Cluster2Former achieves competitive performance on standard benchmarks. The approach offers a cost-effective and computationally efficient solution for video instance segmentation, especially in scenarios with limited annotation resources.
更多
查看译文
关键词
transformers,video processing,instance segmentation,semisupervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要