Efficient Temporal Action Segmentation via Boundary-aware Query Voting
arxiv(2024)
摘要
Although the performance of Temporal Action Segmentation (TAS) has improved
in recent years, achieving promising results often comes with a high
computational cost due to dense inputs, complex model structures, and
resource-intensive post-processing requirements. To improve the efficiency
while keeping the performance, we present a novel perspective centered on
per-segment classification. By harnessing the capabilities of Transformers, we
tokenize each video segment as an instance token, endowed with intrinsic
instance segmentation. To realize efficient action segmentation, we introduce
BaFormer, a boundary-aware Transformer network. It employs instance queries for
instance segmentation and a global query for class-agnostic boundary
prediction, yielding continuous segment proposals. During inference, BaFormer
employs a simple yet effective voting strategy to classify boundary-wise
segments based on instance segmentation. Remarkably, as a single-stage
approach, BaFormer significantly reduces the computational costs, utilizing
only 6
producing better or comparable accuracy over several popular benchmarks. The
code for this project is publicly available at
https://github.com/peiyao-w/BaFormer.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要