Instrument-tissue Interaction Detection Framework for Surgical Video Understanding
arxiv(2024)
摘要
Instrument-tissue interaction detection task, which helps understand surgical
activities, is vital for constructing computer-assisted surgery systems but
with many challenges. Firstly, most models represent instrument-tissue
interaction in a coarse-grained way which only focuses on classification and
lacks the ability to automatically detect instruments and tissues. Secondly,
existing works do not fully consider relations between intra- and inter-frame
of instruments and tissues. In the paper, we propose to represent
instrument-tissue interaction as quintuple and present an
Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the
quintuple for surgery videos understanding. Specifically, we propose a Snippet
Consecutive Feature (SCF) Layer to enhance features by modeling relationships
of proposals in the current frame using global context information in the video
snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to
incorporate features of proposals between adjacent frames through spatial
encoding. To reason relationships between instruments and tissues, a Temporal
Graph (TG) Layer is proposed with intra-frame connections to exploit
relationships between instruments and tissues in the same frame and inter-frame
connections to model the temporal information for the same instance. For
evaluation, we build a cataract surgery video (PhacoQ) dataset and a
cholecystectomy surgery video (CholecQ) dataset. Experimental results
demonstrate the promising performance of our model, which outperforms other
state-of-the-art models on both datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要