Learning to Prompt Segment Anything Models
CoRR(2024)
摘要
Segment Anything Models (SAMs) like SEEM and SAM have demonstrated great
potential in learning to segment anything. The core design of SAMs lies with
Promptable Segmentation, which takes a handcrafted prompt as input and returns
the expected segmentation mask. SAMs work with two types of prompts including
spatial prompts (e.g., points) and semantic prompts (e.g., texts), which work
together to prompt SAMs to segment anything on downstream datasets. Despite the
important role of prompts, how to acquire suitable prompts for SAMs is largely
under-explored. In this work, we examine the architecture of SAMs and identify
two challenges for learning effective prompts for SAMs. To this end, we propose
spatial-semantic prompt learning (SSPrompt) that learns effective semantic and
spatial prompts for better SAMs. Specifically, SSPrompt introduces spatial
prompt learning and semantic prompt learning, which optimize spatial prompts
and semantic prompts directly over the embedding space and selectively leverage
the knowledge encoded in pre-trained prompt encoders. Extensive experiments
show that SSPrompt achieves superior image segmentation performance
consistently across multiple widely adopted datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要