Segment Anything Model is a Good Teacher for Local Feature Learning
arxiv(2023)
摘要
Local feature detection and description play an important role in many
computer vision tasks, which are designed to detect and describe keypoints in
"any scene" and "any downstream task". Data-driven local feature learning
methods need to rely on pixel-level correspondence for training, which is
challenging to acquire at scale, thus hindering further improvements in
performance. In this paper, we propose SAMFeat to introduce SAM (segment
anything model), a fundamental model trained on 11 million images, as a teacher
to guide local feature learning and thus inspire higher performance on limited
datasets. To do so, first, we construct an auxiliary task of Attention-weighted
Semantic Relation Distillation (ASRD), which distillates feature relations with
category-agnostic semantic information learned by the SAM encoder into a local
feature learning network, to improve local feature description using semantic
discrimination. Second, we develop a technique called Weakly Supervised
Contrastive Learning Based on Semantic Grouping (WSC), which utilizes semantic
groupings derived from SAM as weakly supervised signals, to optimize the metric
space of local descriptors. Third, we design an Edge Attention Guidance (EAG)
to further improve the accuracy of local feature detection and description by
prompting the network to pay more attention to the edge region guided by SAM.
SAMFeat's performance on various tasks such as image matching on HPatches, and
long-term visual localization on Aachen Day-Night showcases its superiority
over previous local features. The release code is available at
https://github.com/vignywang/SAMFeat.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要