BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM
CoRR(2024)
摘要
The Segment Anything Model (SAM), a foundation model pretrained on millions
of images and segmentation masks, has significantly advanced semantic
segmentation, a fundamental task in computer vision. Despite its strengths, SAM
encounters two major challenges. Firstly, it struggles with segmenting specific
objects autonomously, as it relies on users to manually input prompts like
points or bounding boxes to identify targeted objects. Secondly, SAM faces
challenges in excelling at specific downstream tasks, like medical imaging, due
to a disparity between the distribution of its pretraining data, which
predominantly consists of general-domain images, and the data used in
downstream tasks. Current solutions to these problems, which involve finetuning
SAM, often lead to overfitting, a notable issue in scenarios with very limited
data, like in medical imaging. To overcome these limitations, we introduce
BLO-SAM, which finetunes SAM based on bi-level optimization (BLO). Our approach
allows for automatic image segmentation without the need for manual prompts, by
optimizing a learnable prompt embedding. Furthermore, it significantly reduces
the risk of overfitting by training the model's weight parameters and the
prompt embedding on two separate subsets of the training dataset, each at a
different level of optimization. We apply BLO-SAM to diverse semantic
segmentation tasks in general and medical domains. The results demonstrate
BLO-SAM's superior performance over various state-of-the-art image semantic
segmentation methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要