TASFormer: Task-Aware Image Segmentation Transformer

NEURAL INFORMATION PROCESSING, ICONIP 2023, PT V(2024)

引用 0|浏览2
暂无评分
摘要
In image segmentation tasks for real-world applications, the number of semantic categories can be very large, and the number of objects in them can vary greatly. In this case, the multi-channel representation of the output mask for the segmentation model is inefficient. In this paper we explore approaches to overcome such a problem by using a single-channel output mask and additional input information about the desired class for segmentation. We call this information task embedding and we learn it in the process of the neural network model training. In our case, the number of tasks is equal to the number of segmentation categories. This approach allows us to build universal models that can be conveniently extended to an arbitrary number of categories without changing the architecture of the neural network. To investigate this idea we developed a transformer neural network segmentation model named TASFormer. We demonstrated that the highest quality results for task-aware segmentation are obtained using adapter technology as part of the model. To evaluate the quality of segmentation, we introduce a binary intersection over union (bIoU) metric, which is an adaptation of the standard mIoU for the models with a single-channel output. We analyze its distinguishing properties and use it to compare modern neural network methods. The experiments were carried out on the universal ADE20K dataset. The proposed TASFormer-based approach demonstrated state-of-the-art segmentation quality on it. The software implementation of the TASFormer method and the bIoU metric is publicly available at www. github.com/subake/TASFormer.
更多
查看译文
关键词
Image segmentation,Task embedding,Segmentation quality metric,Transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要