L-MAE: Masked Autoencoders are Semantic Segmentation Datasets Augmenter

Jiaru Jia,Mingzhe Liu,Jiake Xie,Xin Chen,Aiqing Yang, Xin Jiang,Hong Zhang, Yong Tang

arXiv (Cornell University)(2022)

引用 0|浏览0
暂无评分
摘要
Generating semantic segmentation datasets has consistently been laborious and time-consuming, particularly in the context of large models or specialized domains(i.e. Medical Imaging or Remote Sensing). Specifically, large models necessitate a substantial volume of data, while datasets in professional domains frequently require the involvement of domain experts. Both scenarios are susceptible to inaccurate data labeling, which can significantly affect the ultimate performance of the trained model. This paper proposes a simple and effective label pixel-level completion method, \textbf{Label Mask AutoEncoder} (L-MAE), which fully uses the existing information in the label to generate the complete label. The proposed model are the first to apply the Mask Auto-Encoder to downstream tasks. In detail, L-MAE adopts the fusion strategy that stacks the label and the corresponding image, namely fuse map. Moreover, since some of the image information is lost when masking the fuse map, direct reconstruction may lead to poor performance. We proposed Image Patch Supplement algorithm to supplement the missing information during the mask-reconstruct process, and empirically found that an average of 4.1\% mIoU can be improved. We conducted a experiment to evaluate the efficacy of L-MAE to complete the dataset. We employed a degraded Pascal VOC dataset and the degraded dataset enhanced by L-MAE to train an identical conventional semantic segmentation model for the initial set of experiments. The results of these experiments demonstrate a performance enhancement of 13.5\% in the model trained with the L-MAE-enhanced dataset compared to the unenhanced dataset.
更多
查看译文
关键词
masked autoencoders,segmentation,l-mae
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要