TLDR: Text Based Last-layer Retraining for Debiasing Image Classifiers
CoRR(2023)
摘要
A classifier may depend on incidental features stemming from a strong
correlation between the feature and the classification target in the training
dataset. Recently, Last Layer Retraining (LLR) with group-balanced datasets is
known to be efficient in mitigating the spurious correlation of classifiers.
However, the acquisition of group-balanced datasets is costly, which hinders
the applicability of the LLR method. In this work, we propose to perform LLR
based on text datasets built with large language models for a general image
classifier. We demonstrate that text can be a proxy for its corresponding image
beyond the image-text joint embedding space, such as CLIP. Based on this, we
use generated texts to train the final layer in the embedding space of the
arbitrary image classifier. In addition, we propose a method of filtering the
generated words to get rid of noisy, imprecise words, which reduces the effort
of inspecting each word. We dub these procedures as TLDR (\textbf{T}ext-based
\textbf{L}ast layer retraining for \textbf{D}ebiasing image
classifie\textbf{R}s) and show our method achieves the performance that is
comparable to those of the LLR methods that also utilize group-balanced image
dataset for retraining. Furthermore, TLDR outperforms other baselines that
involve training the last linear layer without a group annotated dataset.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要