The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
arxiv(2024)
摘要
Generative text-to-image models enable us to synthesize unlimited amounts of
images in a controllable manner, spurring many recent efforts to train vision
models with synthetic data. However, every synthetic image ultimately
originates from the upstream data used to train the generator. What additional
value does the intermediate generator provide over directly training on
relevant parts of the upstream data? Grounding this question in the setting of
image classification, we compare finetuning on task-relevant, targeted
synthetic data generated by Stable Diffusion – a generative model trained on
the LAION-2B dataset – against finetuning on targeted real images retrieved
directly from LAION-2B. We show that while synthetic data can benefit some
downstream tasks, it is universally matched or outperformed by real data from
our simple retrieval baseline. Our analysis suggests that this underperformance
is partially due to generator artifacts and inaccurate task-relevant visual
details in the synthetic images. Overall, we argue that retrieval is a critical
baseline to consider when training with synthetic data – a baseline that
current methods do not yet surpass. We release code, data, and models at
https://github.com/scottgeng00/unmet-promise.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要