A Prompt Log Analysis of Text-to-Image Generation Systems

Yan‐zhao Xie,Zhaoying Pan,Jinge Ma, Jianxun Luo,Qiaozhu Mei

arXiv (Cornell University)(2023)

引用 0|浏览2
暂无评分
摘要
Recent developments in diffusion models have unleashed the astonishing capabilities of text-to-image generation systems to synthesize high-quality images that are faithful to a given reference text, known as a "prompt." These systems, once released to the public, have immediately received tons of attention from researchers, creators, and common users. Despite the plenty of efforts to improve the underneath generative models, there is limited work on understanding the information needs of the real users of these systems, e.g., by investigating the prompts the users input at scale. In this paper, we take the initiative to conduct a comprehensive analysis of large-scale prompt logs collected from multiple text-to-image generation systems. Our work is analogous to analyzing the query log of Web search engines, a line of work that has made critical contributions to the glory of the Web search industry and research. We analyze over two million user-input prompts submitted to three popular text-to-image systems at scale. Compared to Web search queries, text-to-image prompts are significantly longer, often organized into unique structures, and present different categories of information needs. Users tend to make more edits within creation sessions, showing remarkable exploratory patterns. Our findings provide concrete implications on how to improve text-to-image generation systems for creation purposes.
更多
查看译文
关键词
prompt log analysis,generation,text-to-image
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要