Chrome Extension
WeChat Mini Program
Use on ChatGLM

A Survey on Quality Metrics for Text-to-Image Models

arxiv(2024)

Cited 0|Views10
No score
Abstract
Recent AI-based text-to-image models not only excel at generating realistic images, they also give designers more and more fine-grained control over the image content. Consequently, these approaches have gathered increased attention within the computer graphics research community, which has been historically devoted towards traditional rendering techniques that offer precise control over scene parameters such as objects, materials, and lighting, when generating realistic images. While the quality of rendered images is traditionally assessed through well-established image quality metrics, such as SSIM or PSNR, the unique challenges presented by text-to-image models, which in contrast to rendering interweave the control of scene and rendering parameters, necessitate the development of novel image quality metrics. Therefore, within this survey, we provide a comprehensive overview of existing text-to-image quality metrics addressing their nuances and the need for alignment with human preferences. Based on our findings, we propose a new taxonomy for categorizing these metrics, which is grounded in the assumption that there are two main quality criteria, namely compositionality and generality, which ideally map to human preferences. Ultimately, we derive guidelines for practitioners conducting text-to-image evaluation, discuss open challenges of evaluation mechanisms, and surface limitations of current metrics.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined