MC^2: Multi-concept Guidance for Customized Multi-concept Generation
arxiv(2024)
摘要
Customized text-to-image generation aims to synthesize instantiations of
user-specified concepts and has achieved unprecedented progress in handling
individual concept. However, when extending to multiple customized concepts,
existing methods exhibit limitations in terms of flexibility and fidelity, only
accommodating the combination of limited types of models and potentially
resulting in a mix of characteristics from different concepts. In this paper,
we introduce the Multi-concept guidance for Multi-concept customization, termed
MC^2, for improved flexibility and fidelity. MC^2 decouples the
requirements for model architecture via inference time optimization, allowing
the integration of various heterogeneous single-concept customized models. It
adaptively refines the attention weights between visual and textual tokens,
directing image regions to focus on their associated words while diminishing
the impact of irrelevant ones. Extensive experiments demonstrate that MC^2
even surpasses previous methods that require additional training in terms of
consistency with input prompt and reference images. Moreover, MC^2 can be
extended to elevate the compositional capabilities of text-to-image generation,
yielding appealing results. Code will be publicly available at
https://github.com/JIANGJiaXiu/MC-2.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要