SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout
arxiv(2024)
摘要
Generating VectorArt from text prompts is a challenging vision task,
requiring diverse yet realistic depictions of the seen as well as unseen
entities. However, existing research has been mostly limited to the generation
of single objects, rather than comprehensive scenes comprising multiple
elements. In response, this work introduces SVGCraft, a novel end-to-end
framework for the creation of vector graphics depicting entire scenes from
textual descriptions. Utilizing a pre-trained LLM for layout generation from
text prompts, this framework introduces a technique for producing masked
latents in specified bounding boxes for accurate object placement. It
introduces a fusion mechanism for integrating attention maps and employs a
diffusion U-Net for coherent composition, speeding up the drawing process. The
resulting SVG is optimized using a pre-trained encoder and LPIPS loss with
opacity modulation to maximize similarity. Additionally, this work explores the
potential of primitive shapes in facilitating canvas completion in constrained
environments. Through both qualitative and quantitative assessments, SVGCraft
is demonstrated to surpass prior works in abstraction, recognizability, and
detail, as evidenced by its performance metrics (CLIP-T: 0.4563, Cosine
Similarity: 0.6342, Confusion: 0.66, Aesthetic: 6.7832). The code will be
available at https://github.com/ayanban011/SVGCraft.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要