Two-Stage Dual Augmentation with CLIP for Improved Text-to-Sketch Synthesis

Zhenfei Zhang,Ming-Ching Chang

2023 IEEE 6th International Conference on Multimedia Information Processing and Retrieval (MIPR)(2023)

引用 0|浏览5
暂无评分
摘要
We introduce an improved text-to-sketch synthesis method using two-stage dual augmentation based on the large-scale pre-trained CLIP and CLIPDraw models. In the first stage, the input text is fed to CLIPDraw to produce text augmentation adaptively. In the second stage, attention mechanisms and structural images with lower strokes are adopted for image augmentation enhancement. Parameters of the Bezier drawing curves are optimized using global and local loss terms. Our method produces visually plausible drawings with better stroke layouts and improved drawing details. There is no need for model re-training or parameter tuning. We further utilize CLIPScore, a reference-free metric, to evaluate the matching of the generated image against the input text description. Experimental results show that the proposed method produces improved drawing sketches.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要