Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer
arxiv(2024)
摘要
Recent progress in generative compression technology has significantly
improved the perceptual quality of compressed data. However, these advancements
primarily focus on producing high-frequency details, often overlooking the
ability of generative models to capture the prior distribution of image
content, thus impeding further bitrate reduction in extreme compression
scenarios (<0.05 bpp). Motivated by the capabilities of predictive language
models for lossless compression, this paper introduces a novel Unified Image
Generation-Compression (UIGC) paradigm, merging the processes of generation and
compression. A key feature of the UIGC framework is the adoption of
vector-quantized (VQ) image models for tokenization, alongside a multi-stage
transformer designed to exploit spatial contextual information for modeling the
prior distribution. As such, the dual-purpose framework effectively utilizes
the learned prior for entropy estimation and assists in the regeneration of
lost tokens. Extensive experiments demonstrate the superiority of the proposed
UIGC framework over existing codecs in perceptual quality and human perception,
particularly in ultra-low bitrate scenarios (<=0.03 bpp), pioneering a new
direction in generative compression.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要