Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers
CoRR(2024)
摘要
We propose sandwiching standard image and video codecs between pre- and
post-processing neural networks. The networks are jointly trained through a
differentiable codec proxy to minimize a given rate-distortion loss. This
sandwich architecture not only improves the standard codec's performance on its
intended content, it can effectively adapt the codec to other types of
image/video content and to other distortion measures. Essentially, the sandwich
learns to transmit “neural code images” that optimize overall rate-distortion
performance even when the overall problem is well outside the scope of the
codec's design. Through a variety of examples, we apply the sandwich
architecture to sources with different numbers of channels, higher resolution,
higher dynamic range, and perceptual distortion measures. The results
demonstrate substantial improvements (up to 9 dB gains or up to 30% bitrate
reductions) compared to alternative adaptations. We derive VQ equivalents for
the sandwich, establish optimality properties, and design differentiable codec
proxies approximating current standard codecs. We further analyze model
complexity, visual quality under perceptual metrics, as well as sandwich
configurations that offer interesting potentials in image/video compression and
streaming.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要