Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 0|浏览19
暂无评分
摘要
Traditional image codecs prioritize signal fidelity and human perception, often neglecting machine vision tasks. Deep learning approaches have shown promising coding performance by leveraging rich semantic embeddings that can be optimized for both human and machine vision. However, these compact embeddings struggle to represent low-level details like contours and textures, leading to imperfect reconstructions. Additionally, existing learning-based coding tools lack scalability. To address these challenges, this paper presents a content-adaptive diffusion model for scalable image compression. The method encodes accurate texture through a diffusion process, enhancing human perception while preserving important features for machine vision tasks. It employs a Markov palette diffusion model with commonly-used feature extractors and image generators, enabling efficient data compression. By utilizing collaborative texture-semantic feature extraction and pseudo-label generation, the approach accurately learns texture information. A content-adaptive Markov palette diffusion model is then applied to capture both low-level texture and high-level semantic knowledge in a scalable manner. This framework enables elegant compression ratio control by flexibly selecting intermediate diffusion states, eliminating the need for deep learning model re-training at different operating points. Extensive experiments demonstrate the effectiveness of the proposed framework in image reconstruction and downstream machine vision tasks such as object detection, segmentation, and facial landmark detection. It achieves superior perceptual quality scores compared to state-of-the-art methods.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要