A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation
CoRR(2024)
摘要
In multimedia understanding tasks, corrupted samples pose a critical
challenge, because when fed to machine learning models they lead to performance
degradation. In the past, three groups of approaches have been proposed to
handle noisy data: i) enhancer and denoiser modules to improve the quality of
the noisy data, ii) data augmentation approaches, and iii) domain adaptation
strategies. All the aforementioned approaches come with drawbacks that limit
their applicability; the first has high computational costs and requires pairs
of clean-corrupted data for training, while the others only allow deployment of
the same task/network they were trained on (, when upstream and downstream
task/network are the same). In this paper, we propose SyMPIE to solve these
shortcomings. To this end, we design a small, modular, and efficient (just
2GFLOPs to process a Full HD image) system to enhance input data for robust
downstream multimedia understanding with minimal computational cost. Our SyMPIE
is pre-trained on an upstream task/network that should not match the downstream
ones and does not need paired clean-corrupted samples. Our key insight is that
most input corruptions found in real-world tasks can be modeled through global
operations on color channels of images or spatial filters with small kernels.
We validate our approach on multiple datasets and tasks, such as image
classification (on ImageNetC, ImageNetC-Bar, VizWiz, and a newly proposed mixed
corruption benchmark named ImageNetC-mixed) and semantic segmentation (on
Cityscapes, ACDC, and DarkZurich) with consistent improvements of about 5%
relative accuracy gain across the board. The code of our approach and the new
ImageNetC-mixed benchmark will be made available upon publication.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要