Chrome Extension
WeChat Mini Program
Use on ChatGLM

SwinWave-SR: Multi-scale lightweight underwater image super-resolution

INFORMATION FUSION(2024)

Cited 0|Views26
No score
Abstract
The resource-limited nature of underwater vision equipment leads to poor, otherwise low-resolution information affecting the downstream underwater robotics and ocean engineering tasks. Underwater Image Enhancement (UIE) methods have emerged, particularly Super-Resolution (SR), to tackle the aforementioned challenge by restoring the corresponding low-resolution image to a high-quality counterpart. Vision Transformers (ViTs) have recently been employed for SR tasks thanks to their superior performance over mainstream convolution neural networks. The success of ViTs is largely due to their self-attention mechanism; however, they may encounter challenges in dealing with severe and unpredictable degradation in underwater imaging. In contrast, Multi-scale ViTs (MViTs) variants such as the Swin transformers have overcome that challenge by preserving long-range dependencies over multi-scale feature hierarchies through evolving channel capacity. MViTs tend to induce spatial efficiency through classical down-sampling, such as average pooling over key/values, which results in an inevitable loss of high-frequency components. To address this lack, in the current work, we propose a novel algorithm, SwinWave-SR, for efficient and accurate multi-scale SR for underwater images. Our proposed algorithm is based on Swin transformer consisting of a wavelet block to restrict the information drop by downsampling in an invertible fashion. Consequently, the key components are preserved to assist self-attention learning while reducing its computational cost simultaneously. To further complement it, we explore a prominent compression regime, namely the Lottery Ticket Hypothesis (LTH), to discover a lightweight sub-network with competitive performance to its original model by reducing computational costs up to 70.44%. Overall, SwinWave-SR improves peak signal-to-noise ratio (PSNR) by 0.95 dB similar to 2.23 dB compared to the state-of-the-art SwinIR while reducing the number of parameters by 29.56% and the calculation cost by 18.734%. Experimental results show that the proposed SwinWave-SR method outperforms the state-of-the-art SR methods on four benchmark underwater datasets and significantly improves PSNR and structural similarity index (SSIM).
More
Translated text
Key words
Vision transformer (ViT),Wavelet transform,Super-resolution,Underwater Image Enhancement (UIE),Lightweight model
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined