PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution
CVPR 2024(2024)
Abstract
Recently, some large kernel convnets strike back with appealing performance
and efficiency. However, given the square complexity of convolution, scaling up
kernels can bring about an enormous amount of parameters and the proliferated
parameters can induce severe optimization problem. Due to these issues, current
CNNs compromise to scale up to 51x51 in the form of stripe convolution (i.e.,
51x5 + 5x51) and start to saturate as the kernel size continues growing. In
this paper, we delve into addressing these vital issues and explore whether we
can continue scaling up kernels for more performance gains. Inspired by human
vision, we propose a human-like peripheral convolution that efficiently reduces
over 90
and manage to scale up kernel size to extremely large. Our peripheral
convolution behaves highly similar to human, reducing the complexity of
convolution from O(K^2) to O(logK) without backfiring performance. Built on
this, we propose Parameter-efficient Large Kernel Network (PeLK). Our PeLK
outperforms modern vision Transformers and ConvNet architectures like Swin,
ConvNeXt, RepLKNet and SLaK on various vision tasks including ImageNet
classification, semantic segmentation on ADE20K and object detection on MS
COCO. For the first time, we successfully scale up the kernel size of CNNs to
an unprecedented 101x101 and demonstrate consistent improvements.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined