Chrome Extension
WeChat Mini Program
Use on ChatGLM

SkaNet: Split Kernel Attention Network

Lipeng Chen, Daixi Jia,Hang Gao,Fengge Wu,Junsuo Zhao

Lecture Notes in Computer Science(2023)

Cited 0|Views7
No score
Abstract
Recently, convolutional neural networks (CNNs) and vision transformers (ViTs) have shown impressive results in the area of light-weight models for edge devices. However, the dominant CNNs and ViTs architectures rely heavily on a structured grid or sequence representation of images, which can result in inflexible handling of complex or irregular objects within them. In this paper, we propose SkaNet, an innovative, high-performance hybrid architecture that synergistically integrates the benefits of both CNNs and ViTs, and further enhances these advantages by graph representation learning. Specifically in SkaNet, we introduce a novel linear attention named split kernel attention (SKA) that exploits graph convolution to capture global semantic information and facilitate flexible recognition of irregular objects, splits input tensors into multiple channel groups adaptively, and fuses aforementioned modules into linear attention to efficiently aggregate contextual information. Extensive experiments demonstrate that SkaNet outperforms popular light-weight CNN and ViT-based models on common vision tasks and datasets. For classification on ImageNet-1k, SkaNet-S, with 5.5M parameters, achieves an impressive top-1 accuracy of 79.5%, surpassing MobileViT-S with an absolute gain of 1.1%. Furthermore, SkaNet-S exhibits superior performance in semantic segmentation on PASCAL VOC 2012 and object detection on COCO 2017. Our source code is available on GitHub at: https://github.com/charryglomc/skanet.
More
Translated text
Key words
Convolutional Neural Network,Attention,Graph Representation Learning,Light-weight
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined