Chrome Extension
WeChat Mini Program
Use on ChatGLM

MACA: Memory-aware convolution accelerating for CNN inference on edge devices.

Chaoxiong Yi,Songlei Jian,Yusong Tan, Yusen Zhang

International Conference on Computer Supported Cooperative Work in Design(2024)

Cited 0|Views0
No score
Abstract
Deep learning inference tasks develop towards the edge due to their latency requirements and privacy issues. However, edge devices are limited by their power consumption and size, and generally have limited resources. The convolutional neural networks (CNN) is commonly used in image processing tasks which contains a large number of convolutional layers, accounting for more than 95% of the calculation time in most general used CNN model. This paper proposes a general convolutional layer optimization method called MACA, we implement and optimize a variety of convolution operators and design a memory-aware convolution operator automatic selection strategy to select appropriate operator, without modifying user code. Finally, we integrate MACA into PyTorch and conduct extensive experiments. The results show that when memory resources are sufficient, MACA can effectively increase the inference by 36.10% on average, and can reduce memory usage by an average of 29.14% to complete inference when resources are tight. This paper provides an effective solution for deploying deep learning models on resource-constrained edge devices.
More
Translated text
Key words
Deep learning,Edge intelligence,Inference acceleration,Convolution,Memory-aware
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined