Chrome Extension
WeChat Mini Program
Use on ChatGLM

PELSI: Power-Efficient Layer-Switched Inference

IEEE International Conference on Embedded and Real-Time Computing Systems and Applications(2023)

Cited 0|Views5
No score
Abstract
Convolutional Neural Networks (CNNs) are now quintessential kernels within embedded computer vision applications deployed in edge devices. Heterogeneous Multi-Processor System-on-Chips (HMPSoCs) with Dynamic Voltage and Frequency Scaling (DVFS) capable components (CPUs and GPUs) allow for low-latency, low-power CNN inference on resource-constrained edge devices when employed efficiently. CNNs comprise several heterogeneous layer types that execute with different degrees of power efficiency on different HMPSoC components at different frequencies. We propose the first framework, PELSI, that exploits this layer-wise power efficiency heterogeneity for power-efficient CPU-GPU layer-switched CNN interference on HMPSoCs. PELSI executes each layer of a CNN on an HMPSoC component (CPU or GPU) clocked at just the right frequency for every layer such that the CNN meets its inference latency target with minimal power consumption while still accounting for the power-performance overhead of multiple switching between CPU and GPU mid-inference. PELSI incorporates a Genetic Algorithm (GA) to identify the near-optimal CPU-GPU layer-switched CNN inference configuration from within the large exponential design space that meets the given latency requirement most power efficiently. We evaluate PELSI on Rock-Pi embedded platform. The platform contains an RK3399Pro HMPSoC with DVFS-capable CPU clusters and GPU. Empirical evaluations with five different CNNs show a 44.48% improvement in power efficiency for CNN inference under PELSI over the state-of-the-art.
More
Translated text
Key words
Low-Power Design,Edge Computing,Embedded Machine Learning (ML),On-Chip Artificial Intelligence (AI)
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined