EdgeNN: Efficient Neural Network Inference for CPU-GPU Integrated Edge Devices.

ICDE(2023)

引用 0|浏览15
暂无评分
摘要
With the development of the architectures and the growth of AIoT application requirements, data processing on edge has become popular. Neural network inference is widely employed for data analytics on edge devices. This paper extensively explores neural network inference on integrated edge devices and proposes EdgeNN, the first neural network inference solution on CPU-GPU integrated edge devices. EdgeNN has three novel characteristics. First, EdgeNN can adaptively utilize the unified physical memory and conduct the zero-copy optimization. Second, EdgeNN involves a novel inference-targeted inter- and intra-kernel CPU-GPU hybrid execution approach, which co-runs the CPU with the GPU to fully utilize the edge device’s computing resources. Third, EdgeNN adopts a fine-grained adaptive inference tuning approach, which can divide the complicated inference structure into sub-tasks mapped to the CPU and the GPU. Experiments show that on six popular neural network inference tasks, EdgeNN brings an average of 3.97×, 3.12×, and 8.80× speedups to inference on the CPU of the integrated device, inference on a mobile phone CPU, and inference on an edge CPU device. Additionally, it achieves 22.02% time benefits to the direct execution of the original programs. Specifically, 9.93% comes from better utilization of unified memory, and 10.76% comes from CPU-GPU hybrid execution. Besides, EdgeNN can deliver 29.14× and 5.70× higher energy efficiency than the edge CPU and the discrete GPU, respectively. We have made EdgeNN available at https://github.com/ChenyangZhang-cs/EdgeNN.
更多
查看译文
关键词
complicated inference structure,CPU-GPU integrated edge devices,edge CPU,edge device,EdgeNN available,efficient neural network inference,fine-grained adaptive inference tuning approach,integrated device,intra-kernel CPU-GPU hybrid execution approach,mobile phone CPU,neural network inference solution,novel inference-targeted inter,popular neural network inference tasks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要