Evaluation of architecture-aware optimization techniques for Convolutional Neural Networks

2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)(2023)

引用 1|浏览3
暂无评分
摘要
The growing need to perform Neural network inference with low latency is giving place to a broad spectrum of heterogeneous devices with deep learning capabilities. Therefore, obtaining the best performance from each device and choosing the most suitable platform for a given problem has become challenging. This paper evaluates multiple inference platforms using architecture-aware optimizations for convolutional neural networks. Specifically, we use TensorRT and OpenVINO frameworks for hardware optimizations on top of the platform-aware NetAdapt algorithm. The experimental evaluation shows that on MobileNet and AlexNet, using NetAdapt with TensorRT or Open-VINO can improve latency up to 10 x and 5.3 x, respectively. Moreover, a throughput test using different batch sizes showed variable performance improvement on the different devices. Discussing the experimental results can guide the selection of devices and optimizations for different AI solutions.
更多
查看译文
关键词
Deep learning,Edge computing,Efficient computing,Neural network optimizations,Heterogeneous computing,NetAdapt
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要