PORTEND: A Joint Performance Model for Partitioned Early-Exiting DNNs.

Maryam Ebrahimi,Alexandre da Silva Veith,Moshe Gabel,Eyal de Lara

International Conference on Parallel and Distributed Systems（2023）

引用 0|浏览1

暂无评分

摘要

The computation and storage requirements of Deep Neural Networks (DNNs) make them challenging to deploy on edge devices, which often have limited resources. Conversely, offloading DNNs to cloud servers incurs high communication overheads. Partitioning and early exiting are attractive solutions for reducing computational costs and improving inference speed. However, current work often addresses these approaches separately and/or ignores common communication intricacies on edge networks such as de(serialization) and data transmission overheads. We present PORTEND, a novel performance model that jointly optimizes partitioning, early exiting, and multi-tier network placement. PORTEND’S novel approach outperforms the state-of-the-art solutions in edge computing setups, reducing the DNN inference latency by 29%.

查看译文

关键词

DNN partitioning,early exit,performance model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要