Context-Aware Deep Model Compression for Edge Cloud Computing

Lingdong Wang,Liyao Xiang,Jiayu Xu, Jiaju Chen,Xing Zhao,Dixi Yao,Xinbing Wang,Baochun Li

2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)（2020）

Cited 15|Views20

No score

Abstract

While deep neural networks (DNNs) have led to a paradigm shift, its exorbitant computational requirement has always been a roadblock in its deployment to the edge, such as wearable devices and smartphones. Hence a hybrid edge-cloud computational framework is proposed to transfer part of the computation to the cloud, by naively partitioning the DNN operations under the constant network condition assumption. However, real-world network state varies greatly depending on the context, and DNN partitioning only has limited strategy space. In this paper, we explore the structural flexibility of DNN to fit the edge model to varying network contexts and different deployment platforms. Specifically, we designed a reinforcement learning-based decision engine to search for model transformation strategies in response to a combined objective of model accuracy and computation latency. The engine generates a context-aware model tree so that the DNN can decide the model branch to switch to at runtime. By the emulation and field experimental results, our approach enjoys a 30% − 50% latency reduction while retaining the model accuracy.

Translated text

Key words

Edge Cloud Computing,Neural Architecture Search,Reinforcement Learning

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined