A Practical Framework for Designing and Deploying Tiny Deep Neural Networks on Microcontrollers.

Brenda Zhuang,Danilo Pau

IEEE International Conference on Consumer Electronics(2024)

引用 0|浏览2
暂无评分
摘要
For many applications, Deep Neural Networks (DNNs) trained on powerful CPUs and GPUs are expected to efficiently perform inference on tiny devices. However, deploying productively un-constrained complex models to microcontrollers (MCUs) remains a time-consuming task. In this paper, a comprehensive methodology is presented that combines advanced optimization techniques in hyperparameter search, model compression, and deployability evaluation using benchmark data.MCUs typically have low-power processors, limited embedded RAM memory and FLASH storage, providing orders of magnitude fewer computational resources than what cloud assets offer. Designing DNNs for such platforms requires effective strategies to balance high accuracy performance with low memory usage and inference latency. To address this challenge, Bayesian optimization has been applied, a powerful complexity-bounded technique, to hyperparameter tuning to select tiny model architecture candidates. Several pruning and quantization methods have been developed to compress all the models and evaluated the numerical performance after compression. Additionally, cloud-based deployment tools have been utilized to iteratively validate the on-device memory and latency performance on off-the-shelf MCUs. Through evaluating the benchmarks against the stringent requirements of tiny devices at the edge, practical insights have been gained into these models.Multiple image classification applications have been applied on a variety of STM32 MCUs. The practical framework can: a) maintain top-1 classification accuracy within tolerance from the floating-point network after compression; b) reduce memory footprint by at least 4 times; c) reduce inference runtime significantly by avoiding external RAM usage; d) adaptable to many different applications.
更多
查看译文
关键词
Deep learning,deployment,hyper-parameter optimization,Bayesian optimization,model compression,microcontrollers,resource constraints
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要