Benchmarking TensorFlow Lite Quantization Algorithms for Deep Neural Networks

2022 IEEE 16th International Symposium on Applied Computational Intelligence and Informatics (SACI)(2022)

引用 0|浏览2
暂无评分
摘要
Deploying deep neural network models on the resource constrained devices, e.g., lost-cost microcontrollers, is challenging because they are mostly limited in terms of memory footprint and computation capabilities. Quantization is one of the widely used solutions to reduce the size of a model. For parameter representation, it employs for example just 8-bit integer or less instead of 32-bit floating point. The TensorFlow Lite deep learning framework currently provides four methods for post-training quantization. The aim of this paper is to benchmark these quantization methods using various deep neural models of different sizes. The main outcomes of the paper are: (1) the compression ratio obtained for each quantization method for deep neural models of small, medium, and large sizes, (2) a comparison of the accuracy results relative to the original accuracy, and (3) a viewpoint for the decision to choose the quantization method depending on the model size.
更多
查看译文
关键词
Deep Neural Networks,TensorFlow Lite,Post-Training Quantization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要