Benchmarking TensorFlow Lite Quantization Algorithms for Deep Neural Networks
2022 IEEE 16th International Symposium on Applied Computational Intelligence and Informatics (SACI)(2022)
摘要
Deploying deep neural network models on the resource constrained devices, e.g., lost-cost microcontrollers, is challenging because they are mostly limited in terms of memory footprint and computation capabilities. Quantization is one of the widely used solutions to reduce the size of a model. For parameter representation, it employs for example just 8-bit integer or less instead of 32-bit floating point. The TensorFlow Lite deep learning framework currently provides four methods for post-training quantization. The aim of this paper is to benchmark these quantization methods using various deep neural models of different sizes. The main outcomes of the paper are: (1) the compression ratio obtained for each quantization method for deep neural models of small, medium, and large sizes, (2) a comparison of the accuracy results relative to the original accuracy, and (3) a viewpoint for the decision to choose the quantization method depending on the model size.
更多查看译文
关键词
Deep Neural Networks,TensorFlow Lite,Post-Training Quantization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要