Channel pruning based on convolutional neural network sensitivity

Chenbin Yang,Huiyi Liu

Neurocomputing(2022)

引用 15|浏览7
暂无评分
摘要
Pruning is a useful technique for decreasing the memory consumption and floating point operations (FLOPs) of deep convolutional neural network (CNN) models. Nevertheless, at modest pruning levels, current structured pruning approaches often lead to considerable declines in accuracy. Furthermore, existing approaches often treat pruning rates as super parameters, neglecting the sensitivity of different convolution layers. In this study, we propose a novel sensitivity-based method for channel pruning that utilizes second-order sensitivity as a criterion. The essential concept is to prune insensitive filters while retaining sensitive ones. We quantify the sensitivity of the filter using the sum of the sensitivities of all weights in the filter, rather than the magnitude-based metric frequently applied in the literature. Furthermore, a layer sensitivity approach based on the Hessian eigenvalues of each layer is introduced into the process of automatically choosing the most appropriate pruning rate for each layer. Experiments on a variety of modern CNN architectures demonstrate that we can considerably enhance the pruning rate while sacrificing a small amount of accuracy, resulting in a reduction of more than 60% in FLOPs on CIFAR-10. Notably, on ImageNet, pruning based on ResNet50 decreased the FLOPs by 56.3% while losing only 0.92% of accuracy.
更多
查看译文
关键词
Convolutional neural network,Channel pruning,Sensitivity,Pruning rate
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要