Principal Component Networks: Utilizing Low-Rank Activation Structure to Reduce Parameters Early in Training

ACM / IMS Journal of Data Science(2024)

引用 0|浏览6
暂无评分
摘要
Recent works show that overparameterized neural networks contain small subnetworks that exhibit comparable accuracy to the full model when trained in isolation. These results highlight the potential to reduce the computational costs of deep neural network training without sacrificing generalization performance. Initial approaches for finding these small networks relied on expensive multi-round train-and-prune procedures, limiting their practical potential, but more recent work identifies subnetworks using structured pruning techniques early in training. In this article, we study network activations, rather than network weights, and find that hidden layer activations in overparameterized networks exist primarily in subspaces smaller than the actual model width. We further notice that these subspaces can be identified early in training. Based on these observations, we show how to efficiently find small networks that exhibit similar accuracy to their overparameterized counterparts after only a few training epochs. We term these network architectures Principal Component Networks (PCNs). PCNs compress individual layers by retaining only the high variance linear combinations of channels—defined by the principal components of the layer inputs—a key difference from structured pruning techniques which focus on individual channel pruning using localized channel measurements. We evaluate PCNs on CIFAR-10 and ImageNet for VGG and ResNet style architectures and compare against existing methods for subnetwork identification during early training. We find that PCNs consistently reduce parameter counts with little accuracy loss, thus providing the potential to reduce the computational costs of deep neural network training. Beyond model compression, we also connect our observation regarding hidden layer activations to the feature representations learned by neural networks and discuss areas for future work.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要