Information Bottleneck Analysis of Deep Neural Networks via Lossy Compression
arxiv(2023)
Abstract
The Information Bottleneck (IB) principle offers an information-theoretic
framework for analyzing the training process of deep neural networks (DNNs).
Its essence lies in tracking the dynamics of two mutual information (MI)
values: between the hidden layer output and the DNN input/target. According to
the hypothesis put forth by Shwartz-Ziv Tishby (2017), the training process
consists of two distinct phases: fitting and compression. The latter phase is
believed to account for the good generalization performance exhibited by DNNs.
Due to the challenging nature of estimating MI between high-dimensional random
vectors, this hypothesis was only partially verified for NNs of tiny sizes or
specific types, such as quantized NNs. In this paper, we introduce a framework
for conducting IB analysis of general NNs. Our approach leverages the
stochastic NN method proposed by Goldfeld et al. (2019) and incorporates a
compression step to overcome the obstacles associated with high dimensionality.
In other words, we estimate the MI between the compressed representations of
high-dimensional random vectors. The proposed method is supported by both
theoretical and practical justifications. Notably, we demonstrate the accuracy
of our estimator through synthetic experiments featuring predefined MI values
and comparison with MINE (Belghazi et al., 2018). Finally, we perform IB
analysis on a close-to-real-scale convolutional DNN, which reveals new features
of the MI dynamics.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined