Fundamental limits of weak learnability in high-dimensional multi-index models

CoRR(2024)

Cited 0|Views9
No score
Abstract
Multi-index models – functions which only depend on the covariates through a non-linear transformation of their projection on a subspace – are a useful benchmark for investigating feature learning with neural networks. This paper examines the theoretical boundaries of learnability in this hypothesis class, focusing particularly on the minimum sample complexity required for weakly recovering their low-dimensional structure with first-order iterative algorithms, in the high-dimensional regime where the number of samples is n=α d is proportional to the covariate dimension d. Our findings unfold in three parts: (i) first, we identify under which conditions a trivial subspace can be learned with a single step of a first-order algorithm for any α>0; (ii) second, in the case where the trivial subspace is empty, we provide necessary and sufficient conditions for the existence of an easy subspace consisting of directions that can be learned only above a certain sample complexity α>α_c. The critical threshold α_c marks the presence of a computational phase transition, in the sense that no efficient iterative algorithm can succeed for α<α_c. In a limited but interesting set of really hard directions – akin to the parity problem – α_c is found to diverge. Finally, (iii) we demonstrate that interactions between different directions can result in an intricate hierarchical learning phenomenon, where some directions can be learned sequentially when coupled to easier ones. Our analytical approach is built on the optimality of approximate message-passing algorithms among first-order iterative methods, delineating the fundamental learnability limit across a broad spectrum of algorithms, including neural networks trained with gradient descent.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined