Enric Boix is a 5th year PhD student at MIT, working on the theory of deep learning and optimal transport. He has been named a Siebel Scholar, an NSF Graduate Research Fellow, and an Apple AI/ML scholar.
Talk: The Merged-Staircase property
Abstract: Which functions f : {+1,-1}^d \to \R can neural networks learn when trained with SGD? In this talk, we will consider functions that depend only on a small number of coordinates. We will study the dynamics of two-layer neural networks in the mean-field parametrization, trained by O(d) samples of SGD. Our main result will be to characterize a hierarchical property, the "merged-staircase property"— that is both necessary and nearly sufficient for learning in this setting. We will further show that non-linear training is necessary: for this class of functions, linear methods on any feature map (e.g., the NTK) are not capable of learning efficiently. The key tools are a new “dimension-free” dynamics approximation result that applies to functions defined on a latent space of low-dimension, and a proof of global convergence based on polynomial identity testing.