Abstract: We characterize the sample complexity of learning target functions of low latent dimension on `regular’ neural networks with SGD. The characterization brings to light a new complexity measure, the leap, which measures how “hierarchical” target functions are. For depth 2, we show that SGD learns such target functions in d^Leap time, where d is the input dimension, with a saddle-to-saddle dynamic that learns the functions along degree-climbing features. We then consider an out-of-distribution setting defined as generalization on the unseen (GOTU), and show how various neural networks such as Transformers tend to learn functions of minimal degree on the unseen. This leads to a new `degree curriculum learning’ algorithm.
Joint works with E. Boix (MIT), T. Misiakiewicz (Stanford) and S. Bengio (Apple), A. Lotfi (EPFL).
Bio: Emmanuel Abbe received his Ph.D. degree from the EECS Department at the Massachusetts Institute of Technology (MIT) in 2008, and his M.S. degree from the Department of Mathematics at the Ecole Polytechnique Fédérale de Lausanne (EPFL) in 2003. He was at Princeton University as an assistant professor from 2012-2016 and a tenured associate professor from 2016, jointly in the Program for Applied and Computational Mathematics and the Department of Electrical Engineering, as well an associate faculty in the Department of Mathematics at Princeton University since 2016. He joined EPFL in 2018 as a Full Professor, jointly in the Mathematics Institute and the School of Computer and Communication Sciences, where he holds the Chair of Mathematical Data Science. He is the recipient of the Foundation Latsis International Prize, the Bell Labs Prize, the NSF CAREER Award, the Google Faculty Research Award, the Walter Curtis Johnson Prize from Princeton University, the von Neumann Fellowship from the Institute for Advanced Study, the IEEE Information Theory Society Paper Award, and a co-recipient of the Simons-NSF Mathematics of Deep Learning Collaborative Research Award.
Prof. E. Abbe is also a Global Expert at the Geneva Science and Diplomacy Anticipator (GESDA), a member of the Steering Committee of the Center for Intelligent Systems (CIS) at EPFL, a member of the Deepfoundations collaboration on the theoretical foundations of deep learning, a consultant at Apple Artificial Intelligence and Machine Learning Research, and the director of the Bernoulli Center for Fundamental Studies at EPFL.