Mathematical problems in Machine Learning

Andrea Montanari (Stanford University)

2023-06-19 10:00, Videoconference and in person in Amphi Bloch, IPhT
2023-06-20 10:00, Videoconference and in person in Amphi Bloch, IPhT
2023-06-21 10:00, Videoconference and in person in Amphi Bloch, IPhT
2023-06-23 10:00, Videoconference and in person in Amphi Bloch, IPhT

Livestream on youtube.com/IPhT-TV: no subscription required

Videoconference: subscribe to the course newsletter to receive links

Abstract: 

Despite their empirical success, the principles underlying modern deep learning models remain mysterious. These models are trained by optimizing highly non-convex objectives, using a variety of gradient-based algorithms that -at best- are only guaranteed to converge to local optima. The model complexity is often large or comparable to the sample size, and hence many choices of the model parameters exist that perform equally well on the training data, but not all of them generalize to unseen data. Over the last few years, an informal scenario has emerged that captures these phenomena. I will describe its elements and explain a few examples in which this scenario can be made more precise.

A rough outline of the lectures:

  1. Why does overparametrization help optimization?
  2. Why doesn't overparametrization hurt generalization?
  3. The puzzle of architecture.
  4. Generative modeling.
Series: 
IPhT Courses
Short course title: 
Machine Learning