New statistical and computational phenomena from deep learning

Mon Mar 6, 2023 4:00 p.m.—5:00 p.m.

This event has passed.

Speaker

Theodor Misiakiewicz, Stanford University

Deep learning methodology has presented major challenges for statistical learning theory. Indeed deep neural networks often operate in regimes outside the realm of classical statistics and optimization wisdom. In this talk, we will consider two illustrative examples which clarify some of these new challenges. The first example considers an instance where kernel ridge regression with a simple RBF kernel achieves optimal test error when it perfectly fits the noisy training data. Why can we interpolate noisy data and still generalize well? Why can overfitting be benign in kernel ridge regression? The second example—computational in nature—considers fitting two different smooth ridge functions with deep neural networks (DNNs). Both can be estimated at the same near-parametric rate by DNNs trained with unbounded computational resources. However, empirically, learning becomes much harder for one of these functions when restricted to DNNs trained using SGD. Why does SGD succeed on some functions and fail on others? The goal of this talk will be to understand these two simulations. In particular, we will demonstrate quantitative theories that can precisely capture both phenomena.

In-Person seminars will be held at Mason Lab 211, 9 Hillhouse Avenue with the option of virtual participation
3:30pm - Pre-talk meet and greet teatime - Dana House, 24 Hillhouse Avenue
Theodor Misiakiewicz’s website