Searching for the implicit bias of deep learning

Mon Feb 13, 2023 4:00 p.m.—5:00 p.m.

This event has passed.

Speaker

Matus Jan Telgarsky, University of Illinois Urbana-Champaign

What makes deep learning special — why is it effective in so many settings where other models fail? This talk will present recent progress from three perspectives. The first result is approximation-theoretic: deep networks can easily represent phenomena that require exponentially-sized shallow networks, decision trees, and other classical models. Secondly, I will show that their statistical generalization ability — namely, their ability to perform well on unseen testing data — is correlated with their prediction margins, a classical notion of confidence. Finally, comprising the majority of the talk, I will discuss the interaction of the preceding two perspectives with optimization: specifically, how standard descent methods are implicitly biased towards models with good generalization. Here I will present two approaches: the strong implicit bias, which studies convergence to specific well-structured objects, and the weak implicit bias, which merely ensures certain good properties eventually hold, but has a more flexible proof technique.

Bio: Matus Telgarsky is an assistant professor at the University of Illinois, Urbana-Champaign, specializing in deep learning theory. He was fortunate to receive a PhD at UCSD under Sanjoy Dasgupta. Other highlights include: co-founding, in 2017, the Midwest ML Symposium (MMLS) with Po-Ling Loh; receiving a 2018 NSF CAREER award; and organizing two Simons Institute programs, one on deep learning theory (summer 2019), and one on generalization (fall 2024).

In-Person seminars will be held at Mason Lab 211, 9 Hillhouse Avenue with the option of virtual participation
3:30pm - Pre-talk meet and greet teatime - Dana House, 24 Hillhouse Avenue
Matus Jan Telgarsky’s website