Lénaïc Chizat, EPFL
In this talk, we propose an analysis of gradient descent on wide two-layer ReLU neural networks that leads to sharp characterizations of the learned predictor. The main idea is to study the training dynamics when the width of the hidden layer goes to infinity, which is a Wasserstein gradient flow. While this dynamics evolves on a non-convex landscape, we show that for appropriate initializations, its limit, when it exists, is a global minimizer. We also study the implicit regularization of this algorithm when the objective is the unregularized logistic loss, which leads to a max-margin classifier in a certain functional space. We finally discuss what these results tell us about the generalization performance, and in particular how these models compare to kernel methods.
You are invited to a scheduled Zoom meeting. Zoom is Yale’s audio and visual conferencing platform.
Topic: Yale S&DS Department Seminar
Time: 4:00pm - 5:00pm
Join from PC, Mac, Linux, iOS or Android: https://yale.zoom.us/j/99169700816?pwd=SWEvWHI5d3dPNVdHMkZMZURMWWJPUT09
Or Telephone：203-432-9666 (2-ZOOM if on-campus) or 646 568 7788
Meeting ID: 991 6970 0816
International numbers available: https://yale.zoom.us/u/acBOaD1ic6
For H.323 and SIP information for video conferencing units please click here: https://yale.service-now.com/it?id=support_article&sys_id=434b72d3db9e8fc83514b1c0ef961924