Tim G. J. Rudner
Tim G. J. Rudner, New York University
Conventional regularization techniques for neural networks, such as L2 or L1 regularization, explicitly penalize divergence of the model parameters from specific parameter values. However, in most neural network models, specific parameter configurations bear little to no physical meaning, and it is difficult to incorporate domain knowledge or other relevant information into neural network training using conventional regularization techniques.
In this talk, I will show that we can address this shortcoming by using Bayesian principles to effectively incorporate domain knowledge or beliefs about desirable model properties into neural network training. To do so, I will approach regularization in neural networks from a probabilistic perspective and define a family of data-driven prior distributions that allows us to encode useful auxiliary information into the model. I will then show how to perform approximate inference in neural networks with such priors and derive a simple variational optimization objective with a regularizer that reflects the constraints implicitly encoded in the prior. This regularizer is mathematically simple, easy to implement, and can be used as a drop-in replacement for existing regularizers when performing supervised learning in neural networks of any size.
I will conclude the talk with an overview of applications of data-driven priors, including distribution shift detection, drug discovery, and medical diagnosis.
This is joint work with Sanyam Kapoor, Shikai Qiu, Xiang Pan, Lily Yucen Li, Ya Shi Zhang, Ravid Shwartz-Ziv, Julia Kempe, and Andrew Gordon Wilson.
Bio: Tim G. J. Rudner is an Assistant Professor and Faculty Fellow at New York University’s Center for Data Science and an AI Fellow at Georgetown University’s Center for Security and Emerging Technology. He conducted PhD research on probabilistic machine learning in the Department of Computer Science at the University of Oxford, where he was advised by Yee Whye Teh and Yarin Gal. The goal of his research is to develop methods and theoretical insights that enable the safe deployment of machine learning systems in safety-critical settings. Tim holds a master’s degree in statistics from the University of Oxford and an undergraduate degree in applied mathematics and economics from Yale University. He is also a Rhodes Scholar and a Qualcomm Innovation Fellow.
3:30pm - Pre-talk meet and greet teatime - 219 Prospect Street, 13 floor, there will be light snacks and beverages in the kitchen area.