Hongyang Zhang, Toyota Technological Institute at Chicago
Deep learning models are often vulnerable to adversarial examples. In this talk, we will focus on robustness and security of machine learning against adversarial examples. There are two types of defenses against such attacks: 1) empirical and 2) certified adversarial robustness.
In the first part of the talk, we will see the foundation of our winning system, TRADES, in the NeurIPS’18 Adversarial Vision Challenge in which we won 1st place out of 400 teams and 3,000 submissions. Our study is motivated by an intrinsic trade-off between robustness and accuracy: we provide a differentiable and tight surrogate loss for the trade-off using the theory of classification-calibrated loss. TRADES has record-breaking performance in various standard benchmarks and challenges, including the adversarial benchmark RobustBench, the NLP benchmark GLUE, the Unrestricted Adversarial Examples Challenge hosted by Google, and has motivated many new attacking methods powered by our TRADES benchmark.
In the second part of the talk, to equip empirical robustness with certification, we study certified adversarial robustness by random smoothing in the L_infty threat model. On one hand, we show that random smoothing on the TRADES-trained classifier achieves SOTA certified robustness when the L_infty perturbation radius is small. On the other hand, when the perturbation is large, i.e., independent of inverse of input dimension, we show that random smoothing is provably unable to certify L_infty robustness for arbitrary random noise distribution. The intuition behind our theory reveals an intrinsic difficulty of achieving certified robustness by “random noise based methods”, and inspires new directions as potential future work.
Bio: Hongyang Zhang is a Postdoc fellow at TTIC, hosted by Avrim Blum and Greg Shakhnarovich. He obtained his Ph.D. from CMU Machine Learning Department in 2019, advised by Maria-Florina Balcan and David P. Woodruff. His research interests lie in the intersection between theory and practice of machine learning, robustness and AI security. His methods won the championship award or ranked top in various competitions such as the NeurIPS’18 Adversarial Vision Challenge (all three tracks), the Unrestricted Adversarial Examples Challenge hosted by Google, and the NeurIPS’20 Challenge on Predicting Generalization of Deep Learning.