Statistics Seminar Speaker: Ziv Goldfeld, 02/09/2022

Ziv Goldfeld is an assistant professor in the School of Electrical and Computer Engineering, and a graduate field member in Computer Science, Data Science, and the Center of Applied Mathematics, at Cornell University. Before joining Cornell, he was a postdoctoral research fellow in LIDS at MIT, hosted by Yury Polyanskiy. Ziv graduated with a B.Sc., M.Sc., and Ph.D. (all summa cum laude) in Electrical and Computer Engineering from Ben Gurion University, Israel. His graduate advisor was Haim Permuter.

Ziv’s research interests include optimal transport theory, statistical learning theory, information theory, and applied probability. He seeks to understand the theoretical foundations of modern inference and information processing systems by formulating and solving mathematical models. Honors include the NSF CAREER Award, the IBM University Award, and the Rothschild Postdoctoral Fellowship.

Talk: A Scalable Statistical Theory for Smooth Wasserstein Distances

Watch this talk (Cornell NetID required)

Abstract: Wasserstein distances has recently seen a surge of applications in statistics and machine learning. This stems from many advantageous properties they possess, such as the metric and topological structure of Wasserstein spaces, robustness to support mismatch, compatibility to gradient-based optimization, and rich geometric properties. In practice, we rarely have access to the actual distribution and only get data from it, which necessitates estimating the distance from samples. A central issue is that such estimators suffer from the curse of dimensionality: their empirical convergence rate scales as n^{-1/d} for d-dimensional distributions. This rate deteriorates exponentially fast with dimension, making it impossible to obtain meaningful accuracy guarantees, especially given the dimensionality of real-world data.

This talk will present the novel framework of smooth Wasserstein distances that inherits the properties of their classic counterparts while alleviating the empirical curse of dimensionality. Specifically, we will show that the empirical approximation error of the smooth distance decays as n^{-1/2}, in all dimensions. To enable principled inference, we will also derive high-dimensional limit distributions for the smooth empirical distances. These results highlight the favorable statistical behavior of the smooth framework, as comparable claims for the classic distance are currently far beyond the horizon. The derivations rely on tools from empirical process theory, the extended functional delta method, dual Sobolev spaces, and the Benamou-Brenier dynamical formulation as key ingredients. Applications to implicit generative modeling will be briefly discussed and serve to motivate the statistical exploration.