The Statistics Seminar speaker for Monday, April 22, 2019, will be Jeremias Knoblauch, a PhD student within the Oxford-Warwick Statistics Programme, a visiting scholar at both the Alan Turing Institute and Duke University as well as a Facebook Fellow. His research interests lie at the intersection between statistical machine learning and spatio-temporal inference. Specifically, he focuses on scalable inference methods for spatio-temporal data streams that run in real time. Inference for such complex dynamical systems is typically complicated by non-stationarity, sudden changes, model uncertainty, misspecification and outliers. While the analysis of real-world data streams almost always needs to address these complications, tackling them jointly leads standard likelihood-based learning rules to break down. Jeremias works on alternative learning rules derived from generalized Bayes theorems which solve this collection of problems jointly, efficiently and effortlessly. More information, papers, videos, slides and open source code can be found at https://jeremiasknoblauch.github.io/.
Abstract: This paper introduces a generalized representation of Bayesian inference. It is derived axiomatically, recovering existing Bayesian methods as special cases. We use it to prove that variational inference (VI) with the variational family Q produces the uniquely optimal Qconstrained approximation to the exact Bayesian inference problem. Surprisingly, this implies that VI dominates any other Q-constrained approximation to the exact Bayesian inference problem. This means that alternative Q-constrained approximations like Expectation Propagation (Minka, 2001; Oppeer & Winther, 2000) can produce better posteriors than VI only by implicitly targeting more appropriate Bayesian inference problems. Inspired by this, we introduce Generalized Variational Inference (GVI), a modular approach for instead solving such alternative inference problems explicitly. We explore some applications of GVI, including robust inference and better approximate posterior variances. Lastly, we derive a black box inference scheme and demonstrate it on Bayesian Neural Networks and Deep Gaussian Processes, where GVI substantially outperforms competing methods.