Veronika Rockova is assistant professor in econometrics and statistics at the University of Chicago Booth School of Business. Her work brings together statistical methodology, theory and computation to develop high-performance tools for analyzing big datasets. Her research interests reside at the intersection of Bayesian and frequentist statistics, and focus on: data mining, variable selection, machine learning, non-parametric methods, factor models, dynamic models, high-dimensional decision theory and inference. She has authored a variety of published works in top statistics journals, including the Journal of American Statistical Association and the Annals of Statistics. In her applied work, she contributed to the improvement of risk stratification and prediction models for public reporting in healthcare analytics.
Prior to joining Booth, Rockova held a Postdoctoral Research Associate position at the Department of Statistics of the Wharton School at the University of Pennsylvania. Rockova holds a PhD in biostatistics from Erasmus University (The Netherlands), an MSc in biostatistics from Universiteit Hasselt (Belgium) and both an MSc in mathematical statistics and a BSc in general mathematics from Charles University (Czech Republic).
Talk: Theory for BART
Abstract: The remarkable empirical success of Bayesian additive regression trees (BART) has raised considerable interest in understanding why and when this method produces good results. Since its inception nearly 20 years ago, BART has become widely used in practice and yet, theoretical justifications have been unavailable. To narrow this yawning gap, we study estimation properties of Bayesian trees and tree ensembles in nonparametric regression (such as the speed of posterior concentration, reluctance to overfit, variable selection and adaptation in high-dimensional settings). Our approach rests upon a careful analysis of recursive partitioning schemes and associated sieves of approximating step functions. We develop several useful tools for analyzing additive regression trees, showing their optimal performance in both additive and non-additive regression. Our results constitute a missing piece of the broader theoretical puzzle as to why Bayesian machine learning methods like BART have been so successful in practice.