The Statistics Seminar speaker for Wednesday, October 31, 2018, is Tim Hesterberg, a Senior Statistician at Google. He previously worked at Insightful (S-PLUS), Franklin & Marshall College, and Pacific Gas & Electric Co. He received his Ph.D. in Statistics from Stanford University, under Brad Efron, and is a Fellow of the American Statistical Association. He is author of the "Resample" package for R, Chihara and Hesterberg "Mathematical Statistics with Resampling and R" (2018), and "What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum", The American Statistician 2015.
Talk: Bootstrap Surprises
Abstract: Resampling methods are easier to use and more accurate than classical formula-based statistical methods, but computationally expensive. Whoa - wait a minute. You can go wrong if you don't understand the idea behind the bootstrap. You might think of the bootstrap for small samples where you don't trust the central limit theorem, but the most common bootstrap methods are less accurate in small samples than classical methods. There are simple variations that are dramatically more accurate, and these show that the old n >= 30 rule is just wrong - try n >= 5000 instead (theory confirms this). Finally, we bootstrap for big data at Google because formulas are computationally infeasible. I hope to change not only the way you think about the bootstrap, but about statistical practice.