Statistics Seminar Speaker: Kean Ming Tan, 11/13/2019

Event Layout

Wednesday Nov 13 2019

Statistics Seminar Speaker: Kean Ming Tan, 11/13/2019

4:15pm @ G01 Biotechnology

The Statistics Seminar speaker for Wednesday, November 13, 2019, will be Kean Ming, an assistant professor at the University of Michigan. He is an applied statistician working on statistical machine learning methods for analyzing complex biomedical data sets. He develops multivariate statistical methods such as probabilistic graphical models, cluster analysis, discriminant analysis, and dimension reduction to uncover patterns from massive data set. Recently, he also work on topics related to robust statistics, non-convex optimization, and data integration from multiple sources.

Talk: Sparse Generalized Eigenvalue Problem and Its Application to Neuroscience

Abstract: Sparse generalized eigenvalue problem (GEP) plays a pivotal role in a large family of high-dimensional learning tasks, including sparse Fisher’s discriminant analysis, canonical correlation analysis, and sufficient dimension reduction. Most of the existing methods and theory in the context of specific statistical models that are special cases of sparse GEP require restrictive structural assumptions on the input matrices. This talk will focus on a two-stage computational framework for solving the non-convex optimization problem resulting from the sparse GEP. At the first stage, we solve a convex relaxation of the sparse GEP. Taking the solution as an initial value, we then exploit a non-convex optimization perspective and propose the truncated Rayleigh flow method (Rifle) to estimate the leading generalized eigenvector, and show that it converges to a solution with the optimal statistical rate of convergence. Theoretically, our method significantly improves upon the existing literature by eliminating the structural assumptions on the input matrices. Numerical studies in the context of several statistical models are provided to validate the theoretical results. We then apply the proposed method to an electrocorticography data to understand how human brains recall and mentally rehearse word sequences.