The Statistics Seminar speaker for Wednesday, February 15, 2017 is Alex Franks, who is currently a Moore/Sloan Data Science and WRF Innovation in Data Science Postdoctoral Fellow at the eScience Institute at the University of Washington. His interests include methods for high-dimensional covariance estimation, models for high-throughput 'omics data, and missing data methodology. He is currently collaborating with Dr. Daniel Promislow (University of Washington, Dept. of Pathology) on statistical methods for analyzing metabolomic data from patients with Alzheimer's and Parkinson's disease. He received his PhD in the Department of Statistics at Harvard University where he was a member of Edoardo Airoldi's lab.
Title: Bayesian Covariance Estimation with Applications in High-throughput Biology
Abstract: Understanding the function of biological molecules requires statistical methods for assessing covariability across multiple dimensions as well as accounting for complex measurement error and missing data. In this talk, I will discuss two models for covariance estimation which have applications in molecular biology. In the first part of the talk, I will describe a model-based method for evaluating heterogeneity among several p x p covariance matrices in the large p, small n setting and will illustrate the utility of the method for exploratory analyses of high-dimensional multivariate gene expression data. In the second half of the talk, I will describe the role of covariance estimation in quantifying how cells regulate protein levels. Specifically, estimates of the correlation between steady-state levels of mRNA and protein are used to assess the degree to which protein levels are determined by post-transcriptional processes. Differences in cell preparation, measurement technology and protocol, as well as the pervasiveness of missing data complicate the accurate estimation of this correlation. To address these issues, I fit a Bayesian hierarchical model to a compendium of 58 data sets from multiple labs to infer a structured covariance matrix of measurements. I contextualize and contrast our results to conclusions drawn in previous studies.