Irina Gaynanova is an Associate Professor in the Department of Biostatistics at University of Michigan. Prior to that she was Associate Professor in the Department of Statistics at Texas A&M University. She received her PhD in Statistics from Cornell University in 2015. Dr. Gaynanova’s teaching methods emphasize reproducible research practices, statistical computing and communication skills; her goal being to prepare students for STEM-oriented careers. Dr. Gaynanova’s research focuses on the development of statistical methods for analysis of modern high-dimensional biomedical data. Her methodological interests are in data integration, machine learning and high-dimensional statistics, motivated by challenges arising in analyses of multi-omics data and data from wearable devices. Her research has been funded by the National Science Foundation, and recognized with a David P. Byar Young Investigator Award and an NSF CAREER Award.
Talk: Fast variable selection for distributional regression with application to CGM data
Abstract: Continuous glucose monitors (CGMs) are increasingly used to measure blood glucose levels and provide information about the treatment and management of diabetes. However, there is a large gap between rich temporal information in the data and commonly used crude summaries (i.e., mean, standard deviation). At the same time, free-living conditions in which CGM data are typically collected make applications of functional data analysis approaches challenging due to time-misaligned meal intakes and observation periods. Distributional learning advances on traditional summaries by using the whole distribution function of glucose measurements as the response while avoiding time alignment issues. However, existing algorithms for distributional regression are computationally demanding, making their application infeasible on large CGM datasets, especially when coupled with variable selection strategies. In this work, we develop a new optimization algorithm for distributional regression with variable selection based on the closed-form geodesic descent. Our numerical studies demonstrate superior accuracy and computational efficiency compared to existing methods, with 100+ order of magnitude speed improvements. We combine our algorithm with subsampling-based stability selection to study the effect of over 30 covariates on CGM profiles of 200 patients with type II diabetes, obtaining new clinical insights.