This week's Statistics Seminar Speaker will be Lei Sun from the Dalla Lana School of Public Health at the University of Toronto.
Talk Title: Multiple Hypothesis Testing and Other Statistical Issues in Large-Scale Genetic Studies
Abstract: A central issue in high-dimensional genetic studies is how to assess statistical significance taking into account the inherent large-scale multiple hypothesis testing. To improve power, a number of studies have investigated the benefits of utilizing available genomic and biological information; however, the relative merits of different methods remain unclear. We focus on the stratified FDR control (Sun et al., 2006, Genetic Epidemiology 30:519-530) and weighted p-value method (Genovese et al., 2006, Biometrika 93:509-524). The two approaches model the prior info distinctively. Weighted p-value approach converts the available prior information to test-specific weighting factor and adjusts the p-values accordingly. In contrast, stratified FDR divides tests into several disjoint strata based on the prior information and applies the FDR control separately in each stratum. We formulate the two approaches in one framework and show the trade-off between power and robustness by theoretical, simulation, and application studies. Robustness is consequential in applications, safeguarding against potential uninformative or even misleading prior information. To demonstrate the practical relevance of these methods, I discuss two recent genome-wide association studies of Cystic Fibrosis modifier genes, in which over 500,000 genetic markers are investigated for association with lung functions in individuals with CF and the available prior is of quantitative nature (Wright et al. 2011, Nature Genetics 43:539-548), and for association with meconium ileus and the prior is of categorical nature (Sun et al. 2012, Nature Genetics 44:562-569). If time allows, I will briefly discuss additional interesting analytical challenges in these studies and studies of other complex human traits. For example, how do we correct for the inherent overestimation of genetic effect for genetic variants discovered in whole-genome studies; and how do we simultaneously analyze multiple outcomes such as lung function and meconium ileus in CF to identify the so called pleiotropic variants that are associated with both traits.
Refreshments will be served after the seminar in 1181 Comstock Hall.