The Statistics Seminar speaker for Wednesday, Oct. 5, 2016, will be Myung Hee Lee, assistant professor of Clinical Epidemiology in Medicine at Weill Cornell Medical College.
Title: Outlier detection for high dimensional, low sample size data
Abstract: Despite the popularity of high dimension, low sample data analysis, little attention has been paid to the outlier detection problem. We propose a two-stage procedure to detect outliers for high dimensional data. The first step screens out pre-determined most outlying points one by one, based on the distance between each data vector and the affine space generated by the remaining data. At the second step, we test whether each of the screened observations is significantly outlying or not. The reference values for the significant test are based on random rotations of the data in the dual space. We show that the rotation procedure generates null data sets with the same volume as the original data, but without any outliers. High dimensional asymptotic is used to justify the proposed remoteness measure. The proposed method shows superior performance with various simulation settings compared to alternative approaches. If time permits, I will present project highlights that I am currently involved in at the Center for Global Health.
Following the seminar, refreshments will be served in 1181 Comstock Hall.