Skip to main content
Cornell university
Cornell Statistics and Data Science Cornell Statistics and Data Science
  • About Us

    About Us
    Cornell's Department of Statistics and Data Science offers four programs at the undergraduate and graduate levels. Undergraduates can earn a BA in statistical science, social...

    Welcome to the Department of Statistics and Data Science
    History
    Facilities
    Statistics Graduate Society
    Recently Published Papers
  • Academics

    Academics

    Undergraduate
    PhD
    MPS
    PhD Minor in Data Science
    Courses & Course Enrollment
  • People

    People

    Faculty
    Field Faculty
    PhDs
    Emeritus Faculty
    Academic Staff
    Staff
    Research Areas of Expertise
    Statistical Consultants
  • News and Events

    News and Events

    Events
    News
  • Resources

    Resources

    Professional Societies and Meetings
    Affiliated Groups
    Career Services
    Cornell Statistical Consulting Unit
  • Alumni

    Alumni
    Cornell's Statistics and Data Science degrees prepare students for a wide variety of careers, from academia to industry.  See the After Graduation page for a general overview of...

    Alumni Profiles

Search form

You are here

  1. Home 
  2. Events 
  3. Statistics Seminars

Statistics Seminar Speaker: Abhishek Chakrabortty, 12/14/2018

Event Layout

Friday Dec 14 2018

Statistics Seminar Speaker: Abhishek Chakrabortty, 12/14/2018

10:30am @ G01 Biotechnology
In Statistics Seminars

The Statistics Seminar speaker for Friday, December 14, 2018, is Abhishek Chakrabortty, a postdoctoral researcher at the Department of Statistics and the DBEI, University of Pennsylvania where he is mentored by Prof. Hongzhe Li and Prof. T. Tony Cai. Dr. Chakrabortty received his Ph.D. in Biostatistics from Harvard University, where he was advised by Prof. Tianxi Cai, and his Bachelors and Masters in Statistics from the Indian Statistical Institute, Kolkata. His research interests broadly lie at the interface of semi-parametric inference, high dimensional statistics and statistical learning in semi-supervised or weakly supervised settings, with applications in the analysis of large and complex observational datasets arising in modern biomedical studies.

Talk: Semi-Supervised Inference with Large and High Dimensional Data: A Semi-Parametric Perspective

Abstract: The abundance of large and complex datasets in the current big data era has also created a host of novel statistical challenges for properly harnessing such rich (but often incomplete) information. One such challenge includes statistical inference in semi-supervised (SS) settings, where apart from a moderate sized supervised data (L), one also has a much larger sized unsupervised data (U) available. Such datasets arise naturally when the response, unlike the covariates, is difficult and/or expensive to obtain, a frequent scenario in modern studies involving large databases, including biomedical data like electronic health records (EHR). It is natural to investigate whether and how the information from U can be exploited to improve efficiency over a given supervised approach.

In this talk, I will consider SS inference for a class of standard Z-estimation problems. I will discuss first the subtleties and associated challenges that necessitate a semi-parametric perspective. I will then demonstrate a family of SS Z-estimators that are robust and adaptive, thus ensuring that they are always as efficient as the supervised estimator and more efficient (optimal in some cases) when the information from U actually relates to the parameter of interest. These properties are crucial for advocating ‘safe’ use of unlabeled data and are often unaddressed. Our framework provides a much needed unified understanding of these problems. Multiple EHR data applications are also presented to exhibit the practical benefits of our estimator. In the later part of the talk, I consider SS inference in high dimensional settings, and demonstrate the remarkable benefits the unlabeled data provides in seamlessly obtaining a family of SS estimators with asymptotic linear expansions, without directly requiring any sparsity conditions or debiasing needed in supervised settings. This, in particular, facilitates high dimensional inference under minimal assumptions.

Event Categories

  • Statistics Seminars
  • Special Events

Image Gallery

Abhishek Chakrabortty
  • Home
  • About Us
  • Contact Us
  • Careers
© Cornell University Department of Statistics and Data Science

1198 Comstock Hall, 129 Garden Ave., Ithaca, NY 14853

Social Menu

  • Facebook
  • Twitter
  • YouTube
Cornell Bowers CIS College of Computing and Information Science Cornell CALS ILR School

If you have a disability and are having trouble accessing information on this website or need materials in an alternate format, contact web-accessibility@cornell.edu for assistance.