Skip to main content
Cornell university
Cornell Statistics and Data Science Cornell Statistics and Data Science
  • About Us

    About Us
    Cornell's Department of Statistics and Data Science offers four programs at the undergraduate and graduate levels. Undergraduates can earn a BA in statistical science, social...

    Welcome to the Department of Statistics and Data Science
    History
    Facilities
    Statistics Graduate Society
    Recently Published Papers
  • Academics

    Academics

    Undergraduate
    PhD
    MPS
    PhD Minor in Data Science
    Courses & Course Enrollment
  • People

    People

    Faculty
    Field Faculty
    PhDs
    Emeritus Faculty
    Academic Staff
    Staff
    Research Areas of Expertise
    Statistical Consultants
  • News and Events

    News and Events

    Events
    News
  • Resources

    Resources

    Professional Societies and Meetings
    Affiliated Groups
    Career Services
    Cornell Statistical Consulting Unit
  • Alumni

    Alumni
    Cornell's Statistics and Data Science degrees prepare students for a wide variety of careers, from academia to industry.  See the After Graduation page for a general overview of...

    Alumni Profiles

Search form

You are here

  1. Home 
  2. Events 
  3. Statistics Seminars

Statistics Seminar Speaker: Pratik Patil

Event Layout

Friday Feb 21 2025

Statistics Seminar Speaker: Pratik Patil

4:15pm @ G01 Biotech
In Statistics Seminars

Pratik Patil is a postdoctoral researcher in Statistics at the University of California, Berkeley. He obtained his PhD in Statistics and Machine Learning from Carnegie Mellon University. His research broadly spans a range of topics at the intersection of statistical machine learning, optimization, and information theory. Much of his recent work focuses on the statistical analysis of machine learning methods in the overparameterized regime, such as bagging, sketching, cross-validation, and model tuning, drawing upon tools from statistical physics and random matrix theory. More details can be found at: https://pratikpatil.io/.

Talk: Facets of regularization in overparameterized machine learning

Abstract: Modern machine learning often operates in an overparameterized regime in which the number of parameters far exceeds the number of observations. In this regime, models can exhibit surprising generalization behaviors: (1) Models can overfit with zero training error yet still generalize well (benign overfitting); furthermore, in some cases, even adding and tuning explicit regularization can favor no regularization at all (obligatory overfitting). (2) The generalization error can vary non-monotonically with the model or sample size (double/multiple descent). These behaviors challenge classical notions of overfitting and the role of explicit regularization.

In this talk, I will present theoretical and methodological results related to these behaviors, primarily focusing on the concrete case of ridge regularization. First, I will identify conditions under which the optimal ridge penalty is zero (or even negative) and show that standard techniques such as leave-one-out and generalized cross-validation, when analytically continued, remain uniformly consistent for the generalization error and thus yield the optimal penalty, whether positive, negative, or zero. Second, I will introduce a general framework to mitigate double/multiple descent in the sample size based on subsampling and ensembling and show its intriguing connection to ridge regularization. As an implication of this connection, I will show that the generalization error of optimally tuned ridge regression is monotonic in the sample size (under mild data assumptions) and mitigates double/multiple descent. Key to both parts is the role of implicit regularization, either self-induced by the overparameterized data or externally induced by subsampling and ensembling. Finally, I will briefly mention some extensions and variants beyond ridge regularization.

The talk will feature joint work with the following collaborators (in surname-alphabetical order): Pierre Bellec, Jin-Hong Du, Takuya Koriyama, Arun Kumar Kuchibhotla, Alessandro Rinaldo, Kai Tan, Ryan Tibshirani, Yuting Wei. The corresponding papers (in talk-chronological order) are: optimal ridge landscape (https://pratikpatil.io/papers/ridge-ood.pdf), ridge cross-validation (https://pratikpatil.io/papers/functionals-combined.pdf), risk monotonization (https://pratikpatil.io/papers/risk-monotonization.pdf), ridge equivalences (https://pratikpatil.io/papers/generalized-equivalences.pdf), and extensions and variants (https://pratikpatil.io/papers/cgcv.pdf, https://pratikpatil.io/papers/subagging-asymptotics.pdf).

Event Categories

  • Statistics Seminars
  • Special Events

Image Gallery

A color photo of a man smiling for a photo
  • Home
  • About Us
  • Contact Us
  • Careers
© Cornell University Department of Statistics and Data Science

1198 Comstock Hall, 129 Garden Ave., Ithaca, NY 14853

Social Menu

  • Facebook
  • Twitter
  • YouTube
Cornell Bowers CIS College of Computing and Information Science Cornell CALS ILR School

If you have a disability and are having trouble accessing information on this website or need materials in an alternate format, contact web-accessibility@cornell.edu for assistance.