A Novel Test for Additivity in Supervised Ensemble Learners
Lucas Mentch, Giles Hooker(Submitted on 7 Jun 2014 (v1), last revised 11 Nov 2014 (this version, v2))
Additive models remain popular statistical tools due to their ease of interpretation and as a result, hypothesis tests for additivity have been developed to assess the appropriateness of these models. However, as data grows in size and complexity, learning algorithms continue to gain popularity due to their exceptional predictive performance. Due to the black-box nature of these learning methods, the increase in predictive power is assumed to come at the cost of interpretability and inference. However, recent work suggests that many popular learning techniques, such as bagged trees and random forests, have desirable asymptotic properties which allow for formal statistical inference when base learners are built with proper subsamples. This work extends hypothesis tests previously developed and demonstrates that by enforcing a grid structure on an appropriate test set, we may perform formal hypothesis tests for additivity among features. We develop notions of total and partial additivity and demonstrate that both tests can be carried out at no additional computational cost. We also suggest a new testing procedure based on random projections that allows for testing on larger grids, even when the grid size is larger than that of the training set. Simulations and demonstrations on real data are provided.
Subjects: Machine Learning (stat.ML); Applications (stat.AP)Cite as: arXiv:1406.1845 [stat.ML] (or arXiv:1406.1845v2 [stat.ML] for this version)