Review for NeurIPS paper: PAC-Bayes Learning Bounds for Sample-Dependent Priors

NeurIPS 2020

PAC-Bayes Learning Bounds for Sample-Dependent Priors

Meta Review

This paper derives new PAC-Bayesian risk bounds for sample-dependent priors -- that is, priors that depend on the training data, which violates the classical PAC-Bayes setting. There has been a recent surge in papers about informed priors, and this paper takes an interesting step in this direction. Notably, it improves on recent work by Foster et al. (2019) on "hypothesis set stability," which incorporates ideas from transductive Rademacher complexity. The resulting risk bound makes no assumptions about the prior; the prior is characterized by covering numbers. The paper applies the bound to a family of sample-dependent priors that obeys a sensitivity condition (similar to algorithmic stability but designed for distributions). PAC-Bayes with informed priors has had a resurgence because it has been shown to yield non-vacuous risk bounds for neural networks (see, e.g., work by Dziugaite & Roy). Thus, the paper addresses a relevant, timely problem. The reviewers agree that the paper is solid; the results are new, interesting, and presented well. The main criticism is that the paper lacks an empirical investigation. The reason for using PAC-Bayes and data-dependent priors is to obtain bounds that are meaningful in practice. Papers on generalization bounds (especially for deep learning) have started to include experiments to demonstrate that (a) the bounds are non-vacuous and (b) tighter than others. Not including an empirical study is a big gap here; including one would have taken the paper to another level. I strongly encourage the authors to incorporate feedback from the reviewers into the paper -- especially from R7.