NeurIPS 2020

Wisdom of the Ensemble: Improving Consistency of Deep Learning Models

Meta Review

The authors define a notion of consistency and correct-consistency for ensembles, and show this can be used for dynamic ensemble pruning. Ensembles considered are generated for neural networks using snapshots taken during learning. I agree with the authors that the main concern of reviewer one, which is the only strong voice for rejection, is a bit too harsh, as the same criticism applies to many if not all other ensemble properties that have been investigated in the past. Ensembles do not in all cases perform better than their single best member (that would probably violate the NFL theorem), but they often do in practise. The paper also introduces a dedicated ensemble learning method for neural networks, based on "extended bagging" ideas as well as cyclical learning rate schedule. The novel addition is a pruning criterion, based on a formula with one parameter beta, where the theory suggests a specific value for beta. At least in the Cifar100 experiment reported, this suggested beta value is shown to perform well. So in summary, taking all the author feedback into account, I suggest accepting this submission.