Reviews: Online EXP3 Learning in Adversarial Bandits with Delayed Feedback

This paper should be accepted. The initial review was underwhelming. In particular, there was confusion about the tuning of the learning rate and the inner workings of the doubling trick. This was cleared up by the author rebuttal. The reviewers are now convinced of the technical merit, and are favoring accepting the paper. Given the interest in delays and the recent surge in game-theoretic equilibrium arguments/constructions discovered for tackling a variety of problems, I think this paper will interest the NeurIPS participants.

Paper ID:	6063
Title:	Online EXP3 Learning in Adversarial Bandits with Delayed Feedback