Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

AuthorFeedback Bibtex MetaReview Paper Review Supplemental

Authors

Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

Abstract

In this paper, we consider stochastic multi-armed bandits (MABs) with heavy-tailed rewards, whose p-th moment is bounded by a constant nu_p for 1