Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track
David Woodruff, Fred Zhang, Samson Zhou
In the online learning with experts problem, an algorithm makes predictions about an outcome on each of $T$ days, given a set of $n$ experts who make predictions on each day. The algorithm is given feedback on the outcomes of each day, including the cost of its prediction and the cost of the expert predictions, and the goal is to make a prediction with the minimum cost, compared to the best expert in hindsight. However, often the predictions made by experts or algorithms at some time influence future outcomes, so that the input is adaptively generated. In this paper, we study robust algorithms for the experts problem under memory constraints. We first give a randomized algorithm that is robust to adaptive inputs that uses $\widetilde{O}\left(\frac{n}{R\sqrt{T}}\right)$ space for $M=O\left(\frac{R^2 T}{\log^2 n}\right)$, thereby showing a smooth space-regret trade-off. We then show a space lower bound of $\widetilde{\Omega}\left(\frac{nM}{RT}\right)$ for any randomized algorithm that achieves regret $R$ with probability $1-2^{-\Omega(T)}$, when the best expert makes $M$ mistakes. Our result implies that the natural deterministic algorithm, which iterates through pools of experts until each expert in the pool has erred, is optimal up to polylogarithmic factors. Finally, we empirically demonstrate the benefit of using robust procedures against a white-box adversary that has access to the internal state of the algorithm.