Mixability made efficient: Fast online multiclass logistic regression

Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

Bibtex Paper Reviews And Public Comment » Supplemental

Authors

Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

Abstract

Mixability has been shown to be a powerful tool to obtain algorithms with optimal regret. However, the resulting methods often suffer from high computational complexity which has reduced their practical applicability. For example, in the case of multiclass logistic regression, the aggregating forecaster (see Foster et al. 2018) achieves a regret of $O(\log(Bn))$ whereas Online Newton Step achieves $O(e^B\log(n))$ obtaining a double exponential gain in $B$ (a bound on the norm of comparative functions). However, this high statistical performance is at the price of a prohibitive computational complexity $O(n^{37})$.In this paper, we use quadratic surrogates to make aggregating forecasters more efficient. We show that the resulting algorithm has still high statistical performance for a large class of losses. In particular, we derive an algorithm for multiclass regression with a regret bounded by $O(B\log(n))$ and computational complexity of only $O(n^4)$.