Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)

*Gunnar Rätsch, Sebastian Mika, Manfred K. K. Warmuth*

We give an unified convergence analysis of ensemble learning meth- ods including e.g. AdaBoost, Logistic Regression and the Least-Square- Boost algorithm for regression. These methods have in common that they iteratively call a base learning algorithm which returns hypotheses that are then linearly combined. We show that these methods are related to the Gauss-Southwell method known from numerical optimization and state non-asymptotical convergence results for all these methods. Our analysis includes ` 1 -norm regularized cost functions leading to a clean and general way to regularize ensemble learning. 1 Introduction

We show convergence rates of ensemble learning methods such as AdaBoost [10], Logistic Regression (LR) [11, 5] and the Least-Square (LS) regression algorithm called LS-Boost [12]. These algorithms have in common that they iteratively call a base learning algorithm

L (also called weak learner) on a weighted training sample. The base learner is expected to return in each iteration t a hypothesis

^

h t from some hypothesis set of weak hypotheses

H that has small weighted training error. This is the weighted number of false predictions in classification and weighted estimation error in regression. These hypotheses are then linearly combined to form the final hypothesis f ^ (x) =

Do not remove: This comment is monitored to verify that the site is working properly