Part of Advances in Neural Information Processing Systems 7 (NIPS 1994)
Anders Krogh, Jesper Vedelsby
Learning of continuous valued functions using neural network en(cid:173) sembles (committees) can give improved accuracy, reliable estima(cid:173) tion of the generalization error, and active learning. The ambiguity is defined as the variation of the output of ensemble members aver(cid:173) aged over unlabeled data, so it quantifies the disagreement among the networks. It is discussed how to use the ambiguity in combina(cid:173) tion with cross-validation to give a reliable estimate of the ensemble generalization error, and how this type of ensemble cross-validation can sometimes improve performance. It is shown how to estimate the optimal weights of the ensemble members using unlabeled data. By a generalization of query by committee, it is finally shown how the ambiguity can be used to select new training data to be labeled in an active learning scheme.