Part of Advances in Neural Information Processing Systems 6 (NIPS 1993)
Hans Thodberg
MacKay's Bayesian framework for backpropagation is conceptually appealing as well as practical. It automatically adjusts the weight decay parameters during training, and computes the evidence for each trained network. The evidence is proportional to our belief in the model. The networks with highest evidence turn out to generalise well. In this paper, the framework is extended to pruned nets, leading to an Ockham Factor for "tuning the architecture to the data". A committee of networks, selected by their high evidence, is a natural Bayesian construction. The evidence of a committee is computed. The framework is illustrated on real-world data from a near infrared spectrometer used to determine the fat content in minced meat. Error bars are computed, including the contribution from the dissent of the committee members.
1 THE OCKHAM FACTOR
William of Ockham's (1285-1349) principle of economy in explanations, can be formulated as follows:
If several theories account for a phenomenon we should prefer the simplest which describes the data sufficiently well.