Part of Advances in Neural Information Processing Systems 8 (NIPS 1995)
Steve Waterhouse, Anthony Robinson
We present two additions to the hierarchical mixture of experts (HME) architecture. By applying a likelihood splitting criteria to each expert in the HME we "grow" the tree adaptively during train(cid:173) ing. Secondly, by considering only the most probable path through the tree we may "prune" branches away, either temporarily, or per(cid:173) manently if they become redundant. We demonstrate results for the growing and path pruning algorithms which show significant speed ups and more efficient use of parameters over the standard fixed structure in discriminating between two interlocking spirals and classifying 8-bit parity patterns.