Part of Advances in Neural Information Processing Systems 9 (NIPS 1996)
Graham Brightwell, Claire Kenyon, Hélène Paugam-Moisy
We study the number of hidden layers required by a multilayer neu(cid:173) ral network with threshold units to compute a function f from n d to {O, I}. In dimension d = 2, Gibson characterized the functions computable with just one hidden layer, under the assumption that there is no "multiple intersection point" and that f is only defined on a compact set. We consider the restriction of f to the neighbor(cid:173) hood of a multiple intersection point or of infinity, and give neces(cid:173) sary and sufficient conditions for it to be locally computable with one hidden layer. We show that adding these conditions to Gib(cid:173) son's assumptions is not sufficient to ensure global computability with one hidden layer, by exhibiting a new non-local configuration, the "critical cycle", which implies that f is not computable with one hidden layer.