The Ni1000: High Speed Parallel VLSI for Implementing Multilayer Perceptrons

Perrone, Michael; Cooper, Leon

The Ni1000: High Speed Parallel VLSI for Implementing Multilayer Perceptrons

Part of Advances in Neural Information Processing Systems 7 (NIPS 1994)

Bibtex Metadata Paper

Authors

Michael P. Perrone, Leon N. Cooper

Abstract

In this paper we present a new version of the standard multilayer perceptron (MLP) algorithm for the state-of-the-art in neural net(cid:173) work VLSI implementations: the Intel Ni1000. This new version of the MLP uses a fundamental property of high dimensional spaces which allows the 12-norm to be accurately approximated by the It -norm. This approach enables the standard MLP to utilize the parallel architecture of the Ni1000 to achieve on the order of 40000, 256-dimensional classifications per second.

1 The Intel NilOOO VLSI Chip

The Nestor/Intel radial basis function neural chip (Ni1000) contains the equivalent of 1024 256-dimensional artificial digital neurons and can perform at least 40000 classifications per second [Sullivan, 1993]. To attain this great speed, the Ni1000 was designed to calculate "city block" distances (Le. the II-norm) and thus to avoid the large number of multiplication units that would be required to calculate Euclidean dot products in parallel. Each neuron calculates the city block distance between its stored weights and the current input:

neuron activity = L IWi - :eil

(1) where w, is the neuron's stored weight for the ith input and :ei is the ith input. Thus the Nil000 is ideally suited to perform both the RCE [Reillyet al., 1982] and

748

Michael P. Perrone. Leon N. Cooper

PRCE [Scofield et al., 1987] algorithms or any of the other commonly used radial basis function (RBF) algorithms. However, dot products are central in the calcula(cid:173) tions performed by most neural network algorithms (e.g. MLP, Cascade Correlation, etc.). Furthermore, for high dimensional data, the dot product becomes the compu(cid:173) tation bottleneck (i.e. most ofthe network's time is spent calculating dot products). If the dot product can not be performed in parallel there will be little advantage using the NilOOO for such algorithms. In this paper, we address this problem by showing that we can extend the NilOOO to many of the standard neural network algorithms by representing the Euclidean dot product as a function of Euclidean norms and by then using a city block norm approximation to the Euclidean norm. Section 2, introduces the approximate dot productj Section 3 describes the City Block MLP which uses the approximate dot productj and Section 4 presents ex(cid:173) periments which demonstrate that the City Block MLP performs well on the NIST OCR data and on human face recognition data.

2 Approximate Dot Product

Consider the following approximation [Perrone, 1993]:

The Ni1000: High Speed Parallel VLSI for Implementing Multilayer Perceptrons

Authors

Abstract

Name Change Policy