{"title": "The Hopfield Model with Multi-Level Neurons", "book": "Neural Information Processing Systems", "page_first": 278, "page_last": 289, "abstract": null, "full_text": "278 \n\nTHE HOPFIELD MODEL WITH MUL TI-LEVEL NEURONS \n\nMichael Fleisher \n\nDepartment of Electrical Engineering \n\nTechnion - Israel Institute of Technology \n\nHaifa 32000, Israel \n\nABSTRACT \n\nThe Hopfield neural network. model for associative memory is generalized. The generalization \n\nreplaces two state neurons by neurons taking a richer set of values. Two classes of neuron input output \n\nrelations are developed guaranteeing convergence to stable states. The first is a class of \"continuous\" rela-\n\ntions and the second is a class of allowed quantization rules for the neurons. The information capacity for \n\nnetworks from the second class is fOWld to be of order N 3 bits for a network with N neurons. \n\nA generalization of the sum of outer products learning rule is developed and investigated as well. \n\n\u00a9 American Institute of Physics 1988 \n\n\f279 \n\nI. INTRODUCTION \n\nThe ability to perfonn collective computation in a distributed system of flexible structure without \n\nglobal synchronization is an important engineering objective. Hopfield's neural network [1] is such a \n\nmodel of associative content addressable memory. \n\nAn important property of the Hopfield neural network is its guaranteed convergence to stable states \n\n(interpreted as the stored memories). In this work we introduce a generalization of the Hopfield model by \n\nallowing the outputs of the neurons to take a richer set of values than Hopfield's original binary neurons. \n\nSufficient conditions for preserving the convergence property are developed for the neuron input output \n\nrelations. Two classes of relations are obtained. The first introduces neurons which simulate multi thres-\n\nhold functions, networks with such neurons will be called quantized neural networks (Q.N.N.). The second \n\nclass introduces continuous neuron input output relations and networks with such neurons will be called \n\ncontinuous neural networks (C.N.N.). \n\nIn Section II, we introduce Hopfield's neural network and show its convergence property. C.N.N. \n\nare introduced in Section m and a sufficient condition for the neuron input output continuous relations is \n\ndeveloped for preserving convergence. In Section IV, Q.N.N. are introduced and their input output rela(cid:173)\n\ntions are analyzed in the same manner as in III. In Section IV we look further at Q.N.N. by using the \n\ndefinition of information capacity for neural networks of [2] to obtain a tight asymptotic estimate of the \n\ncapacity for a Q.N.N. with N neurons. Section VI is a generalized sum of outer products learning for the \n\nQ.N.N. and section VII is the discussion. \n\nn. THE HOPFIELD NEURAL NETWORK \n\nA neural network consists of N pairwise connected neurons. The i 'th neuron can be in one of two \nstates: Xi = -lor Xi = + 1. The connections are fixed real numbers denoted by W ij (the connection \nfrom neuron i to nelD'On j ). Defme the state vector X to be a binary vector whose i 'th component \n\ncorresponds to the state of the i 'th neuron. Randomly and asynchronously, each neuron examines its input \n\nand decides its next output in the following manner. Let ti be the threshold voltage of the i 'th neuron. If \n\nthe weighted sum of the present other N -1 neuron outputs (which compose the i 'th neuron input) is \n\n\f280 \n\ngreater or equal to ti' the next Xi (xt) is+l. ifnot.Xt is -1. This action is given in (1). \n\nX\u00b7+ = sgn [ ~ W\u00b7\u00b7X \u00b7-t\u00b7 ] \n\nIJ J \n\nI \n\nI \n\nN \nLi \nj=1 \n\n(1) \n\nWe give the following theorem \n\nTheorem 1 (of (1)) \n\nThe network described with symmetric (Wij=Wji ) zero diagonal (Wi;=<\u00bb connection matrix W \n\nhas the convergence property. \n\nDefme the quantity \n\nE(X) =- - ~ ~ W\u00b7\u00b7X\u00b7X\u00b7 + ~ t\u00b7X\u00b7 \nI \n\nIJ \n\nI \n\nJ \n\n-\n\n1 N N \n2 Li Li \ni j=1 \n\nN \nLi I \ni=1 \n\n(2) \n\nWe show that E (X) caD only decrease as a result of the action of the network. Suppose that Xk changed \nto X t = Xl +Mk \u2022 the resulting change in E is given by \n\ntJ.E = -llXk ( 1: WkjXj-tk) \n\nN \n\nj=1 \n\n(3) \n\n(Eq. (3) is correct because of the restrictions on W). The term in brackets is exactly the argument of the \nsgn function in (1) and therefore the signs of IlXk and the term in brackets is the same (or IlXk =<\u00bb and \nwe get!lE ~ O. Combining this with the fact that E (X) is bounded shows that eventually the network \nwill remain in a local minimum of E (X). TlUs cornpJetcs the proof. \n\nThe technique used in the proof of Theorem 1 is an important tool in analyzing neural networks. A \n\nnetwork with a particular underlying E (X) function can be used to solve optimization problems with \n\nE (K) as the object of optimization. Thus we see another use of neural networks. \n\n\fm. THE C.N.N. \n\nWe ask ourselves the following question: How can we change the sgn function in (1) without affecl(cid:173)\n\ning the convergence property? The new action rule for the i 'th neuron is \n\n281 \n\nX\u00b7+=/\u00b7[ ~ W\u00b7\u00b7X\u00b7 ] \n\nIJ J \n\n, \n\nN \n1 kI \nj=l \n\n(4) \n\nOur attention is focused on possible choices for Ii ('). The following theorem gives a part of the answer. \n\nTheorem 2 \n\nThe network described by (4) (with symmetric zero diagonal W) has the convergence property if \n\nIi ( . ) are strictly increasing and bounded. \n\nDefine \n\nWe show as before that E ex) can only decrease and since E is bounded (because of the boundedness of \n\nIi's) the theorem is proved. \n\n(5) \n\nUsinggi(Xi ) = J li-l(u)dU we have \n\nXj \n\no \n\nUsing the intel111ediate value theorem we gel \n\n(6) \n\n\f282 \n\nis a \n\npoint between X k \n\nif Mk > 0 we have \nwhere C \nC S; Xk+LlXk = > Ik-I(C) S;fk-1(Xk+Mk ) and the term in brackets is greater or equal to zero \n=> IlE SO. A similar argument holds for Mk < 0 (of course Mk =0 => llE =0). This comp~etes \n\nand Xk +LlXk . Now, \n\nthe proof. \n\nSome remarks: \n\n(a) Strictly increasing bounded neuron relations are not the whole class of relations conserving the conver-\n\ngence property. This is seen immediately from the fact that Hopfield's original model (1) is not in this \n\nclass. \n\n(b) The E (X) in the C.N.N. coincides with Hopfield's continuous neural network [3]. The difference \n\nbetween the two networks lies in the updating scheme. In our C.N.N. the neurons update their outputs at \n\nthe moments they examine their inputs while in [3] the updating is in the form of a set of differential equa(cid:173)\n\ntions featuring the time evolution of the network outputs. \n\n(c) The boundedness requirement of the neuron relations results from the boundedness of E (K). It is \n\npossible to impose further restrictions on W resulting in unbounded neuron relations but keeping E (X) \n\nbounded (from below). This was done in [4] where the neurons exhibit linear relations. \n\nIV. THE Q.N.N. \n\nWe develop the class of quantization rules for the neurons, keeping the convergence property. \n\nDenote the set of possible neuron outputs by Yo < Y 1 < ... < Y n and the set of threshold values by \nt 1 < t 2 < ... < t n the action of the neurons is given by \n\nxt = Y/ \n\nif t/ < L W;jXj ~ tl+l I=O, ... ,n \n\nN \n\nj=1 \n\n(8) \n\nThe following theorem gives a class of quantization rules with the convergence property. \n\n\fTheorem 3 \n\nAn.y quantization rule for the neurons which is an increasing step functioo that is \n\nYo 1) yields the same result since it can be viewed as a sequence of I i - j I changes from Y i \n\nto Yj each resulting in M ~O. The proof is completed by noting that LlX'e=O=>M =0 and E (X) is \n\nbounded. \n\n(12) \n\n\f284 \n\nCorollaIy \n\nHopfield's original model is a special case of (9). \n\nV. INFORMATION CAPACITY OF THE Q.N.N. \n\nWe use the definition of [2] for the information capacity of the Q.N.N. \n\nDefinition 1 \n\nThe information capacity of the Q.N.N. (bits) is the log (Base 2) of the number of distinguishable \n\nnetworks of N neurons. Two networks are distinguishable if observing the state transitions of the neurons \n\nyields different observations. For Hopfield's original model it was shown in [2] that the capacity C of a \nnetwork of N neurons is bounded by C ~ log (2(N-l)2f = O(N 3)b. It was also shown that \nC ~ Q(N 3)b and thus is exactly of the order N 3b. It is obvious that in our case (which contains the \noriginal model) we must have C ~ Q(N 3)b as well (since the lower bound cannot decrease in this \nricher case). It is shown in the Appendix that the number of multi threshold functions of N -1 variables \nwith n+l oUlput levels is at most (n+lf2+N+1 since we have N neurons there will be \n( (n+lf2+N+1f distinguishablenetworlcs and thus \n\n01 as before, C is exactly of O(N 3)b. In fact, the rise in C is probably a faclOr of O(log2n) as can be \n\n(14) \n\nseen from the upper bound. \n\nVI. \"OUTER PRODUCT\" LEARNING RULE \n\nFor Hopfleld's origiDal network with two state neurons (taking the values \u00b11) a nalw-al and exten(cid:173)\n\nsively investigated r l.t 1.\u00a3 ] learning rule is the so called sum of outer products construction. \n\n1 1 \nW .. =- ~ X\u00b7X\u00b7 \n) \n\n1) N ~ 1 \n\n1 K \n\n1=1 \n\n(15) \n\nwhere Xl, ... , X K are the desired stable states of the network. A well-known result for (15) is that the \n\nasymplOtic capacity K of the network is \n\n\fK= N-l +1 \n\n410gN \n\n285 \n\n(16) \n\nIn this section we introduce a natural generalization of (15) and prove a similar result for the asymp-\n\ntotic capacity. We first limit the possible quantization rules to: \n\n(17) \n\nwith Y < ... < Y \nn \n\no \nt.=~(y.+y. IJ \n\nJ \n\nJ-\n\nJ \n\n2 \n\nwith \n\nis even \nYi -:# 0 \n\n(a) \nn+l \n(b) V i \n(c) \n\ny. =-y . \nn-l \n\nI \n\nj=l, ... n \n\ni=O, ... ,n \n\nN eAt we state that the desired stable vectors Xl, . . . X K are such that each component is picked \n\nindependently at random from ( Yo ' . . . Y M } with equal probability. Thus. the K \u2022 N components of \n\nthe X 's are zero mean i.i.D random variables. Our modified learning rule is \n\nw .. = -L ~ X!. [_1 ] \n\nIJ N ~ I \n\nXl \nj \n\n1=1 \n\nNote that for Xi E (+1, -I} (18) is identical to (16). \n\nDefine \n\n(18) \n\n\f286 \n\n;~~ IYi -Yjl \n\nl\u00a2oJ \n\nA = max \niJ \n\nIY.12 \n\nl \n\nIYj I \n\nWe state that \n\nPROPOsmON: \n\nThe asymptotic capacity of the above network is given by \n\nN \n\nK= - - - - -\n16A 2 logN \n,.., \n\n(6y)2 \n\nPROOF: \n\nDef\"me \n\nP (K , N) = P r \n\n{ K vectors chosen randomly as deSCribed} \n\nare stable states with the W of ( ) \n\n(19) \n\n(20) \n\nwhere Aij is the event that the i th component of j th vector is in error. We concentrate on the event All \n\nW.L.G. \n\nThe input u 1 when X' is presented is given by \n\n(21) \n\nThe first term is mapped by (17) into itself and corresponds to the desired Signal. \n\nThe last term is a sum of (K -1 )(N -1) i.i.D zero mean random variables and corresponds to \n\nnoise. \n\n\f287 \n\nK-l \n\nN -+00 \n\nThe middle term -N X 1 is disposed of by assuming -N ~ O. (With a zero diagonal \n\nK-l \n\n1 \n\nchoice of W (using (18) with i *' j) this term does not appear). \nP r (A 11) = P r { noise gets us out of range } \nDenoting the noise by I we have \n\n(K -1)(N-l)4A 2 \n\n(22) \n\nwhere the first inequality is from the defmition of .1Yand the second uses the lemma of [6] p. 58. We thus \n\nget \n\nP (K , N) ~ 1 - K \u2022 N . 2exp \n\n,.., \n(,1Y)2N 2 \n\n- -~---'---~ \n\n8(K -l)(N-l)A 2 \n\n(23) \n\nsubstituting (19) and taking N ~ 00 we get P (K , N) ~ 1 and this completes the proof. \n\nVll. DISCUSSION \n\nTwo classes of generalization of the Hopfield neural network model were presented. We give some \n\nremarks: \n\n(a) Any combination of neurons from the two classes will have the convergence property as well. \n\n(b) Our defmition of the information capacity for the eN.N. is useless since a full observation of the pos\u00b7 \n\nsible state transitions of the netwock is impossible. \n\n\f288 \n\nAPPENDIX \n\nWe prove the following theorem. \n\nTheorem \n\nAn upper bound on the num~ of multi threshold functions with N inputs and M points in the \n\ndomain (out of(n+l)N possible points) et/ is the solution of the recurrence relation \n\neM - CM - 1 + n \u00b7CM - 1 \nN-l \nN - N \n\n(A.I) \n\nLet us look on the N dimensional weight space W. Each input point X divides the weight space \ninto n+l regions by n parallel hyperplanes L W;X;=tk k=l, ... ,n. We keep adding points in such \n\nN \n\n;=1 \n\na way that the new n hypeq>1anes corresponding to each added point partition the W space into as many \nregions as possible. Assume M -1 points have made e t! -I regions and we add the M 'lh point. Each \nhyperplane (out of n) is divided into at most Cf/_l1 region, (being itself an N -1 dimensional space \ndivided by (M -1)n hyperlines). We thus have after passing the n hyperplanes: \n\nis e tI = (n + 1).L \n\nN-l[ M-1] \n\ni \n\neM - CM - I + n \u00b7CA1 - 1 \n\nN-I \n\nN -\n\nN \n\nn i and the theorem is proved . \n\n\u2022 =0 \n\nThe solution of the recurrence in the case M =(n + I f (all possible points) we have a bound on \n\nthe number of multi threshold functions of N variables equal to \n\nand the result used is established. \n\n\f289 \n\nLIST OF REFERENCES \n\n[1] Hopfield J. J. t \"Neural networks and physical systems with emergent collective computational abili(cid:173)\n\nties\", Proc. Nat. Acad. Sci. USA, Vol. 79 (1982), pp. 2554-2558. \n\n[2] Abu-Mostafa Y.S. and Jacques J. St, \"lnfonnation capacity of the Hopfield model\", IEEE Trans. on \n\nInfo. Theory, Vol. IT-31 (1985. ppA61-464. \n\n[3] Hopfield J. J., \"Neurons with graded response have collective computational properties like those of \n\ntwo state neurons\", Proc. Nat. Acad. Sci. USA, Vol. 81 (1984). \n\n[4] \n\nFleisher M., \"Fast processing of autoregressive signals by a neural network\", to be presented at IEEE \n\nConference, Israel 1987. \n\n[5] Levin, E., Private communication. \n\n[6] \n\nPettov, \"Sums of independent random variables\". \n\n\f", "award": [], "sourceid": 60, "authors": [{"given_name": "Michael", "family_name": "Fleisher", "institution": null}]}