{"title": "Analysis of Linsker's Simulations of Hebbian Rules", "book": "Advances in Neural Information Processing Systems", "page_first": 694, "page_last": 701, "abstract": null, "full_text": "694 MacKay and Miller \n\nAnalysis of Linsker's Simulations \n\nof Hebbian rules \n\nDavid J. C. MacKay \n\nComputation and Neural Systems \n\nCaltech 164-30 CNS \nPasadena, CA 91125 \n\nKenneth D. Miller \n\nDepartment of Physiology \n\nUniversity of California \n\nSan Francisco, CA 94143 - 0444 \n\nmackayOaurel.cns.caltech.edu \n\nkenOphyb.ucsf.edu \n\nABSTRACT \n\nLinsker has reported the development of centre---surround receptive \nfields and oriented receptive fields in simulations of a Hebb-type \nequation in a linear network. The dynamics of the learning rule \nare analysed in terms of the eigenvectors of the covariance matrix \nof cell activities. Analytic and computational results for Linsker's \ncovariance matrices, and some general theorems, lead to an expla(cid:173)\nnation of the emergence of centre---surround and certain oriented \nstructures. \n\nLinsker [Linsker, 1986, Linsker, 1988] has studied by simulation the evolution of \nweight vectors under a Hebb-type teacherless learning rule in a feed-forward linear \nnetwork. The equation for the evolution of the weight vector w of a single neuron, \nderived by ensemble averaging the Hebbian rule over the statistics of the input \npatterns, is:! \n\na at Wi = k! + L(Qij + k 2 )wj subject to -Wmax ~ Wi < Wmax \n\n(1) \n\nj \n\nlOur definition of equation I differs from Linsker's by the omission of a factor of liN before \n\nthe sum term, where N is the number of synapses. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n695 \n\nwhere Q is the covariance matrix of activities of the inputs to the neuron. The \ncovariance matrix depends on the covariance function, which describes the depen(cid:173)\ndence of the covariance of two input cells' activities on their separation in the input \nfield, and on the location of the synapses, which is determined by a synaptic density \nfunction. Linsker used a gaussian synaptic density function. \n\nDepending on the covariance function and the two parameters kl and k2' different \nweight structures emerge. Using a gaussian covariance function (his layer B -+- C), \nLinsker reported the emergence of non-trivial weight structures, ranging from satu(cid:173)\nrated structures through centre-surround structures to bi-Iobed oriented structures. \n\nThe analysis in this paper examines the properties of equation (1). We concen(cid:173)\ntrate on the gaussian covariances in Linsker's layer B -+- C, and give an explanation \nof the structures reported by Linsker. Several of the results are more general, \napplying to any covariance matrix Q. Space constrains us to postpone general \ndiscussion, and criteria for the emergence of centre-surround weight structures, \ntechnical details, and discussion of other model networks, to future publications \n[MacKay, Miller, 1990]. \n\n1 ANALYSIS IN TERMS OF EIGENVECTORS \nWe write equation (1) as a first order differential equation for the weight vector w: \n\n(2) \nwhere J is the matrix J ij = 1 Vi, j, and n is the DC vector ni = 1 Vi. This equation \nis linear, up to the hard limits on Wi. These hard limits define a hypercube in weight \nspace within which the dynamics are confined. We make the following assumption: \n\nAssumption 1 The principal features of the dynamics are established before the \nhard limits are reached. When the hypercube is reached, it captures and preserves \nthe existing weight structure with little subsequent change. \n\nThe matrix Q+k2J is symmetric, so it has a complete orthonormal set of eigenvectors2 \ne Ca) with real eigenvalues Aa. The linear dynamics within the hypercube can be char(cid:173)\nacterised in terms of these eigenvectors, each of which represents an independently \nevolving weight configuration. First, equation (2) has a fixed point at \n\n(3) \n\nSecond, relative to the fixed point, the component of w in the direction of an eigen(cid:173)\nvector grows or decays exponentially at a rate proportional to the corresponding \neigenvalue. Writing wet) = :La wa(t)eCa ), equation (2) yields \n\nwa(t) - w:P = (wa(O) - w~p)e>'~t \n\n(4) \n\n2 The indices a and b will be used to denote the eigenvector basis for w, while the indices i and \n\nj will be used for the synaptic basis. \n\n\f696 MacKay and Miller \n\nThus, the principal emergent features of the dynamics are determined by the fol(cid:173)\nlowing three factors: \n1. The principal eigenvectors of Q + k 2J, that is, the eigenvectors with largest \npositive eigenvalues. These are the fastest growing weight configurations. \n2. Eigenvectors of Q + k 2 J with negative eigenvalue. Each is associated with an \nattracting constraint surface, the hyperplane defined by Wa = w!p. \n3. The location of the fixed point of equation (1). This is important for two \nreasons: a) it determines the location of the constraint surfaces; b) the fixed point \ngives a \"head start\" to the growth rate of eigenvectors e(a) for which Iw~PI is large \ncompared to IWa(O)I. \n2 EIGENVECTORS OF Q \nWe first examine the eigenvectors and eigenvalues of Q. The principal eigenvector \nof Q dominates the dynamics of equation (2) for kl = 0, k2 = O. The subsequent \neigenvectors of Q become important as kl and k2 are varied. \n\n2.1 PROPERTIES OF CIRCULARLY SYMMETRIC SYSTEMS \n\nIf an operator commutes with the rotation operator, its eigenfunctions can be writ(cid:173)\nten as eigenfunctions of the rotation operator. For Linsker's system, in the contin(cid:173)\nuum limit, the operator Q + k2 J is unchanged under rotation of the system. So the \neigenfunctions of Q + k 2J can be written as the product of a radial function and \none of the angular functions cosiO, sinifJ, 1= 0,1,2 ... To describe these eigenfunc(cid:173)\ntions we borrow from quantum mechanics the notation n = 1,2,3 ... and I = s, p, \nd ... to denote the total number of number of nodes in the function = 0,1,2 ... and \nthe number of angular nodes = 0, 1,2 ... respectively. For example, \"2s\" denotes a \ncentre-surround function with one radial node and no angular nodes (see figure 1). \n\nFor monotonic and non-negative covariance functions, we conjecture that the eigen(cid:173)\nfunctions of Q are ordered in eigenvalue by their numbers of nodes such that the \neigenfunction [nl] has larger eigenvalue than either [en + 1)/] or [n(1 + 1)]. This \nconjecture is obeyed in all analytical and numerical results we have obtained. \n2.2 ANALYTIC CALCULATIONS FOR k2 = 0 \nWe have solved analytically for the first three eigenfunctions and eigenvalues of the \ncovariance matrix for layer 8 -+ C of Linsker's network, in the continuum limit \nIs, the function with no changes of sign, is the principal eigenfunction \n(Table 1). \nof Q; 2p, the bilobed oriented function, is the second eigenfunction; and 2s, the \ncentre-surround eigenfunction, is third. 3 \n\nFigure l(a) shows the first six eigenfunctions for layer B -+ C of [Linsker, 1986]. \n\n32s is degenerate with 3d at k2 = O. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n697 \n\nTable 1: The first three eigenfunctions of the operator Q(r, r') \n\nQ(r, r') = e-(r-r')2/ 2c e-r'2/2A, where C and A denote the characteristic sizes of \nthe covariance function and synaptic density function. r denotes two-dimensional \nspatial position relative to the centre of the synaptic arbor, and r = Irl. The \neigenvalues ~ are all normalised by the effective number of synapses. \n\nName \n\nEigenfunction \n\n~/N \n\nIs \n2p \n2s \n\ne- r2 / 2R \n\nr cos Oe -r2/2R \n\nIC/A \n[2C/A \n(1 - r2/r5)e-r2/2R 13C/A \n\nR \n\nI \n\nr2 \no \n\n~ (1 + VI + 4A/C) \n\u00a5 \n\n(0 < 1<1) \n\n2A \n\nJl+4A/C \n\nFigure 1: Eigenfunctions of the operator Q + k2 J. \n\nLargest eigenvalue is in the top row. Eigenvalues (in arbitrary units): (a) k2 = 0: \n(b) k2 = -3: 2p, 1.0; \nIs, 2.26; 2p, 1.0; 2s & 3d (only one 3d is shown), 0.41. \n2s, 0.66; Is, -17.8. The greyscale indicates the range from maximum negative to \nmaximum positive synaptic weight within each eigenfunction. Eigenfunctions of \nthe operator (e-(r-r')2/ 2C +k2)e-r'2/2A were computed for CIA = 2/3 (as used by \nLinsker for most layer B --+ C simulations) on a circle of radius 12.5 grid intervals, \nwith VA = 6.15 grid intervals. \n\n(~) \n\n(E3) \n\n\f698 \n\nMacKay and Miller \n\n3 THE EFFECTS OF THE PARAMETERS kl AND k2 \nVarying k2 changes the eigenvectors and eigenvalues of the matrix Q + k2J. Varying \nkl moves the fixed point of the dynamics with respect to the origin. We now analyse \nthese two changes, and their effects on the dynamics. \n\nDefinition: Let ii be the unit vector in the direction of the DC vector n. We \nrefer to (w . ii) as the DC component of w. The DC component is proportional \nto the sum of the synaptic strengths in a weight vector. For example, 2p and all \nthe other eigenfunctions with angular nodes have zero DC component. Only the \ns-modes have a non-zero DC component. \n\n3.1 GENERAL THEOREM: THE EFFECT OF k2 \n\nWe now characterise the effect of adding k 2J to any covariance matrix Q. \n\nTheorem 1 For any covariance matrix Q, the spectrum of eigenvectors and eigen(cid:173)\nvalues of Q + k 2J obeys the following: \n1. Eigenvectors of Q with no DC component, and their eigenvalues, are unaffected \nby k 2 \u2022 \n2. The other eigenvectors, with non-zero DC component, vary with k 2 \u2022 Their eigen(cid:173)\nvalues increase continuously and monotonically with k2 between asymptotic limits \nsuch that the upper limit of one eigenvalue is the lower limit of the eigenvalue above. \n3. There is at most one negative eigenvalue. \n4. All but one of the eigenvalues remain finite. In the limits k2 --+ \u00b1oo there is a \nDC eigenvector ii with eigenvalue --+ k 2 N, where N is the dimensionality ofQ, i.e. \nthe number of synapses. \n\nThe properties stated in this theorem, whose proof is in [MacKay, Miller, 1990]' are \nsummarised pictorially by the spectral structure shown in figure 2. \n\n3.2 \n\nIMPLICATIONS FOR LINSKER'S SYSTEM \n\nFor Linsker's circularly symmetric systems, all the eigenfunctions with angular \nnodes have zero DC component and are thus independent of k 2 \u2022 The eigenval(cid:173)\nues that vary with k2 are those of the s-modes. The leading s-modes at k2 = 0 are \nIs, 2s; as k2 is decreased to -00, these modes transform continuously into 2s, 3s \nrespectively (figure 2).4 Is becomes an eigenvector with negative eigenvalue, and it \napproaches the DC vector ii. This eigenvector enforces a constraint w\u00b7 ii = w FP . ii, \nand thus determines that the final average synaptic strength is equal to w FP . n/ N. \nLinsker used k2 = -3 in [Linsker, 1986]. This value of k2 is sufficiently large that \nthe properties of the k2 --+ -00 limit hold [MacKay, Miller, 1990]' and in the fol(cid:173)\nlowing we concentrate interchangeably on k2 = -3 and k2 --+ -00. The computed \neigenfunctions for Linsker's system at layer B --+ C are shown in figure l(b) for \n\n\u2022 The 2s eigenfunctions at k2 = 0 and k2 = - 00 both have one radial node, but are not identical \n\nfunctions. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n699 \n\nFigure 2: General spectrum of eigenvalues of Q + k 2 J as a function of k 2-\nA: Eigenvectors with DC component. B: Eigenvectors with zero DC component. \nC: Adjacent DC eigenvalues share a common asymptote. D: There is only one \nnegative eigenvalue. \nThe annotations in brackets refer to the eigenvectors of Linsker's system. \n\n-:00 \n\n00: \n\n~ k2 \n. \n! \nn~ ... (~~2 ............................. ~ ................................................ .1 \n\nD \n\nk2 = -3. The principal eigenfunction is 2p. The centre-surround eigenfunction 2s \nis the principal symmetric eigenfunction, but it still has smaller eigenvalue than 2p. \n\n3.3 EFFECT OF kl \n\nVarying kl changes the location of the fixed point of equation (2). From equation \n(3), the fixed point is displaced from the origin only in the direction of eigenvectors \nthat have non-zero DC component, that is, only in the direction of the s-modes. \nThis has two important effects, as discussed in section 1: a) The s-modes are given \na head start in growth rate that increases as kl is increased. In particular, the \nprincipal s-mode, the centre-surround eigenvector 2s, may outgrow the principal \neigenvector 2p. b) The constraint surface is moved when kl is changed. For large \nnegative k2' the constraint surface fixes the average synaptic strength in the final \nweight vector. To leading order in 1/k2' Linsker showed that the constraint is: \nL Wj = kl/lk21\u00b75 \n3.4 SUMMARY OF THE EFFECTS OF kl AND k2 \n\nWe can now anticipate the explanation for the emergence of centre--surround cells: \nFor kl = 0, k2 = 0, the dynamics are dominated by Is. The centre-surround \n5To second order, this expression becomes L Wi = kt/lk2 + ql, where q = (QiJ)' the average \ncovariance (averaged over i and j). The additional term largely resolves the discrepancy between \nLinsker's 9 and kt/lk21 in [Linsker, 1986]. \n\n\f700 MacKay and Miller \n\neigenfunction 2s is third in line behind 2p, the bi-Iobed function. Making k2 large \nand negative removes Is from the lead. 2p becomes the principal eigenfunction \nand dominates the dynamics for kl ~ 0, so that the circular symmetry is bro(cid:173)\nken. Finally, increasing kdlk21 gives a head start to the centre-surround function \n2s. Increasing kdlk21 also increases the final average synaptic strength, so large \nkdlk21 also produces a large DC bias. The centre-surround regime therefore lies \nsandwiched between a 2p-dominated regime and an all-excitatory regime. kdlk21 \nhas to be large enough that 2s dominates over 2p, and small enough that the DC \nbias does not obscure the centre-surround structure. We estimate this parameter \nregime in [MacKay, Miller, 1990], and show that the boundary between the 2s- and \n2p-dominated regimes found by simulated annealing on the energy function may be \ndifferent from the boundary found by simulating the time-development of equation \n(1), which depends on the initial conditions. \n\nThe principal eigenvector of Q, Is. \nThe flat DC weight vector, which leads to the same satu(cid:173)\nrated structures as Is. \nThe principal eigenvector of Q + k2 J for k2 ---+ -00, 2p. \n\n4 CONCLUSIONS AND DISCUSSION \nFor Linsker's B ---+ C connections, we predict four main parameter regimes for vary(cid:173)\ning kl and k2.6 These regimes, shown in figure 3, are dominated by the following \nweight structures: \nk2 = 0, kl = 0: \nk2 = large positive \nand/ or kl = large \nk2 = large negative, \nkl ~ 0 \nk2 = large negative, The principal circularly symmetric function which is given \nkl = intermediate \nHigher layers of Linsker's network can be analysed in terms of the same four regimes; \nthe principal eigenvectors are altered, so that different structures can emerge. The \ndevelopment of the interesting cells in Linsker's system depends on the use of neg(cid:173)\native synapses and on the use of the terms kl and k2 to enforce a constraint on the \nfinal percentages of positive and negative synapses. Both of these may be biolog(cid:173)\nically problematic [Miller, 1990]. Linsker suggested that the emergence of centre(cid:173)\nsurround structures may depend on the peaked synaptic density function that he \nused [Linsker, 1986, page 7512]. However, with a flat density function, the eigen(cid:173)\nfunctions are qualitatively unchanged, and centre-surround structures can emerge \nby the same mechanism. \n\na head start, 2s. \n\nAcknowledgements \n\nD.J.C.M. is supported by a Caltech Fellowship and a Studentship from SERe, UK. \n\nK.D.M. thanks M. P. Stryker for encouragement and financial support while this \nwork was undertaken. K.D.M. was supported by an N .E.I. Fellowship and the In-\n\n6not counting the symmetric regimes (kl' k2) ..... (-kl' k 2 ) in which all the weight shuctures \n\nare inverted in sign. \n\n\fAnalysis of Linsker's Simulations of Hebbian Rules \n\n701 \n\nFigure 3: Parameter regimes for Linsker's system. The DC bias is approx(cid:173)\nimately constant along the radial lines, so each of the regimes with large negative \nk2 is wedge-shaped. \n\n-8---8 -q 8---8-+ -k1 \n\n'. \n\n'. '. \n\n'. \n\n\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7 ... 8 \n\nternational Joint Research Project Bioscience Grant to M. P. Stryker (T. Tsumoto, \nCoordinator) from the N.E.D.O., Japan. \n\nThis collaboration would have been impossible without the internet/NSF net, long \nmay their daemons flourish. \n\nReferences \n[Linsker, 1986] R. Linsker. From Basic Network Principles to Neural Architecture \n(series), PNAS USA, 83, Oct.-Nov. 1986, pp. 7508-7512, 8390-8394, \n8779-8783. \n\n[Linsker, 1988] R. Linsker. Self-Organization in a Perceptual Network, Computer, \n\nMarch 1988. \n\n[Miller, 1990] K.D. Miller. \n\n\"Correlation-based mechanisms of neural develop(cid:173)\n\nment,\" in Neuroscience and Connectionist Theory, M.A. Gluck and \nD.E. Rumelhart, Eds. (Lawrence Erlbaum Associates, Hillsboro NJ) \n(in press). \n\n[MacKay, Miller, 1990] D.J.C. MacKay and K.D. Miller. \"Analysis ofLinsker's Sim(cid:173)\n\nulations of Hebbian rules\" (submitted to Neural Computation); and \n\"Analysis of Linsker's application of Hebbian rules to linear net(cid:173)\nworks\" (submitted to Network). \n\n\f", "award": [], "sourceid": 193, "authors": [{"given_name": "David", "family_name": "MacKay", "institution": null}, {"given_name": "Kenneth", "family_name": "Miller", "institution": null}]}