{"title": "Spike-Timing-Dependent Learning for Oscillatory Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 152, "page_last": 158, "abstract": null, "full_text": "Spike-Timing-Dependent Learning for \n\nOscillatory Networks \n\nSilvia Scarp etta \n\nDept. of Physics \"E.R. Caianiello\" \nSalerno University 84081 (SA) Italy \nand INFM, Sezione di Salerno Italy \n\nZhaoping Li \n\nGatsby Compo Neurosci. Unit \n\nUniversity College, London, WCIN 3AR \n\nUnited Kingdom \n\nscarpetta@na. infn. it \n\nzhaoping@gatsby.ucl.ac.uk \n\nJohn Hertz \n\nNordita \n\n2100 Copenhagen 0, Denmark \n\nheriz@nordita.dk \n\nAbstract \n\nWe apply to oscillatory networks a class of learning rules in which \nsynaptic weights change proportional to pre- and post-synaptic ac(cid:173)\ntivity, with a kernel A(r) measuring the effect for a postsynaptic \nspike a time r after the presynaptic one. The resulting synaptic ma(cid:173)\ntrices have an outer-product form in which the oscillating patterns \nare represented as complex vectors. In a simple model, the even \npart of A(r) enhances the resonant response to learned stimulus by \nreducing the effective damping, while the odd part determines the \nfrequency of oscillation. We relate our model to the olfactory cortex \nand hippocampus and their presumed roles in forming associative \nmemories and input representations. \n\n1 \n\nIntroduction \n\nRecent studies of synapses between pyramidal neocortical and hippocampal neu(cid:173)\nrons [1, 2, 3, 4] have revealed that changes in synaptic efficacy can depend on the \nrelative timing of pre- and postsynaptic spikes. Typically, a presynaptic spike fol(cid:173)\nlowed by a postsynaptic one leads to an increase in efficacy (LTP), while the reverse \ntemporal order leads to a decrease (LTD). The dependence of the change in synap(cid:173)\ntic efficacy on the difference r between the two spike times may be characterized \nby a kernel which we denote A(r) [4]. For hippocampal pyramidal neurons, the \nhalf-width of this kernel is around 20 ms. \n\nMany important neural structures, notably hippocampus and olfactory cortex, ex(cid:173)\nhibit oscillatory activity in the 20-50 Hz range. Here the temporal variation of the \nneuronal firing can clearly affect the synaptic dynamics, and vice versa. In this \npaper we study a simple model for learning oscillatory patterns, based on the struc(cid:173)\nture of the kernel A( r) and other known physiology of these areas. We will assume \n\n\fthat these synaptic changes in long range lateral connections are driven by oscilla(cid:173)\ntory, patterned input to a network that initially has only local synaptic connections. \nThe result is an imprinting of the oscillatory patterns in the synapses, such that \nsubsequent input of a similar pattern will evoke a strong resonant response. It can \nbe viewed as a generalization to oscillatory networks with spike-timing-dependent \nlearning of the standard scenario whereby stationary patterns are stored in Hopfield \nnetworks using the conventional Hebb rule. \n\n2 Model \n\nThe computational neurons of the model represent local populations of biological \nneurons that share common input. They follow the equations of motion [5] \n\nUi = -aUi - (3?gv(Vi) + L J~gu(Uj) + Ii, \n\nj \n\nVi \n\n-avi +')'Pgu(Ui) + Lwggu(Uj). \n\n#i \n\n(1) \n\n(2) \n\nHere Ui and Vi are membrane potentials for excitatory and inhibitory (formal) \nneuron i, a- 1 is their membrane time constant, and the sigmoidal functions \ngu( ) and gv( ) model the dependence of their outputs (interpreted as instanta(cid:173)\nneous firing rates) on their membrane potentials. The couplings (3? and ')'? are \ninhibitory-to-excitatory (resp. excitatory-to-inhibitory) connection strengths within \nlocal excitatory-inhibitory pairs, and for simplicity we take the external drive Ii(~ \nto act only on the excitatory units. We include nonlocal excitatory couplings Jij \nbetween excitatory units and wg from excitatory units to inhibitory ones. In this \nminimal model, we ignore long-range inhibitory couplings, appealing to the fact \nthat real anatomical inhibitory connections are predominantly short-ranged. (In \nwhat follows, we will sometimes use bold and sans serif notation (e.g., u, J) for \nvectors and matrices, respectively.) The structure of the couplings is shown in Fig. \n1A. \n\nThe model is nonlinear, but here we will limit our treatment to an analysis of small \noscillations around a stable fixed point {ii, v} determined by the DC part of the \ninput. Performing the linearization and eliminating the inhibitory units [6, 5], we \nobtain \n\nii + [2a - J]ti + [a2 + (3(')' + W) - aJ]u = (at + a)81. \n\n(3) \nHere u is now measured from the fixed point ii, 81 is the time-varying part of the \ninput, and the elements of J and W are related to those of JO and WO by Wij = \ng~(Uj)wg and Jij = g~(Uj)J~. For simplicity, we have assumed that the effective \nlocal couplings (3i = g~(Vi)(3? and ')'i = g~(uih? are independent of i: (3i = (3, \n')'i = ')'. With oscillatory inputs 81 = ee-iwt + c.c., the oscillatory pattern elements \n~i = I~ile-i\u00a2i are complex, reflecting possible phase differences across the units. \nWe likewise separate the response u = u+ + u- (after the initial transients) into \npositive- and negative-frequency components u\u00b1 (with u- = u+* and u\u00b1 ex: e'fiwt). \nSince ti\u00b1 = =t=iwu\u00b1, Eqn. (3) can be written \n\n[2a \u00b1 ~(a2 + (3')' - w2)] u\u00b1 = M\u00b1u\u00b1 + (1 \u00b1 ~) 81\u00b1, \n\na form that shows how the matrix \n\n(4) \n\n(5) \n\nM\u00b1(w) == J =t= !..((3W - aJ). \n\nw \n\ndescribes the effective coupling between local oscillators. 2a is the intrinsic damping \nand J a 2 + (3')' the frequency of the individual oscillators. \n\n\fII L \nCD \n\nu. \n\n1 j \nJ:, G) \n\n+ ... --------~ + \n\nU l \n\n, \n\n, \n, \n, \n, \n, \n, \n, \n, \n, \n, \n, \n, , \n, \n, \n>, \n, \n, \n, \n, \n, \n, \nO /w:, \n\n'0 \n\n/ \n\nA \n\nB.1 \n\nB.2 \n\nFigure 1: A. The model: In addition to the local excitatory-inhibitory connections \n(vertical solid lines), there are nonlocallong-range connections (dashed lines) be(cid:173)\ntween excitatory units (Jij ) and from excitatory to inhibitory units (Wij ). External \ninputs are fed to the excitatory units. B: Activation function used in simulations \nfor excitatory units (B.1) and inhibitory units (B.2). Crosses mark the equilibrium \npoint (ii, v) of the system. \n\n2.1 Learning phase \n\nWe employ a generalized Hebb rule of the form \n\nc5Cij (t) ='T/ rT dtjOO dTYi(t+T)A(T)Xj(t) \n\n10 \n\n-00 \n\n(6) \n\nfor synaptic weight Cij, where Xj and Yi are the pre- and postsynaptic activities, \nmeasured relative to stationary levels at which no changes in synaptic strength \noccur. We consider a general kernel A(T), although experimentally A(T) > 0 \u00ab 0) \nfor T > 0 \u00ab 0). We will apply the rule to both J and W in our linearized network, \nwhere the firing rates 9u(Ui) and 9v(Vi) vary linearly with Ui and Vi, so we will \nuse Eqn. (6) with Xj = Uj and Yi = Ui or Vi (measured from the fixed point Vi), \nrespectively. \nWe assume oscillatory input c51 = eOe-iwot + c.c. during learning. In the brain \nstructures we are modeling, cholinergic modulation makes the long-range connec(cid:173)\ntions ineffective during learning [7]. Thus we set J = W = 0 in Eqn. (3) and \nfind \n\nu7- = \n\nt \n\n( +. )':0 -iwot \nWo \n\nIa '>i e \n\n2awo + i(a2 + (3\"1 - w5) -\n\nt \n\n= Uo~~e-iwot \n\nand, from (at + a)vi = \"lUi, \n\nUsing these in the learning rule (6) leads to \n\n(7) \n\n(8) \n\n(9) \n\nwhere A(w) = J~oodT A(T)e- iwT \n27r'T/J IUol2 /wo, and 'T/J(W) are the respective learning rates. When the rates are tuned \nsuch that 'T/J = 'T/w\"l(3/(a 2 +w5) and when w = Wo, we have Mit = JoA(wo)~?~J*, a \n\nis the Fourier transform of A(T), Jo \n\n\fgeneralization of the outer-product learning rule to the complex patterns el-l from \nthe Hopfield-Hebb form for real-valued patterns. For learning multiple patterns e, \nf.L = 1,2, ... , the learned weights are simply sums of contributions from individual \npatterns like Eqns. (9) with ~? replaced by ~r. \n\n2.2 Recall phase \n\nWe return to the single-pattern problem and study the simple case when 'fJJ \n'fJw'\"Y(3/(a 2 + w~). Consider first an input pattern 81 = ee-iwt + c.c. that matches \nthe stored pattern exactly (e = eO), but possibly oscillating at a different frequency. \nWe then find, using Eqns. (9) in Eqn. (3), the (positive-frequency) response \n\nu+ -\n\n- 2aw - ~(w + wo)A'(wo) + i[a 2 + (3'\"Y - ~(w + WO)AII(WO) - w2 ]\u00b7 \n\n(w + ia)eoe-iwt \n\n(10) \n\nwhere A'(wo) == ReA(wo) and A\"(WO) == ImA(wo). For strong response at w = Wo, \nwe require \n\n(11) \n\nThis means (1) the resonance frequency Wo is determined by A\", (2) the effective \ndamping 2a - JoA' should be small, and (3) deviation of w from Wo reduces the \nresponses. \nIt is instructive to consider the case where the width of the time window for synaptic \nchange is small compared with the oscillation period. Then we can expand A(wo) \nin Wo: \n\n(12) \nIn particular, A(T) = 8(T) gives ao = 1 and al = 0 and the conventional Hebbian \nlearning [5]. Experimentally, al > 0 , implying a resonant frequency greater than the \nintrinsic local frequency, J a 2 + (3'\"Y obtained in the absence of long-range coupling. \nIf the drive e does not match the stored pattern (in phase and amplitude), the \nresponse will consist of two terms. The first has the form of Eqn. (10) but reduced \nin amplitude by an overlap factor eO* . e. \n(For convenience we use normalized \npattern vectors.) The second term is proportional to the part of e orthogonal to \nthe stored pattern. The J and W matrices do not act in this subspace, so the \nfrequency dependence of this term is just that of uncoupled oscillators, i.e., Eqn. \n(10) with Jo set equal to zero. This response is always highly damped and therefore \nsmall. \nIt is straightforward to extend this analysis to multiple imprinted patterns. The \nresponse consists of a sum of terms, one for each stored pattern. The term for each \nstored pattern is just like that just described in the single-stored-pattern case: it \nhas one part for the input component parallel to the stored pattern and another \npart for the component orthogonal to the stored pattern. \n\nWe note that, in this linear analysis, an input which overlaps several stored pat(cid:173)\nterns will (if the imprinting and input frequencies match) evoke a resonant response \nwhich is a linear combination of the stored patterns. Thus, a network tuned to \noperate in a nearly linear regime is able to interpolate in forming its representation \nof the input. For categorical associative memory, on the other hand, a network has \nto work in the extreme nonlinear limit, responding with only the strongest stored \npattern in an input mixture. As our network operates near the threshold for sponta(cid:173)\nneous oscillations, we expect that it should exhibit properties intermediate between \n\n\fA \n\nB \n00 0 00 \n\n200 \n\n\" ~ \n\n1'i \n~ 100 \n\n200 \n\n\" 'C \n. .E \n1'i \n~ 100 \n\n60 \n\no o-eo* \n\n02 \n\n04 \n\n06 \n\nOverlap \n\n\u2022 \n\n0 8 \n\nC \n\n90', - - - - - - - - - - - - - : := - - - ,$ \n\n,* \n\n* \n00 \" 00 \n\n.. / \n\n* \n\n45 \n\nInput angle (degrees) \n\n90 \n\nFigure 2: Circles show non-linear simulation results, stars show the linear simulation \nresults, while the dotted line show the analytical prediction for the linearized model. \nA. Importance of frequency match: amplitude of response of output units as a \nfunction of the frequency of the current input. The frequency of the imprinted \npattern is 41 Hz. B.Importance of amplitude and phase mismatch: amplitude of \nresponse as a function of overlap between current input and imprinted pattern (i.e. \nI~o* . ~I), for different presented input patterns~. C: Input - output relationship \nwhen two orthogonal patterns e and e, have been imprinted at the same frequency \nw = 41Hz. The angle of input pattern with resrect to ~1 is shown as a function of \nthe angle of the output pattern with respect to ~ ,for many different input patterns. \n\nthese limits. We find that this is indeed the case in the simulations reported in \nthe next section. From our analysis it turns out that the network behaves like a \nHopfield-memory (separate basins, without interpolation capability) for patterns \nwith different imprinting frequencies, but at the same time it is able to interpolate \namong patterns which share a common frequency. \n\n3 Simulations \n\nChecking the validity of our linear approximation in the analysis, we performed \nnumerical simulations of both the non-linear equations (1,2) and the linearized ones \n(3). We simulated the recall phase of a network consisting of 10 excitatory and 10 \ninhibitory cells. The connections Jij and Wij were calculated from Eqns. (9), where \nwe used the approximations (12) for the kernel shape A(T). Parameters were set \nin such a way that the selective resonance was in the 40-Hz range. In non-linear \nsimulations we used different piecewise linear activation functions for 9u() and 9v(), \nas shown in Fig.1B. We chose the parameters of the functions 9u () and 9v () so that \nthe network equilibrium points Ui, ih were close to, but below, the high-gain region, \ni.e. at the points marked with crosses in Fig. lB. \n\nThe results confirm that when the input pattern matches the imprinted one in \nfrequency, amplitude and phase, the network responds with strong resonant oscil(cid:173)\nlations. However, it does not resonate if the frequencies do not match, as shown \nin the frequency tuning curve in Fig. 2A. The behavior when the two frequencies \nare close to each other differs in the linear and nonlinear cases. However, in both \ncases a sharp selectivity in frequency is observed. The dependence on the overlap \nbetween the input and the stored pattern is shown in Fig. 2B. The non-linear case, \nindicated by circles, should be compared with the linear case, where the amplitude \nis always linear in the overlap. In the nonlinear case, the linearity holds roughly \nonly for overlaps lower than about 0.4; for larger overlaps the amplification is as \nhigh as for the perfect match case. This means that input patterns with an overlap \nwith the imprinted one greater than 0.4 lie within the basis of attraction of the \n\n\fW =WI \n\ne = el \n\n50 ,----------, \n\n200 \n\n200 \n\n400 \n\n400 \n\n,NS:C] \n_soC] \n,\" 1--1 \n-soC] -50 \n,.':~ ,. ~O ,.':8 ,. ~8 \n\n,\" 5:0 ,\" S:c=J \n\n_soc=J \n\n-50 \n\n0 \n\n200 \n\n400 \n\n200 \n\n400 \n\n200 \n\n400 \n\n0 \n\n200 \n\no \n\no \n\n0 \n\n- 50 \n\no \n\n200 \n\n400 \n\n-SO \n\n0 \n\n200 \n\n400 \n\n- 50 \n0 \n\n200 \n\n400 \n\n- 50 \n0 \n\n200 \n\n400 \n\nFigure 3: Frequency selectivity: Response evoked on 3 of the 10 neurons. Oscillatory \npatterns ele-iwlt + c.c. and e2e-iw2tc.c. have been imprinted, with e l .1 e and \nWI = 41 Hz, W2 = 63 Hz. During the learning phases the parameter al of kernel was \ntuned appropriately, i.e. al = 0.1 when imprinting e l and al = 1.1 when imprinting \ne\u00b7 \n\nimprinted pattern. \n\nThe response elicited when two orthogonal patterns have been imprinted with the \nsame frequency is shown in Fig. 2C. Let ele-iwot + c.c. and ee-iwot + c.c. denote \nthe imprinted patterns, and ee-iwot + c.c. be the input to the trained network. In \nboth linear and non-linear simulations the network responds vigorously(with high(cid:173)\namplitude oscillations) to the drive if e is in the subspace spanned by the imprinted \npatterns, and fails to respond appreciably if e is orthogonal to that plane. When \nthe input pattern e is in the plane spanned by the stored patterns, the resonant \nresponse u also lies in this plane. However, while in the linear case the output is \nproportional to the input, in agreement with the analytical analysis, in the non(cid:173)\nlinear case there are preferred directions, in the stored pattern plane. The figure \n\nshows that, in case simulated here, there are three stable attractors: e, e, and \nthe symmetric linear combination (el + e 2)/V2). \nFinally we performed linear simulations storing two orthogonal patterns ele-iwlt + \nc.c. and e2e-iw2t + c.c. with two different imprinting frequencies. Fig. 3 shows a \ngood performance of the network in separating the basins of attraction in this case. \nThe response to a linear combination of the two patterns, (ae + be)e-iw2t + c.c. \nis proportional to the part of the input whose imprinting frequency matches the \ncurrent driving frequency. Linear combinations of the two imprinted patterns are \nnot attractors if the two patterns do not share the same imprinting frequency. \n\n4 Summary and Discussion \n\nWe have presented a model of learning for memory or input representations in neural \nnetworks with input-driven oscillatory activity. The model structure is an abstrac-\n\n\ftion of the hippocampus or the olfactory cortex. We propose a simple generalized \nHebbian rule, using temporal-activity-dependent LTP and LTD, to encode both \nmagnitudes and phases of oscillatory patterns into the synapses in the network. Af(cid:173)\nter learning, the model responds resonantly to inputs which have been learned (or, \nfor networks which operate essentially linearly, to linear combinations of learned in(cid:173)\nputs), but negligibly to other input patterns. Encoding both amplitude and phase \nenhances computational capacity, for which the price is having to learn both the \nexcitatory-to-excitatory and the excitatory-to-inhibitory connections. Our model \nputs contraints on the form of the learning kernal A(r) that should be experime(cid:173)\nnally observed, e.g., for small oscillation frequencies, it requires that the overall LTP \ndominates the overall LTD, but this requirement should be modified if the stored \noscillations are of high frequencies. Plasticity in the excitatory-to-inhibitory connec(cid:173)\ntions (for which experimental evidence and investigation is still scarce) is required \nby our model for storing phase locked but unsynchronous oscillation patterns. \n\nAs for the Hopfield model, we distinguish two functional phases: (1) the learning \nphase, in which the system is clamped dynamically to the external inputs and (2) \nthe recall phase, in which the system dynamics is determined by both the external \ninputs and the internal interactions. \n\nA special property of our model in the linear regime is the following interpolation \ncapability: under a given oscillation frequency, once the system has learned a set of \nrepresentation states, all other states in the subspace spanned by the learned states \ncan also evoke vigorous responses. Hippocampal place cells could employ such a \nrepresentation. Each cell has a localised \"place field\", and the superposition of \nactivity of several cells wth nearby place fields can represent continuously-varying \nposition. The locality of the place fields also means that this representation is \nconservative (and thus robust), in the sense that interpolation does not extend \nbeyond the spatial range of the experienced locations or to locations in between \ntwo learned but distant and disjoint spatial regions. \n\nOf course, this interpolation property is not always desirable. For instance, in cat(cid:173)\negorical memory, one does not want inputs which are linear combinations of stored \npatterns to elicit responses which are also similar linear combinations. Suitable \nnonlinearity can (as we saw in the last section), enable the system to perform cate(cid:173)\ngorization: one way involved storing different patterns (or, by implication, different \nclasses of patterns) at different frequencies. For instance, in a multimodal area, \n\"place fields\" might be stored at one oscillation frequency, and (say) odor mem(cid:173)\nories at another. It seems likely to us that the brain may employ different kinds \nand degrees of nonlinearity in different areas or at different times to enhance the \nversatility of its computations. \n\nReferences \n\n[1] H Markram, J Lubke, M Frotscher, and B Sakmann, Science 275 213 (1997). \n\n[2] J C Magee and D Johnston, Science 275 209 (1997). \n\n[3] D Debanne, B H Gahwiler, and S M Thompson, J Physiol507 237 (1998) . \n[4] G Q Bi and M M Poo, J Neurosci 18 10464 (1998). \n[5] Z Li and J Hertz, Network: Computation in Neural Systems 11 83-102 (2000). \n\n[6] Z Li and J J Hopfield, Biol Cybern 61 379-92 (1989) . \n\n[7] M E Hasselmo, Neural Comp 5 32-44 (1993). \n\n\f", "award": [], "sourceid": 1828, "authors": [{"given_name": "Silvia", "family_name": "Scarpetta", "institution": null}, {"given_name": "Zhaoping", "family_name": "Li", "institution": null}, {"given_name": "John", "family_name": "Hertz", "institution": null}]}