{"title": "Self-regulation Mechanism of Temporally Asymmetric Hebbian Plasticity", "book": "Advances in Neural Information Processing Systems", "page_first": 245, "page_last": 252, "abstract": null, "full_text": "Self-regulation Mechanism of Temporally\n\nAsymmetric Hebbian Plasticity\n\nNarihisa Matsumoto\n\nMasato Okada\n\nGraduate School of Science and Engineering\n\nRIKEN Brain Science Institute\n\nSaitama University:\n\nRIKEN Brain Science Institute\n\nSaitama 351-0198, Japan\nxmatumo@brain.riken.go.jp\n\nSaitama 351-0198, Japan\nokada@brain.riken.go.jp\n\nAbstract\n\nRecent biological experimental (cid:12)ndings have shown that the synap-\ntic plasticity depends on the relative timing of the pre- and post-\nsynaptic spikes which determines whether Long Term Potentiation\n(LTP) occurs or Long Term Depression (LTD) does. The synaptic\nplasticity has been called \\Temporally Asymmetric Hebbian plas-\nticity (TAH)\". Many authors have numerically shown that spatio-\ntemporal patterns can be stored in neural networks. However, the\nmathematical mechanism for storage of the spatio-temporal pat-\nterns is still unknown, especially the e(cid:11)ects of LTD. In this paper,\nwe employ a simple neural network model and show that inter-\nference of LTP and LTD disappears in a sparse coding scheme.\nOn the other hand, it is known that the covariance learning is in-\ndispensable for storing sparse patterns. We also show that TAH\nqualitatively has the same e(cid:11)ect as the covariance learning when\nspatio-temporal patterns are embedded in the network.\n\n1\n\nIntroduction\n\nRecent biological experimental (cid:12)ndings have indicated that the synaptic plasticity\ndepends on the relative timing of the pre- and post- synaptic spikes which deter-\nmines whether Long Term Potentiation (LTP) occurs or Long Term Depression\n(LTD) does [1, 2, 3]. LTP occurs when a presynaptic (cid:12)ring precedes a postsynaptic\none by no more than about 20ms.\nIn contrast, LTD occurs when a presynaptic\n(cid:12)ring follows a postsynaptic one. A rapid transition occurs between LTP and LTD\nwithin a time di(cid:11)erence of a few ms. Such a learning rule is called \\Temporally\nAsymmetric Hebbian learning (TAH)\" [4, 5] or \\Spike Timing Dependent synaptic\nPlasticity (STDP)\" [6]. Many authors have numerically shown that spatio-temporal\npatterns can be stored in neural networks [6, 7, 8, 9, 10, 11]. Song et al. discussed\nthe variablity of spike generation about the network consisting of spiking neurons\nusing TAH [6]. They found that the condition that the area of LTD was slightly\nlarger than that of LTP was indispensable of the stability. Namely, the balance of\nLTP and LTD is crucial. Yoshioka also discussed the associative memory network\n\n\fconsisting of spiking neurons using TAH [11]. He found that the area of LTP was\nneeded to be equal to that of LTD for stable retrieval. Munro and Hernandez nu-\nmerically showed that a network can retrieve spatio-temporal patterns even in a\nnoisy environment owing to LTD [9]. However, they did not discuss the reason why\nTAH was e(cid:11)ective in terms of the storage and retrieval of the spatio-temporal pat-\nterns. Since TAH has not only the e(cid:11)ect of LTP but that of LTD, the interference\nof LTP and LTD may prevent retrieval of the patterns. To investigate this unknown\nmathematical mechanism for retrieval, we employ an associative memory network\nconsisting of binary neurons. To simplify the dynamics of internal potential enables\nus to analyze the details of the retrieval process. We use a learning rule that is\nthe similar formulation in the previous works. We show the mechanism that the\nspatio-temporal patterns can be retrieved in this network.\n\nThere are many works concerned with associative memory networks that store\nspatio-temporal patterns by the covariance learning [12, 13]. Many biological (cid:12)nd-\nings imply that sparse coding schemes may be used in the brain [14]. It is well-\nknown that the covariance learning is indispensable when the sparse patterns are\nembedded in a network as attractors [15, 16]. The information on the (cid:12)ring rate\nfor the stored patterns is not indispensable for TAH, although it is indispensable\nfor the covariance learning. We theoretically show that TAH qualitatively has the\nsame e(cid:11)ect as the covariance learning when the spatio-temporal patterns are em-\nbedded in the network. This means that the di(cid:11)erence in spike times induces LTP\nor LTD, and the e(cid:11)ect of the (cid:12)ring rate information can be canceled out by this\nspike time di(cid:11)erence. We conclude that this is the reason why TAH doesn\u2019t require\nthe information on the (cid:12)ring rate for the stored patterns.\n\n2 Model\n\nWe investigate a network consisting of N binary neurons that are connected mutu-\nally. In this paper, we consider the case of N ! 1. We use a neuronal model with\nbinary state, f0; 1g. We also use discrete time steps and the following synchronous\nupdating rule,\n\nui(t) =\n\nN\n\nXj=1\n\nJijxj(t);\n\nxi(t + 1) = (cid:2)(ui(t) (cid:0) (cid:18));\n(cid:2)(u) = (cid:26) 1; u (cid:21) 0\n\n0; u < 0;\n\n(1)\n\n(2)\n\n(3)\n\nwhere xi(t) is the state of the i-th neuron at time t, ui(t) its internal potential,\nand (cid:18) a uniform threshold. If the i-th neuron (cid:12)res at time t, its state is xi(t) = 1;\notherwise, xi(t) = 0. The speci(cid:12)c value of the threshold is discussed later. Jij is\nthe synaptic weight from the j-th neuron to the i-th neuron. Each element (cid:24)(cid:22)\ni of\nthe (cid:22)-th memory pattern (cid:24)(cid:22) = ((cid:24)(cid:22)\n\nN ) is generated independently by,\n\n1 ; (cid:24)(cid:22)\n\n2 ;(cid:1)(cid:1)(cid:1); (cid:24) (cid:22)\n\nProb[(cid:24)(cid:22)\n\ni = 0] = f:\n\ni = 1] = 1 (cid:0) Prob[(cid:24)(cid:22)\nThe expectation of (cid:24)(cid:22) is E[(cid:24)(cid:22)\ni ] = f, and thus, f can be considered as the mean (cid:12)ring\nrate of the memory pattern. The memory pattern is \\sparse\" when f ! 0, and\nthis coding scheme is called \\sparse coding\". The synaptic weight Jij follows the\nsynaptic plasticity that depends on the di(cid:11)erence in spike times between the i-th\n(post-) and j-th (pre-) neurons. The di(cid:11)erence determines whether LTP occurs or\nLTD does. Such a learning rule is called \\Temporally Asymmetric Hebbian learning\n(TAH)\" or \\Spike Timing Dependent synaptic Plasticity (STDP)\". This biological\n\n(4)\n\n\fexperimental (cid:12)nding indicates that LTP or LTD is induced when the di(cid:11)erence in\nthe pre- and post-synaptic spike times falls within about 20ms [3] (Figure 1(a)).\nWe de(cid:12)ne that one time step in equations (1){(3) corresponds to 20ms in Figure\n1(a), and a time duration within 20ms is ignored (Figure 1(b)). Figure 1(b) shows\nthat LTP occurs when the j-th neuron (cid:12)res one time step before the i-th neuron\ndoes, (cid:24)(cid:22)+1\nj = 1, and that LTD occurs when the j-th neuron (cid:12)res one time step\nafter the i-th neuron does, (cid:24)(cid:22)(cid:0)1\nj = 1. The previous work indicates the blance\nof LTP and LTD is signi(cid:12)cant [6]. Therefore, we de(cid:12)ne that the area of LTP is the\n\n= (cid:24)(cid:22)\n\n= (cid:24)(cid:22)\n\ni\n\ni\n\n)\n\n%\n(\ne\nd\nu\nt\ni\nl\np\nm\na\n \nP\nS\nP\nE\nn\ni\n \ne\ng\nn\na\nh\nC\n\n \n\n(a)\n\n60\n\n40\n\n20\n\n0\n\n-20\n\n-40\n\nLTP\n\nLTD\n\n-60\n-100 -80 -60 -40 -20\n\n0\n\n20\n\n40\n\ntpre - tpost (ms)\n\n1\n\nj\ni\n\nJ\n\n0\n\n-1\n\n-2\n\nLTP\n\nLTD\n\n-1\n\n0\ntj - ti\n\n1\n\n2\n\n60\n\n80 100\n\n(b)\n\nFigure 1: Temporally Asymmetric Hebbian plasticity. (a): The result of biological\n(cid:12)nding [3] and (b): the learning rule in our model. LTP occurs when the j-th\nneuron (cid:12)res one time step before the i-th one. On the contrary, LTD occurs when\nthe j-th neuron (cid:12)res one time step after the i-th one. Synaptic weight Jij is followed\nby this rule.\n\nsame as that of LTD, and that the amplitude of LTP is also the same as that of\nLTD. On the basis of these de(cid:12)nitions, we employ the following learning rule,\n\nJij =\n\n1\n\nN f(1 (cid:0) f)\n\np\n\n((cid:24)(cid:22)+1\n\ni\n\nX(cid:22)=1\n\nj (cid:0) (cid:24)(cid:22)(cid:0)1\n(cid:24)(cid:22)\n\ni\n\n(cid:24)(cid:22)\nj ):\n\n(5)\n\nThe number of memory patterns is p = (cid:11)N where (cid:11) is de(cid:12)ned as the \\loading\nrate\". There is a critical value (cid:11)C of loading rate. If the loading rate is larger than\n(cid:11)C, the pattern sequence becomes unstable. (cid:11)C is called the \\storage capacity\".\nThe previous works have shown that the learning method of equation (5) can store\nspatio-temporal patterns, that is, pattern sequences [9, 10]. We show that p memory\npatterns are retrieved periodically like (cid:24)1 ! (cid:24)2 ! (cid:1)(cid:1)(cid:1) ! (cid:24) p ! (cid:24)1 ! (cid:1)(cid:1)(cid:1). In other\nwords, (cid:24)1 is retrieved at t = 1, (cid:24)2 at t = 2, and (cid:24)1 at t = p + 1.\n\nHere, we discuss the value of threshold (cid:18).\nIt is well-known that the threshold\nvalue should be controlled according to the progress of the retrieval process time-\ndependently [15, 16]. One candidate algorithm for controlling the threshold value\nis to maintain the mean (cid:12)ring rate of the network at that of memory pattern, f, as\nfollows,\n\nf =\n\n1\nN\n\nN\n\nXi=1\n\nxi(t) =\n\n1\nN\n\nN\n\nXi=1\n\n(cid:2)(ui(t) (cid:0) (cid:18)(t)):\n\n(6)\n\nIt is known that the obtained threshold value is nearly optimal, since it approxi-\nmately gives a maximal storage capacity value [16].\n\n3 Theory\n\nMany neural network models that store and retrieve sequential patterns by TAH\nhave been discussed by many authors [7, 8, 9, 10]. They have numerically shown that\n\nD\n\fTAH is e(cid:11)ective for storing pattern sequences. For example, Munro and Hernandez\nshowed that their model could retrieve a stored pattern sequence even in a noisy\nenvironment [9]. However, the previous works have not mentioned the reason why\nTAH is e(cid:11)ective. Exploring such a mechanism is the main purpose of our paper.\n\nHere, we discuss the mechanism that the network learned by TAH can store and\nretrieve sequential patterns. Before providing details of the retrieval process, we\ndiscuss a simple situation where the number of memory patterns is very small\nrelative to the number of neurons, i.e., p (cid:24) O(1). Let the state at time t be the\nsame as the t-th memory pattern: x(t) = (cid:24)t. Then, the internal potential ui(t) of\nthe equation (1) is given by,\n\ni\n\ni\n\nui(t) = (cid:24)t+1\n\ni (cid:0) (cid:24)t(cid:0)1\n\ni\n\n:\n\n(7)\n\ni\n\ni\n\ni\n\nand (cid:24)t(cid:0)1\n\n= 1 and (cid:24)t(cid:0)1\n\nui(t) depends on two independent random variables, (cid:24)t+1\n, according to the\nequation (4). The (cid:12)rst term (cid:24)t+1\nof the equation (7) is a signal term for the recall of\nthe pattern (cid:24)t+1, which is designed to be retrieved at time t+1, and the second term\ncan interfere in retrieval of (cid:24)t+1. According to the equation (7), ui(t) takes a\n(cid:24)t(cid:0)1\ni\nvalue of 0, (cid:0)1 or +1. (cid:24) t(cid:0)1\ni = 1 means that the interference of LTD exists. If the\nthreshold (cid:18)(t) is set between 0 and +1, (cid:24)t+1\ni = 0 isn\u2019t in(cid:13)uenced by the interference\nof (cid:24)t(cid:0)1\n= 1. When (cid:24)t+1\n= 1, the interference does in(cid:13)uence the\nretrieval of (cid:24)t+1. We consider the probability distribution of the internal potential\nui(t) to examine how the interference of LTD in(cid:13)uences the retrieval of (cid:24)t+1. The\nprobability of (cid:24) t+1\ni = 0 is f (cid:0) f 2,\nthat of (cid:24)t+1\n= 0 and (cid:24)t(cid:0)1\n= 0 is\n(1 (cid:0) f)2. Then the probability distribution of u i(t) is given by this equation\nProb(ui(t)) = (f (cid:0)f2)(cid:14)(ui(t)(cid:0)1)+(1(cid:0)2f +2f 2)(cid:14)(ui(t))+(f (cid:0)f2)(cid:14)(ui(t)+1): (8)\nSince the threshold (cid:18)(t) is set between 0 and +1, the state xi(t + 1) is 1 with\nprobability f (cid:0) f 2 and 0 with 1 (cid:0) f + f 2. The overlap between the state x(t + 1)\nand the memory pattern (cid:24)t+1 is given by,\n\ni = 1 is f 2, that of (cid:24) t+1\n= 1 is f (cid:0) f 2, and that of (cid:24) t+1\n\ni = 0 and (cid:24)t(cid:0)1\n\ni = 1 and (cid:24)t(cid:0)1\n\ni = 1 and (cid:24)t(cid:0)1\n\ni\n\ni\n\ni\n\ni\n\nmt+1(t + 1) =\n\n1\n\nN f(1 (cid:0) f)\n\nN\n\nXi=1\n\n((cid:24)t+1\ni (cid:0) f)xi(t + 1) = 1 (cid:0) f:\n\n(9)\n\nIn a sparse limit, f ! 0, the probability of (cid:24) t+1\ni = 1 approaches 0.\nThis means that the interference of LTD disappears in a sparse limit, and the model\ncan retrieve the next pattern (cid:24)t+1. Then the overlap mt+1(t + 1) approaches 1.\n\ni = 1 and (cid:24)t(cid:0)1\n\nNext, we discuss whether the information on the (cid:12)ring rate is indispensable for\nTAH or not. To investigate this, we consider the case that the number of memory\npatterns is extensively large, i.e., p (cid:24) O(N ). Using the equation (9), the internal\npotential ui(t) of the i-th neuron at time t is represented as,\n\ni\n\ni (cid:0) (cid:24)t(cid:0)1\nui(t) = ((cid:24)t+1\ni (cid:0) (cid:24)(cid:22)(cid:0)1\n((cid:24)(cid:22)+1\nX(cid:22)6=t\n\nzi(t) =\n\np\n\ni\n\n)mt(t) + zi(t);\n\n)m(cid:22)(t):\n\n(10)\n\n(11)\n\nzi(t) is called the \\cross-talk noise\", which represents contributions from non-target\npatterns excluding (cid:24)t(cid:0)1 and prevents the target pattern (cid:24)t+1 from being retrieved.\nThis disappeared in the (cid:12)nite loading case, p (cid:24) O(1).\nIt is well-known that the covariance learning is indispensable when the sparse pat-\nterns are embedded in a network as attractors [15, 16]. Under sparse coding schemes,\n\n\funless the covariance learning is employed, the cross-talk noise does diverge in the\nlarge N limit. Consequently, the patterns can not be stored. The information on\nthe (cid:12)ring rate for the stored patterns is not indispensable for TAH, although it is\nindispensable for the covariance learning. We use the method of the \\statistical\nneurodynamics\" [17, 18] to examine whether the variance of cross-talk noise di-\nverges or not. If a pattern sequence can be stored, the cross-talk noise is obeyed\nby a Gaussian distribution with mean 0 and time-dependent variance (cid:27) 2(t). Oth-\nerwise, (cid:27)2(t) diverges. Since (cid:27)2(t) is changing over time, it is necessary to control\na threshold at an appropriate value at each time step [15, 16]. According to the\nstatistical neurodynamics, we obtain the recursive equations for the overlap mt(t)\nbetween the network state x(t) and the target pattern (cid:24)t and the variance (cid:27)2(t).\nThe details of the derivation will be shown elsewhere. Here, we show the recursive\nequations for mt(t) and (cid:27)2(t),\n1 (cid:0) f\n2\n\nerf((cid:30)0) (cid:0)\n\n1 (cid:0) 2f\n\nerf((cid:30)1) +\n\nmt(t) =\n\nerf((cid:30)2);\n\n(12)\n\nf\n2\n\n2\n\nt\n\na\n\n(13)\n\n(14)\n\n(15)\n\n(cid:27)2(t) =\n\nU (t) =\n\nq(t) =\n\nerf(y) =\n\n(cid:30)0 =\n\n1\n\n1\n\nYb=1\n\nU 2(t (cid:0) b + 1);\n\n0 + f(1 (cid:0) f)(e(cid:0)(cid:30)2\n\n2(a+1)C(a+1)(cid:11)q(t (cid:0) a)\n\nXa=0\np2(cid:25)(cid:27)(t (cid:0) 1)f(1 (cid:0) 2f + 2f 2)e(cid:0)(cid:30)2\n2 )g;\n2 (cid:0)1 (cid:0) (1 (cid:0) 2f + 2f 2)erf((cid:30)0) (cid:0) f(1 (cid:0) f)(erf((cid:30)1) + erf((cid:30)2))(cid:1) ;\np(cid:25) Z y\n(cid:18)(t (cid:0) 1)\np2(cid:27)(t (cid:0) 1)\n\n; (cid:30)1 = (cid:0)mt(cid:0)1(t (cid:0) 1) + (cid:18)(t (cid:0) 1)\n\np2(cid:27)(t (cid:0) 1)\n\nexp ((cid:0)u2)du;\n\na!(b (cid:0) a)!\n\n1 + e(cid:0)(cid:30)2\n\nbCa =\n\n; (cid:30)2 =\n\nb!\n\n2\n\n0\n\n;\n\na! = a (cid:2) (a (cid:0) 1) (cid:2) (cid:1)(cid:1)(cid:1) (cid:2) 1;\n\nmt(cid:0)1(t (cid:0) 1) + (cid:18)(t (cid:0) 1)\n\n:\n\np2(cid:27)(t (cid:0) 1)\n\nThese equations reveal that the variance (cid:27)2(t) of cross-talk noise does not diverge\nas long as a pattern sequence can be retrieved. This result means that TAH quali-\ntatively has the same e(cid:11)ect as the covariance learning.\n\nNext, we discuss the mechanism that the variance of cross-talk noise does not di-\nverge. Let us consider the equation (5). Synaptic weight Jij from j-th neuron to\ni-th neuron is also derived as follows,\n\nJij =\n\n=\n\n1\n\nN f(1 (cid:0) f)\n\n1\n\nN f(1 (cid:0) f)\n\np\n\nX(cid:22)=1\nX(cid:22)=1\n\np\n\n((cid:24)(cid:22)+1\n\ni\n\nj (cid:0) (cid:24)(cid:22)(cid:0)1\n(cid:24)(cid:22)\n\ni\n\n(cid:24)(cid:22)\nj ) =\n\ni n((cid:24)(cid:22)(cid:0)1\n(cid:24)(cid:22)\n\nj (cid:0) f) (cid:0) ((cid:24)(cid:22)+1\n\n1\n\nN f(1 (cid:0) f)\nj (cid:0) f)o\n\np\n\nX(cid:22)=1\n\n((cid:24)(cid:22)\n\ni (cid:24)(cid:22)(cid:0)1\n\nj (cid:0) (cid:24)(cid:22)\n\ni (cid:24)(cid:22)+1\n\nj\n\n)\n\n(16)\n\nThis equation implies that TAH has the information on the (cid:12)ring rate of the memory\npatterns when spatio-temporal patterns are embedded in a network. Therefore,\nthe variance of cross-talk noise doesn\u2019t diverge, and this is another factor for the\nnetwork learned by TAH to store and retrieve a pattern sequence. We conclude\nthat the di(cid:11)erence in spike times induces LTP or LTD, and the e(cid:11)ect of the (cid:12)ring\nrate information can be canceled out by this spike times di(cid:11)erence.\n\n4 Results\n\nWe investigate the property of our model and examine the following two conditions:\na (cid:12)xed threshold and a time-dependent threshold, using the statistical neurodynam-\nics and computer simulations.\n\n\fN Pi xi(t), depend on the loading rate (cid:11) when the mean (cid:12)ring rate of\n\nFigure 2 shows how the overlap mt(t) and the mean (cid:12)ring rate of the network,\n(cid:22)x(t) = 1\nthe memory pattern is f = 0:1 and the threshold is (cid:18) = 0:52, where the storage\ncapacity is maximum with respect to the threshold (cid:18). The stored pattern sequence\ncan be retrieved when the initial overlap m1(1) is greater than the critical value\nmC . The lower line indicates how the critical initial overlap m C depends on the\nloading rate (cid:11). In other words, the lower line represents the basin of attraction\nfor the retrieved sequence. The upper line denotes a steady value of overlap mt(t)\nwhen the pattern sequence is retrieved. mt(t) is obtained by setting the initial\nstate to the (cid:12)rst memory pattern: x(1) = (cid:24)1. In this case, the storage capacity is\n(cid:11)C = 0:27. The dashed line shows a steady value of the normalized mean (cid:12)ring rate\nof network, (cid:22)x(t)=f, for the pattern sequence. The data points and error bars indicate\nthe results of the computer simulations with 5000 neurons: N = 5000. The former\nindicates mean values and the latter does variances in 10 trials. Since the results\n\n1.2\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n0\n\n0.05\n\n)\nd\ne\nh\ns\na\nd\n(\n \nf\n/\ny\nt\ni\nv\ni\nt\nc\na\n \n,\n)\nd\ni\nl\no\ns\n(\n \n\np\na\nl\nr\ne\nv\no\n\n0.15\n\n0.1\nloading rate\n\n0.2\n\nFigure 2 !!The critical overlap (the lower line) and the\noverlap at the stationary state (the upper line). The\ndashed line shows the mean (cid:12)ring rate of the network\ndivided (cid:12)ring rate which is 0:1. The threshold is 0:52\nand the number of neurons is 5000. The data points and\nerror bars show the means and variances, respectively, in\n10 trials of computer simulations. The storage capacity\nis 0:27.\n\n0.25\n\n0.3\n\nof the computer simulations coincide with those of the statistical neurodynamics,\nhereafter, we show the results only of the statistical neurodynamics.\n\nNext, we examine the threshold control scheme in the equation (6), where the\nthreshold is controlled to maintain the mean (cid:12)ring rate of the network at f. q(t)\nin equation (15) is equal to the mean (cid:12)ring rate because q(t) = 1\ni=1(xi(t))2 =\n1\ni=1 xi(t) under the condition xi(t) = f0; 1g. Thus, the threshold is adjusted to\n\nN PN\n\nsatisfy the following equation,\n\nN PN\n\nf = q(t) =\n\n1\n\n2 (cid:0)1 (cid:0) (1 (cid:0) 2f + 2f 2)erf((cid:30)0) (cid:0) f(1 (cid:0) f)(erf((cid:30)1) + erf((cid:30)2))(cid:1) :\n\n(17)\n\nFigure 3 shows the overlap mt(t) as a function of loading rate (cid:11) with f = 0:1. The\nstorage capacity is (cid:11)C = 0:234. The basin of attraction becomes larger than that\nof the (cid:12)xed threshold condition, (cid:18) = 0:52 (Figure 2). Thus, the network becomes\nrobust against noise. This means that even if the initial state x(1) is di(cid:11)erent from\nthe (cid:12)rst memory pattern (cid:24)1, that is, the state includes a lot of noise, the pattern\nsequence can be retrieved.\n\n1.2\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n\n0\n\n0.05\n\n)\nd\ne\nh\ns\na\nd\n(\n \nf\n/\ny\nt\ni\nv\ni\nt\nc\na\n \n,\n)\nd\ni\nl\no\ns\n(\n \np\na\nl\nr\ne\nv\no\n\nFigure 3 !!The critical overlap (the lower line) and the\noverlap at the stationary state (the upper line) when\nthe threshold is changing over time to maintain mean\n(cid:12)ring rate of the network at f. The dashed line shows\nthe mean (cid:12)ring rates of the network divided (cid:12)ring rate\nwhich is 0:1. The basin of attraction become larger than\nthat of the (cid:12)xed threshold condition: Figure 2.\n\n0.25\n\n0.3\n\n0.15\n\n0.1\nloading rate\n\n0.2\n\nFinally, we discuss how the storage capacity depends on the (cid:12)ring rate f of the\nmemory pattern.\nin a\nsparse limit, f ! 0 [19, 20]. Therefore, we investigate the asymptotic property\n\nIt is known that the storage capacity diverges as\n\nf j log f j\n\n1\n\n\fof the storage capacity in a sparse limit. Figure 4 shows how the storage capacity\ndepends on the (cid:12)ring rate where the threshold is controlled to maintain the network\nf j log f j in a sparse limit.\nactivity at f (symbol (cid:14)). The storage capacity diverges as\n\n1\n\n0.03\n\n0.025\n\n0.02\n\nf\n \nC\n\n0.015\n\n0.01\n\n0.005\n\n0\n\n0\n\n0.05 0.1\n\n0.15 0.2\n\n0.25 0.3\n\n0.35 0.4\n\n0.45\n\n1/|log f|\n\n5 Discussion\n\nFigure 4 !!The storage capacity as a function of f in\nthe case of maintaining activity at f (symbol (cid:14)). Ths\nstorage capacity diverges as\n\nf j log f j in a sparse limit.\n\n1\n\nUsing a simple neural network model, we have discussed the mechanism that TAH\nenables the network to store and retrieve a pattern sequence. First, we showed that\nthe interference of LTP and LTD disappeared in a sparse coding scheme. This is\na factor to enable the network to store and retrieve a pattern sequence. Next, we\nshowed the mechanism that TAH qualitatively had the same e(cid:11)ect as the covariance\nlearning by analyzing the stability of the stored pattern sequence and the retrieval\nprocess by means of the statistical neurodynamics. Consequently, the variance of\ncross-talk noise didn\u2019t diverge, and this is another factor for the network learned by\nTAH to store and retrieve a pattern sequence. We conclude that the di(cid:11)erence in\nspike times induces LTP or LTD, and the e(cid:11)ect of the (cid:12)ring rate information can\nbe canceled out by this spike times di(cid:11)erence. We investigated the property of our\nmodel. To improve the retrieval property of the basin of attraction, we introduced\na threshold control algorithm where a threshold value was adjusted to maintain the\nmean (cid:12)ring rate of the network at that of a memory pattern. As a result, we found\nthat this scheme enlarged the basin of attraction, and that the network became\nrobust against noise. We also found that the loading rate diverged as\nf j log f j in a\nsparse limit, f ! 0.\nHere, we compare the storage capacity of our model with that of the model using\nthe covariance learning (Figure 5). The dynamical equations of the model using the\ncovariance learning is derived by Kitano and Aoyagi [13]. We calculate the storage\ncapacity (cid:11)COV\nfrom their dynamical equations and compare these of our model,\n(cid:11)T AH\n. The threshold control method is the same as\nin this paper. As f decreases, the ratio of storage capacities approaches 0:5. The\ncontribution of LTD reduces the storage capacity of our model to half. Therefore,\nin terms of the storage capacity, the covariance learning is better than TAH. But, as\nwe discussed previously, the information of the (cid:12)ring rate is indispensable in TAH.\nIn biological systems, to get the information of the (cid:12)ring rate is di(cid:14)cult.\n\n, by the ratio of (cid:11)T AH\n\n=(cid:11)COV\n\nC\n\n1\n\nC\n\nC\n\nC\n\nV\nO\nCC\n\n/\nH\nA\nCT\n\n0.52\n\n0.5\n\n0.48\n\n0.46\n\n0.44\n\n0.42\n\n0.4\n\n0.38\n\n0.36\n\n0.34\n\n-10\n\n-9\n\n-8\n\n-7\n\n-6\n-5\nlog10f\n\n-4\n\n-3\n\n-2\n\n-1\n\nReferences\n\nFigure 5 !!The comparison of the storage capacity of\nour model with that of the model using the covariance\nlearning. As f decreases, the ratio of storage capacity\napproaches 0:5.\n\n[1] G. Q. Bi and M. M. Poo. Synaptic modi(cid:12)cations in cultured hippocampal\nneurons: Dependence on spike timing, synaptic strength, and postsynaptic cell\n\na\na\na\n\ftype. The Journal of Neuroscience, 18:10464{10472, 1998.\n\n[2] H. Markram, J. L(cid:127)ubke, M. Frotscher, and B. Sakmann. Regulation of synaptic\ne(cid:14)cacy by coincidence of postsynaptic aps and epsps. Science, 275:213{215,\n1997.\n\n[3] L. I. Zhang, H. W. Tao, C. E. Holt, W. A. Harris, and M. M. Poo. A crit-\nical window for cooperation and competition among developing retinotectal\nsynapses. Nature, 395:37{44, 1998.\n\n[4] L. F. Abbott and S. Song. Temporally asymmetric hebbian learning, spike\ntiming and neuronal response variability. In Advances in Neural Information\nProcessing Systems 11, pages 69{75. MIT Press, 1999.\n\n[5] J. Rubin, D. D. Lee, and H. Sompolinsky. Equilibrium properties of temporally\n\nasymmetric hebbian plasticity. Physical Review Letters, 86:364{367, 2001.\n\n[6] S. Song, K. D. Miller, and L. F. Abbott. Competitive hebbian learning through\nspike-timing-dependent synaptic plasticity. Nature Neuroscience, 3:919{926,\n2000.\n\n[7] W. Gerstner, R. Kempter, J. L. van Hemmen, and H. Wagner. A neuronal\n\nlearning rule for sub-millisecond temporal coding. Nature, 383:76{78, 1996.\n\n[8] R. Kempter, W. Gerstner, and J. L. van Hemmen. Hebbian learning and\n\nspiking neurons. Physical Review E, 59:4498{4514, 1999.\n\n[9] P. Munro and G. Hernandez. LTD facilitates learning in a noisy environment.\nIn Advances in Neural Information Processing Systems 12, pages 150{156. MIT\nPress, 2000.\n\n[10] R. P. N. Rao and T. J. Sejnowski. Predictive sequence learning in recurrent\nneocortical circuits. In Advances in Neural Information Processing Systems 12,\npages 164{170. MIT Press, 2000.\n\n[11] M. Yoshioka. to be published in Physical Review E, 2001.\n[12] G. Chechik, I. Meilijson, and E. Ruppin. E(cid:11)ective learning requires neuronal\nremodeling of hebbian synapses. In Advances in Neural Information Processing\nSystems 11, pages 96{102. MIT Press, 1999.\n\n[13] K. Kitano and T. Aoyagi. Retrieval dynamics of neural networks for sparsely\ncoded sequential patterns. Journal of Physics A: Mathematical and General,\n31:L613{L620, 1998.\n\n[14] M. Miyashita. Neuronal correlate of visual associative long-term memory in\n\nthe primate temporal cortex. Nature, 335:817{820, 1988.\n\n[15] S. Amari. Characteristics of sparsely encoded associative memory. Neural\n\nNetworks, 2:1007{1018, 1989.\n\n[16] M. Okada. Notions of associative memory and sparse coding. Neural Networks,\n\n9:1429{1458, 1996.\n\n[17] S. Amari and K. Maginu. Statistical neurodynamics of various versions of\n\ncorrelation associative memory. Neural Networks, 1:63{73, 1988.\n\n[18] M. Okada. A hierarchy of macrodynamical equations for associative memory.\n\nNeural Networks, 8:833{838, 1995.\n\n[19] M. V. Tsodyks and M. V. Feigle\u2019man. The enhanced strage capacity in neural\n\nnetworks with low activity level. Europhysics Letters, 6:101{105, 1988.\n\n[20] C. J. Perez-Vicente and D. J. Amit. Optimized network for sparsely coded\npatterns. Journal of Physics A: Mathematical and General, 22:559{569, 1989.\n\n\f", "award": [], "sourceid": 1972, "authors": [{"given_name": "N.", "family_name": "Matsumoto", "institution": null}, {"given_name": "M.", "family_name": "Okada", "institution": null}]}