{"title": "Information through a Spiking Neuron", "book": "Advances in Neural Information Processing Systems", "page_first": 75, "page_last": 81, "abstract": null, "full_text": "Information through a Spiking Neuron \n\nCharles F. Stevens and Anthony Zador \n\nSalk Institute MNL/S \n\nLa J olIa, CA 92037 \n\nzador@salk.edu \n\nAbstract \n\nWhile it is generally agreed that neurons transmit information \nabout their synaptic inputs through spike trains, the code by which \nthis information is transmitted is not well understood. An upper \nbound on the information encoded is obtained by hypothesizing \nthat the precise timing of each spike conveys information. Here we \ndevelop a general approach to quantifying the information carried \nby spike trains under this hypothesis, and apply it to the leaky \nintegrate-and-fire (IF) model of neuronal dynamics. We formu(cid:173)\nlate the problem in terms of the probability distribution peT) of \ninterspike intervals (ISIs), assuming that spikes are detected with \narbitrary but finite temporal resolution . In the absence of added \nnoise, all the variability in the ISIs could encode information, and \nthe information rate is simply the entropy of the lSI distribution, \nH (T) = (-p(T) log2 p(T)}, times the spike rate. H (T) thus pro(cid:173)\nvides an exact expression for the information rate. The methods \ndeveloped here can be used to determine experimentally the infor(cid:173)\nmation carried by spike trains, even when the lower bound of the \ninformation rate provided by the stimulus reconstruction method \nis not tight. In a preliminary series of experiments, we have used \nthese methods to estimate information rates of hippocampal neu(cid:173)\nrons in slice in response to somatic current injection. These pilot \nexperiments suggest information rates as high as 6.3 bits/spike. \n\n1 \n\nInformation rate of spike trains \n\nCortical neurons use spike trains to communicate with other neurons. The output \nof each neuron is a stochastic function of its input from the other neurons. It is of \ninterest to know how much each neuron is telling other neurons about its inputs. \n\nHow much information does the spike train provide about a signal? Consider noise \nnet) added to a signal set) to produce some total input yet) = set) + net). This \nis then passed through a (possibly stochastic) functional F to produce the output \nspike train F[y(t)] --+ z(t). We assume that all the information contained in the \nspike train can be represented by the list of spike times; that is, there is no extra \ninformation contained in properties such as spike height or width. Note, however, \nthat many characteristics of the spike train such as the mean or instantaneous rate \n\n\f76 \n\nC. STEVENS, A. ZADOR \n\ncan be derived from this representation; if such a derivative property turns out to \nbe the relevant one, then this formulation can be specialized appropriately. \n\nWe will be interested, then, in the mutual information 1(S(t); Z(t\u00bb between the \ninput signal ensemble S(t) and the output spike train ensemble Z(t) . This is defined \nin terms of the entropy H(S) of the signal, the entropy H(Z) of the spike train, \nand their joint entropy H(S, Z), \n\n1(S; Z) = H(S) + H(Z) - H(S, Z). \n\n(1) \nNote that the mutual information is symmetric, 1(S; Z) = 1(Z; S), since the joint \nentropy H(S, Z) = H(Z, S). Note also that if the signal S(t) and the spike train \nZ(t) are completely independent, then the mutual information is 0, since the joint \nentropy is just the sum of the individual entropies H(S, Z) = H(S) + H(Z). This is \ncompletely in lin'e with our intuition, since in this case the spike train can provide \nno information about the signal. \n\nInformation estimation through stimulus reconstruction \n\n1.1 \nBialek and colleagues (Bialek et al., 1991) have used the reconstruction method \nto obtain a strict lower bound on the mutual information in an experimental set(cid:173)\nting. This method is based on an expression mathematically equivalent to eq. (1) \ninvolving the conditional entropy H(SIZ) of the signal given the spike train, \n\n1(S; Z) \n\nH(S) - H(SIZ) \n\n> H(S) - Hest(SIZ), \n\n(2) \nwhere Hest(SIZ) is an upper bound on the conditional entropy obtained from a \nreconstruction sest< t) of the signal. The entropy is estimated from the second order \nstatistics of the reconstruction error e(t) ~ s(t)-sest (t); from the maximum entropy \nproperty of the Gaussian this is an upper bound. Intuitively, the first equation says \nthat the information gained about the spike train by observing the stimulus is just \nthe initial uncertainty of the signal (in the absence of knowledge of the spike train) \nminus the uncertainty that remains about the signal once the spike train is known, \nand the second equation says that this second uncertainty must be greater for any \nparticular estimate than for the optimal estimate. \n\nInformation estimation through spike train reliability \n\n1.2 \nWe have adopted a different approach based an equivalent expression for the mutual \ninformation: \n\n1(S; Z) = H(Z) - H(ZIS). \n\n(3) \nThe first term H(Z) is the entropy of the spike train, while the second H(ZIS) \nis the conditional entropy of the spike train given the signal; intuitively this like \nthe inverse repeatability of the spike train given repeated applications of the same \nsignal. Eq. (3) has the advantage that, if the spike train is a deterministic function \nof the input, it permits exact calculation of the mutual information . This follows \nfrom an important difference between the conditional entropy term here and in eq. \n2: whereas H(SIZ) has both a deterministic and a stochastic component, H(ZIS) \nhas only a stochastic component. Thus in the absence of added noise, the discrete \nentropy H(ZIS) = 0, and eq. (3) reduces to 1(S; Z) = H(Z). \n\nIf ISIs are independent, then the H(Z) can be simply expressed in terms of the \nentropy of the (discrete) lSI distribution p(T), \n\n00 \n\nH(T) = - LP(1'i) 10g2P(1'i) \n\ni=O \n\n(4) \n\n\fInfonnation Through a Spiking Neuron \n\n77 \n\nas H(Z) = nH(T), where n is the number of spikes in Z. Here p('li) is the prob(cid:173)\nability that the spike occurred in the interval (i)~t to (i + l)~t. The assumption \nof finite timing precision ~t keeps the potential information finite. The advantage \nof considering the lSI distribution peT) rather than the full spike train distribution \np(Z) is that the former is univariate while the latter is multivariate; estimating the \nformer requires much less data. \n\nUnder what conditions are ISIs independent? Correlations between ISIs can arise \neither through the stimulus or the spike generation mechanism itself. Below we shall \nguarantee that correlations do not arise from the spike-generator by considering the \nforgetful integrate-and-fire (IF) model, in which all information about the previous \nspike is eliminated by the next spike. If we further limit ourselves to temporally \nuncorrelated stimuli (i. e. stimuli drawn from a white noise ensemble), then we can \nbe sure that ISIs are independent, and eq. (4) can be applied. \n\nIn the presence of noise, H(ZIT) must also be evaluated, to give \n\nf(S; T) = H(T) - H(TIS). \n\nH(TIS) is the conditional entropy of the lSI given the signal, \n\nH(TIS) = - / t p(1j ISi(t)) log2 p(1j ISi(t))) \n\n\\J=1 \n\n3;(t) \n\n(5) \n\n(6) \n\nwhere p(1j ISi(t)) is the probability of obtaining an lSI of 1j in response to a par(cid:173)\nticular stimulus Si(t) in the presence of noise net). The conditional entropy can be \nthought of as a quantification of the reliability of the spike generating mechanism: \nit is the average trial-to-trial variability of the spike train generated in response to \nrepeated applications of the same stimulus. \n\n1.3 Maximum spike train entropy \nIn what follows, it will be useful to compare the information rate for the IF neuron \nwith the limiting case of an exponential lSI distribution, which has the maximum \nentropy for any point process of the given rate (Papoulis, 1984). This provides an \nupper bound on the information rate possible for any spike train, given the spike \nrate and the temporal precision. Let f(T) = re-rr be an exponential distribution \nwith a mean spike rate r. Assuming a temporal precision of ~t, the entropy/spike \nis H(T) = log2 r~t' and the entropy/time for a rate r is rH(T) = rlog2 -~ . \nFor example, if r = 1 Hz and ~t = 0.001 sec, this gives (11.4 bits/second) (1 \nspike/second) = 11.4 bits/spike. That is, if we discretize a 1 Hz spike train into \n1 msec bins, it is nof possible for it to transmit more than 11.4 bits/second. If \nwe reduce the bin size two-fold, the rate increases by log2 1/2 = 1 bit/spike to \n12.4 bits/spike, while if we double it we lose one bit/s to get 10.4 bit/so Note \nthat at a different firing rate, e.g. r = 2 Hz, halving the bin size still increases \nthe entropy/spike by 1 bit/spike, but because the spike rate is twice as high, this \nbecomes a 2 bit/second increase in the information rate. \n\n1.4 The IF model \nNow we consider the functional :F describing the forgetful leaky IF model of spike \ngeneration. Suppose we add some noise net) to a signal set), yet) = net) + set), \nand threshold the sum to produce a spike train z(t) = :F[s(t) + net)]. Specifically, \nsuppose the voltage vet) of the neuron obeys vet) = -v(t)/r + yet), where r is the \nmembrane time constant, both s(t~ and net) have a white Gaussian distributions \nand yet) has mean I' and variance (T \n\u2022 If the voltage reaches the threshold ()o at some \ntime t, the neuron emits a spike at that time and resets to the initial condition Vo. \n\n\f78 \n\nc. STEVENS, A. ZAOOR \n\nIn the language of neurobiology, this model can be thought of (Tuckwell, 1988) as \nthe limiting case of a neuron with a leaky IF spike generating mechanism receiving \nmany excitatory and inhibitory synaptic inputs. Note that since the input yet) is \nwhite, there are no correlations in the spike train induced by the signal, and since \nthe neuron resets after each spike there are no correlations induced by the spike(cid:173)\ngenerating mechanism. Thus ISIs are independent, and eq. (4) r.an be applied. \n\nWe will estimate the mutual information I(S, Z) between the ensemble of input \nsignals S and the ensemble of outputs Z. Since in this model ISIs are independent by \nconstruction, we need only evaluate H(T) and H(TIS); for this we must determine \np(T), the distribution of ISIs, and p(Tlsi), the conditional distribution of ISIs for \nan ensemble of signals Si(t). Note that peT) corresponds to the first passage time \ndistribution of the Ornstein-Uhlenbeck process (Tuckwell, 1988). \n\nThe neuron model we are considering has two regimes determined by the relation \nof the asymptotic membrane potential (in the absence of threshold) J.l.T and the \nthreshold (J. In the suprathreshold regime, J.l.T > (J, threshold crossings occur even if \nthe signal variance is zero (0-2 = 0). In the subthreshold regime, J.l.T ~ (J, threshold \ncrossings occur only if 0-2 > O. However, in the limit that E{T} ~ T, i.e. the mean \nfiring rate is low compared with the integration time constant (this can only occur \nin the subthreshold regime), the lSI distribution is exponential, and its coefficient \nof variation (CV) is unity (cf. (Softky and Koch, 1993)). In this low-rate regime the \nfiring is deterministically Poisson; by this we mean to distinguish it from the more \nusual usage of Poisson neuron, the stochastic situation in which the instantaneous \nfiring rate parameter (the probability of firing over some interval) depends on the \nstimulus (i.e. f ex: set)). In the present case the exponential lSI distribution arises \nfrom a deterministic mechanism. \n\nAt the border between these regimes, when the threshold is just equal to the asymp(cid:173)\ntotic potential, (Jo = J.l.T, we have an explicit and exact solution for the entire lSI \ndistribution (Sugiyama et al., 1970) \n\npeT) = (J.l.T)(T/2)-3/2 [e 2T1T _ 1]-3/ 2exp(2T/T _ \n\n(211\")1/20-\n\n(J.l.T? \n\n(0-2T)(e 2TIT - 1) \n\n). \n\n(7) \n\nThis is the special case where, in the absence of fluctuations (0- 2 = 0), the membrane \npotential hovers just subthreshold. Its neurophysiological interpretation is that the \nexcitatory inputs just balance the inhibitory inputs, so that the neuron hovers just \non the verge of firing. \n\nInformation rates for noisy and noiseless signals \n\n1.5 \nHere we compare the information rate for a IF neuron at the \"balance point\" J.l.T = (J \nwith the maximum entropy spike train. For simplicity and brevity we consider only \nthe zero-noise case, i.e. net) = O. Fig. 1A shows the information per spike as a \nfunction of the firing rate calculated from eq. (7), which was varied by changing \nthe signal variance 0-2 . We assume that spikes can be resolved with a temporal \nresolution of 1 msec, i. e. \nthat the lSI distribution has bins 1 msec wide. The \ndashed line shows the theoretical upper bound given by the exponential distribution; \nthis limit can be approached by a neuron operating far below threshold, in the \nPoisson limit. For both the IF model and the upper bound, the information per \nspike is a monotonically decreasing function of the spike rate; the model almost \nachieves the upper bound when the mean lSI is just equal to the membrane time \nconstant. In the model the information saturates at very low firing rates, but for the \nexponential distribution the information increases without bound. At high firing \nrates the information goes to zero when the firing rate is too fast for individual ISIs \nto be resolved at the temporal resolution. Fig. 1B shows that the information rate \n(information per second) when the neuron is at the balance point goes through a \n\n\fInfonnation Through a Spiking Neuron \n\n79 \n\nmaximum as the firing rate increases. The maximum occurs at a lower firing rate \nthan for the exponential distribution (dashed line). \n\n1.6 Bounding information rates by stimulus reconstruction \nBy construction, eq. (3) gives an exact expression for the information rate in this \nmodel. We can therefore compare the lower bound provided by the stimulus recon(cid:173)\nstruction method eq. (2) (Bialek et aI., 1991). That is, we can assess how tight \na lower bound it provides. Fig. 2 shows the lower bound provided by the recon(cid:173)\nstruction (solid line) and the reliability (dashed line) methods as a function of the \nfiring rate. The firing rate was increased by increasing the mean p. of the input \nstimulus yet), and noise was set to O. At low firing rates the two estimates are \nnearly identical, but at high firing rates the reconstruction method substantially \nunderestimates the information rate. The amount of the underestimate depends on \nthe model parameters, and decreases as noise is added to the stimulus. The tight(cid:173)\nness of the bound is therefore an empirical question. While Bialek and colleagues \n(1996) show that under the conditions of their experiments the underestimate is less \nthan a factor of two, it is clear that the potential for underestimate under different \nconditions or in different systems is greater. \n\n2 Discussion \n\nWhile it is generally agreed that spike trains encode information about a neuron's \ninputs, it is not clear how that information is encoded. One idea is that it is the \nmean firing rate alone that encodes the signal, and that variability about this mean \nis effectively noise. An alternative view is that it is the variability itself that encodes \nthe signal, i. e. \nthat the information is encoded in the precise times at which spikes \noccur. In this view the information can be expressed in terms of the interspike \ninterval (lSI) distribution of the spike train. This encoding scheme yields much \nhigher information rates than one in which only the mean rate (over some interval \nlonger than the typical lSI) is considered. Here we have quantified the information \ncontent of spike trains under the latter hypothesis for a simple neuronal model. \n\nWe consider a model in which by construction the ISIs are independent, so that the \ninformation rate (in bits/sec) can be computed directly from the information per \nspike (in bits/spike) and the spike rate (in spikes/sec). The information per spike \nin turn depends on the temporal precision with which spikes can be resolved (if \nprecision were infinite, then the information content would be infinite as well, since \nany message could for example be encoded in the decimal expansion of the precise \narrival time of a single spike), the reliability of the spike transduction mechanism, \nand the entropy of the lSI distribution itself. For low firing rates, when the neuron \nis in the subthreshold limit, the lSI distribution is close to the theoretically maximal \nexponential distribution. \n\nMuch of the recent interest in information theoretic analyses of the neural code can \nattributed to the seminal work of Bialek and colleagues (Bialek et al., 1991; Rieke \net al., 1996), who measured the information rate for sensory neurons in a number of \nsystems. The present results are in broad agreement with those of DeWeese (1996) , \nwho considered the information rate of a linear-filtered threshold crossing! (LFTC) \nmodel. DeWeese developed a functional expansion, in which the first term describes \nthe limit in which spike times (not ISIs) are independent, and the second term is \na correction for correlations. The LFTC model differs from the present IF model \nmainly in that it does not \"reset\" after each spike. Consequently the \"natural\" \n\n1 In the LFTC model, Gaussian signal and noise are convolved with a linear filter; the \n\ntimes at which the resulting waveform crosses some threshold are called \"spikes\". \n\n\f80 \n\nC. STEVENS, A. ZADOR \n\nrepresentation of the spike train in the LFTC model is as a sequence to . . . tn of \nfiring times, while in the IF model the \"natural\" representation is as a sequence \nTl .. . Tn of ISIs. The choice is one of convenience, since the two representations are \nequivalent. \n\nThe two models are complementary. In the LFTC model, results can be obtained for \ncolored signals and noise, while such conditions are awkward in the IF model. In the \nIF model by contrast, a class of highly correlated spike trains can be conveniently \nconsidered that are awkward in the LFTC model. That is, the indendent-ISI condi(cid:173)\ntion required in the IF model is less restrictive than the independent-spike condition \nof the LFTC model-spikes are independent iff ISIs are indepenndent and the lSI \ndistribution p(T) is exponential. In particular, at high firing rates the lSI distri(cid:173)\nbution can be far from exponential (and therefore the spikes far from independent) \neven when the ISIs themselves are independent. \n\nBecause we have assumed that the input s(t) is white, its entropy is infinite, and the \nmutual information can grow without bound as the temporal precision with which \nspikes are resolved improves. Nevertheless, the spike train is transmitting only a \nminute fraction of the total available information. The signal thereby saturates the \ncapacity of the spike train. While it is not at all clear whether this is how real \nneurons actually behave, it is not implausible: a typical cortical neuron receives as \nmany as 104 synaptic inputs, and if the information rate of each input is the same as \nthe target, then the information rate impinging upon the target is 104-fold greater \n(neglecting synaptic unreliability, which could decrease this substantially) than its \ncapacity. \n\nIn a preliminary series of experiments, we have used the reliability method to esti(cid:173)\nmate the information rate of hippocampal neuronal spike trains in slice in response \nto somatic current injection (Stevens and Zador, unpublished). Under these condi(cid:173)\ntions ISIs appear to be independent, so the method developed here can be applied. \nIn these pilot experiments, an information rates as high as 6.3 bits/spike was ob(cid:173)\nserved. \n\nReferences \n\nBialek, W., Rieke, F., de Ruyter van Steveninck, R., and Warland, D. (1991). \n\nReading a neural code. Science, 252:1854- 1857. \n\nDeWeese, M. (1996). Optimization principles for the neural code. In Hasselmo, \nM., editor, Advances in Neural Information Processing Systems, vol. 8. MIT \nPress, Cambridge, MA. \n\nPapoulis, A. (1984) . Probability, random variables and stochastic processes, 2nd \n\nedition. McGraw-Hill. \n\nRieke, F., Warland, D., de Ruyter van Steveninck, R., and Bialek, W . (1996). Neural \n\nCoding. MIT Press. \n\nSoftky, W . and Koch, C. (1993). The highly irregular firing of cortical cells is \ninconsistent with temporal integration of random epsps. J . Neuroscience., \n13:334-350. \n\nSugiyama, H., Moore, G., and Perkel, D. (1970). Solutions for a stochastic model \n\nof neuronal spike production. Mathematical Biosciences, 8:323-34l. \n\nTuckwell, H. (1988). Introduction to theoretical neurobiology (2 vols.). Cambridge. \n\n\fInfonnation Through a Spiking Neuron \n\n81 \n\nInformation at balance point \n\n\u00a7 1000 \n~ \n15 500 \n\n/ \n\n/ \n\n/ \n\n/ \n\n/ \n\n/ \n\n, \n\n\\ \n\n\\ \n\\ \n\\ \n\\ \n\\ \n\noL-~~====~--~~~~--~~~~~~~~~~ \n1~ \n1~ \n\n1~ \n\n1~ \n\n1~ \n\nfiring rate (Hz) \n\nFigure 1: Information rate at balance point. (A; top) The information per spike \ndecreases monotonically with the spike rate (solid line) . It is bounded above by \nthe entropy of the exponential limit (dashed line), which is the highest entropy lSI \ndistribution for a given mean rate; this limit is approached for the IF neuron in \nthe subthreshold regime. The information rate goes to 0 when the firing rate is of \nthe same order as the temporal resolution tit. The information per spike at the \nbalance point is nearly optimal when E{T} ::::::: T. (T = 50 msec; tit = 1 msec); \n(B; bottom) Information per second for above conditions. The information rate \nfor both the balance point (solid curve) and the exponential distribution (dashed \ncurve) pass through a maximum, but the maximum is greater and occurs at an \nhigher rate for the latter. For firing rates much smaller than T, the rates are almost \nindistinguishable. (T = 50 msec; tit = 1 msec) \n\n~r-----~----~----~----~-----'~----~----~----, \n\n250 \n\n200 \n\n1150 \nD \n\n100 \n\n\" \" \n\n\",'\" \n\n,,-\n\n,,-\n\n~V-----\n\n0 0 \n\n10 \n\n20 \n\n30 \n\n40 \n\nspike rate (Hz) \n\n50 \n\neo \n\n70 \n\n80 \n\nFigure 2: Estimating information by stimulus reconstruction. The information \nrate estimated by the reconstruction method solid line and the exact information \nrate dashed line are shown as a function of the firing rate. The reconstruction \nmethod significantly underestimates the actual information, particularly at high \nfiring rates. The firing rate was varied through the mean input p. The parameters \nwere: membrane time constant T = 20 msec; spike bin size tit = 1 msec; signal \nvariance 0\"; = 0.8; threshold Q = 10. \n\n\f", "award": [], "sourceid": 1135, "authors": [{"given_name": "Charles", "family_name": "Stevens", "institution": null}, {"given_name": "Anthony", "family_name": "Zador", "institution": null}]}