{"title": "What Can a Single Neuron Compute?", "book": "Advances in Neural Information Processing Systems", "page_first": 75, "page_last": 81, "abstract": null, "full_text": "What can a  single neuron compute? \n\nBlaise Agiiera y  Areas, l  Adrienne L.  Fairhall, 2  and William Bialek2 \n\n1 Rare Books Library,  Princeton University,  Princeton,  New  Jersey 08544 \n\n2NEC  Research Institute, 4 Independence Way,  Princeton, New  Jersey 08540 \n\nblaisea@prineeton. edu  {adrienne, bialek} @researeh. nj. nee. com \n\nAbstract \n\nIn this  paper  we  formulate  a  description  of the  computation  per(cid:173)\nformed  by  a  neuron  as  a  combination  of  dimensional  reduction \nand nonlinearity.  We  implement this description for  the Hodgkin(cid:173)\nHuxley model,  identify the most  relevant  dimensions  and find  the \nnonlinearity.  A  two  dimensional  description  already  captures  a \nsignificant  fraction  of the information that  spikes  carry about  dy(cid:173)\nnamic inputs.  This description  also shows that computation in the \nHodgkin-Huxley  model  is  more  complex  than  a  simple  integrate(cid:173)\nand-fire or perceptron model. \n\n1 \n\nIntroduction \n\nClassical  neural  network  models  approximate  neurons  as  devices  that  sum  their \ninputs  and  generate  a  nonzero  output  if the  sum  exceeds  a  threshold.  From  our \ncurrent state of knowledge in neurobiology it is easy to criticize these models as over(cid:173)\nsimplified:  where is  the complex geometry of neurons,  or the many different  kinds \nof ion channel,  each with its own  intricate multistate kinetics?  Indeed,  progress  at \nthis more microscopic level of description has led us to the point where we can write \n(almost) exact models for the electrical dynamics of neurons, at least on short time \nscales.  These  nearly exact  models  are  complicated  by  any measure,  including tens \nif not hundreds of differential equations to describe the states of different  channels \nin  different  spatial compartments of the  cell.  Faced with this  detailed  microscopic \ndescription,  we  need  to  answer  a  question  which  goes  well  beyond  the  biological \ncontext:  given  a  continuous dynamical system,  what does it compute? \nOur goal in this paper is to make this question about what a neuron computes some(cid:173)\nwhat  more  precise,  and then to explore  what  we  take to be the simplest example, \nnamely the Hodgkin- Huxley model  [1],[2]  (and refs therein). \n\n2  What do we mean by the question? \n\nReal neurons take as inputs signals at their synapses and give as outputs sequences \nof discrete,  identical  pulses-action  potentials  or  'spikes'.  The  inputs  themselves \nare spikes from  other neurons, so  the neuron is  a  device which takes N  '\" 103  pulse \ntrains as inputs and generates one pulse train as output.  If the system operates at 2 \nmsec resolution and the window of relevant inputs is  20 msec, then we  can think of \na  single neuron as having an input  described by  a  '\"  x 104  bit word-the presence \nor  absence  of a  spike  in  each  2  msec  bin for  each  presynaptic  cell-which is  then \nmapped to a  one  (spike)  or zero  (no  spike).  More  realistically,  if the  average spike \n\n\frates are'\" 10 sec-1, the input words can be compressed by a factor of ten.  Thus we \nmight  be able to think about neurons  as evaluating a  Boolean function  of roughly \n1000 Boolean variables,  and then  characterizing the computational function  of the \ncell  amounts to specifying this Boolean function. \nThe above estimate, though crude, makes clear that there will be no direct empirical \nattack on the question of what a neuron computes:  there are too many possibilities \nto learn the function by brute force from any reasonable set of experiments.  Progress \nrequires the hypothesis that the function computed by a neuron is not arbitrary, but \nbelongs to a simple class.  Our suggestion is that this simple class involves functions \nthat  vary  only  over  a  low  dimensional  subspace of the  inputs,  and in fact  we  will \nstart by searching for  linear subspaces. \nSpecifically, we  begin by simplifying away the spatial structure of neurons and take \ninputs to be just injected currents into a point- like neuron.  While this misses some \nof the richness  in  real  cells,  it  allows  us to focus  on  developing our  computational \nmethods.  Further, it turns out that even this simple problem is  not at all trivial.  If \nthe input is  an injected  current, then the neuron maps the history of this  current, \nI(t < to), into the presence or absence of a spike at time to.  More generally we might \nimagine that the cell  (or our description)  is  noisy,  so  that there is  a  probability of \nspiking P[spike@toII(t < to)]  which depends on the current history.  We  emphasize \nthat the  dependence  on the history of the  current means that there still are  many \ndimensions to the input signal even though we have collapsed any spatial variations. \nIf we  work  at time  resolution  flt  and  assume that  currents in  a  window  of size  T \nare relevant  to the decision  to spike,  then the inputs live  in  a  space of D  = T / flt, \nof order 100 dimensions in many interesting cases. \nIf the neuron is  sensitive only to  a  low  dimensional  linear subspace,  we  can  define \na  set  of signals  S1, S2,\u00b7\u00b7\u00b7, SK  by filtering  the current, \n\ns,..  = 100  dtf,..(t)I(to - t), \n\n(1) \n\nso  that the probability of spiking depends only on this finite  set  of signals, \n\nP[spike@toII(t < to)]  =  P[spike@to]g(s1,s2,\u00b7 .. ,SK), \n\n(2) \nwhere we include the average probability of spiking so that 9 is  dimensionless.  If we \nthink of the current I(t < to)  as a vector, with one dimension for  each time sample, \nthen these filtered signals are linear projections of this vector. \nIn this formulation,  characterizing the computation  done  by  a  neuron  means  esti(cid:173)\nmating the number of relevant  stimulus  dimensions  (K,  hopefully  much  less  than \nD), identifying the filters which project into this relevant subspace,!  and then char(cid:173)\nacterizing the nonlinear function  g(8) .  The  classical  perceptron- like  cell  of neural \nnetwork theory has only one relevant dimension and a  simple form for g. \n\n3 \n\nIdentifying low-dimensional structure \n\nThe  idea  that  neurons  might  be  sensitive  only  to  low-dimensional  projections  of \ntheir  inputs  was  developed  explicitly in  work  on  a  motion  sensitive  neuron of the \nfly  visual system [3].  Rather than looking at the distribution P[spike@tols(t < to)], \nwith  s(t)  the  input  signal  (velocity  of motion  across  the  visual  field  in  [3]),  that \nwork  considered  the  distribution  of signals  conditional  on  the  response,  P[s(t  < \nto)lspike@to];  these are related by Bayes'  rule, \n\nP[spike@tols(t < to)]  = P[s(t < to)lspike@to] \n\n(3) \n\n__ _ ___  ----'P=-[=sp=ik=e:..::@=to]P[s(t<to)] \n\nlNote that the individual filters  don't  really have  any  meaning;  what is  meaningful is \nthe projection operator that is  formed  by the whole  set of these filters.  Put another way, \nthe individual filters  specify both a  K - dimensional subspace  and a  coordinate system on \nthis subspace,  but there is  no reason to prefer one coordinate system over another. \n\n\fWithin  the  response  conditional  ensemble  P[s(t  <  to)lspike@to]  we  can  compute \nvarious moments.  Thus the spike triggered average stimulus, or reverse correlation \nfunction  [4],  is the first  moment \n\nST A(T) = j  [ds]  P[s(t < to)lspike@to]s(to  - T). \n\n(4) \n\nWe  can also  compute the covariance matrix of fluctuations  around this average, \n\nCspike(T,T')  =  j[dS] P[s(t < to)lspike@to]s(to-T)s(to-T')-STA(T)STA(T').  (5) \n\nIn  the  same  way  that  we  compare  the  spike  triggered  average  to  some  constant \naverage level ofthe signal (which we can define to be zero)  in the whole experiment, \nwe  want to compare the covariance matrix Cspike  with the covariance of the signal \naveraged over the whole  experiment, \n\nCprior(T,T') = j[dS] P[s(t < to)]s(to  - T)S(tO  - T'). \n\n(6) \n\nNotice that all of these covariance matrices are D  x D  in size.  The surprising find(cid:173)\ning of [3]  was that the change in  the covariance matrix,  t1C =  Cs  ike  - Cprior,  had \nonly  a  very  small  number  of nonzero  eigenvalues.  In fact  it  can be shown  that if \nthe probability of spiking depends on K  linear projections of the stimulus as in eq. \n(2),  and if the inputs  s(t)  are  chosen from  a  Gaussian  distribution,  then  the  rank \nof the matrix t1C is  exactly K.  Further,  the eigenvectors associated  with  nonzero \neigenvalues  span the relevant  subspace  (up  to a rotation  associated with the auto(cid:173)\ncorrelations in the inputs.  Thus eigenvalue analysis of the spike triggered covariance \nmatrix gives  us  a  direct  way  to  search for  a  low  dimensional  linear  subspace  that \ncaptures the relevant  stimulus features. \n\n4  The Hodgkin-Huxley model \n\nWe  recall the details of the Hodgkin-Huxley model and note some special features \nthat  guide  our  analysis.  Hodgkin  and  Huxley  [1]  modeled  the  dynamics  of the \ncurrent through a  patch of membrane by flow  through ion-specific conductances: \n\nI(t) = Cdt +  9Kn4 (V - VK)  +  9Nam3h (V - VNa)  +  91  (V - VI), \n\ndV \n\n(7) \n\nwhere  K  and  N a  subscripts  denote  potassium- and  sodium-related  variables,  re(cid:173)\nspectively, and l  (for 'leakage') terms are a catch-all for other ion species with slower \ndynamics.  C  is  the  membrane  capacitance.  The subscripted voltages VI  and  VNa \nare ion-specific reversal potentials.  91,  9K  and 9Na  are empirically determined max(cid:173)\nimal  conductances for  the different  ions,2  and the gating variables n, m  and h  (on \nthe interval  [0,1])  have their own voltage dependent  dynamics: \n\ndn/dt  = \ndm/dt  = \ndh/dt  = \n\n(O.OlV + 0.1)(1 - n) exp( -O.lV) - 0.125n exp(V/80) \n(0.1 V  + 2.5)(1- m) exp( -0.1 V  - 1.5) - 4m exp(V/18) \n0.07(1 - h) exp(0.05V) - h exp( -0.1 V  - 4), \n\n(8) \n\nwith V  in m V and t  in  msec. \nHere  we  are  interested  in  dynamic  inputs  I(t),  but  it  is  important  to  remember \nthat for  constant inputs the  Hodgkin-Huxley  model undergoes  a  Hopf bifurcation \nto spike at a  constant frequency;  further,  this frequency is  rather insensitive to the \nprecise  value  of the input  above  onset.  This  'rigidity'  of the  system  is  felt  also  in \n2We have used the original parameters, with a sign change for  voltages:  C  =  lJ.tF /cm2, \ngK = 36mU/cm2, gNa  = 120mU/cm2, gl  = O.3mU/cm2, VK  = -12mV,  VNa  = +115mV, \nVi  =  +10.613mV.  We  have taken our system to be  a  7r  x  30 2 J.tm 2  patch of membrane. \n\n\fmany regimes of dynamic stimulation, and can be thought of as a strong interaction \namong successive  spikes.  These interactions  lead to  long memory times,  reflecting \nthe  infinite  phase  memory  of the  periodic  orbit  which  exists  for  constant  input. \nWhile  spike  interactions  are  interesting,  we  want  to  focus  on  the  way  that  input \ncurrent modulates the probability of spiking.  To separate these effects  we  consider \nonly  'isolated'  spikes.  These  are  defined  by  accumulating  the  interspike  interval \ndistribution  and  noticing  that  for  some  intervals  t  >  tc  the  distribution  decays \nexponentially, which means that the system has lost memory of the previous spike; \nthus spikes which  are more than tc  after the previous spike  are isolated. \nIn  what follows  we  consider the response of the Hodgkin- Huxley model to currents \nI(t)  with zero mean, 0.275  nA  standard deviation,  and 0.5  msec  correlation time. \n\n5  How many dimensions? \n\nFig.  1 shows the change in covariance matrix f1C( r, r') for isolated spikes in our HH \nsimulation, and fig.  2(a)  shows the resulting spectrum of eigenvalues  as  a  function \nof  sample  size.  The  result  strongly  suggests  that  there  are  many  fewer  than  D \nrelevant  dimensions.  In  particular,  there  seem  to  be two  outstanding  modes;  the \nSTA itself lies  largely in  the subspace of these modes,  as shown in  Fig.  2(b). \n\n0.01 \n\n~ 0.00 \nS \n~ \n\nt' ({l\\sec) \n\nFigure 1:  The isolated spike triggered covariance matrix f1C(r,r'). \n\nThe filters  themselves,  shown in fig.  3,  have simple forms;  in  particular the second \nmode  is  almost exactly the derivative  of the first.  If the neuron filtered  its  inputs \nand  generated  a  spike  when  the  output  of the  filter  crosses  threshold,  we  would \nfind  that  there  are two  significant  dimensions,  corresponding  to  the  filter  and  its \nderivative.  It is tempting to suggest, then, that this is  a good approximation to the \nHH model, but we will see that this is not correct.  Notice also that both filters have \nsignificant differentiating components- the cell is  not simply integrating its inputs. \nAlthough fig.  2(a) suggests that two modes dominate, it also demonstrates that the \nsmaller nonzero eigenvalues of the other modes are not just noise.  The width of any \nspectral  band  of eigenvalues  near  zero  due  to  finite  sampling  should  decline  with \nincreasing sample size.  However, the smaller eigenvalues seen in fig.  2(a) are stable. \nThus while the system is  primarily sensitive to two dimensions, there is  something \n\n\f02 \n\n(a) \n\n0.5 \n\n2 \n\n(b) \n\nCl'=O_ iQ- =-- - _ _ \"\"\"'_ \n\n20 \n\n.\u00a7  0.0 \n13 \nOJ \"e-\n\"(cid:173)\n\n-0.5 \n\n-1.0 \n\n1 \n\n10+3 \n\n10+4 \n\n10+5 \n\n10+6 \n\nnumber of  spikes accu mulated \n\nFigure 2:  (a)  Convergence ofthe largest 32 eigenvalues of the isolated spike triggered \ncovariance  with  increasing  sample  size_  (b)  Projections  of the  isolated  STA  onto \nthe covariance modes_ \n\neigenmodes 1 and 2 \n\n-\n...... ..  normalized derivative of mode 1 \n\n-30 \n\n-25 \n\n-20 \n\nFigure 3:  Most significant two modes of the spike-triggered covariance_ \n\nmissing  in  this picture.  To  quantify  this,  we  must first  characterize the  nonlinear \nfunction  g(81' 82). \n\n6  Nonlinearity and information \n\nAt  each instant of time we  can find  the relevant projections of the stimulus 81  and \n82.  By  construction,  the  distribution  of these  signals  over  the  whole  experiment, \nP(81, 82),  is Gaussian.  On the other hand, each time we see a spike we  get a sample \nfrom  the distribution P(81' 82Ispike@to), leading to the picture in fig.  4.  The prior \nand  spike  conditional  distributions  clearly  are  better  separated in  two  dimensions \nthan in one, which means that our two dimensional description captures more than \nthe spike triggered average.  Further, we  see  that the spike  conditional distribution \nis  curved,  unlike  what we  would  expect for  a  simple thresholding device. \nCombining eq's.  (2)  and  (3),  we  have \n\n( \n\n9  81, 82  -\n\n) _  P(81,82Ispike@to) \n, \n\nP( \n\n) \n81,82 \n\n(9) \n\nso  that these  two  distributions determine  the input/output relation of the neuron \nin  this  2D  space.  We  emphasize  that  although the  subspace  is  linear,  9  can  have \narbitrary  nonlinearity.  Fig.  4  shows  that  this  input/output  relation  has  sharp \nedges,  but  also  some fuzziness.  The HH  model is  deterministic,  so  in principle the \ninput/output relation should be a  c5  function:  spikes occur only when certain exact \nconditions are met.  Of course we have blurred things a bit by working at finite time \n\n\f-w  2 \no \n~ \n.~ \n\"0 \n\"E  a \n'\" \"0 \n~ \n:\u00a7.. \n\nN \nen \n\n-2 \n\n-4 \n\n4 \n\n2 \n~ \ns, (standard deviations) \n\na \n\nFigure 4:  104  spike-conditional stimuli projected along the first  2 covariance modes. \nThe circles  represent  the  cumulative radial integral of the  prior  distribution from \n00;  the ring marked 10-4, for  example, encloses  1 - 10-4  of the prior. \n\nresolution.  Given  that  we  work  at  finite  llt,  spikes  carry only  a  finite  amount  of \ninformation, and the quality of our 2D  approximation can be judged by asking how \nmuch of this information is  captured by this description. \nAs  explained in  [5],  the arrival time of a  single spike provides an information \n\nlonespike = ( r~) log2  [r~)] ), \n\n(10) \n\nwhere  r(t)  is  the time  dependent  spike  rate,  f  is  the  average  spike  rate,  and  ( . . . ) \ndenotes  an  average  over  time.  With  a  deterministic  model  like  HH,  the  rate  r(t) \neither  is  zero  or  corresponds to  one  spike  occurring in  one  bin  of size  llt,  that  is \nr  = l/11t.  The result  is  that lonespike = -log2(fllt). \nOn the other hand, if the probability of spiking really depends only on the stimulus \ndimensions 81  and 82,  we  can substitute \n\nr(t) \n-\nf \n\nP(81,82Ispike@t) \n-+  ---'--=::::-:-=--'--=----:---\"-\nP(81,82)' \n\n(11) \n\nand use  the ergodicity of the stimulus to replace time averages in Eq.  (10).  Then \nwe find  [3,  5] \n\n(12) \n\nIf our two dimensional approximation were exact we would find  l~~~s:pike = lone spike; \nmore generally we  will find  1~~~ss2pike  ~ lone spike,  and the fraction of the information \nwe  capture measures the  quality of the approximation.  This fraction  is  plotted in \nfig.  5 as a function of time resolution.  For comparison, we also show the information \ncaptured by considering only the stimulus projection along the STA. \n\n\f-+- Covariance modes 1 and 2 (2D) \n\n02 ~~----~~----~----~--~ \n\n6 \n\nB \n\n10 \n\ntime discretization  (msec) \n\nFigure 5:  Fraction of spike timing information captured by STA  (lower  curve)  and \nprojection onto covariance modes 1 and 2  (upper curve). \n\n7  Discussion \n\nThe  simple,  low-dimensional  model  described  captures  a  substantial  amount  of \ninformation  about  spike  timing  for  a  HH  neuron.  The  fraction  is  maximal  near \nbot  =  5.5msec,  reaching  nearly 70%.  However,  the  absolute information  captured \nsaturates for both the 1D and 2D  cases, at RJ  3.5 and 5 bits respectively, for  smaller \nbot.  Hence  the  information  fraction  captured  plummets;  recovering  precise  spike \ntiming requires a  more complex, higher dimensional representation of the stimulus. \nIs  this  effect  important,  or  is  timing  at  this  resolution  too  noisy  for  this  extra \ncomplexity to matter in a  real neuron?  Stochastic HH  simulations have  suggested \nthat,  when  realistic  noise  sources  are  taken  into  account,  the  timing  of spikes  in \nresponse  to  dynamic  stimuli is  reproducible to  within  1- 2 msec  [6].  This  suggests \nthat such timing details may indeed  be important. \nEven in 2D, one can observe that the spike conditional distribution is curved (fig.  4); \nit is likely to curve along other dimensions as well.  It may be possible to improve our \napproximation by  considering the computation to take place on  a  low-dimensional \nbut  curved  manifold,  instead  of a  linear  subspace.  The  curvature  in  Fig.  4  also \nimplies  that  the  computation  in  the  HH  model  is  not  well  approximated  by  an \nintegrate and fire  model,  or a  perceptron model limited to linear separations. \nCharacterizing  the  complexity  of  the  computation  is  an  important  step  toward \nunderstanding  neural  systems.  How  to  quantify  this  complexity  theoretically  is \nan area for future work;  here, we have made progress toward this goal by describing \nsuch  computations in  a  compact way  and then evaluating the completeness of the \ndescription  using  information.  The  techniques  presented  are  applicable  to  more \ncomplex  models,  and  of course  to  real  neurons.  How  does  the  addition  of  more \nchannels increase the complexity of the computation?  Will this add more relevant \ndimensions or does the non-linearity change? \n\nReferences \n\n[1]  A.  Hodgkin  and A.  Huxley.  J.  Physiol.,  117,  1952. \n[2]  C. Koch.  Biophysics  of computation.  New York:  Oxford University Press,  1999. \n[3]  W.  Bialek  and R.  de Ruyter van Steveninck.  Proc.  R.  Soc.  Lond.  B,  234,  1988. \n[4]  F.  Rieke,  D.  Warland, R.  de Ruyter van Steveninck and W.  Bialek.  Spikes:  exploring \n\nthe  neural  code.  Cambridge,  MA:  MIT Press,  1997. \n\n[5]  N.  Brenner,  S.  Strong,  R.  Koberle,  W.  Bialek  and  R.  de  Ruyter  van  Steveninck. \n\nNeural  Comp.,  12,  2000. \n\n[6]  E.  Schneidman, R.  Freedman and I. Segev.  Neural  Comp.,  10,  1998. \n\n\f", "award": [], "sourceid": 1867, "authors": [{"given_name": "Blaise", "family_name": "Ag\u00fcera y Arcas", "institution": null}, {"given_name": "Adrienne", "family_name": "Fairhall", "institution": null}, {"given_name": "William", "family_name": "Bialek", "institution": null}]}