{"title": "Rapid Visual Processing using Spike Asynchrony", "book": "Advances in Neural Information Processing Systems", "page_first": 901, "page_last": 907, "abstract": null, "full_text": "Rapid Visual Processing using Spike Asynchrony \n\nSimon  J.  Thorpe  &  Jacques  Gautrais \n\nCentre de Recherche Cerveau &  Cognition \n\nF-31062 Toulouse \n\nFrance \n\nemail thorpe@cerco.ups-tlseJr \n\nAbstract \n\nWe have investigated the possibility that  rapid  processing  in  the  visual \nsystem  could  be  achieved  by  using  the  order  of  firing  in  different \nneurones as  a  code,  rather than  more  conventional  firing  rate  schemes. \nUsing  SPIKENET,  a  neural  net  simulator  based  on  integrate-and-fire \nneurones and  in  which  neurones  in  the  input  layer function  as  analog(cid:173)\nto-delay  converters,  we  have  modeled  the  initial  stages  of  visual \nprocessing. Initial  results  are extremely  promising.  Even  with  activity \nin  retinal  output  cells  limited  to  one  spike  per  neuron  per  image \n(effectively ruling out any form of rate coding), sophisticated processing \nbased on asynchronous activation was nonetheless possible. \n\n1.  INTRODUCTION \nWe recently  demonstrated that  the  human  visual  system  can  process  previously  unseen \nnatural images in  under  150  ms  (Thorpe et aI,  1996).  Such  data,  together  with  previous \nstudies on  processing  speeds  in  the  primate  visual  system  (see Thorpe &  Imbert,  1989) \nput  severe  constraints  on  models  of  visual  processing.  For  example,  temporal  lobe \nneurones  respond  selectively  to  faces  only  80-100  ms  after  stimulus  onset  (Oram  & \nPerrett,  1992;  Rolls  &  Tovee,  1994).  To  reach  the  temporal  lobe  in \nthis  time, \ninformation from  the  retina  has  to  pass  through  roughly  ten  processing  stages  (see  Fig. \nI).  If one takes  into  account  the  surprisingly  slow  conduction  velocities  of intracortical \naxons  \u00ab  1  ms-l,  see  Nowak  &  Bullier,  1997)  it  appears  that  the  computation  time \nwithin any cortical stage will  be as  little  as  5-10  ms.  Given  that  most  cortical  neurones \nwill  be firing  below  100 spikes.sol ,  it  is difficult to escape the  conclusion  that  processing \ncan be achieved with only one spike per neuron. \n\n\f902 \n\nS.  J.  Thorpe and J.  Gautrais \n\nRetina \n\nLGN \n\nVI \n\nV2 \n\nV4 \n\nPIT \n\nAIT \n\n0 \n\n~ \n\n0 \n0 \n\n.. \n\nr. \n\n~ \n\nr. \n'-' \n\n'-' \n\n-a-.o \n\na-.~ \n\nr.  \"\" \n.~  0-.('). \n\n-a-.o-\n,...... \n,...... \n,...... \n\n...a  -a.o-\n\n~ .~ \n\nroo. \n\n........ \n\n~ ~ \n\n..n..o-\n,...... \n\nr. \n\n'-'  ~ \n\n'\" \n\nL. \n\n..(\") \n\n0 \n0 \n\n~~ \n\n~ \n\n30-SOms \n\n40-60ms \n\nSO-70ms \n\n60-BOms \n\n70-90 ms \n\nSO-lOOms \n\nFigure  1 : Approximate latencies for neurones in different stages of the visual primate \n\nvisual system (see Thorpe &  Imbert,  1989; Nowak &  Bullier,  1997). \n\nSuch constraints pose major problems for conventional firing rate codes since at least two \nspikes are needed to estimate a neuron's  instantaneous  firing  rate.  While  it  is  possible  to \nuse the  number of spikes  generated  by  a  population  of cells  to  code  analog  values,  this \nturns  out  to  be  expensive,  since  to  code  n  analog  values,  one  needs  n-1  neurones. \nFurthermore, the  roughly  Poisson  nature  of spike  generation  would  also  seriously  limit \nthe  amount  of information  that  can  be  transmitted.  Even  at  100  spikes.s\u00b7 l ,  there  is  a \nroughly 35% chance that  the  neuron  will  generate  no  spike  at  all  within  a  particular  10 \nms window, again forcing the system to use large numbers of redundant cells. \n\nAn alternative is to use information encoded in the  temporal  pattern  of firing  produced  in \nresponse  to  transient  stimuli  (Mainen  &  Sejnowski,  1995).  In  particular,  one  can  treat \nneurones  not  as  analog  to  frequency  converters  (as  is  normally  the  case)  but  rather  as \nanalog to delay converters(Thorpe,  1990,  1994). The idea is very simple and  uses  the  fact \nthat  the  time  taken  for  an  integrate-and-fire  neuron  to  reach  threshold  depends  on  input \nstrength. Thus,  in  response to an  intensity profile, the 6  neurones in  figure 2  will  tend  to \nfire in a particular order - the most strongly  activated  cells  firing  first.  Since  each  neuron \nfires one and only one spike, the firing  rates of the cells contain no information, but  there \nis information in the order in which the cells fire (see also Hopfield,  1995). \n\nA \n\nB \n\nc \n\nF \n\nI \n\nINTENSITY \n\nFigure  2  :  An  example  of  spike  order \ncoding.  Because of the  intrinsic  properties \nof  neurones  the  most  strongly  activated \nfire  first.  The  sequence \nneurones  will \nB>A>F>C>E>D  is  one  ot  the  720  (i.e. \n6!) possible orders in  which the 6  neurones \ncan fire, each  of which  reflects  a  different \nintensity profile. Note that such a  code can \nbe used to send information very quickly. \n\nTo  test  the  plausibility  of using  spike  order rather  than  firing  rate  as  a  code,  we  have \ndeveloped a neural network simulator \"SPIKENET\" and used it to model the  initial  stages \nof  visual  processing.  Initial \nthat \nsophisticated visual processing can  indeed  be  achieved  in  a  visual  system  in  which  only \none spike per neuron is available. \n\nresults  are  very  encouraging  and  demonstrate \n\n\fRapid Vzsual Processing using Spike Asynchrony \n\n903 \n\n2.  SPIKENET  SIMULATIONS \nSPIKENET has  been  developed  in  order  to  simulate  the  activity  of  large  numbers  of \nintegrate-and-fire  neurones.  The  basic  neuronal  elements are  simple,  and  involve  only  a \nlimited  number of parameters,  namely,  an  activation  level,  a  threshold  and  a  membrane \ntime constant. The basic propagation mechanism involves processing the list of neurones \nthat  fired  during  the  previous  time  step.  For  each  spiking  neuron,  we  add  a  synaptic \nweight value to each of its  targets,  and,  if the  target  neuron's  activation  level  exceeds  its \nthreshold,  we  add  it  to  the  list  of spiking  neurones  for  the  next  time  step  and  reset  its \nactivation  level  by  subtracting  the  threshold  value.  When  a  target  neuron  is  affected  for \nthe first time on any particular time step, its activation level  is recalculated to simulate an \nexponential  decay  over time.  One of the  great advantages  of this  kind  of  \"event-driven\" \nsimulator  is  its  computational  efficiency  - even  very  large  networks  of neurones  can  be \nsimulated because no processor time is wasted on inactive neurones. \n\n2.1  ARCHITECTURE \nAs  an  initial  test  of  the  possibility  of  single  spike  processing,  we  simulated  the \npropagation  of activity  in  a  visual  system  architecture  with  three  levels  (see  Figure  3). \nStarting from a gray-scale image (180 x 214  pixels)  we  calculate  the  levels  of activation \nin  two retinal  maps,  one  corresponding  to  ON-center retinal  ganglion  cells,  the  other to \nOFF-center  cells.  This  essentiaIly  corresponds  to  convolving  the  image  with  two \nMexican-hat  type  operators.  However,  unlike  more  classic  neural  net  models,  these \nactivation levels are not used to determine a continuous output value for  each  neuron,  nor \nto  calculate  a  firing  rate.  Instead,  we  treat  the  cells  as  analog-to-delay  converters  and \ncalculate at  which  time  step  each  retinal  unit  will  fire.  Because  of  their  receptive  field \norganization,  cells  which  fire  at  the  shortest  latencies  will  correspond  to  regions  in  the \nimage where the local contrast is high. Note, however, that each retinal ganglion cell  will \nfire once and once only.  While this is clearly not physiologically realistic (normally, cells \nfiring at a short latencies go on to fire further spikes at short intervals) our aim  was  to see \nwhat sort of processing can be achieved in  the absence of rate coding. \n\nThe ON- and  OFF-center cells  each  make  excitatory  connections  to  a  large  number  of \ncortical  maps  in  the  second  level  of  the  network.  Each  map  contains  neurones  with  a \ndifferent pattern of afferent connections which  results  in  orientation  and  spatial  frequency \nselectivity  similar to  that  described  for  simple-type  neurones  in  striate  cortex.  In  these \nsimulations  we  used  32  different  filters  corresponding  to  8  different  orientations  (each \nseparated  by  45\u00b0)  and  four  different  scales  or  spatial  frequencies.  This  is  functionally \nequivalent to having one single cortical map  (equivalent  to  area  VI) in  which  each  point \nin  visual  space corresponds to a hypercolumn containing a complete set of orientation  and \nspatial frequency tuned filters. \n\nUnits in the third layer receive weighted inputs from  all  the simple units corresponding  to \na  particular  region  of  space  with  the  same  orientation  preference  and  thus  roughly \ncorrespond to complex cells in area VI. \n\n\f904 \n\nS.  J.  Thorpe and J.  Gautrais \n\nLayer 3 \nOrientation and Spatial \nFrequency tuned \nComplex cells \n(::::205 000 units) \n\n,  t I \nON- and OFF-center cells \u00a3r 7 \n\nlJoyer1 \n\n(::::77  000 units) \n\nImage \n180 x 214 pixels \n\n\\ \n\n(!!YI  .\". \n\n'I !J \n\nFigure 3 : Architecture used in the present simulations \n\nOne  unusual  feature  of  the  propagation  process  used  in  SPIKENET  is  that  the  post(cid:173)\nsynaptic effect of activating a synapse is  not fixed,  but depends on how many  inputs  have \nalready  been  activated.  Thus,  the  earliest  firing  cells  produce  a  maximal  post-synaptic \neffect (100%), but those which fire  later produce  less  and  less  response. Specifically,  the \nsensitivity of the post-synaptic neuron decreases by a fixed percentage each time one of its \ninputs  fires.  The  phenomenon  is  somewhat  similar  to  the  sorts  of  activity-dependent \nsynaptic  depression  described  recently  by  Markram  &  Tsodyks  (1996)  and  others,  but \ndiffers in  that the depression affects all  the inputs to a particular  neuron.  The  net  result  is \nto make the post-synaptic cell sensitive to the order  in  which its inputs are activated. \n\n2.2  SIMULATION  RESULTS \nWhen  a  new  image  is  presented  to  the  network,  spikes  are  generated  asynchronously  in \nthe ON- and OFF-center cells of the  retina  in  such  a  way  that  information  about  regions \nof the image with high local contrast (i.e. where there are contours present) are sent to  the \ncortex  first.  Progressively,  neurons  in  the  second  layer  become  more  and  more  excited, \nand, after a variable number of time steps,  the  first  cells  in  the  second  layer  will  start  to \n\n\fRapid Visual Processing using Spike Asynchrony \n\n905 \n\nreach  threshold  and  fire.  Note that,  as  in  the  first  layer,  the  earliest firing  units  will  be \nthose for whom the pattern of input activation best matches their receptive field structure. \n\n40rns \n\n45rns \n\n50rns \n\n80rns \n\nLayerl \n\n\"ON-center\" \nCells \n\nLayer 2 \n\nSimple cells \nOrientation \n45\u00b0 \n\nLayer 3 \n\nComplex cells \nOrientation \n45\u00b0 \n\nFigure 4 : Development of activity in 3 of the maps \n\nFigure 4  illustrates this  process  for just three  maps.  The top  row  shows  the  location  of \nunits in the ON-center retinal map that have fired after various  delays. After 40  msec,  the \nmain  outlines  of the  figure  can  be  seen  but  progressively  more  details  are  seen  after  45 \nand  then  50  ms.  Note  that  the  representation  used  here  uses  pixel  intensity  to  code  the \norder in which the cells have fired - bright  white  spots  correspond  to  places  in  the  image \nwhere the cells fired earliest. In the final  frame (taken at 80 ms)  the  vast  majority  of ON(cid:173)\ncenter cells  have  already  fired  and  the  resulting  image is  quite  similar  to  a  high  spatial \nfrequency filtered version of the original image. \n\nThe middle row  of images  shows  activity  in  one  of  the  second  level  maps  - the  one \ncorresponding to medium  spatial  frequency  components oriented  at  45\u00b0. Note  that  in  the \nfirst  times lice  (40  ms)  very  few  cells  have  fired,  but  that  the  proportion  increases \nprogressively  over  the  next  10  or  so  milliseconds.  However,  even  at  the  end  of  the \npropagation process, the proportion of cells that have actually fired  remains  low.  Finally, \nthe lowest row shows activity in the corresponding third  layer map - again  corresponding \nto contours oriented at 45\u00b0, but this time with less precise  position  specificity  as  a result \nof the grouping process. \n\nFigure 5 plots the  total  number  of spikes  occurring  per millisecond  in  each of the  three \nlayers  during  the  first  100  ms  following  the  onset  of  processing.  It  is  perhaps  not \n\n\f906 \n\ns. 1.  Thorpe and 1.  Gautrais \n\nsurprising that the onset of firing occurs later for layers 2 and 3. However, there is  a  huge \namount of overlap  in  the  onset  latencies  of  cells  in  the  three  layers,  and  indeed,  it  is \ndoubtful  whether there would be any systematic differences in mean onset latency  between \nthe three layers. \n\n1250 \n\nen 1000 \nE .... \nQ) c.. \n~ 750 \n~ .0. \nen \n\n500 \n\n250 \n\no \n\no \n\nOn centre cells \nSimple cells \nComplex cells \n\n10 \n\n20 \n\n30 \n\n40 \n\n50 \n\n60 \n\n70 \n\n80 \n\n90 \n\n100 \n\nTime (ms) \n\nFigure 5  : Amount of activity measured in spikes/ms for the three layers of neurones as a \n\nfunction of time \n\nBut perhaps  one of the  most  striking  features  of these simulations  is  the  way  in  which \nthe  onset  latency  of cells  can  be seen  to  vary  with  the  stimulus.  The  small  number  of \ncells  in  each  layer  which  fire  early  are  in  fact  very  special  because  only  the  most \noptimally activated cells will  fire  at  such  short  latencies.  The  implications  of this  effect \nfor  visual  processing  are  far  reaching  because  it  means  that  the  earliest  information \narriving  at  later  processing  stages  will  be  particularly  informative  because  the  cells \ninvolved  are  very  unambiguous.  Interestingly,  such  changes  in  onset  latency  have  been \nobserved  experimentally  in  neurones  in  area  VI  of the  awake  primate  in  response  to \nchanges in  orientation. In these experiments it was shown that  shifting  the  orientation  of \na grating  away  from  a  neuron's  preferred  orientation  could  result  in  changes  in  not  only \nthe  firing  rate  of the  cell,  but  also  increases  in  onset  latency  of  as  much  as  20-30  ms \n(Celebrini, Thorpe, Trotter &  Imbert,  1993). \n\n3.  CONCLUSIONS \nA  number  of  points  can  be  made  on  the  basis  of  these  results.  Perhaps  the  most \nimportant  is  that  visual  processing  can  indeed  be  performed  under  conditions  in  which \nspike frequency coding is effectively ruled out. Clearly, under normal conditions, neurones \nin  the visual system that respond  to  a  visual  input  will  almost  invariably  generate  more \n\n\fRapid VISual Processing using Spike Asynchrony \n\n907 \n\nthan one spike. However, as we have  argued  previously,  processing  in  the  visual  system \nis  so  fast  that  most  cells  will  not  have  time  to  generate  more  than  one  spike  before \nprocessing in later stages  has  to  be  initiated.  The present  results  indicate  that  the  use  of \ntemporal order coding may provide a key to understanding this remarkable efficiency. \n\nThe simulations  presented  here  are  clearly  very  limited,  but  we  are  currently  looking  at \nspike  propagation  in  more  complex  architectures  that  include  extensive  horizontal \nconnections  between  neurones  in  a  particular  layer  as  well  as  additional  layers  of \nprocessing.  As  an  example,  we  have  recently  developed  an  application  capable  of  digit \nrecognition.  SPIKENET is  well  suited  for  such  large  scale  simulations  because  of  the \nevent-driven  nature  of the  propagation  process.  For  instance,  the  propagation  presented \nhere,  which  involved  roughly  700  000  neurones  and  over  35  million  connections,  took \nroughly  15  seconds  on  a  150  MHz PowerMac,  and even  faster  simulations  are  possible \nusing parallel processing. With this  is  view  we have  developed  a  version  of SPIKENET \nthat uses PVM (Parallel Virtual Machine) to run on a cluster of workstations. \n\nReferences \n\nCelebrini S., Thorpe S., Trotter Y. &  Imbert M. (1993). Dynamics of orientation coding \nin  area VI of the awake primate Visual Neuroscience 10, 811-25. \n\nHopfie1d J. J. (1995). Pattern recognition computation using action potential  timing for \nstimulus  representation. Nature , 376, 33-36. \n\nMainen Z. F.  &  Sejnowski T.  J.  (1995).  Reliability of spike timing in  neocortical \nneurons  Science, 268,  1503-6. \n\nMarkram H.  &  Tsodyks M. (1996) Redistribution of synaptic efficacy between \nneocortical pyramidal  neurons. Nature, 382,807-810 \n\nNowak L.G.  &  Bullier J (1997) The timing of information transfer in  the visual system. \nIn  Kaas J., Rocklund K.  &  Peters A. (eds).  Extrastriate Cortex in Primates (in press). \nPlenum Press. \n\nOram M. W.  &  Perrett D. I. (1992). Time course of neural  responses discriminating \ndifferent views of the face and head Journal of Neurophysiology, 68, 70-84. \n\nRolls E. T.  &  Tovee M. J. (1994). Processing speed  in  the cerebral cortex and the \nneurophysiology  of visual  masking  Proc R Soc Lond  B Bioi Sci,  257,9-15. \n\nThorpe S., Fize D. &  Marlot C.  (1996). Speed of processing in the human  visual  system \nNature, 381, 520-522. \n\nThorpe S. J.  (1990).  Spike arrival times: A highly efficient coding scheme for neural \nnetworks. In R. Eckmiller, G. Hartman &  G. Hauske (Eds.), Parallel processing in neural \nsystems (pp. 91-94). North-Holland: Elsevier. Reprinted in H. Gutfreund &  G. Toulouse \n(1994), Biology and computation: A physicist's choice.  Singapour: World Scientific. \n\nThorpe S. J.  &  Imbert M.  (1989). Biological constraints on  connectionist models. In R. \nPfeifer, Z.  Schreter, F.  Fogelman-Soulie &  L.  Steels (Eds.),  Connectionism in \nPerspective.  (pp. 63-92). Amsterdam: Elsevier. \n\n\f", "award": [], "sourceid": 1305, "authors": [{"given_name": "Simon", "family_name": "Thorpe", "institution": null}, {"given_name": "Jacques", "family_name": "Gautrais", "institution": null}]}