{"title": "Acoustic-Imaging Computations by Echolocating Bats: Unification of Diversely-Represented Stimulus Features into Whole Images", "book": "Advances in Neural Information Processing Systems", "page_first": 2, "page_last": 9, "abstract": null, "full_text": "2 \n\nSimmons \n\nAcoustic-Imaging  Computations  by  Echolocating Bats: \n\nUnification  of Diversely-Represented  Stimulus \n\nFeatures  into  Whole  Images. \n\nJames A. Simmons \n\nDepartment of Psychology \nand Section of Neurobiology, \n\nDivision of Biology  and  Medicine \n\nBrown  University,  Providence,  RI  02912. \n\nABSTRACT \n\nThe  echolocating  bat,  Eptesicus fuscus,  perceives  the  distance  to \nsonar  targets  from  the  delay  of echoes  and  the  shape  of targets \nfrom  the  spectrum  of echoes.  However,  shape  is  perceived  in \nterms  of the  target's  range  proftle.  The  time  separation  of echo \ncomponents from  parts of the target located at  different  distances \nis  reconstructed  from  the  echo  spectrum  and  added  to  the \nestimate  of absolute  delay  already  derived  from  the  arrival-time \nof  echoes.  The  bat  thus  perceives  the  distance  to  targets  and \nrange \ndepth  within \ndimension,  which  is  computed.  The  image  corresponds  to  the \ncrosscorrelation  function  of  echoes.  Fusion  of  physiologically \ndistinct  time- and  frequency-domain  representations  into  a  fmal, \ncommon  time-domain  image  illustrates  the  binding  of  within(cid:173)\nmodality  features  into  a  unified,  whole  image.  To  support  the \nstructure  of  images  along  the  dimension  of  range,  bats  can \nperceive echo  delay  with a hyperacuity of 10  nanoseconds. \n\nthe  same  psychological \n\ntargets  along \n\n\fAcoustic-Imaging Computations by Echolocating Bats \n\n3 \n\nTHE SONAR O.~ BATS \n\nBats  are  flying  mammals,  whose  lives  are  largely  nocturnal.  They  have  evolved \nthe  capacity  to  orient  in  darkness  using  a  biological  sonar  called  echolocation, \nwhich they use  to  avoid obstacles to flight  and to detect, identify,  and  track flying \ninsects  for  interception  (Griffm,  1958).  Echolocating  bats  emit  brief,  mostly \nultrasonic sonar  sounds and perceive objects from echoes that return to their ears. \nThe  bat's  auditory  system  acts  as  the  sonar  receiver,  processing  echoes  to \nreconstruct  images  of  the  objects  themselves.  Many  bats  emit  frequency(cid:173)\nmodulated  (FM)  signals;  the  big  brown  bat,  Eptesicus fuscus,  transmits  sounds \nwith  durations  of  several  milliseconds  containing  frequencies  from  about  20  to \n100  kHz  arranged  in  two  or  three  hannonic  sweeps  (Fig.  1).  The  images  that \nEptesicus ultimately perceives retain crucial features  of the original sonar wave-\n\n100 \n\nN  80 \nI \n~ \n\n-;::  60 \n() \nc \n~  40 \nc(cid:173)\nO> \n.....  20 \n\n......... \n\no~ ____________ ~ \n\n1  msec \n\nFigure  I:  Spectrogram  of a \nsonar sound emitted  by the \nbig  brown  bat,  Eptesicus \nfuscus  (Simmons,  1989). \n\nforms,  thus  revealing  how  echoes  are  processed  to  reconstruct  a  display  of the \nobject  itself.  Several  important  general  aspects  of perception  are  embodied  in \nspecific  echo-processing operations in  the  bat's sonar.  By  recognizing constraints \nimposed when  echoes are  encoded in  terms of neural activity in the bat's auditory \nsystem,  recent  experiments  have  identified  a  nove)  use  of time- and  frequency(cid:173)\ndomain  techniques  as  the  basis  for  acoustic  imaging  in  FM  echolocation.  The \nintrinsically  reciprocal  properties  of time- and  frequency-domain  representations \nare  exploited  in  the  neural  algorithms  which  the  bat  uses  to  unify  disparate \nfeatures  into whole images. \n\nIMAGES OF SINGLE-GI.JNT TARGETS \n\nA  simple  sonar  target  consists  of a  single  reflecting  point,  or glint,  located  at  a \ndiscrete  range  and  reflecting  a  single  replica  of  the  incident  sonar  signal.  A \ncomplex target  consists of several  glints  at slightly different  ranges.  It thus reflects \ncompound  echoes  composed  of individual  replicas  of the  incident  sound arriving \n\n\f4 \n\nSimmons \n\nat slightly  different  delays.  To dctennine the  distance  to a target,  or target range, \necholocating bats estimate the delay of echoes (Simmons,  1989).  The bat's image \nof a  single-glint  target  is  constructed  around  its  estimate  of echo  delay,  and  the \nshape  of  the  image  can  be  measured  behaviorally.  The  performance  of  bats \ntrained  to  discriminate  between  echoes  that  jitter  in  delay  and  echoes  that  are \nstationary  in  delay  yields  a  graph  of the  image  itself (Altes,  1989),  together  with \nan  indication  of the  accuracy  of the  delay  estimate  that  underlies  it  (Simmons, \n1979;  Simmons, Perragamo,  Moss, Stevenson, &  Altes,  in press).  Fig.  2 shows \n\nJitter Performonce \n\nCrasscorrelatian  Function \n\n-\" \n/\n..-. \n\n1\"-.... \n\n/\\ \n\n'\\./.\\  j \\  /.'/ \n\n./.\\ \n\n/-\n....... \n\n/ \n\n. \n\n\\ \n\n. \n\n-50  -40  -030  -20  -10 \n\n0 \n\n10 \n\n~o  )0 \n\n40 \n\n50 \n\n-50  -40  -JO  -20  -10 \n\n0 \n\n10 \n\n20 \n\nJO \n\n40 \n\n50 \n\nTime  (mIcroseconds) \n\nTime  (microseconds) \n\nFigure  2:  Graphs  showing  the  bat's image  of a  single-glint  target \nfrom  jitter  discrimination  experilnents  (left)  for  comparison  with \nthe  crosscorrelation  function  of echoes  (right).  The  zero  point \non each time axis  corresponds to the  objective arrival-time of the \nechoes  (about  3  msec  in  this  experiment;  Sinlmons,  Perragamo, \net aI.,  in  press). \n\nthe  image  of a  single-glint  target  perceived  by  Eptesicus,  expressed  in  terms  of \necho  delay  (58  Ilsec/cm  of  range). \nProm  the  bat's  jitter  discrimination \nperformance,  the  target  is  perceived  at  its  true  range.  Also,  the image  has  a  fme \nstructure  consisting  of a  central  peak  corresponding to  the  location  of the  target \nand two  prominent  side-peaks  as  ghost  images  located  about  35  }lsec  or 0.6  cm \nnearer  and  farther  than  the  main  peak.  This  image  fme  structure  reflects  the \ncomposition  of  the  waveform  of  the  echoes  themselves;  it  approximates  the \ncrosscorrelation function of echoes  (Fig.  2). \n\nThe  discovery  that  the  bat  perceives  an  image  corresponding  to  the  cross(cid:173)\ncorrelation  function  of echoes  provides  a  view  of the  hidden  machinery  of the \nbat's  sonar  receiver.  The  bat's  estimate  of echo  delay  evidently  is  based  upon  a \ncapacity  of  the  auditory  system  to  represent  virtually  all  of  the  information \navailable  in  echo  waveforms  that  is  relevant  to  determining  delay,  including  the \nphase  of echoes  relative to emissions  (Simmons,  Ferragamo, et  al,  in press).  The \nbat's  initial  auditory  representation  of these  FM  signals  resembles  spectrograms \n\n\fAcoustic-Imaging Computations by Echolocating Bats \n\n5 \n\nthat  consist  of  neural  impulses  marking  the  time-of-occurrence  of  succeSSlve \nfrequencies  in the  FM sweeps of the  sounds (Fig.  3).  Each nerve im-\n\n150 \n120 \n100 \n\n80 \n\n60 \nN  50 \nI \n.x:  40 \n\n30 \n25 \n20 \n\n15 \n\n\" \n\n. \n\\. \n.. .~ .::\\ \n.. \n\n\":. \n~ \n\n\"I \n) '. \n\n'-.\\~. \n~ .. \n\n+, \n\nI~ \n':~ \n-=\\. \n\n0 \n\n5 \n(msec) \n\ntime \n\n10 \n\nHgure  3:  Neural  spectrograms \nrepresenting  a  sonar  emission \n(left)  and  an  echo  from  a  target \nlocated  about  I  m  away  (right), \nThe  individual  dots  are  neural \nimpulses \nthe \ninstantaneous  frequency  of  the \nFM sweeps  (see  Fig.  1).  The 6-\nmsec  time  separation  of the  two \nspectrograms \ntarget \nrange  in  the  bat's  sonar  receiver \n(Simmons &  Kick,  1984). \n\nconveying \n\nindicates \n\npulse  travels  in  a  \"channel\"  that  is  tuned  to  a  particular  excitatory  frequency \n(Bodenhamer  &  Pollak,  1981)  as  a  consequence  of  the  frequency  analyzing \nproperties  of the  cochlea..  The  cochlear  filters  are  followed  by  rectification  and \nlow-pass  filtering,  so  in  a  conventional  sense  the  phase  of the  filtered  signals  is \ndestroyed in  the course of forming  the  spectrograms.  However,  Fig.  2 shows that \nthe  bat  is  able  to  reconstruct  the  crosscorrclation  function  of  echoes  from  its \nspectrogram-like  auditory  representation.  The  individual  neural  \"points\"  in  the \nspectrogram  signify  instantaneous  frequency,  and  the  recovery  of  the  fIne \nstructure  in  the  image  may  exploit  properties  of instantaneous  frequency  when \nthe  images  are  assembled  by  integrating  numerous  separate  delay  measurements \nacross  different  frequencies.  The  fact  that  the  crosscorrelation  function  emerges \nfrom  these  neural  computations is  provocative  from  theoretical  and technological \nviewpoints--the  bat  appears  to  employ  novel  real-time  algorithms  that  can \ntransform  echoes  into  spectrograms  and  then  into  the  sonar  ambiguity  function \nitself. \n\nThe  range-axis  image  of  a  single-glint  target  has  a  fIne  structure  surrounding  a \ncentral peak  that constitutes  the  bat's  estimate of echo delay  (Fig.  2).  The width \nof  this  peak  corresponds  to  the  limiting  accuracy  of  the  bat's  delay  estimate, \nallowing  for  the  ambiguity  represented  by  the  side-peaks  located  about  35  Jlsec \naway. \nIn  Fig.  2,  the  data-points  arc  spaced  5  Jlsec  apart  along  the  time  axis \n(approximately  the  Nyquist  sampling  interval  for  the  bat's  signals),  and  the  true \nwidth of the  central peak is  poorly shown.  Fig.  4 shows the performance of three \nEptesicus  in  an experiment to measure  this  width with smaller delay  steps.  The \n\n\f6 \n\nSimmons \n\n100 \n\n/~-~-------\n\nOeloy  line \n\nBot  #I  1  . - .  \nBot. 3  .--. \nBot. 50-0 \n\nCable \n\nBat.3  0 - -0  \nBot'5 .-0 \n\n90 \n\n~. \n\u2022 \n\n1'. \n1 \nI ' , \nu  1 \n\n\" \n\" ~  80 \ng. \n\" \n~  70 \n\" \n~  60 \nc \ne 50 \n\" 0.. \n\n40 \n\n0 \n\n5  10  15  20  25  30  35  40  45  50  55  60 \n\nTIme  (nanosetonds) \n\nof \n\nFigure  4:  A  graph  of  the \npelformance \nEptesicus \ndiscriminating \necho-delay \njitters \nsmall \nsteps. \nlimiting \nfor \nacuity \nnsec \n75% \nresponses \n(Simmons,  Perragamo,  et  a1., \nin  press). \n\nthat  change \nThe  bats' \nabout  10 \ncorrect \n\nm \n\nIS \n\nbats  can  detect  a  shift  of as  little  as  10  nsec  as  a  hyperacuity  (Altes,  1989)  for \necho  delay  in  the  jitter  task. \nIn  estimating  echo  delay,  the  bat  must  integrate \nspectrogram  delay  estimates  across  separate  frequencies  in  the  FM  sweeps  of \nemissions  and  echoes  (see  Fig.  3),  and  it  arrives  at  a  very  accurate  composite \nestimate  indeed.  Timing  accuracy  in  the  nanosecond  range  is  a  previously \nunsuspected  capahility  of the  nervous  system,  and  it  is  likely  that  more  complex \nalgorithms  than  just  integration  of information  across  frequencies  lie  behind  this \nfine  acuity  (see  below on amplitude-latency trading and  perceived delay). \n\nIMAGES OI<~ lWO-GLINT TARGETS \n\nComplex  targets  such  as  airborne  insects  reflect  echoes  composed  of  several \nreplicas  of the  incident  sound  separated  by  short  intervals  of time  (Simmons  & \nChen,  1989).  Por insect-sized  targets,  with  dimensions  of a  few  centimeters,  this \ntime  separation  of  echo  components  is  unlikely  to  exceed  100  to  150  Jlsec. \nBecause  the  bat's  signals  arc  several  milliseconds  long,  the  echoes  from  complex \ntargets  thus  will  contain  echo  components  that  largely  overlap.  The  auditory \nsystem  of  Eptesicus  has  an  integration-time  of about  350  Jlsec  for  reception  of \nsonar  echoes  (Simmons,  Freedman,  et  at.,  1989).  Two  echo  components  that \narrive  together  within  this  integration-time  will  merge  together  into  a  single \ncompound echo  having  an  arrival-time  as  a  whole  that indicates  the  delay  of the \nfirst  echo  component, and having a series of notches in its  spectrum that indicates \nthe  time  separation  of the  first  and  second  components. \nIn  the  bat's  auditory \nrepresentation, echo  delay corresponds  to the time separation  of the emission  and \necho  spectrograms  (see  Fig.  3),  while  the  notches  in  the  compound  echo \nspectrum appear  as  '1101es\"  in the  spectrogram--that is,  as  frequencies  that fail  to \nappear  in  echoes.  The  location  and  spacing  of  these  notches  or  holes  in \nfrequency  is  related  to  the  separation  of the  two  echo  components in  lime.  The \ncrucial  point  is  that  the  constraint  imposed  by  the  350-Jlsec  integration-time  for \necho  reception disperses  the information required  to reconstruct the detailed  range \n\n\fAcoustic-Imaging Computations by Echolocating Bats \n\n7 \n\nstructure  of the  complex  target  into  both the  time  and  the  frequency  dimensions \nof the neural  spectrograms. \n\nFptesicuJ  extracts  an estimate  of the  overall  delay  of the  waveform  of compound \nechoes  from  two-glint  targets.  This  time  estimate  leads  to  a  range-axis  image  of \nthe closer  of the  two  glints  in the  target  (the  target's leading  edge).  This  part of \nthe  image  exhibits  the  same  properties  as  the  image  of a  single-glint  target--it  is \nencoded  by  the time-of-occurrence  of neural  discharges in the spectrograms and it \nresembles  the  crosscorrclation  function  for  the  first  echo  component  (Simmons, \nMoss,  &  Perragamo,  1990;  Simmons,  Ferragamo,  et  al.,  in  press;  see  Simmons, \n1989).  The  bat  also  perceives  a  range-axis  image  of the  farther  of the  two  glints \n(the  target's  trailing  edge).  This  image  is  located  at  a  perceived  distance  that \ncorresponds  to  the  bat's  estimate  of  the  time  separation  of  the  two  echo \ncomponents that make up the compound echo.  Fig.  5 shows  the  performance of \nEpleJicuJ in a jitter discrimination experiment in  which  one of the \n\n8,  a'i \n\ni~~-I \n\n! \n\nI \n\n, \n\nI \n\nI \n20 \n\no \nlime  (psec) \n\nI \n40 \n\nFigure  5:  A  graph  comparing \nthe crosscorrelation function  of \nechoes  from  a  two-glint  target \nwith  a  delay  separation  of  10 \nJlsec  (top)  with  the  bat's  jitter \ndiscrimination \nperformance \nusing tlus  compound echo  as  a \nstimulus  (bottom).  The  two \nglints  arc  indicated  as  a I  and \naI' (Simmons,  1989). \n\njittering  stimulus  echoes  contained  two  replicas  of  the  bat's  emitted  sound \nseparated  by  10  Jlsec.  The  bat  perceives  two  distinct  reflecting  points along  the \nrange  axis.  Both  glints  appear  as  events  along  the  range  axis  in  a  time-domain \nimage  even  though  the  existence  of the  second  glint  could  only  be  inferred  from \nthe  frequency  domain  because  the  delay  separation  of  10  Jlsec  is  much  shorter \nthan the  receiver's  integration  time.  The image  of the  second  glint  resembles  the \ncrosscorrelation function  of the  later of the  two  echo components.  The bat adds \nit  to  the  crosscorrelation  function  for  the  earlier  component  when  the  whole \nimage is formed. \n\n\f8 \n\nSimmons \n\nACOUSTIC-IMA(;E PROCESSING BY  FM BATS \n\ntarget  and  to  estimate  delay  with  nanosecond  accuracy. \n\nSomehow  Eptesicus  recovers  sufficient  information  from  the  timing  of  neural \ndischarges  across  the  frequencies  in  the  PM  sweeps  of emissions  and  echoes  to \nreconstruct  the  crosscorrelation  function  of  echoes  from  the  flfst  glint  in  the \ncomplex \nThis \nfundamentally  time-domain  image  is  derived  from  the  processing  of information \ninitially  also  represented  in  the  time  domain,  as  demonstrated  by  the  occurrence \nof  changes  in  apparent  delay  as  echo  amplitude  increases  or  decreases:  The \nlocation of the perceived  crosscorrelation function  for the flfst  glint  can  be  shifted \nby  predictable  amounts  along the  time  axis  according to  the  separately-measured \namplitude-latency  trading  relation  for  Eptesicus  (about  -17  }lsec/dB;  Simmons, \nMoss,  &  Perragamo,  1990;  Simmons,  Ferragamo,  et  aI.,  in press),  indicating  that \nneural  response  latency--that  is,  neural  discharge  timing--conveys  the  crucial \ninformation about delay in the bat's auditory system. \n\nThe  second  glint  in  the  complex  target  manifests  itself  as  a  crosscorrelation-like \nimage  component,  too.  However,  the  bat  must  transform  spectral  information \ninto the time domain to arrive  at such a time- or range-axis  representation for the \nsecond  glint.  This  transformed  time-domain  image  is  added  to  the  time-domain \nimage  for  the first  glint  in  such a  way that  the  absolute  range  of the  second  glint \nis  referred  to  that  of the  first  glint.  Shifts  in  the  apparent  range  of the  flfst  glint \ncaused  by  neural  discharges  undergoing  amplitude-latency  trading  will  carry  the \nimage  of the  second  glint along with it  to a new  range  value  (Simmons,  Moss,  & \nPerragamo,  1990).  Evidently,  the  psychological  dimension  of  absolute  range \nsupports  the  image  of  the  target  as  a  whole.  This  helps  to  explain  the  bat's \nextraordinary  IO-nsec  accuracy  for  perceiving  delay.  For the  psychological  range \nor delay  axis  to  accept fine-grain  range infonnation about the separation  of glints \nin  complex  targets,  its  intrinsic  accuracy  must  be  adequate  to  receive  the \ninformation  that  is  transformed  from  the  frequency  domain.  The  bat  achieves \nfusion  of image  components  by transfonning  one  component  into  the  numerical \nfonnat  for  the  other  and  then  adding  them  together. \nThe  experimental \ndissociation  of the  images  of the  first  and  second  glints  from  different  effects  of \nlatency  shifts  demonstrates  the \nindependence  of  their  initial  physiological \nrepresentations.  Furthennore,  the  expected  latency  shift  does  not  occur  for \nfrequencies  whose  amplitudes  are  low  because  they  coincide  with  spectral \nnotches;  the  bat's  fine  nanosecond  acuity  thus  seems  to  involve  removal  of \ndischarges  at  \"untrustworthy\" frequencies  prior to  integration of discharge  timing \nacross  frequencies.  The  delay-tuning  of neurons  is  usually  thought  to  represent \nthe  conversion  of a  temporal  code  (timing  of neural  discharges) \ninto  a  \"place\" \ncode (the location of activity on the neural map).  The bat's unusual acuity  of 10 \nnsec  suggests  that  this  conversion  of a  temporal  to  a  \"place\" code is  only partial. \n\n\fAcoustic-Imaging Computations by EchoIocating Bats \n\n9 \n\nNot.  only  does  the  site  of activity  on  the  neural  map  convey  information  about \ndelay,  but the timing of discharges  in map neurons may also play  a critical role in \nthe  map-reading  operation.  The  bat's  fIne  acuity  may emerge  in  the  behavioral \ndata  because  initial  neural  encoding  of the  stimulus  conditions  in  the  jitter  task \ninvolves  the  same  parameter  of neural  rcsponses--timing--that  later  is  intimately \nassociated  with  map-reading in  the  brain.  Echolocation may thus fortuitously  be \na good system  in  which to explore this  basic perceptual  process. \n\nAckllowledgmen ts \n\nResearch  supported  by grants from  ONR,  NIH,  NIMH,  ORF, and SOF. \n\nReferences \n\nR.  A.  Altes  (1989)  Ubiquity of hyperacuity, 1.  Acoust.  Soc.  Am.  85:  943-952. \nR.  D.  Bodenhamer &  G.  O.  Pollak  (1981)  Time and frequency domain \n\nprocessing in the inferior colliculus of echolocating bats,  Hearing  Res.  5: \n317-355. \n\nO.  R.  Griffin  (1958)  Listening in  the  Dark,  Yale  Univ.  Press. \n1.  A.  Simmons (1979)  Perception of echo phase information in bat sonar, \n\nScience,  207:  1336-1338. \n\n1.  A.  Simmons  (1989)  A view  of the world through the  bat's ear:  the formation of \n\nacoustic images in  echolocation,  Cognition  33:  155-199. \n\nJ.  A.  Simmons &  L.  Chen  (1989) The acoustic basis  for  target discrimination by \n\nPM echolocating bats, 1. Acoust.  Soc.  Am. 86:  1333-1350. \n\n1.  A.  Simmons,  M.  Ferragamo,  C.  F.  Moss,  S.  B.  Stevenson, &  R.  A.  Altes  (in \n\npress)  Discrimination of jittered sonar echoes by the echolocating bat, \nEplesicus fuscus:  the shape of target unages in echolocation, 1.  Compo \nPhysiol.  A. \n\n1.  A.  Simmons,  E.  G.  Freedman, S.  B.  Stevenson, L. Chen, &  T. 1.  Wohlgenant \n\n(1989)  Clutter interference  and the integration tUne  of echoes in the \necholocating bat,  Eptesicus fuscus,  J.  Acoust.  Soc.  Am.  86:  1318-1332. \n\n1.  A.  Simmons &  S.  A.  Kick  (1984)  Physiological mechanisms for  spatial fIltering \n\nand unage enhancement in the sonar of bats, Ann.  Rev.  Physiol.  46:  599-\n614. \n\nJ.  A.  Simmons,  C.  F.  Moss,  &  M.  Ferragamo  (1990)  Convergence of temporal \n\nand spectral information into acoustic images perceived by the \necholocating bat,  Eptesicus fuscus,  1.  Compo  Physiol.  A  166: \n\n\f", "award": [], "sourceid": 224, "authors": [{"given_name": "James", "family_name": "Simmons", "institution": null}]}