{"title": "Vertex Identification in High Energy Physics Experiments", "book": "Advances in Neural Information Processing Systems", "page_first": 868, "page_last": 874, "abstract": null, "full_text": "Vertex Identification in High Energy \n\nPhysics Experiments \n\nGideon Dror* \n\nDepartment of Computer Science \n\nThe  Academic  College of Tel-Aviv-Yaffo, Tel  Aviv  64044 , Israel \n\nHalina Abramowiczt  David Hornt \n\nSchool of Physics  and  Astronomy \n\nRaymond and Beverly Sackler  Faculty of Exact Sciences \n\nTel-Aviv University, Tel  Aviv 69978 , Israel \n\nAbstract \n\nIn High Energy Physics experiments one has to sort through a high \nflux  of events, at a  rate of tens of MHz,  and select  the few  that are \nof interest.  One  of the  key  factors  in  making  this  decision  is  the \nlocation of the vertex  where  the  interaction , that led  to the event , \ntook  place.  Here  we  present  a  novel  solution  to  the  problem  of \nfinding  the  location  of the  vertex,  based  on  two  feedforward  neu(cid:173)\nral  networks with fixed  architectures, whose parameters are chosen \nso  as  to  obtain  a  high  accuracy.  The  system  is  tested  on  simu(cid:173)\nlated  data sets , and is  shown  to  perform better  than  conventional \nalgorithms. \n\n1 \n\nIntroduction \n\nAn event in High Energy Physics  (HEP) is  the experimental result of an interaction \nduring  the  collision  of particles  in  an  accelerator .  The  result  of this  interaction  is \nthe production  of tens  of particles, each  of which  is  ejected  in  a  different  direction \nand energy.  Due to the quantum mechanical effects  involved, the events differ from \none  another  in  the  number of particles  produced ,  the  types  of particles,  and  their \nenergies.  The  trajectories  of produced  particles  are  detected  by  a  very  large  and \nsophisticated  detector. \n\n\u2022 gideon@server.mta.ac.il \nthalina@Dost.tau.ac.i1 \n*hom@n;uron.tau.ac.il \n\n\fVertex Identification in High Energy Physics Experiments \n\n869 \n\nEvents  are  typically  produced  at  a  rate  of  10  MHz,  in  conjunction  with  a  data \nvolume of up to 500 kBytes per event.  The signal is  very small, and is selected from \nthe background by multilevel triggers that perform filtering either through hardware \nor software.  In the present paper we confront one problem that is of interest in these \nexperiments  and is  part of the  triggering  consideration.  This is  the location of the \nvertex of the interaction.  To be specific we will use a simulation of data collected by \nthe central tracking detector  [1]  of the ZEUS  experiment [2]  at the HEP  laboratory \nDESY in Hamburg, Germany.  This detector,  placed  in  a  magnetic field , surrounds \nthe  interaction  point  and  is  sensitive  to  the  path  of charged  particles.  It has  a \ncylindrical  shape  around  the  axis,  z,  where  the  interaction  between  the  incoming \nparticles takes place.  The challenge is  to find  an efficient and fast method to extract \nthe exact  location of the vertex  along this axis. \n\n2  The Input  Data \n\nAn example of an event, projected onto the z =  0 plane, is shown in Figure  1.  Only \nthe  information relevant  to  triggering  is  used  and  displayed.  The  relevant  points, \nwhich denote hits by the outgoing particles on wires in the detector , form five  rings \ndue  to  the  concentric  structure  of the  detector.  Several  slightly  curved  particle \ntracks  emanating  from  the  origin,  which  is  marked  with  a  + sign,  and  crossing \nall  five  rings,  can  easily  be  seen.  Each  track  is  made  of 30-40  data  points.  All \ntracks  appear in  this projection  as  arcs,  and indeed,  when  viewed  in  3 dimensions, \nevery  particle follows a  helical trajectory due to the solenoidal magnetic field  in the \ndetector. \n\n. \"1: .. -\n\n60 \n\n40 \n\n20 \nEo \n.\u00a3 \n\n-20 \n\n-40 \n\n-60 \n\n-60  -40  -20 \n\n0 \n\nx[cml \n\n20 \n\n40 \n\n60 \n\nFigure  1:  A  typical  event  projected  onto the  z  =  0  plane.  The dots,  or hits , have \na  two-fold ambiguity in  the determination of the  xy coordinates through which the \nparticle  has  moved.  The correct  solutions  lie  on  curved  tracks  that emanate from \nthe origin. \n\nEach physical hit is represented twice in Fig.  1 due to an inherent two-fold ambiguity \nin the determination of its xy coordinates.  The correct solutions form curved tracks \nemanating from the  origin.  Some of those can  be  readily seen  in  the data.  Due  to \nthe limited time available for  decision  making at the trigger  level,  the z  coordinate \nis  obtained from  the difference  in  arrival  times of a  pulse  at both ends  of the CTD \nand  is  available  for  only  a  fraction  of these  points.  The  hit  resolution  in  xy  is \n'\"  230 J.lm , while  that  of z-by-timing  is  :::  4 cm.  The  quality  of the  z  coordinate \n\n\f870 \n\nG.  Dror.  H.  Abramowicz and D.  Hom \n\ninformation is  exemplified  in  figure  2.  Figure 2(a)  shows  points forming a  track  of \na  single  particle  on  the  z  =  0  projection.  Since  the  corresponding  track  forms  a \nhelix with  small curvature,  one expects  a  linear dependence  of the  z  coordinate of \nthe hits on their radial position,  r  =  J x 2 + y2.  Figure 2(b)  compares the values of \nr  with  the measured  z  values  for  these  points.  The scatter  of the  data around  the \nlinear regression  fit  is  considerable. \n\n35,--,--,-,1---,1-...,--,..-----.1--1.----,  10~~-~-,..--~-~-~~-, \n\n301-\n\na) \n\n25f(cid:173)\n~20f-\n>-\n\n101-\n\n51-\n\n... ; \n\n90 \n\n80 \n\nb) \n\n:-r. \n\n-\n-\n\n70 \n\nE \n\n- ~60 \n\nN \n\n_ \n\n-\n\n-\n\n50 \n\n40 \n\n30 \n\n20 \n\nI I I   ' I  \n10 \n\n50 \n\n70 \n\n20 \n\n30 \n\n60 \n\n40 \nx [cm) \n\n80 \n\n1~5  20 \n\n25 \n\n30 \n\n35 \nr[cm) \n\n40 \n\n45 \n\n50 \n\n55 \n\nFigure 2:  A  typical example of uncertainties in the measured  z  values:  (a)  a  single \ntrack taken from the event shown in figure  1,  (b)  the z coordinate vs  r = Jx 2 + y2 \nthe  distance  from  the  z  axis  for  the  data  points  shown  in  (a).  The full  line  is  a \nlinear regression  fit. \n\n3  The Network \n\nOur network is based on step-wise changes in the representation of the data, moving \nfrom  the  input  points,  to  local  line  segments  and  to  global  arcs.  The  nature  of \nthe  data  and  the  problem  suggest  it  is  best  to  separate  the  treatment  of the  xy \ncoordinates  from  that  of the  z  coordinate.  Two  parallel  networks  which  perform \nentirely  different  computations,  form  our  final  system.  The  first  network,  which \nhandles the xy information is responsible for constructing arcs that correctly identify \nsome of the  particle tracks  in the event.  The second  network  uses  this information \nto evaluate the  z  location of the point where  all tracks  meet. \n\n3.1  Arc Identification Network \n\nThe arc identification network processes information in a fashion akin to the method \nvisual information is  processed  by the primary visual system  [3]. \n\nThe input layer for  this network is  made of a  large number of neurons  (several  tens \nof thousands)  and  corresponds  to  the  function  of the  retina.  Each  input  neuron \nhas its distinct  receptive  field.  The sum of all fields  covers  completely the  relevant \ndomain  in  the  xy  plane.  This  domain  has  5  concentric  rings,  which  show  up  in \nfigure  1.  The total area of the rings is  about 5000 cm2 ,  and covering it with 100000 \ninput  neurons  leads  to  satisfactory  resolution.  A  neuron  in  the  input  level  fires \nwhen  a  hit is present in its receptive field.  We shall label each  input neuron  by  the \n(xy)  coordinates of the center  of its  receptive field. \n\nNeurons  of the  second  layer  are  line segment detectors.  Each  second  layer neuron \nis labeled by (XY a), where  (X, Y)  are the coordinates of the center of the segment \n\n\fVertex Identification  in High Energy Physics Experiments \n\n871 \n\nand  0'  denotes  its orientation.  The activation of second  layer neurons  is  given  by \n\nVXYa  =  g(2:: J XY a ,xy V xy  -\n\n( 2 )  , \n\nxy \n\nwhere \n\nlxY a ,ry = { ~1  ifr.L  <  O.5cmArll  <  2cm \n\nifO.5cm< r.L  <  1cmArii  <  2cm \notherwise \n\n(1) \n\n(2) \n\nand  g( x)  is  the  standard  Heaviside  step  function .  rll  and  r.L  are  the  parallel  and \nperpendicular  distances  between  (X , Y)  and  (x, y)  with  respect  to  the  axis  of the \nline segment, defined  by  0' .  It is  important to  note that  at this level , values  of the \nthreshold  82  which  are slightly lower  than optimum are  preferable,  taking the  risk \nof obtaining superfluous  line segments in  order to reduce  the probability of missing \none.  Superfluous  line segments are filtered  out  very  efficiently in higher layers. \n\nFigure 3 represents  the output of the second  layer neurons for  the input illustrated \nby  the  event  of figure  1.  An  active  second  layer  neuron  (XY 0')  is  represented  in \nthis figure  by a  line segment centered  at  the  point (X , Y)  making an  angle  0'  with \nthe  x  axis.  The  length  of the  line segments  is  immaterial and was  chosen  only for \nthe purpose  of visual clarity. \n\n60 \n\n40 \n\n20 \nE  0 ~ \n>-\n-20 \n\n-40 \n\n-60 \n\n\"Z \n\n.... \n~  .~ \n1'1(  #- .>~~  . \ns \n\n~.::..  ~ \n\n~ , \n~ ~.  ' \n~ . \n\n\"\\0 \n\n-t!- ~ \n~ \n\n... \n\n~ \n\n~1t ~ \n..,.  ~ ,  . \nl-\n-J.. \n\ni-!. \n\nI' \n\n;J \n\n\"'\"  \" \n\n'\" \n\n-60  -40  -20 \n\n0 \n\nxfcml \n\n20 \n\n40 \n\n60 \n\nFigure 3:  Representation of the activity of second layer neurons  XY 0' for  the input \nof figure  1 taken by plotting the appropriate line segments in  the xy plane .  At some \nXY locations several line segments with different  directions occur due to the rather \nlow  threshold parameter used , 82  =  4. \n\nNeurons  of the third  layer  transform the  representation  of local  line segments into \nlocal  arc  segments.  An  arc  which  passes  through  the origin  is  uniquely  defined  by \nits radius of curvature  R  and its slope at the origin.  Thus , each third  layer neuron \nis  labeled  by  '\" 8 i , where  1\"'1  =  1/ R  is  the  curvature  and  the  sign  of '\"  determines \nthe orient ation of the arc.  1 <  i  <  5  is  an  index which  relates each  arc segment to \nthe ring  it belongs to. \n\n- -\n\nThe  mapping between  second  and  third  layers is  based  on  a  winner-take-all  mech(cid:173)\nanism.  Namely, for  a  given  local  arc  segment,  we  take  the  arc  segment  which  is \nclosest  to being tangent to the local  arc segment. \nDenoting the average radius of the ring i  (  i=1 ,2, ... 5)  by rj and using f3i  =  sin -1 (y) \n\n\f872 \n\nG.  Dror.  H.  Abramowicz and D.  Horn \n\nthe final  expression for  the  activation of the third layer neurons  is \n\nV\",lIi  = maxe \n\n_0 2 \n\n0<3 \n\n2 \n\ncos  (()  - 2f3i - 0:), \n\n(3) \n\nwhere 6 = 6(X , Y, \"', (),  i) = J(X - ri cos((}  - f3d)2  + (Y  - ri sin(() - f3d)2  is simply \nthe  distance  of the  center  of the  receptive  field  of the  (XY 0:)  neuron  to  the  (\"'(}) \narc. \n\nThe fourth layer is the last one in the arc identification network.  Neurons belonging \nto this layer  are  global arc detectors.  In  other words,  they  detect  projected  tracks \non the z = 0 plane.  A fourth level  neuron is denoted by \"'(}  , where\", and ()  have the \nprevious  meaning, now  describing  global arcs.  Fourth layer neurons  are  connected \nto third layer  neurons  in a  simple fashion , \n\nVd = g( L 6\"\"\",,611 ,11' V\"\"II'i  - (}4)  . \n\n\",'II' i \n\n(4) \n\nFigure  4  represents  the  activity of fourth  layer  neurons.  Each  active  neuron  \"'(}  is \nequivalent in  the  xy plane to one  arc  appearing in the figure . \n\n. ~-\n\n~ \n\n60 \n\n40 \n\n20 \n\nE \n~o \n>-\n-20 \n\n-40 \n\n-60 \n\n-60  -40  -20 \n\nf< \n\nx  em] \n\n20 \n\n40 \n\n60 \n\nFigure  4:  Representation  of the activity of fourth  layer neurons \"'(}  for  the input  of \nfigure  1 taken  by  plotting  the  appropriate  arcs  in  t he  xy plane.  The arcs  are  not \nprecisely  congruent  to  the  activity  of the  input  layer  which  is  also  shown ,  due  to \nthe  finite  widths  which  were  used,  il\",  =  0.004  and  il(}  = 7r/20.  This  figure  was \nproduced  with  (}4  =  3. \n\n3.2  z  Location Network \n\nThe architecture of the second  network has a structure which is  identical to the first \none,  although its computational task  is  different.  We  will use  an identical labeling \nsystem  for  its  neurons ,  but  denote  their  activities  by  v xy .  The  latter  will  assume \ncontinuous values in  this  network. \n\nA  first  layer  neuron  of  the  z-location  network  receives  its  input  from  the  same \nreceptive field  as its corresponding neuron in the first  network.  Its value, vxy ,  is the \nmean value of the  z  values of the points within its receptive field .  If no z  values  are \navailable for  these  points , a  null value is  assigned  to it. \nThe second layer neurons compute the mean value v XY a  =  (v xy )  of the z  coordinate \nof the first  layer  neurons  in  their  receptive  field , averaging over  all  neurons  within \n\n\fVertex  Identification  in High Energy Physics Experiments \n\n873 \n\nthe section \n\n{xy II(x - X) sina - (y - Y) cosal < 0.5cm/\\ (x - X)2 + (y  - y)2 < 4cm2}  , \n\nwhich  corresponds  to  the  excitatory  part  of the  synaptic  connections  of equation \n(2).  If null values appear within that section  they are disregarded  by the averaging \nprocedure.  If all values are null , VXYa  is  assigned a null value too.  This Z  averaging \nprocedure  is  similarly propagated to the  third layer  neurons. \n\nThe fourth  layer  neurons  evaluate  the  Z  value  of the  origin  of each  arc  identified \nby  the first  network.  This is  performed  by a  simple linear extrapolation.  The final \nz  estimate  of the  vertex,  Znet ,  which  should  be  the  common origin  of all  arcs,  IS \ncalculated by  averaging the outputs of all  active fourth  layer neurons. \n\n4  Results \n\nIn  order  to  test  the  network,  we  ran  it  over  a  set  of 1000  events  generated  by  a \nMonte-Carlo simulator as  well  as  over  a  sample of physical  events  taken  from  the \nZEUS  experiment  at  the  HEP  laboratory  DESY  in  Hamburg.  For  the  former  set \nwe  compared the estimate of the net  Znet  with the nominal location of the vertex z, \nwhereas  for  the  real  events  in  the latter set , we  compared it  with  an estimate  Zrec \nobtained by full  reconstruction  algorithm , which  runs off-line and  uses  all  available \ndata.  Results of the  two tests  can  be  compared since it is  well  established  that the \nresult  of the full  reconstruction  algorithm is  within  1 mm from  the  exact  location \nof the vertex. \n\nNetwork \n\n<Az>=-2.7\u00b1O.2 \n(1  =  6.1 \u00b1O.2 \n\nz \n\n140 \n\n120 \n\n100 \n\n80 \n\n60 \n\n40 \n\nHistogrom \n\n<Az>=  1.9\u00b1O.3 \n(1  =  8.4\u00b1O.3 \n\nz \n\n140 \n\n120 \n\n100 \n\n80 \n\n60 \n\n40 \n\n20  J  ~~  20 \n\n0 \n\n0 \n\n-40 \n\n-20 \n\n0 \n\n20 \n\n40 \n\nAz  [em] \n\nFigure 5:  Distribution of ~ z = Ze8timate  - Zexact  values  for  two types  of estimates, \n(a)  the  one  proposed  in  this  paper  and  (b)  the  one  based  on  a  commonly  used \nhistogram method. \n\nWe  also  compared  our  results  with  those  of an  algorithmic  method  used  for  trig(cid:173)\ngering at ZEUS  [4].  We  shall refer  to this  method as  the  'histogram method '.  The \nperformance of the  two  methods  was  compared  on  a  sample of 1000  Monte-Carlo \nevents.  The  network  was  unable  to  get  an  estimate  for  16  events  from  the  set , \nas  compared  with  15  for  the  histogram method  (15  of those  events  were  common \n\n\f874 \n\nG.  Dror,  H.  Abramowicz and D.  Horn \n\nIn  Figure  5  we  compare  the  distributions  of  ~z = Znet  - Zexact  and \nfailures). \n~Z =  Zhist  - Zexact  for  the  sample of Monte-Carlo events, where  Zexact  is  the  gen(cid:173)\nerated  location of the vertex.  Both methods lead  to small biases,  -2.7 cm for  Znet \nand  1.9 cm for  Zhist .  The  resolution,  as  obtained  from  a  Gaussian  fit ,  was  found \nto be better  for  the  network  approach  (0- = 6.1 cm)  as  compared  to  the  histogram \nmethod  (0- = 8.4cm).  In  addition,  it should  be  noted  that  the  histogram method \nyields discrete  results,  with a step of 10 cm,  whereas  the current  method gives con(cid:173)\ntinuous  values.  This can  be  of great  advantage for  further  processing.  Note  that \noff-line,  after using the whole CTD information, the resolution is better than  1 mm. \n\n5  Discussion \n\nWe  have  described  a  feed forward  double  neural  network  that  performs  a  task  of \npattern identification by thresholding and selecting subsets of data on which a simple \ncomputation can  lead  to  the  final  answer.  The  network  uses  a  fixed  architecture, \nwhich allows for its implementation in hardware, crucial for fast triggering purposes. \n\nThe  basic  idea of using  a  fixed  architecture  that  is  inspired  by  the  way  our  brain \nprocesses  visual  information,  is  similar to  the  the  raison  d 'etre  of the  orientation \nselective  neural  network  employed  by  [5].  The latter  was  based  on  orientation se(cid:173)\nlective cells  only, which were  sufficient  to select  linear tracks  that are of interest  in \nHEP  experiments.  Here  we  develop an arc identification  method, following similar \nsteps.  Both  methods  can  also  be  viewed  as  generalizations  of the  Hough  trans(cid:173)\nform  [6]  that  was  originally  proposed  for  straight  line  identification  and  may  be \nregarded  as  a  basic  element  of pattern  recognition  problems  [7].  Neither  [5]  nor \nour  present  proposal  were  considered  by  previous  neural  network  analyses  of HEP \ndata [8] .  The results that we  have obtained are very promising.  We hope that they \nopen  the possibility for  a  new  type of neural  network  implementation in triggering \ndevices  of HEP  experiments. \n\nAcknowledgments \n\nWe  are  indebted  to the  ZEUS  Collaboration whose  data were  used  for  this  study. \nThis research  was  partially supported  by  the Israel  National Science  Foundation . \n\nReferences \n\n[1]  B.  Foster et al. , Nuclear Instrum. and Methods in  Phys. Res. A338  (1994)  254. \n[2]  ZEUS  Collab.,  The  ZEUS  Detector,  Status  Report  1993,  DESY  1993;  M. \n\nDerrick  et  al. , Phys.  Lett.  B  293  (1992)  465 . \n\n[3]  D. H.  Hubel  and T . N. Wiesel,  J.  Physiol.  195  (1968)  215. \n[4]  A. Quadt ,  MSc  thesis,  University  of Oxford  (1997) . \n[5]  H. Abramowicz , D.  Horn , U.  Naftaly and C . Sahar-Pikielny, Nuclear Instrum. \nand  Methods in  Phys.  Res.  A378  (1996)  305;  Advances in  Neural  Information \nProcessing  Systems  9,  eds.  M.  C .  Mozer ,  M.  J.  Jordan  and T.  Petsche,  MIT \nPress  1997, pp.  925- 931. \n\n[6]  P. V.  Hough , \"Methods and means to recognize complex patterns\", U.S. patent \n\n3.069.654. \n\n[7]  R.  O.  Duda and P.  E.  Hart,  \"Pattern classification and scene  analysis\" , Wiley, \n\nNew  York, 1973. \n\n[8]  B.  Denby, Neural  Computation, 5  (1993)  505. \n\n\f", "award": [], "sourceid": 1546, "authors": [{"given_name": "Gideon", "family_name": "Dror", "institution": null}, {"given_name": "Halina", "family_name": "Abramowicz", "institution": null}, {"given_name": "David", "family_name": "Horn", "institution": null}]}