{"title": "An Efficient, Exact Algorithm for Solving Tree-Structured Graphical Games", "book": "Advances in Neural Information Processing Systems", "page_first": 817, "page_last": 823, "abstract": null, "full_text": "An Efficient,  Exact  Algorithm for  Solving \n\nTree-Structured  Graphical  Games \n\nMichael L.  Littman \nAT&T Labs- Research \n\nFlorham Park, NJ  07932-0971 \nmlittman\u00a9research.att.com \n\nMichael Kearns \n\nDepartment of Computer & Information Science \n\nUniversity of Pennsylvania \nPhiladelphia, PA  19104-6389 \n\nmkearns\u00a9cis.upenn.edu \n\nSatinder  Singh \nSyntek Capital \n\nNew  York,  NY  10019-4460 \nbaveja\u00a9cs. colorado. edu \n\nAbstract \n\nWe  describe a  new  algorithm for  computing a  Nash equilibrium in \ngraphical  games,  a compact representation for  multi-agent systems \nthat  we  introduced  in  previous  work.  The  algorithm  is  the  first \nto compute equilibria both efficiently  and exactly for  a  non-trivial \nclass of graphical games. \n\n1 \n\nIntroduction \n\nSeeking  to  replicate  the  representational  and  computational  benefits  that  graph(cid:173)\nical  models  have  provided  to  probabilistic  inference,  several  recent  works \nhave  introduced  graph-theoretic  frameworks  for  the  study  of  multi-agent  sys(cid:173)\ntems  (La Mura 2000; Koller  and Milch  2001;  Kearns et  al.  2001).  In  the  simplest \nof these formalisms,  each  vertex represents  a  single agent,  and the edges  represent \npairwise  interaction  between  agents.  As  with  many  familiar  network  models,  the \nmacroscopic behavior of a large system is thus implicitly described by its local inter(cid:173)\nactions,  and the computational challenge is  to extract the global states of interest. \nClassical game  theory is  typically  used to model  multi-agent interactions,  and the \nglobal  states  of interest  are  thus  the  so-called  Nash  equilibria,  in  which  no  agent \nhas a  unilateral incentive to deviate. \n\nIn a recent paper (Kearns et al.  2001), we  introduced such a graphical formalism for \nmulti-agent game theory,  and provided two algorithms for  computing Nash equilib(cid:173)\nria when the underlying graph is a tree (or is sufficiently sparse).  The first algorithm \n\n\fcomputes  approximations  to  all  Nash  equilibria,  in  time  polynomial  in  the size  of \nthe  representation  and  the  quality  of  the  desired  approximation.  A  second  and \nrelated algorithm computes  all  Nash  equilibria exactly,  but in  time exponential in \nthe number of agents.  We thus left open the problem of efficiently computing exact \nequilibria in sparse graphs. \nIn this paper, we  describe a new algorithm that solves this problem.  Given as input \na  graphical game that is  a  tree, the algorithm computes in polynomial time an ex(cid:173)\nact Nash equilibrium for  the global multi-agent system.  The main advances involve \nthe definition  of a  new  data structure for  representing  \"upstream\"  or partial Nash \nequilibria,  and a  proof that this data structure can always  be extended to a  global \nequilibrium.  The  new  algorithm  can  also  be  extended  to  efficiently  accommodate \nparametric representations of the local game matrices, which are analogous to para(cid:173)\nmetric conditional probability tables  (such  as  noisy-OR and sigmoids)  in  Bayesian \nnetworks. \n\nThe  analogy  between  graphical  models  for  multi-agent  systems  and  probabilistic \ninference  is  tempting  and  useful  to  an  extent.  The  problem  of computing  Nash \nequilibria in  a  graphical game,  however,  appears to  be  considerably  more  difficult \nthan computing  conditional  probabilities  in  Bayesian  networks.  Nevertheless,  the \nanalogy  and  the  work  presented  here  suggest  a  number  of interesting  avenues  for \nfurther  work  in  the  intersection of game  theory,  network  models,  probabilistic in(cid:173)\nference,  statistical physics,  and other fields. \n\nThe paper is  organized as follows.  Section  2 introduces graphical games  and other \nnecessary notation and definitions.  Section 3 presents our algorithm and its analysis, \nand Section 4 gives  a  brief conclusion. \n\n2  Preliminaries \n\nAn  n-player,  two-action  1  game is  defined  by  a  set  of n  matrices  Mi  (1  ~ i  ~ n), \neach with n indices.  The entry Mi(Xl, ... ,xn )  =  Mi(X)  specifies the payoff to player \ni when the joint action of the n  players is x E {O, I} n.  Thus, each Mi  has 2n  entries. \nIf a game is  given by simply listing the 2n  entries of each of the n  matrices,  we  will \nsay that it is  represented in  tabular form. \nThe actions \u00b0 and  1 are the  pure  strategies  of each player,  while  a  mixed  strategy \nfor  player  i  is  given  by  the  probability Pi  E  [0, 1]  that  the  player  will  play  1.  For \nany joint mixed strategy, given by a  product distribution p,  we  define  the expected \npayoff to player i  as  Mi(i/)  =  Ex~p[Mi(X)], where x'\" pindicates that each Xj  is  1 \nwith probability Pj  and \u00b0 with probability 1 - Pj. \n\nWe  use  p[i  :  P:]  to  denote  the  vector  that  is  the  same  as  p  except  in  the  ith \ncomponent,  where  the  value  has  been  changed  to P:.  A  Nash  equilibrium  for  the \ngame  is  a  mixed  strategy  p such  that  for  any  player  i,  and  for  any  value  P:  E \n[0,1],  Mi(i/)  :::::  Mi(p[i  : pm.  (We  say  that Pi  is  a  best  response  to  jJ.)  In  other \n\nwords,  no  player can  improve  its  expected  payoff by  deviating  unilaterally from  a \nNash  equilibrium.  The  classic  theorem  of Nash  (1951)  states  that  for  any  game, \nthere  exists  a  Nash  equilibrium  in  the  space  of  joint  mixed  strategies  (product \ndistri butions). \n\nAn n-player graphical  game is  a pair (G, M), where G is  an undirected graph2  on n \n\n1 At present, no  polynomial-time algorithm is  known for  finding Nash equilibria even in \n2-player  games with  more  than two  actions,  so  we  leave  the extension of our work  to the \nmulti-action setting for  future work. \n\n2The directed tree-structured case is  trivial  and is  not addressed in this paper. \n\n\fvertices and M  is a set of n  matrices Mi  (1  ::;  i  ::;  n), called the local game matrices . \nPlayer  i  is  represented  by  a  vertex  labeled  i  in  G.  We  use  N G (i)  ~ {I, ... , n} \nto  denote  the  set  of  neighbors  of  player  i  in  G-\nthose  vertices  j  such  that  the \nundirected edge  (i , j)  appears in G.  By convention,  NG(i)  always includes i  itself. \nThe  interpretation is  that  each  player is  in  a  game  with  only  his  neighbors  in  G. \nThus, if ING(i) I =  k,  the matrix Mi  has k  indices, one for  each player in NG(i) , and \nif x E  [0, Ilk,  Mi(X)  denotes  the  payoff to  i  when  his  k  neighbors  (which  include \nhimself)  play x.  The expected  payoff under a  mixed strategy jJ E  [0, Ilk  is  defined \nanalogously.  Note  that  in  the  two-action  case,  Mi  has  2k  entries,  which  may  be \nconsiderably smaller than 2n. \nSince  we  identify  players with vertices in G,  it will  be easier to treat vertices sym(cid:173)\nbolically  (such as  U, V  and W)  rather than by integer indices.  We  thus use  Mv to \ndenote the local game matrix for  the player identified with  vertex V. \n\nNote that our definitions are entirely representational, and alter nothing about the \nunderlying game theory.  Thus, every graphical game has a  Nash equilibrium.  Fur(cid:173)\nthermore, every game can be trivially represented as  a  graphical game by choosing \nG  to  be  the  complete  graph  and  letting  the  local  game  matrices  be  the  original \ntabular form matrices.  Indeed, in some cases, this may be the most compact graph(cid:173)\nical representation of the tabular game.  However, exactly as for  Bayesian networks \nand other graphical models for  probabilistic inference, any game in which  the local \nneighborhoods  in  G  can be  bounded  by  k  \u00ab n,  exponential  space  savings  accrue. \nThe  algorithm  presented  here  demonstrates  that  for  trees,  exponential  computa(cid:173)\ntional  benefits may also  be realized. \n\n3  The Algorithm \n\nIf (G, M)  is  a  graphical game in  which  G is  a  tree,  then  we  can  always  designate \nsome vertex Z  as the root.  For any vertex V,  the single neighbor of Von the path \nfrom V  to Z  shall be called the child of V,  and the  (possibly many)  neighbors of V \non paths towards the leaves shall be called the parents of V.  Our algorithm consists \nof two passes:  a  downstream pass in which local data structures are passed from the \nleaves  towards  the  root,  and  an  upstream  pass  progressing from  the  root  towards \nthe leaves. \n\nThroughout  the  ensuing  discussion,  we  consider  a  fixed  vertex  V  with  parents \nUI ,  ... , Uk  and child  W.  On the  downstream  pass of our algorithm,  vertex V  will \ncompute and pass to its child  W  a  breakpoint policy,  which  we  now  define. \n\nDefinition 1  A  breakpoint policy for V  consists of an ordered set of W -breakpoints \n\nWo  =  \u00b0 <  WI  <  W2  <  ...  <  Wt-I  <  Wt  =  1  and  an  associated  set  of V-values \n\nVI , . .. ,Vt\u00b7  The  interpretation is  that for  any W  E [0,1],  if Wi-I  < W  < Wi  for  some \nindex i  and W  plays w,  then V  shall play Vii  and if W  =  Wi  for  some  index i ,  then \nV  shall  play  any  value  between  Vi  and Vi+I.  We  say  such  a  breakpoint  policy  has \nt  - 1  breakpoints. \n\nA breakpoint policy for  V  can thus be seen as assigning a value  (or range of values) \nto the mixed strategy played by V  in response to the play of its child W.  In  a slight \nabuse of notation, we  will  denote this  breakpoint policy as  a  function  Fv(w),  with \nthe understanding that the assignment  V  =  Fv(w)  means  that V  plays either the \nfixed  value  determined  by  the breakpoint  policy  (in  the  case  that  W  falls  between \nbreakpoints), or plays any value in the interval determined by the breakpoint policy \n(in  the case that W  equals some breakpoint). \n\n\fLet  G V  denote  the  subtree  of G  with  root  V,  and  let  M~=w denote  the  subset \nof the  set  of local  game  matrices  M  corresponding  to  the  vertices  in  GV ,  except \nthat the  matrix  M v  is  collapsed  one  index  by  setting W  =  w,  thus  marginalizing \nW  out.  On its downstream pass,  our algorithm shall maintain the invariant that if \nwe  set  the  child  W  = w,  then  there  is  a  Nash equilibrium for  the  graphical game \n(G v , M~=w) (an upstream Nash)  in which V  =  Fv(w).  If this property is satisfied \nby  Fv(w),  we  shall  say that  Fv(w)  is  a  Nash  breakpoint  policy  for  V.  Note  that \nsince  (Gv, M~=w) is just another graphical game, it of course has (perhaps many) \nNash  equilibria,  and  V  is  assigned  some  value  in  each.  The  trick  is  to  commit \nto  one  of these  values  (as  specified  by  Fv (w))  that  can  be  extended  to  a  Nash \nequilibrium for  the entire tree G,  before  we  have even processed the tree below  V . \nAccomplishing this efficiently and exactly is  one of the main advances in this work \nover our previous algorithm  (Kearns et al.  2001). \nThe  algorithm  and  analysis  are  inductive:  V  computes  a  Nash  breakpoint  policy \nFv(w) from  Nash breakpoint policies  FUl (v), ... , FUk (v)  passed down from its par(cid:173)\nents  (and  from  the  local  game  matrix  Mv).  The  complexity  analysis  bounds  the \nnumber  of breakpoints  for  any  vertex in  the  tree.  We  now  describe  the  inductive \nstep and its analysis. \n\n3.1  Downstream Pass \nFor any setting it E [0, l]k  for  -0  and w  E [0,1]  for  W, let  us  define \n\n~v(i1,w) ==  Mv(l,it,w) - Mv(O,it,w). \n\nThe sign of ~v(it, w)  tells us V's best response to the setting of the local neighbor(cid:173)\nhood -0  =  it, W  =  w;  positive sign means V  =  1 is  the best response,  negative that \nV  =  0  is  the  best  response,  and  0  that  V  is  indifferent  and  may  play  any  mixed \nstrategy.  Note also  that we  can express  ~v(it,w) as  a  linear function  of w: \n\n~v(it,w) =  ~v(it, O) + w(~v(it, 1)  - ~v(it, 0)). \n\nFor the base case,  suppose V  is  a  leaf with child  W;  we  want to describe the Nash \nbreakpoint  policy  for  V.  If for  all  w  E  [0,1],  the  function  ~v(w) is  non-negative \n(non-positive,  respectively),  V  can  choose  1  (0,  respectively)  as  a  best  response \n(which  in  this  base  case  is  an  upstream  Nash)  to  all  values  W  = w.  Otherwise, \n~ v (w)  crosses  the  w-axis,  separating  the  values  of w  for  which  V  should  choose \n1,  0,  or  be  indifferent  (at  the  crossing  point).  Thus,  this  crossing  point  becomes \nthe single breakpoint in  Fv(w).  Note that if V  is  indifferent for  all  values of w,  we \nassume without loss  of generality that V  plays  l. \n\nThe following  theorem is  the centerpiece of the analysis. \n\nTheorem 2  Let vertex V  have parents UI , ... ,Uk  and child W,  and assume V  has \nreceived Nash  breakpoint policies FUi (v)  from  each parent Ui .  Then V  can  efficiently \ncompute  a  Nash  breakpoint  policy  Fv (w).  The  number  of  breakpoints  is  no  more \nthan  two  plus  the  total  number of breakpoints  in the  FUi (v)  policies. \n\nProof:  Recall that for  any fixed  value  of v,  the breakpoint policy  FUi (v)  specifies \neither a specific value for Ui  (if v falls between two breakpoints of FUi (v)) , or a range \nof allowed values for  Ui  (if v is equal to a breakpoint).  Let us assume without loss of \ngenerality that no two FUi (v)  share a breakpoint, and let Vo  =  0 < VI  < ... < Vs  =  1 \nbe the ordered union of the breakpoints of the FUi (v).  Thus for  any breakpoint Vi, \nthere is  at most one distinguished parent Uj  (that we  shall call the free  parent)  for \nwhich Fu; (Vi)  specifies an allowed interval of play for  Uj .  All  other Ui  are assigned \n\n\ffixed  values  by Fu; (ve).  For each breakpoint Ve,  we  now  define  the set of values for \nthe child W  that, as  we  let the free  parent range across its allowed interval, permit \nV  to play any mixed strategy as  a  best response. \n\nDefinition 3  Let Vo  =  0  < VI  < ... < Vs  =  1  be  the  ordered  union  of the  break(cid:173)\npoints  of the parent policies Fu; (v).  Fix any  breakpoint Ve,  and assume without loss \nof generality  that  UI  is  the  free  parent  of V  for  V  =  Ve.  Let  [a,  b]  be  the  allowed \ninterval ofUI  specified by FUI (ve),  and letui = Fu;(ve)  for  all 2  :::;  i:::;  k.  We  define \n\nWe = {w  E  [0,1]:  (:lUI  E  [a,b])6.v(UI,U2, ... ,Uk,W)  = O}. \n\nIn  words,  We  is  the  set  of values  that W  can  play  that  allow  V  to  play  any  mixed \nstrategy,  preserving  the  existence  of an  upstream  Nash  from  V  given W  =  w. \n\nThe next lemma,  which  we  state without proof and is  a  special case of Lemma 6 in \nKearns et al.  (2001),  limits  the complexity of the sets We.  It also follows  from  the \nearlier work that We  can be computed in  time proportional to the size of V's  local \ngame matrix - O(2k)  for  a  vertex with  k  parents. \nWe  say that an interval  [a, b]  ~ [0, 1]  is  floating  if both a -I- 0 and b -I- 1. \n\nLemma 4  For  any  breakpoint Ve,  the  set We  is  either  empty,  a  single  interval,  or \nthe  union  of two  intervals  that  are  not floating. \n\nWe  wish  to create the  (inductive)  Nash breakpoint policy Fv(w)  from  the sets W e \nand  the  Fu;  policies.  The  idea  is  that  if  w  E  We  for  some  breakpoint  index  e, \nthen  by  definition  of We,  if W  plays  wand the  Uis  play  according  to  the  setting \ndetermined by the Fu;  policies  (including  a  fixed  setting for  the free  parent of V), \nany play by V  is a best response-so in particular, V  may play the breakpoint value \nVe,  and thus extend the  Nash solution constructed,  as  the  UiS  can also all  be  best \nresponses.  For  b E {O, I}, we  define  W b  as the set of values  w  such  that if W  =  w \nand  the  Uis  are  set  according  to  their  breakpoint  policies  for  V  =  b,  V  =  b  is  a \nbest  response.  To  create Fv (w)  as  a  total function,  we  must first  show  that every \nw  E [0, 1]  is  contained in  some We  or WO or WI. \n\nLemma 5  Let Vo  =  0 < VI  < ... < Vs  =  1  be  the  ordered  union  of the  breakpoints \nof the  Fu; (v)  policies.  Then  for  any  value  w  E  [0, 1],  either  w  E  w b  for  some \nbE {O, I} ,  or there  exists  an  index e such  that wE W e. \n\nProof:  Consider  any  fixed  value  of w,  and  for  each  open  interval  (vi> vj+d  de(cid:173)\ntermined  by  adjacent  breakpoints,  label  this  interval  by  V 's  best  response  (0  or \n1)  to  W  =  wand 0  set  according  to  the  Fu;  policies  for  this  interval.  If either \nthe  leftmost  interval  [O ,vd  is  labeled  with  0  or  the  rightmost  interval  [vs-I , I]  is \nlabeled  with  1,  then  w  is  included  in  W O  or  WI ,  respectively  (V  playing  0  or  1 \nis  a  best  response  to  what  the  Uis  will  play  in  response  to  a  0  or  1).  Otherwise, \nsince the labeling starts at  1 on the left  and ends at 0 on the right,  there must be \na  breakpoint Ve  such  that  V's  best  response  changes  over  this  breakpoint.  Let  Ui \nbe  the free  parent for  this  breakpoint.  By continuity,  there must  be  a  value  of Ui \nin its allowed interval for  which V  is  indifferent to playing 0 or 1,  so  w  E We.  This \ncompletes the proof of Lemma 5. \nArmed with Lemmas 4 and 5, we can now describe the construction of Fv(w).  Since \nevery w  is  contained in some W e (Lemma 5),  and since every We is  the union of at \nmost two intervals (Lemma 4), we  can uniquely identify the set WeI  that covers the \nlargest  (leftmost)  interval  containing w  =  0;  let  [0, a]  be this  interval.  Continuing \nin  the same manner to the right,  we  can identify the  unique  set We2  that contains \n\n\fv7r----- --- ----- --- ----- --- ----- --- ----- -------- - r - - - - - -\nv6  - ------~ -------- - -- - -- - -- - -- - - - - - - - - - - - - - - - - - - - - -\n\nV \n\nv5  ------------------------------ - , - - - - - - - ' ---------------\nv4  -\n------------- ------------ ---------------------\n\nv3f------.- ---------------------- --------------------------- - -\n\nv2  _______ --'-_ _____  ----L _________________________________ _ \n\nvI  ----------------------------------------------------\n\nw \n\nFigure  1:  Example of the inductive construction of Fv(w).  The dashed  horizontal \nlines  show  the vrbreakpoints determined  by the parent policies  Fu; (v).  The solid \nintervals  along these  breakpoints  are the  sets We.  As  shown  in  Lemma 4,  each of \nthese sets consists of either a  single  (possibly floating)  interval, or two non-floating \nintervals.  As  shown  in  Lemma  5,  each  value  of w  is  covered  by  some  We.  The \nconstruction of Fv(w)  (represented  by  a  thick  line)  begins  on the left,  and always \nnext  \"jumps\"  to the interval allowing greatest progress to the right. \n\nw  =  a  and extends  farthest  to  the  right  of a.  Any  overlap  between  We 1  and  We2 \ncan be  arbitrarily  assigned  coverage by We 1 ,  and We2  \"trimmed\"  accordingly;  see \nFigure 1.  This  process results in a  Nash breakpoint policy  Fv(w). \nFinally, we  bound the number of breakpoints in the Fv (w)  policy.  By construction, \neach of its breakpoints must be the rightmost portion of some interval in WO,  WI, or \nsome We.  After the first breakpoint, each of these sets contributes at most one new \nbreakpoint  (Lemma 4).  The final  breakpoint is  at w  =  1 and  does  not  contribute \nto  the  count  (Definition  1).  There  is  at most  one  We  for  each  breakpoint  in  each \nFu; (v)  policy,  plus  WO  and  WI,  plus  the  initial  leftmost  interval  and  minus  the \nfinal  breakpoint,  so  the total breakpoints  in  Fv(w)  can be no  more than two  plus \nthe total number of breakpoints in the Fu; (v)  policies.  Therefore, the root of a size \nn  tree will  have a  Nash breakpoint policy with  no  more than 2n breakpoints. \n\nThis completes the proof of Theorem 2. \n\n3.2  Upstream Pass \n\nThe  downstream  pass  completes  when  each  vertex  in  the  tree  has  had  its  Nash \nbreakpoint policy computed.  For simplicity of description, imagine that the root of \nt he tree includes a dummy child with constant payoffs and no influence on t he root, \nso the root's breakpoint policy has t he same form  as the others in the tree. \n\nTo  produce  a  Nash  equilibrium,  our  algorithm  performs  an  upstream  pass  over \nthe  tree,  starting  from  the  root.  Each  vertex  is  told  by  its  child  what  value  to \nplay,  as  well  as  the value  the  child  itself will  play.  The algorithm ensures  that all \ndownstream  vertices  are  Nash  (playing  best  response  to  their  neighbors).  Given \nthis  information,  each  vertex  computes  a  value  for  each  of its  parents  so  that  its \n\n\fown assigned action is  a best response.  This process can be initiated by the dummy \nvertex picking an arbitrary value for  itself,  and selecting the root's value according \nto its Nash breakpoint policy. \nInductively,  we  have  a  vertex V  connected  to  parents  U1 ,  ... , Uk  (or  no  parents  if \nV  is  a  leaf)  and child W.  The child of V  has informed V  to chose V  =  v  and that \nit will  play W  =  w.  To decide on values for  V's parents to enforce V  playing a  best \nresponse, we  can look at the Nash breakpoint policies FUi (v),  which provide a value \n(or range of values)  for  Ui  as a function of v  that guarantee an upstream Nash.  The \nvalue v  can be a breakpoint for  at most one Ui .  For each Ui ,  if v  is  not a breakpoint \nin  FUi (v) ,  then  Ui  should  be  told  to  select  Ui  =  FUi (v).  If v  is  a  breakpoint  in \nFUi (v),  then  Ui's  value can be computed by solving  ~V(Ul \"'\"  Ui,\"\"  Uk, w)  = 0; \nthis is  the value of Ui  that makes V  indifferent.  The equation is  linear in Ui and has \na  solution  by the  construction of the  Nash  breakpoint policies  on the  downstream \npass.  Parents are passed their assigned values  as  well  as the fact  that V  = v. \nWhen the upstream pass completes, each vertex has a concrete choice of action such \nthat jointly they have formed  a  Nash equilibrium. \n\nThe  total  running  time  of the  algorithm can  be  bounded  as  follows.  Each  vertex \nis  involved  in  a  computation  in  the  downstream  pass  and  in  the  upstream  pass. \nLet  t  be  the  total  number  of breakpoints  in  the  breakpoint  policy  for  a  vertex  V \nwith k  parents.  Sorting the breakpoints and computing the W\u00a3  sets and computing \nthe  new  breakpoint  policy  can  be  completed  in  0 (t log t + t2 k ).  In  the  upstream \npass,  only  one  breakpoint  is  considered,  so  0 (log t  +  2k)  is  sufficient  for  passing \nbreakpoints to the parents.  By Theorem 2,  t  :S  2n , so the entire algorithm executes \nin time O(n2 10g n + n22k), where k is the largest number of neighbors of any vertex \nin  the network. \n\nThe algorithm  can be implemented to  take advantage of local  game  matrices  pro(cid:173)\nvided  in  a  parameterized  form.  For  example,  if  each  vertex's  payoff  is  solely  a \nfunction  of the number of 1s  played by the vertex's neighbors, the algorithm takes \nO(n2 10gn + n 2 k),  eliminating the exponential dependence on k. \n\n4  Conclusion \n\nThe  algorithm  presented  in  this  paper finds  a  single  Nash  equilibrium  for  a  game \nrepresented by a tree-structured network.  By building representations of all equilib(cid:173)\nria, our earlier algorithm (Kearns et al.  2001)  was able to select equilibria efficiently \naccording to criteria like  maximizing the total expected payoff for  all  players.  The \npolynomial-time  algorithm  described  in  this  paper throws  out  potential  equilibria \nat many stages,  most significantly during the construction of the Nash  breakpoint \npolicies.  An interesting area for future work is to manipulate this process to produce \nequilibria with particular properties. \n\nReferences \nMichael  Kearns,  Michael  L.  Littman,  and  Satinder  Singh.  Graphical  models  for  game \ntheory.  In  Proceedings  of the  17th  Conference  on  Uncertainty  in  Artificial Int elligence \n(UAI),  pages  253- 260,  200l. \n\nDaphne  Koller  and  Brian  Milch.  Multi-agent  influence  diagrams  for  representing  and \n\nsolving games.  Submitted,  2001. \n\nPierfrancesco  La Mura.  Game networks.  In  Proceedings  of the  16th  Conference  on  Uncer(cid:173)\n\ntainty  in  Artificial Intelligence  (UAI),  pages  335- 342,  2000. \n\nJ . F.  Nash.  Non-cooperative games.  Annals  of Math ematics,  54:286- 295,  1951. \n\n\f", "award": [], "sourceid": 2101, "authors": [{"given_name": "Michael", "family_name": "Littman", "institution": null}, {"given_name": "Michael", "family_name": "Kearns", "institution": null}, {"given_name": "Satinder", "family_name": "Singh", "institution": null}]}