{"title": "A message-passing algorithm for multi-agent trajectory planning", "book": "Advances in Neural Information Processing Systems", "page_first": 521, "page_last": 529, "abstract": "We describe a novel approach for computing collision-free \\emph{global} trajectories for $p$ agents with specified initial and final configurations, based on an improved version of the alternating direction method of multipliers (ADMM) algorithm. Compared with existing methods, our approach is naturally parallelizable and allows for incorporating different cost functionals with only minor adjustments. We apply our method to classical challenging instances and observe that its computational requirements scale well with $p$ for several cost functionals. We also show that a specialization of our algorithm can be used for {\\em local} motion planning by solving the problem of joint optimization in velocity space.", "full_text": "A message-passing algorithm\n\nfor multi-agent trajectory planning\n\nJos\u00b4e Bento \u21e4\n\njbento@disneyresearch.com\n\nNate Derbinsky\n\nnate.derbinsky@disneyresearch.com\n\nJavier Alonso-Mora\n\njalonso@disneyresearch.com\n\nJonathan Yedidia\n\nyedidia@disneyresearch.com\n\nAbstract\n\nWe describe a novel approach for computing collision-free global trajectories for\np agents with speci\ufb01ed initial and \ufb01nal con\ufb01gurations, based on an improved ver-\nsion of the alternating direction method of multipliers (ADMM). Compared with\nexisting methods, our approach is naturally parallelizable and allows for incor-\nporating different cost functionals with only minor adjustments. We apply our\nmethod to classical challenging instances and observe that its computational re-\nquirements scale well with p for several cost functionals. We also show that a\nspecialization of our algorithm can be used for local motion planning by solving\nthe problem of joint optimization in velocity space.\n\n1\n\nIntroduction\n\nRobot navigation relies on at least three sub-tasks: localization, mapping, and motion planning. The\nlatter can be described as an optimization problem: compute the lowest-cost path, or trajectory,\nbetween an initial and \ufb01nal con\ufb01guration. This paper focuses on trajectory planning for multiple\nagents, an important problem in robotics [1, 2], computer animation, and crowd simulation [3].\nCentralized planning for multiple agents is PSPACE hard [4, 5]. To contend with this complexity,\ntraditional multi-agent planning prioritizes agents and computes their trajectories sequentially [6],\nleading to suboptimal solutions. By contrast, our method plans for all agents simultaneously. Tra-\njectory planning is also simpli\ufb01ed if agents are non-distinct and can be dynamically assigned to a set\nof goal positions [1]. We consider the harder problem where robots have a unique identity and their\ngoal positions are statically pre-speci\ufb01ed. Both mixed-integer quadratic programming (MIQP) [7]\nand [more ef\ufb01cient, although local] sequential convex programming [8] approaches have been ap-\nplied to the problem of computing collision-free trajectories for multiple agents with pre-speci\ufb01ed\ngoal positions; however, due to the non-convexity of the problem, these approaches, especially the\nformer, do not scale well with the number of agents. Alternatively, trajectories may be found by\nsampling in their joint con\ufb01guration space [9]. This approach is probabilistic and, alone, only gives\nasymptotic guarantees. See Appendix A for further comments on discrete search methods.\nDue to the complexity of planning collision-free trajectories, real-time robot navigation is com-\nmonly decoupled into a global planner and a fast local planner that performs collision-avoidance.\nMany single-agent reactive collision-avoidance algorithms are based either on potential \ufb01elds [10],\nwhich typically ignore the velocity of other agents, or \u201cvelocity obstacles\u201d [11], which provide\nimproved performance in dynamic environments by formulating the optimization in velocity space\ninstead of Cartesian space. Building on an extension of the velocity-obstacles approach, recent work\non centralized collision avoidance [12] computes collision-free local motions for all agents whilst\nmaximizing a joint utility using either a computationally expensive MIQP or an ef\ufb01cient, though\nlocal, QP. While not the main focus of this paper, we show that a specialization of our approach\n\n\u21e4This author would like to thank Emily Hupf and Noa Ghersin for their support while writing this paper.\n\n1\n\n\fto global-trajectory optimization also applies for local-trajectory optimization, and our numerical\nresults demonstrate improvements in both ef\ufb01ciency and scaling performance.\nIn this paper we formalize the global trajectory planning task as follows. Given p agents of different\nradii {ri}p\ni=1 with given desired initial and \ufb01nal positions, {xi(0)}p\ni=1 and {xi(T )}p\ni=1, along with\na cost functional over trajectories, compute collision-free trajectories for all agents that minimize\nthe cost functional. That is, \ufb01nd a set of intermediate points {xi(t)}p\ni=1, t 2 (0, T ), that satis\ufb01es the\n\u201chard\u201d collision-free constraints that kxi(t)  xj(t)k > ri + rj, for all i, j and t, and that insofar as\npossible, minimizes the cost functional.\nThe method we propose searches for a solution within the space of piece-wise linear trajectories,\nwherein the trajectory of an agent is completely speci\ufb01ed by a set of positions at a \ufb01xed set of time\ninstants {ts}\u2318\ns=0. We call these time instants break-points and they are the same for all agents, which\ngreatly simpli\ufb01es the mathematics of our method. All other intermediate points of the trajectories\nare computed by assuming that each agent moves with constant velocity in between break-points: if\nt1 and t2 > t1 are consecutive break-points, then xi(t) = 1\n((t2  t)xi(t1) + (t t1)xi(t2)) for\nt 2 [t1, t2]. Along with the set of initial and \ufb01nal con\ufb01gurations, the number of interior break-points\n(\u2318  1) is an input to our method, with a corresponding tradeoff: increasing \u2318 yields trajectories that\nare more \ufb02exible and smooth, with possibly higher quality; but increasing \u2318 enlarges the problem,\nleading to potentially increased computation.\nThe main contributions of this paper are as follows:\n\nt2t1\n\ni) We formulate the global trajectory planning task as a decomposable optimization problem.\nWe show how to solve the resulting sub-problems exactly and ef\ufb01ciently, despite their non-\nconvexity, and how to coordinate their solutions using message-passing. Our method, based on\nthe \u201cthree-weight\u201d version of ADMM [13], is easily parallelized, does not require parameter\ntuning, and we present empirical evidence of good scalability with p.\n\nii) Within our decomposable framework, we describe different sub-problems, called minimizers,\neach ensuring the trajectories satisfy a separate criterion. Our method is \ufb02exible and can con-\nsider different combinations of minimizers. A particularly crucial minimizer ensures there are\nno inter-agent collisions, but we also derive other minimizers that allow for \ufb01nding trajectories\nwith minimal total energy, avoiding static obstacles, or imposing dynamic constraints, such as\nmaximum/minimum agent velocity.\n\niii) We show that our method can specialize to perform local planning by solving the problem of\n\njoint optimization in velocity space [12].\n\nOur work is among the few examples where the success of applying ADMM to \ufb01nd approximate\nsolutions to a large non-convex problems can be judged with the naked eye, by the gracefulness\nof the trajectories found. This paper also reinforces the claim in [13] that small, yet important,\nmodi\ufb01cations to ADMM can bring an order of magnitude increase in speed. We emphasize the\nimportance of these modi\ufb01cations in our numerical experiments, where we compare the performance\nof our method using the three-weight algorithm (TWA) versus that of standard ADMM.\nThe rest of the paper is organized as follows. Section 2 provides background on ADMM and the\nTWA. Section 3 formulates the global-trajectory-planning task as an optimization problem and de-\nscribes the separate blocks necessary to solve it (the mathematical details of solving these sub-\nproblems are left to appendices). Section 4 evaluates the performance of our solution: its scalability\nwith p, sensitivity to initial conditions, and the effect of different cost functionals. Section 5 explains\nhow to implement a velocity-obstacle method using our method and compares its performance with\nprior work. Finally, Section 6 draws conclusions and suggests directions for future work.\n\n2 Minimizers in the TWA\n\nIn this section we provide a short description of the TWA [13], and, in particular, the role of the\nminimizer building blocks that it needs to solve a particular optimization problem. Section B of the\nsupplementary material includes a full description of the TWA.\nAs a small illustrative example of how the TWA is used to solve optimization problems, suppose we\nwant to solve minx2R3 f (x) = min{x1,x2,x3} f1(x1, x3) + f2(x1, x2, x3) + f3(x3), where fi(.) 2\n\n2\n\n\fR[{+1}. The functions can represent soft costs, for example f3(x3) = (x31)2, or hard equality\nor inequality constraints, such as f1(x1, x3) = J(x1 \uf8ff x3), where we are using the notation J(.) = 0\nif (.) is true or +1 if (.) is false.\nThe TWA solves this optimization problem iteratively by passing messages on a bipartite graph, in\nthe form of a Forney factor graph [14]: one minimizer-node per function fb, one equality-node per\nvariable xj and an edge (b, j), connecting b and j, if fb depends on xj (see Figure 1-left).\n\ng\n\ng\n\ng\n\n1 \n\n2 \n\n3 \n\n= \n\n1 \n\n= \n\n2 \n\n= \n\n3 \n\nn1,1, \uf073\n\u03c11,1\nn1,3, \uf073\n\u03c11,3\nn2,1, \uf073\n\u03c12,1\nn2,2, \uf073\n\u03c12,2\nn2,3, \uf073\n\u03c12,3\n\nn3,3, \uf073\n\u03c13,3\n\ng1\n\ng2\n\ng3\n\nx1,1, \uf072\n\u03c11,1\nx1,3, \uf072\n\u03c11,3\nx2,1, \uf072\n\u03c12,1\nx2,2, \uf072\n\u03c12,2\nx2,3, \uf072\n\u03c12,3\n\nx3,3, \uf072\n\u03c13,3\n\nFigure 1: Left: bipartite graph, with one minimizer-node on the left for each function making up\nthe overall objective function, and one equality-node on the right per variable in the problem. Right:\nThe input and output variables for each minimizer block.\n\nApart from the \ufb01rst-iteration message values, and two internal parameters1 that we specify in Section\n4, the algorithm is fully speci\ufb01ed by the behavior of the minimizers and the topology of the graph.\nWhat does a minimizer do? The minimizer-node g1, for example, solves a small optimization prob-\nlem over its local variables x1 and x3. Without going into the full detail presented in [13] and the\nsupplementary material, the estimates x1,1 and x1,3 are then combined with running sums of the\ndifferences between the minimizer estimates and the equality-node consensus estimates to obtain\nmessages m1,1 and m1,3 on each neighboring edge that are sent to the neighboring equality-nodes\nalong with corresponding certainty weights, !\u21e2 1,2 and !\u21e2 1,3. All other minimizers act similarly.\nThe equality-nodes receive these local messages and weights and produce consensus estimates for\nall variables by computing an average of the incoming messages, weighted by the incoming certainty\nweights !\u21e2 . From these consensus estimates, correcting messages are computed and communicated\nback to the minimizers to help them reach consensus. A certainty weight for the correcting messages,\n \u21e2 , is also communicated back to the minimizers. For example, the minimizer g1 receives correcting\nmessages n1,1 and n1,3 with corresponding certainty weights \u21e2 1,1 and \u21e2 1,3 (see Figure 1-right).\nWhen producing new local estimates, the bth minimizer node computes its local estimates {xj} by\nchoosing a point that minimizes the sum of the local function fb and weighted squared distance from\nthe incoming messages (ties are broken randomly):\n\n1\n\n(1)\n\n{xb,j}j = gb{nb,j}j,{ \u21e2 k\n\nb,j}j \u2318 arg min\n\n{xj}j24fb({xj}j) +\n\n \u21e2 b,j(xj  nb,j)235 ,\nwhere {}j andPj run over all equality-nodes connected to b. In the TWA, the certainty weights\n{!\u21e2 b,j} that this minimizer outputs must be 0 (uncertain); 1 (certain); or \u21e20, set to some \ufb01xed\nvalue. The logic for setting weights from minimizer-nodes depends on the problem; as we shall\nsee, in trajectory planning problems, we only use 0 or \u21e20 weights. If we choose that all minimizers\nalways output weights equal to \u21e20, the TWA reduces to standard ADMM; however, 0-weights allows\nequality-nodes to ignore inactive constraints, traversing the search space much faster.\nFinally, notice that all minimizers can operate simultaneously, and the same is true for the consensus\ncalculation performed by each equality-node. The algorithm is thus easy to parallelize.\n\n2Xj\n\n3 Global trajectory planning\n\nWe now turn to describing our decomposition of the global trajectory planning optimization problem\nin detail. We begin by de\ufb01ning the variables to be optimized in our optimization problem.\nIn\n\n1These are the step-size and \u21e20 constants. See Section B in the supplementary material for more detail.\n\n3\n\n\four formulation, we are not tracking the points of the trajectories by a continuous-time variable\ntaking values in [0, T ]. Rather, our variables are the positions {xi(s)}i2[p], where the trajectories\nare indexed by i and break-points are indexed by a discrete variable s taking values between 1 and\n\u2318  1. Note that {xi(0)}i2[p] and {xi(\u2318)}i2[p] are the initial and \ufb01nal con\ufb01guration, sets of \ufb01xed\nvalues, not variables to optimize.\n\n3.1 Formulation as unconstrained optimization without static obstacles\nIn terms of these variables, the non-collision constraints2 are\n\n(2)\n\nk(\u21b5xi(s + 1) + (1  \u21b5)xi(s))  (\u21b5xj(s + 1) + (1  \u21b5)xj(s))k  ri + rj,\nfor all i, j 2 [p], s 2{ 0, ...,\u2318  1} and \u21b5 2 [0, 1].\n\nThe parameter \u21b5 is used to trace out the constant-velocity trajectories of agents i and j between\nbreak-points s + 1 and s. The parameter \u21b5 has no units, it is a normalized time rather than an\nabsolute time. If t1 is the absolute time of the break-point with integer index s and t2 is the absolute\ntime of the break-point with integer index s + 1 and t parametrizes the trajectories in absolute time\nthen \u21b5 = (t  t1)/(t2  t1). Note that in the above formulation, absolute time does not appear, and\nany solution is simply a set of paths that, when travelled by each agent at constant velocity between\nbreak-points, leads to no collisions. When converting this solution into trajectories parameterized\nby absolute time, the break-points do not need to be chosen uniformly spaced in absolute time.\nThe constraints represented in (2) can be formally incorporated into an unconstrained optimization\nproblem as follows. We search for a solution to the problem:\n\nmin\n\n{xi(s)}i,s\n\nf cost({xi(s)}i,s) +\n\nn1Xs=0Xi>j\n\nf coll\nri,rj (xi(s), xi(s + 1), xj(s), xj(s + 1)),\n\n(3)\n\nwhere {xi(0)}p and {xi(\u2318)}p are constants rather than optimization variables, and where the func-\ntion f cost is a function that represents some cost to be minimized (e.g. the integrated kinetic energy\nor the maximum velocity over all the agents) and the function f coll\n\nr,r0 is de\ufb01ned as,\n\nf coll\nr,r0(x, x, x0, x0) = J(k\u21b5(x  x0) + (1  \u21b5)(x  x0)k  r + r0 8\u21b5 2 [0, 1]).\n\n(4)\nIn this section, x and x represent the position of an arbitrary agent of radius r at two consecutive\nbreak-points and x0 and x0 the position of a second arbitrary agent of radius r0 at the same break-\npoints. In the expression above J(.) takes the value 0 whenever its argument, a clause, is true and\ntakes the value +1 otherwise.\nr,r0 whenever there is a\ncollision, and we pay zero otherwise.\nIn (3) we can set f cost(.), to enforce a preference for trajectories satisfying speci\ufb01c properties. For\nexample, we might prefer trajectories for which the total kinetic energy spent by the set of agents is\nsmall. In this case, de\ufb01ning f cost\n\nIntuitively, we pay an in\ufb01nite cost in f coll\n\nC (x, x) = Ckx  xk2, we have,\n\nf cost({xi(s)}i,s) =\n\n1\npn\n\npXi=1\n\nn1Xs=0\n\nf cost\nCi,s(xi(s), xi(s + 1)).\n\n(5)\n\nwhere the coef\ufb01cients {Ci,s} can account for agents with different masses, different absolute-time\nintervals between-break points or different preferences regarding which agents we want to be less\nactive and which agents are allowed to move faster.\nMore simply, we might want to exclude trajectories in which agents move faster than a certain\namount, but without distinguishing among all remaining trajectories. For this case we can write,\n\n(6)\nIn this case, associating each break-point to a time instant, the coef\ufb01cients {Ci,s} in expression (5)\nwould represent different limits on the velocity of different agents between different sections of the\ntrajectory. If we want to force all agents to have a minimum velocity we can simply reverse the\ninequality in (6).\n\nf cost\nC (x, x) = J(kx  xk \uf8ff C).\n\n2We replaced the strict inequality in the condition for non-collision by a simple inequality \u201c\u201d to avoid\ntechnicalities in formulating the optimization problem. Since the agents are round, this allows for a single point\nof contact between two agents and does not reduce practical relevance.\n\n4\n\n\f3.2 Formulation as unconstrained optimization with static obstacles\n\nIn many scenarios agents should also avoid collisions with static obstacles. Given two points in\nspace, xL and xR, we can forbid all agents from crossing the line segment from xL to xR by adding\nxL,xR,ri(xi(s), xi(s + 1)). We recall that ri is\n\ns=0 f wall\n\nthe following term to the function (3):Pp\n\nthe radius of agent i and\n\ni=1Pn1\n\nf wall\nxL,xR,r(x, x) = J(k(\u21b5x + (1  \u21b5)x)  (xR + (1  )xL)k  r for all \u21b5,  2 [0, 1]).\n\n(7)\n\nNotice that f coll can be expressed using f wall. In particular,\n\nf coll\nr,r0(x, x, x0, x0) = f wall\n\n0,0,r+r0(x0  x, x0  x).\n\n(8)\nWe use this fact later to express the minimizer associated with agent-agent collisions using the\nminimizer associated with agent-obstacle collisions.\nWhen agents move in the plane, i.e. xi(s) 2 R2 for all i 2 [p] and s+1 2 [\u2318 +1], being able to avoid\ncollisions with a general static line segment allows to automatically avoid collisions with multiple\nstatic obstacles of arbitrary polygonal shape. Our numerical experiments only consider agents in the\nplane and so, in this paper, we only describe the minimizer block for wall collision for a 2D world.\nIn higher dimensions, different obstacle primitives need to be considered.\n\n3.3 Message-passing formulation\n\nTo solve (3) using the TWA, we need to specify the topology of the bipartite graph associated with\nthe !\u21e2 -\nthe unconstrained formulation (3) and the operation performed by every minimizer, i.e.\nweight update logic and x-variable update equations. We postpone describing the choice of initial\nvalues and internal parameters until Section 4.\nWe \ufb01rst describe the bipartite graph. To be concrete, let us assume that the cost functional has the\nform of (5). The unconstrained formulation (3) then tells us that the global objective function is\nthe sum of \u2318p(p + 1)/2 terms: \u2318p(p  1)/2 functions f coll and \u2318p functions f cost\nC . These functions\ninvolve a total of (\u2318 + 1)p variables out of which only (\u2318  1)p are free (since the initial and \ufb01nal\ncon\ufb01gurations are \ufb01xed). Correspondingly, the bipartite graph along which messages are passed has\n\u2318p(p+1)/2 minimizer-nodes that connect to the (\u2318 +1)p equality-nodes. In particular, the equality-\nnode associated with the break-point variable xi(s), \u2318> s > 0, is connected to 2(p  1) different\ngcoll minimizer-nodes and two different gcost\nC minimizer-nodes. If s = 0 or s = \u2318 the equality-node\nonly connects to half as many gcoll nodes and gcost\nWe now describe the different minimizers. Every minimizer basically is a special case of (1).\n\nC nodes.\n\n3.3.1 Agent-agent collision minimizer\nWe start with the minimizer associated with the functions f coll, that we denoted by gcoll. This mini-\nmizer receives as parameters the radius, r and r0, of the two agents whose collision it is avoiding. The\nminimizer takes as input a set of incoming n-messages, {n, n, n0, n0}, and associated \u21e2 -weights,\n{ \u21e2 , \u21e2, \u21e2 0, \u21e2 0}, and outputs a set of updated x-variables according to expression (9). Messages n\nand n come from the two equality-nodes associated with the positions of one of the agents at two\nconsecutive break-points and n0 and n0 from the corresponding equality-nodes for the other agent.\n\nf coll\nr,r0(x, x, x0, x0)\n\ngcoll(n, n, n0, n0, \u21e2 , \u21e2, \u21e2 0, \u21e2 0, r, r0) = arg min\n+ \u21e2\n\n \u21e2\n2 kx  nk2 + \u21e2 0\n\n{x,x,x0,x0}\n \u21e2 0\n2 kx0  n0k2.\n\n2 kx  nk2 +\n\n2 kx0  n0k2 +\n\n(9)\nThe update logic for the weights !\u21e2 for this minimizer is simple. If the trajectory from n to n for\nan agent of radius r does not collide with the trajectory from n0 to n0 for an agent of radius r0 then\nset all the outgoing weights !\u21e2 to zero. Otherwise set them all to \u21e20. The outgoing zero weights\nindicate to the receiving equality-nodes in the bipartite graph that the collision constraint for this\npair of agents is inactive and that the values it receives from this minimizer-node should be ignored\nwhen computing the consensus values of the receiving equality-nodes.\nThe solution to (9) is found using the agent-obstacle collision minimizer that we describe next.\n\n5\n\n\f3.3.2 Agent-obstacle collision minimizer\nThe minimizer for f wall is denoted by gwall. It is parameterized by the obstacle position {xL, xR}\nIt receives two n-messages,\nas well as the radius of the agent that needs to avoid the obstacle.\n{n, n}, and corresponding weights { \u21e2 , \u21e2 }, from the equality-nodes associated with two consecu-\ntive positions of an agent that needs to avoid the obstacle. Its output, the x-variables, are de\ufb01ned as\n\ngwall(n, n, r, xL, xR, \u21e2 , \u21e2 ) = arg min\n{x,x}\n\nxL,xR,r(x, x) + \u21e2\nf wall\n\n2 kx  nk2 +\n\n \u21e2\n2 kx  nk2.\n\n(10)\n\nWhen agents move in the plane (2D), this minimizer can be solved by reformulating the optimiza-\ntion in (10) as a mechanical problem involving a system of springs that we can solve exactly and\nef\ufb01ciently. This reduction is explained in the supplementary material in Section D and the solution\nto the mechanical problem is explained in Section I.\nThe update logic for the !\u21e2 -weights is similar to that of the gcoll minimizer. If an agent of radius\nr going from n and n does not collide with the line segment from xL to xR then set all outgoing\nweights to zero because the constraint is inactive; otherwise set all the outgoing weights to \u21e20.\nNotice that, from (8), it follows that the agent-agent minimizer gcoll can be expressed using gwall.\nMore concretely, as proved in the supplementary material, Section C,\n\ngcoll(n, n, n0, n0, \u21e2 , \u21e2, \u21e2 0, \u21e2 0, r, r0) = M2gwall\u21e3M1.{n, n, n0, n0, \u21e2 , \u21e2, \u21e2 0, \u21e2 0, r, r0}\u2318 ,\nfor a constant rectangular matrix M1 and a matrix M2 that depend on {n, n, n0, n0, \u21e2 , \u21e2, \u21e2 0, \u21e2 0}.\n3.3.3 Minimum energy and maximum (minimum) velocity minimizer\nWhen f cost can be decomposed as in (5), the minimizer associated with the functions f cost is denoted\nby gcost and receives as input two n-messages, {n, n}, and corresponding weights, { \u21e2 , \u21e2 }. The\nmessages come from two equality-nodes associated with two consecutive positions of an agent. The\nminimizer is also parameterized by a cost factor c. It outputs a set of updated x-messages de\ufb01ned as\n\ngcost(n, n, \u21e2 , \u21e2, c ) = arg min\n{x,x}\n\nf cost\nc\n\n(x, x) + \u21e2\n\n2 kx  nk2 +\n\n \u21e2\n2 kx  nk2.\n\n(11)\n\nThe update logic for the !\u21e2 -weights of the minimum energy minimizer is very simply: always set all\noutgoing weights !\u21e2 to \u21e20. The update logic for the !\u21e2 -weights of the maximum velocity minimizer\nis the following. If kn  nk \uf8ff c set all outgoing weights to zero. Otherwise, set them to \u21e20. The\nupdate logic for the minimum velocity minimizer is similar. If kn  nk  c, set all the !\u21e2 -weights\nto zero. Otherwise set them to \u21e20.\nThe solution to the minimum energy, maximum velocity and minimum velocity minimizer is written\nin the supplementary material in Sections E, F, and G respectively.\n\n4 Numerical results\n\nWe now report on the performance of our algorithm (see Appendix J for an important comment on\nthe anytime properties of our algorithm). Note that the lack of open-source scalable algorithms for\nglobal trajectory planning in the literature makes it dif\ufb01cult to benchmark our performance against\nother methods. Also, in a paper it is dif\ufb01cult to appreciate the gracefulness of the discovered trajec-\ntory optimizations, so we include a video in the supplementary material that shows \ufb01nal optimized\ntrajectories as well as intermediate results as the algorithm progresses for a variety of additional\nscenarios, including those with obstacles. All the tests described here are for agents in a two-\ndimensional plane. All tests but the last were performed using six cores of a 3.4GHz i7 CPU.\nThe different tests did not require any special tuning of parameters. In particular, the step-size in\n[13] (their \u21b5 variable) is always 0.1. In order to quickly equilibrate the system to a reasonable set of\nvariables and to wash out the importance of initial conditions, the default weight \u21e20 was set equal to\na small value (\u2318p \u21e5 105) for the \ufb01rst 20 iterations and then set to 1 for all further iterations.\n\n6\n\n\f)\nc\ne\ns\n(\n\ne\nm\n\ni\nt\n\ne\nc\nn\ne\ng\nr\ne\nv\nn\no\nC\n\n2000\n\n1500\n\n1000\n\n500\n\n0\n\n\u2021\n\u00ca\n\u2021\n\u00ca\n\u00ca\n\u20210\n\n\u2021\n\u2021\n\u2021\n\u00ca\n\u00ca\n\u00ca\n\n20\n\n\u2318 = 8\n\n\u2021\n\n\u2021\n\n\u2021\n\u00ca\n\n\u2021\n\n\u2021\n\n\u00ca\n\u2021\n\u00ca\n\u00ca\n\n\u00ca\n\u00ca\n\n\u2318 = 6\n\n\u2021\n\n\u2021\n\u2318 = 4\n\u00ca\n\n\u2318 = 8\n\n\u2318 = 6\n\u00ca\n\n\u2318 = 4\n\u00ca\n\n100\n\n\u2021\n\u00ca\n\n\u00ca\n\n\u00ca\n\ns\ne\nc\nn\ne\nr\nr\nu\nc\nc\no\n\nf\no\n\nr\ne\nb\nm\nu\nN\n\n25\n\n20\n\n15\n\n10\n\n5\n\n0\n100\n\n200\n\n6\n\n7\n\n2500\n\n2000\n\n1500\n\n1000\n\n500\n\n)\nc\ne\ns\n(\n\ne\nm\n\ni\nt\n\ne\nc\nn\ne\ng\nr\ne\nv\nn\no\nC\n\n\u00d9\n\n\u00da\n\np = 100\n\n\u00d9\n\n\u00d9\n\np = 80\n\u00da\n\n\u00cf\np = 60\n\u00da\n\u00cf\np \uf8ff 40\n\u2021\n\u00ca\n\n\u00cf\n\u2021\u2021\n\u00ca\u00ca\n\n5\n\n)\n2\n1\n\uf8ff\n\n(\n\ns\ne\nr\no\nc\n\nl\na\nc\ni\ns\ny\nh\nP\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n10\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n\n15\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n\n\u00d9\n\n\u00da\n\u00cf\n\u2021\n\u00ca\n20\n\nNumber of cores\n\nThe \ufb01rst test considers scenario CONF1: p (even) agents of radius r, equally spaced around on a\ncircle of radius R, are each required to exchange position with the corresponding antipodal agent,\nr = (5/4)R sin(\u21e1/2(p  4)). This is a classical dif\ufb01cult test scenario because the straight line\nmotion of all agents to their goal would result in them all colliding in the center of the circle. We\ncompare the convergence time of the TWA with a similar version using standard ADMM to perform\nthe optimizations. In this test, the algorithm\u2019s initial value for each variable in the problem was set\nto the corresponding initial position of each agent. The objective is to minimize the total kinetic\nenergy (C in the energy minimizer is set to 1). Figure 2-left shows that the TWA scales better with\np than classic ADMM and typically gives an order of magnitude speed-up. Please see Appendix K\nfor a further comment on the scaling of the convergence time of ADMM and TWA with p.\n\n\u00ca\n\n\u00ca\n\n301\n\n2\n\nConvergence time (sec)\n5\n\n3\n\n4\n\n60\n\n40\n80\nNumber of agents, p\n\n300\n\n600\nObjective value of trajectories\n\n400\n\n500\n\n700\n\n0\n\n0\n\nFigure 2: Left: Convergence time using standard ADMM (dashed lines) and using TWA (solid\nlines). Middle: Distribution of total energy and time for convergence with random initial conditions\n(p = 20 and \u2318 = 5). Right: Convergence time using a different number of cores (\u2318 = 5).\n\nThe second test for CONF1 analyzes the sensitivity of the convergence time and objective value\nwhen the variables\u2019 value at the \ufb01rst iteration are chosen uniformly at random in the smallest space-\ntime box that includes the initial and \ufb01nal con\ufb01guration of the robots. Figure 2-middle shows that,\nalthough there is some spread on the convergence time, our algorithm seems to reliably converge to\nrelatively similar-cost local minima (other experiments show that the objective value of these minima\nis around 5 times smaller than that found when the algorithm is run using only the collision avoidance\nminimizers without a kinetic energy cost term). As would be expected, the precise trajectories found\nvary widely between different random runs.\nStill for CONF1, and \ufb01xed initial conditions, we parallelize our method using several cores of\na 2.66GHz i7 processor and a very primitive scheduling/synchronization scheme. Although this\nscheme does not fully exploit parallelization, Figure 2-right does show a speed-up as the number\nof cores increases and the larger p is, the greater the speed-up. We stall when we reach the twelve\nphysical cores available and start using virtual cores.\nFinally, Figure 3-left compares the convergence time to optimize the total energy with the time to\nsimply \ufb01nd a feasible (i.e. collision-free) solution. The agents initial and \ufb01nal con\ufb01guration is\nrandomly chosen in the plane (CONF2). Error bars indicate \u00b1 one standard deviation. Minimizing\nthe kinetic energy is orders of magnitude computationally more expensive than \ufb01nding a feasible\nsolution, as is clear from the different magnitude of the left and right scale of Figure 3-left.\n\n)\nc\ne\ns\n(\n\ne\nl\nb\ni\ns\na\ne\nF\n\ne\nm\n\ni\nt\n\ne\nc\nn\ne\ng\nr\ne\nv\nn\no\nC\n\n12\n\n10\n\n8\n\n6\n\n4\n\n2\n\n0\n\n\u2318 = 8\n\n\u2318 = 8\n\n\u2318 = 6\n\n\u2318 = 4\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\n\u00ca\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n60\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n40\n\n0\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n20\n\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\u00ca\n\nNumber of agents, p\n\n\u00ca\n\n\u00ca\n\n\u00ca\n\n\u00ca\n\u00ca\n\n\u2318 = 6\n\n\u00ca\n\n\u00ca\n\n\u00ca\n\u00ca\n\u00ca\n\n\u00ca\n\n\u00ca\n\n\u2318 = 4\n80\n\n\u00ca\n\n\u00ca\n\n\u00ca\n\n\u00ca\n\u00ca\n\n1800\n\n1500\n\n1200\n\n900\n\n600\n\n300\n\n\u00ca\n\n0\n\n100\n\nC\no\nn\nv\ne\nr\ng\ne\nn\nc\ne\n\nt\ni\n\nm\ne\n\n(\ns\ne\nc\n)\n\nM\ni\nn\ni\nm\nu\nm\ne\nn\ne\nr\ng\ny\n\n)\nc\ne\ns\n(\n\nh\nc\no\np\ne\n\ne\nn\no\n\n,\ne\nm\n\ni\nt\n\ne\nc\nn\ne\ng\nr\ne\nv\nn\no\nC\n\n3.0\n\n2.5\n\n2.0\n\n1.5\n\n1.0\n\n0.5\n\n0.0\n\n8\n\n10* 12\n\n14* 16\n\nPink: MIQP\nLight blue: TWA\n\n24* 24\n\n18* 20\nNumber of agents, p\n\n30* 32\n\n40* 40\n\n50* 52\n\nFigure 3: Left: Convergence time when minimizing energy (blue scale/dashed lines) and to simply\n\ufb01nd a feasible solution (red scale/solid lines). Right: (For Section 5). Convergence-time distribution\nfor each epoch using our method (blue bars) and using the MIQP of [12] (red bars and star-values).\n\n7\n\n\f5 Local trajectory planning based on velocity obstacles\n\nIn this section we show how the joint optimization presented in [12], which is based on the concept\nof velocity obstacles [11] (VO), can be also solved via the message-passing TWA. In VO, given\nthe current position {xi(0)}i2[p] and radius {ri} of all agents, a new velocity command is computed\njointly for all agents minimizing the distance to their preferred velocity {vref\ni }i2[p]. This new velocity\ncommand must guarantee that the trajectories of all agents remain collision-free for at least a time\nhorizon \u2327. New collision-free velocities are computed every \u21b5\u2327 seconds, \u21b5< 1, until all agents\nreach their \ufb01nal con\ufb01guration. Following [12], and assuming an obstacle-free environment and \ufb01rst\norder dynamics, the collision-free velocities are given by,\nminimize\n\n{vi}i2[p] Xi2[p]\ni k2 s.t. k(xi(0) + vit)  (xj(0) + vjt)k  ri + rj 8 i 2 [p], t 2 [0,\u2327 ].\nSince the velocities {vi}i2[p] are related linearly to the \ufb01nal position of each object after \u2327 seconds,\n{xi(\u2327 )}i2[p], a simple change of variables allows us to reformulate the above problem as,\n\nCikvi  vref\n\nC0ikxi  xref\ni k2\n\n{xi}i2[p] Xi2[p]\ns.t. k(1  \u21b5)(xi(0)  xj(0)) + \u21b5(xi  xj)k  ri + rj 8 j > i 2 [p],\u21b5 2 [0, 1]\n\nf wall\nxRk,xLk,ri(xi(0), xi).\n\nri,rj (xi(0), xi, xj(0), xj) +Xi2[p]Xk\n\n(12)\nwhere C0i = Ci/\u2327 2, xref\ni \u2327 and we have dropped the \u2327 in xi(\u2327 ). The above problem,\nextended to account for collisions with the static line segments {xRk, xLk}k, can be formulated in\nan unconstrained form using the functions f cost, f coll and f wall. Namely,\ni ) +Xi>j\n{xi}iXi2[p]\n(13)\nmin\nNote that {xi(0)}i and {xref\ni }i are constants, not variables being optimized. Given this formulation,\nthe TWA can be used to solve the optimization. All corresponding minimizers are special cases\nof minimizers derived in the previous section for global trajectory planning (see Section H in the\nsupplementary material for details). Figure 3-right shows the distribution of the time to solve (12)\nfor CONF1. We compare the mixed integer quadratic programming (MIQP) approach from [12]\nwith ours. Our method \ufb01nds a local minima of exactly (13), while [12] \ufb01nds a global minima of\nan approximation to (13). Speci\ufb01cally, [12] requires approximating the search domain by hyper-\nplanes and an additional branch-and-bound algorithm while ours does not. Both approaches use a\nmechanism for breaking the symmetry from CONF1 and avoid deadlocks: theirs uses a preferential\nrotation direction for agents, while we use agents with slightly different C coef\ufb01cients in their en-\nergy minimizers (Cith agent = 1 + 0.001i). Both simulations were done on a single 2.66GHz core.\nThe results show the order of magnitude is similar, but, because our implementation is done in Java\nwhile [12] uses Matlab-mex interface of CPLEX 11, the results are not exactly comparable.\n\ni = xi(0) + vref\n\nminimize\n\nf cost\nC0i\n\n(xi, xref\n\nf coll\n\n6 Conclusion and future work\n\nWe have presented a novel algorithm for global and local planning of the trajectory of multiple\ndistinct agents, a problem known to be hard. The solution is based on solving a non-convex opti-\nmization problem using TWA, a modi\ufb01ed ADMM. Its similarity to ADMM brings scalability and\neasy parallelization. However, using TWA improves performance considerably. Our implementa-\ntion of the algorithm in Java on a regular desktop computer, using a basic scheduler/synchronization\nover its few cores, already scales to hundreds of agents and achieves real-time performance for local\nplanning.\nThe algorithm can \ufb02exibly account for obstacles and different cost functionals. For agents in the\nplane, we derived explicit expressions that account for static obstacles, moving obstacles, and dy-\nnamic constraints on the velocity and energy. Future work should consider other restrictions on the\nsmoothness of the trajectory (e.g. acceleration constraints) and provide fast solvers to our minimiz-\ners for agents in 3D.\nThe message-passing nature of our algorithm hints that it might be possible to adapt our algorithm\nto do planning in a decentralized fashion. For example, minimizers like gcoll could be solved by\nmessage exchange between pairs of agents within a maximum communication radius. It is an open\nproblem to build a practical communication-synchronization scheme for such an approach.\n\n8\n\n\fReferences\n[1] Javier Alonso-Mora, Andreas Breitenmoser, Martin Ru\ufb02i, Roland Siegwart, and Paul Beardsley. Image\n\nand animation display with multiple mobile robots. 31(6):753\u2013773, 2012.\n\n[2] Peter R. Wurman, Raffaello D\u2019Andrea, and Mick Mountz. Coordinating hundreds of cooperative, au-\n\ntonomous vehicles in warehouses. AI Magazine, 29(1):9\u201319, 2008.\n\n[3] Stephen J. Guy, Jatin Chhugani, Changkyu Kim, Nadathur Satish, Ming Lin, Dinesh Manocha, and\nPradeep Dubey. Clearpath: highly parallel collision avoidance for multi-agent simulation. In Proceedings\nof the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 177\u2013187, 2009.\n[4] John H. Reif. Complexity of the mover\u2019s problem and generalizations. In IEEE Annual Symposium on\n\nFoundations of Computer Science, pages 421\u2013427, 1979.\n\n[5] John E. Hopcroft, Jacob T. Schwartz, and Micha Sharir. On the complexity of motion planning for\nmultiple independent objects; pspace-hardness of the \u201dwarehouseman\u2019s problem\u201d. The International\nJournal of Robotics Research, 3(4):76\u201388, 1984.\n\n[6] Maren Bennewitz, Wolfram Burgard, and Sebastian Thrun. Finding and optimizing solvable priority\nschemes for decoupled path planning techniques for teams of mobile robots. Robotics and Autonomous\nSystems, 41(2\u20133):89\u201399, 2002.\n\n[7] Daniel Mellinger, Alex Kushleyev, and Vijay Kumar. Mixed-integer quadratic program trajectory genera-\ntion for heterogeneous quadrotor teams. In IEEE International Conference on Robotics and Automation,\npages 477\u2013483, 2012.\n\n[8] Federico Augugliaro, Angela P. Schoellig, and Raffaello D\u2019Andrea. Generation of collision-free trajec-\ntories for a quadrocopter \ufb02eet: A sequential convex programming approach. In IEEE/RSJ International\nConference on Intelligent Robots and Systems, pages 1917\u20131922, 2012.\n\n[9] Steven M. LaValle and James J. Kuffner. Randomized kinodynamic planning. The International Journal\n\nof Robotics Research, 20(5):378\u2013400, 2001.\n\n[10] Oussama Khatib. Real-time obstacle avoidance for manipulators and mobile robots. The International\n\nJournal of Robotics Research, 5(1):90\u201398, 1986.\n\n[11] Paolo Fiorini and Zvi Shiller. Motion planning in dynamic environments using velocity obstacles. The\n\nInternational Journal of Robotics Research, 17(7):760\u2013772, 1998.\n\n[12] Javier Alonso-Mora, Martin Ru\ufb02i, Roland Siegwart, and Paul Beardsley. Collision avoidance for multiple\nagents with joint utility maximization. In IEEE International Conference on Robotics and Automation,\n2013.\n\n[13] Nate Derbinsky, Jos\u00b4e Bento, Veit Elser, and Jonathan S. Yedidia. An improved three-weight message-\n\npassing algorithm. arXiv:1305.1961 [cs.AI], 2013.\n\n[14] G. David Forney Jr. Codes on graphs: Normal realizations. IEEE Transactions on Information Theory,\n\n47(2):520\u2013548, 2001.\n\n[15] Sertac Karaman and Emilio Frazzoli. Incremental sampling-based algorithms for optimal motion plan-\n\nning. arXiv preprint arXiv:1005.0416, 2010.\n\n[16] R. Glowinski and A. Marrocco. Sur l\u2019approximation, par \u00b4el\u00b4ements \ufb01nis d\u2019ordre un, et la r\u00b4esolution, par\np\u00b4enalisization-dualit\u00b4e, d\u2019une class de probl`ems de Dirichlet non lin\u00b4eare. Revue Franc\u00b8aise d\u2019Automatique,\nInformatique, et Recherche Op\u00b4erationelle, 9(2):41\u201376, 1975.\n\n[17] Daniel Gabay and Bertrand Mercier. A dual algorithm for the solution of nonlinear variational problems\n\nvia \ufb01nite element approximation. Computers & Mathematics with Applications, 2(1):17\u201340, 1976.\n\n[18] Hugh Everett III. Generalized lagrange multiplier method for solving problems of optimum allocation of\n\nresources. Operations Research, 11(3):399\u2013417, 1963.\n\n[19] Magnus R. Hestenes. Multiplier and gradient methods. Journal of Optimization Theory and Applications,\n\n4(5):303\u2013320, 1969.\n\n[20] Magnus R. Hestenes. Multiplier and gradient methods. In L.A. Zadeh et al., editor, Computing Methods\n\nin Optimization Problems 2. Academic Press, New York, 1969.\n\n[21] M.J.D. Powell. A method for nonlinear constraints in minimization problems.\n\nOptimization. Academic Press, London, 1969.\n\nIn R. Fletcher, editor,\n\n[22] Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization\nand statistical learning via the alternating direction method of multipliers. Foundations and Trends in\nMachine Learning, 3(1):1\u2013122, 2011.\n\n9\n\n\f", "award": [], "sourceid": 341, "authors": [{"given_name": "Jos\u00e9", "family_name": "Bento", "institution": "Disney Research"}, {"given_name": "Nate", "family_name": "Derbinsky", "institution": "Disney Research"}, {"given_name": "Javier", "family_name": "Alonso-Mora", "institution": "Disney Research"}, {"given_name": "Jonathan", "family_name": "Yedidia", "institution": "Disney Research"}]}