{"title": "The Information-Form Data Association Filter", "book": "Advances in Neural Information Processing Systems", "page_first": 1193, "page_last": 1200, "abstract": null, "full_text": "The Information-Form Data Association Filter\n\nBrad Schumitsch, Sebastian Thrun, Gary Bradski, and Kunle Olukotun\n\nStanford AI Lab\n\nStanford University, Stanford, CA 94305\n\nAbstract\n\nThis paper presents a new \ufb01lter for online data association problems in\nhigh-dimensional spaces. The key innovation is a representation of the\ndata association posterior in information form, in which the \u201cproxim-\nity\u201d of objects and tracks are expressed by numerical links. Updating\nthese links requires linear time, compared to exponential time required\nfor computing the exact posterior probabilities. The paper derives the\nalgorithm formally and provides comparative results using data obtained\nby a real-world camera array and by a large-scale sensor network simu-\nlation.\n\n1 Introduction\nThis paper addresses the problem of data association in online object tracking [6]. The data\nassociation problem arises in a large number of application domains, including computer\nvision, robotics, and sensor networks.\n\nOur setup assumes an online tracking system that receives two types of data: sensor\ndata, conveying information about the identity or type of objects that are being tracked; and\ntransition data, characterizing the uncertainty introduced through the tracker\u2019s inability to\nreliably track individual objects over time. The setup is motivated by a camera network\nwhich we recently deployed in our lab. Here sensor data relates to the color of clothing of\nindividual people, which enables us to identify them. Tracks are lost when people walk too\nclosely together, or when they occlude each other.\n\nWe show that the standard probabilistic solution to the discrete data association prob-\nlem requires exponential update time and exponential memory. This is because each data\nassociation hypothesis is expressed by a permutation matrix that assigns computer-internal\ntracks to objects in the physical world. An optimal \ufb01lter would therefore need to maintain\na probability distribution over the space of all permutation matrices, which grows expo-\nnentially with N , the number of objects in the world. The common remedy involves the\nselection of a small number K of likely hypotheses. This is the core of numerous widely-\nused multi-hypothesis tracking algorithms [9, 1]. More recent solutions involve particle\n\ufb01lters [3], which maintain stochastic samples of hypotheses. Both of these techniques are\nvery effective for small N, but the number of hypothesis they require grows exponentially\nwith N .\n\nThis paper provides a \ufb01lter algorithm that scales to much larger problems. This \ufb01lter\nmaintains an information matrix \u2126 of size N \u00d7 N , which relates tracks to physical objects\nin the world. The rows of \u2126 correspond to object identities, the columns to the tracks of the\ntracker. \u2126 is a matrix in information form, that is, it can be thought of as a non-normalized\nlog-probability.\n\nFig. 1a shows an example. The highlighted \ufb01rst column corresponds to track 1 in\nthe tracker. The numerical values in this column suggest that this track is most strongly\n\n\f(a) Example: Information matrix\n\n(b) Most likely data association\n\n2\n\n1\n\n10\n\n5\n\n12\n2\n4\n2\n\n4\n11\n4\n1\n\n\u2126 =\uf8eb\n\uf8ed\n\n4\n0\n15\n2\n\n\uf8f6\n\uf8f8\n\n\u02c6A = argmax\n\nA\n\ntr AT \u2126 =\uf8eb\n\uf8ed\n\n0\n0 1 0\n0 1 0\n0\n0 1\n0\n0\n1 0\n0\n0\n\n\uf8f6\n\uf8f8\n\n(c) Update: Associating track 2 with object 4\n\n(d) Update: Tracks 2 and 3 merge\n\n4\n2 11 0\n4 15\n\n2 12 4\n1\n10 4\n2\n\n5\n\n2\n\n1\n\n\uf8eb\n\uf8ed\n\n4\n2 11 0\n4 15\n\n2 12 4\n1\n10 4\n3\n\n5\n\n1\n\n2\n\n\uf8f6\n\uf8f8\n\n\u2212\u2192\uf8eb\n\uf8ed\n\n\uf8f6\n\uf8f8\n\n2 12\n2\n1\n10 4\n3\n\n5\n\n\uf8eb\n\uf8ec\uf8ed\n\n4\n4\n11 0\n4 15\n1\n\n2\n\n\uf8f6\n\uf8f7\uf8f8\n\n(e) Graphical network interpretation of the information form\n\n2 11.31\n1 10.31\n\n11.31 4\n10.31 0\n\n10\n\n5\n\n4\n\n4\n\n2.43\n\n2.43\n\n15\n\n2\n\n\u2212\u2192\uf8eb\n\uf8ec\uf8ec\uf8ed\n\n\uf8f6\n\uf8f7\uf8f7\uf8f8\n\nFigure 1: Illustration of the information form \ufb01lter for data association in object tracking\n\nassociated with object 3, since the value 10 dominates all other values in this column.\nThus, looking at column 1 of \u2126 in isolation would have us conclude that the most likely\nassociation of track 1 is object 3. However, the most likely permutation matrix is shown\nin Fig. 1b; from all possible data association assignments, this matrix receives the highest\nscore. Its score is tr \u02c6AT \u2126 = 5 + 12 + 11 + 15 = 43 (here \u201ctr\u201d denotes the trace of a\nmatrix). This permutation matrix associates object 3 with track 4, while associating track\n1 with object 4.\n\nThe key question now pertains to the construction of \u2126. As we shall see, the update\noperations for \u2126 are simple and parallelizable. Suppose we receive a measurement that\nassociates track 2 with object 4 (e.g., track 2\u2019s hair color appears to be the same as person\n4\u2019s hair color in our camera array). As a result, our approach adds a value to the element in\n\u2126 that links object 4 and track 2, as illustrated in Fig. 1c (the exact magnitude of this value\nwill be discussed below). Similarly, suppose our tracker is unable to distinguish between\nobjects 2 and 3, perhaps because these objects are so close together in a camera image that\nthey cannot be tracked individually. Such a situation leads to a new information matrix, in\nwhich both columns assume the same values, as illustrated in Fig. 1d. The exact values in\nthis new information matrix are the result of an exponentiated averaging explained below.\nAll of these updates are easily parallelized, and hence are applicable to a decentralized\nnetwork of cameras. The exact update and inference rules are based on a probabilistic\nmodel that is also discussed below.\n\nGiven the importance of data association, it comes as no surprise that our algorithm is\nrelated to a rich body of prior work. The data association problem has been studied as an\nof\ufb02ine problem, in which all data is memorized and inference takes place after data collec-\ntion. There exists a wealth of powerful methods, such as RANSAC [4] and MCMC [6, 2],\nbut those are inherently of\ufb02ine and their memory requirements increase over time. The\ndominant online, or \ufb01lter, paradigm involves the selection of K representative samples\nof the data association matrix, but such algorithms tend to work only for small N [11].\nRelatively little work has focused on the development of compact suf\ufb01cient statistics for\ndata association. One alternative O(N 2) technique to the one proposed here was explored\nin [8]. This technique uses doubly stochastic matrices, which are computationally hard to\nmaintain. The \ufb01rst mention of information \ufb01lters is in [8], but the update rules there were\n\n\fcomputationally less ef\ufb01cient (in O(N 4)) and required central optimization.\n\nThe work in this paper does not address the continuous-valued aspects of object track-\ning. Those are very well understood, and information representations have been success-\nfully applied [5, 10].\n\nInformation representations are popular in the \ufb01eld of graphical networks. Our ap-\nproach can be viewed as a learning algorithm for a Markov network [7] of a special topol-\nogy, where any track and any object are connected by an edge. Such a network is shown in\nFig. 1e. The \ufb01lter update equations manipulate the strength of the edges based on data.\n\n2 Problem Setup and Bayes Filter Solution\n\nWe begin with a formal de\ufb01nition of the data association problem and derive the obvious\nbut inef\ufb01cient Bayes \ufb01lter solution. Throughout this paper, we make the closed world\nassumption, that is, there are always the same N known objects in the world.\n\n2.1 Data Association\nWe assume that we are given a tracking algorithm that maintains N internal tracks of the\nmoving objects. Due to insuf\ufb01cient information, this assumed tracking algorithm does not\nalways know the exact mapping of identities to internal tracks. Hence, the same internal\ntrack may correspond to different identities at different times.\n\nThe data association problem is the problem of assigning these N tracks to N objects.\nEach data association hypothesis is characterized by a permutation matrix of the type shown\nin Fig. 1b. The columns of this matrix correspond to the internal tracks, and the rows to\nthe objects. We will denote the data association matrix by A (not to be confused with the\ninformation matrix \u2126). In our closed world, A is always a permutation matrix; hence all\nelements are 0 or 1. There are exponentially many permutation matrices, which is a reason\nwhy data association is considered a hard problem.\n\n2.2 Identity Measurement\nThe correct data association matrix A is unobservable. Instead, the sensors produce local\ninformation about the relation of individual tracks to individual objects. We will denote\nsensor measurements by zj, where j is the index of the corresponding track. Each zj =\n{zij} speci\ufb01es a local probability distribution in the corresponding object space:\n\np(xi = yj | zj) = zij\n\nzij = 1\n\n(1)\n\nwith Xi\n\nHere xi is the i-th object in the world, and yj is the j-th track.\n\nThe measurement in our introductory example (see Fig. 1c) was of a special form, in\nthat it elevated one speci\ufb01c correspondence over the others. This occurs when zij = \u03b1 for\nsome \u03b1 \u2248 1, and zkj = 1\u2212\u03b1\nN \u22121 for all k 6= i. Such a measurement arises when the tracker\nreceives evidence that a speci\ufb01c track yj corresponds with high likelihood to a speci\ufb01c\nobject xi. Speci\ufb01cally, the measurement likelihood of this correspondence is \u03b1, and the\nerror probability is 1 \u2212 \u03b1.\n\n2.3 State Transitions\nAs time passes by, our tracker may confuse tracks, which is a loss of information with\nrespect to the data association. The tracker confusing two objects amounts to a random \ufb02ip\nof two columns in the data association matrix A.\n\nThe model adopted in this paper generalizes this example to arbitrary distributions over\npermutations of the columns in A. Let {B1, . . . , BM } be a set of permutation matrices,\n\nand {\u03b21, . . . , \u03b2M } withPm \u03b2m = 1 be a set of associated probabilities. The \u201ctrue\u201d per-\n\nmutation matrix undergoes a random transition from A to A Bm with probability \u03b2m:\n\nA\n\nprob=\u03b2m\u2212\u2192 A Bm\n\n(2)\n\n\fThe sets {B1, . . . , BM } and {\u03b21, . . . , \u03b2M } are given to us by the tracker. For the example\nin Fig. 1d, in which tracks 2 and 3 merge, the following two permutation matrices will\nimplement such a merge:\n\nB1 = \uf8eb\n\uf8ed\n\n1\n0\n0\n0\n\n0\n1\n0\n0\n\n0\n0\n1\n0\n\n0\n0\n0\n1\n\n\uf8f6\n\uf8f8 ; \u03b21 = 0.5\n\nB2 = \uf8eb\n\uf8ed\n\n1\n0\n0\n0\n\n0\n0\n1\n0\n\n0\n1\n0\n0\n\n0\n0\n0\n1\n\n\uf8f6\n\uf8f8 ; \u03b22 = 0.5\n\n(3)\n\nThe \ufb01rst such matrix leaves the association unchanged, whereas the second swaps columns\n2 and 3. Since \u03b21 = \u03b22 = 0.5, such a swap happens exactly with probability 0.5.\n\n2.4 Inef\ufb01cient Bayesian Solution\nFor small N , the data association problem now has an obvious Bayes \ufb01lter solution. Specif-\nically, let A be the space of all permutation matrices. The Bayesian \ufb01lter solves the identity\ntracking problem by maintaining a probabilistic belief over the space of all permutation\nmatrices A \u2208 A. For each A, it maintains a posterior probability denoted p(A). This prob-\nability is updated in two different ways, reminiscent of the measurement and state transition\nupdates in DBNs and EKFs.\n\nThe measurement step updates the belief in response to a measurement zj. This update\n\nis an application of Bayes rule:\n\np(A) \u2190\u2212\n\n1\nL\n\nwith L = X\u00afA\n\naij zij\n\n\u00afaij zij\n\np(A) Xi\np( \u00afA) Xi\n\n(4)\n\n(5)\n\nHere aij denotes the ij-th element of the matrix A. Because A is a permutation matrix,\nonly one element in the sum over i is non-zero (hence there is not really a summation here).\nThe state transition updates the belief in accordance with the permutation matrices Bm\n\nand associated probabilities \u03b2m (see Eq. 2):\n\n\u03b2m p(A BT\n\nm)\n\n(6)\n\np(A) \u2190\u2212 Xm\n\nWe use here that the inverse of a permutation matrix is its transpose.\n\nThis Bayesian \ufb01lter is an exact solution to our identity tracking problem. Its problem is\ncomplexity: there are N ! permutation matrices A, and we have to compute probabilities for\nall of them. Thus, the exact \ufb01lter is only applicable to problems with small N . Even if we\nwant to keep track of K \u226a N likely permutations\u2014as attempted by \ufb01lters like the multi-\nhypothesis EKF or the particle \ufb01lter\u2014the required number of tracks K will generally have\nto scale exponentially with N (albeit at a slower rate). This exponential scaling renders the\nBayesian \ufb01lter ultimately inapplicable to the identity tracking problem with large N .\n\n3 The Information-Form Solution\n\nOur data association \ufb01lter represents the posterior in condensed form, using an N \u00d7 N in-\nformation matrix. As a result, it requires linear update time and quadratic memory, instead\nof the exponential time and memory requirements of the Bayes \ufb01lter.\n\nHowever, we give two caveats regarding our method: it is approximate, and it does not\nmaintain probabilities. The approximation is the result of a Jensen approximation, which\nwe will show is empirically accurate. The calculation of probabilities from an information\nmatrix requires inference, and we will provide several options for performing this inference.\n\n3.1 The Information Matrix\nThe information matrix, denoted \u2126, is a matrix of size N \u00d7 N whose elements are non-\nnegative. \u2126 induces a probability distribution over the space of all data association matrices\n\n\fA, through the following de\ufb01nition:\n\np(A) =\n\n1\nZ\n\nexp tr A \u2126\n\nwith\n\nZ = XA\n\nexp tr A \u2126\n\n(7)\n\nHere tr is the trace of a matrix, and Z is the partition function.\n\nComputing the posterior probability p(A) from \u2126 is hard, due to the dif\ufb01culty of com-\nputing the partition function Z. However, as we shall see, maintaining \u2126 is surprisingly\neasy, and it is also computationally ef\ufb01cient.\n\n3.2 Measurement Update in Information Form\nIn information form, the measurement update is a local addition of the form:\n\n\u2126 \u2190\u2212 \u2126 +\uf8eb\n\uf8ed\n\n...\n\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n\n...\n\nlog z1j\n\n0 \u00b7 \u00b7 \u00b7 0\n...\nlog z1N 0 \u00b7 \u00b7 \u00b7 0\n\n...\n\n...\n\n...\n\n\uf8f6\n\uf8f8\n\nThis follows directly from Eq. 4. The complexity of this update is O(N ).\n\nOf particular interest is the case where one speci\ufb01c association was af\ufb01rmed with prob-\nN \u22121 . Then the\n\nability zij = \u03b1, while all others were true with the error probability zkj = 1\u2212\u03b1\nupdate is of the form\n\n(8)\n\n\u2126 \u2190\u2212 \u2126 +\n\nwith\n\nc = log\n\n1 \u2212 \u03b1\nN \u2212 1\n\n(9)\n\n...\n\n...\n\n...\n\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n\n...\n\n...\n\n...\n\nc\n\n...\n\nc\n\nlog \u03b1\n\nc\n\n...\n\nc\n\n...\n\n...\n\n...\n\n...\n\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n...\n0 \u00b7 \u00b7 \u00b7 0\n\n...\n\n...\n\n\uf8eb\n\n\uf8ec\uf8ec\uf8ec\uf8ec\uf8ec\uf8ec\uf8ec\uf8ec\uf8ec\uf8ed\n\n\uf8f6\n\n\uf8f7\uf8f7\uf8f7\uf8f7\uf8f7\uf8f7\uf8f7\uf8f7\uf8f7\uf8f8\n\nHowever, since \u2126 is a non-normalized matrix (it is normalized via the partition function Z\nin Eq. 7), we can modify \u2126 as long as exp tr A \u2126 is changed by the same factor for any\nA. In particular, we can subtract c from an entire column in \u2126; this will affect the result of\nexp tr A \u2126 by a factor of exp c, which is independent of A and hence will be subsumed by\nthe normalizer Z. This allows us to perform a more ef\ufb01cient update\n\n\u03c9ij \u2190\u2212 \u03c9ij + log \u03b1 \u2212 log\n\n1 \u2212 \u03b1\nN \u2212 1\n\n(10)\n\nwhere \u03c9ij is the ij-th element of \u2126. This update is indeed of the form shown in Fig. 1c. It\nrequires O(1) time, is entirely local, and is an exact realization of Bayes rule in information\nform.\n\n3.3 State Transition Update in Information Form\nThe state transition update is also simple, but it is approximate. We show that using a\nJensen bound, we obtain the following update for the information matrix:\n\n\u03b2m BT\n\nm exp \u2126\n\n(11)\n\n\u2126 \u2190\u2212 log Xm\n\nHere the expression \u201cexp \u2126\u201d denotes a component-wise exponentiation of the matrix \u2126;\nthe result is also a matrix. This update implements a \u201cdual\u201d of a geometric mean; here\nthe exponentiation is applied to the individual elements of this mean, and the logarithm is\napplied to the result. It is important to notice that this update only affects elements in \u2126\nthat might be affected by a permutation Bm; all others remain the same.\n\nA numerical example of this update was given in Fig. 1d, assuming the permutation\nmatrices in Eq. 3. The values there are the result of applying this update formula. For\nexample, for the \ufb01rst row we get log 1\n\n2 (exp 12 + exp 4) = 11.3072.\n\n\fThe derivation of this update formula is straightforward. We begin with Eq. 6, writ-\nten in logarithmic form. The transformations rely heavily on the fact that A and Bm are\npermutation matrices. We use the symbol \u201ctr\u2217\u201d for a multiplicative version of the matrix\ntrace, in which all elements on the diagonal are multiplied.\n\nlog p(A) \u2190\u2212 logXm\n\n\u03b2m p(A BT\n\nm)\n\n\u03b2m exp tr A BT\n\nm \u2126\n\n\u03b2m tr\u2217 exp A BT\n\nm \u2126\n\n\u03b2m tr\u2217 A BT\n\nm exp \u2126\n\n= const. + logXm\n= const. + logXm\n= const. + logXm\n\u2264 const. + log tr\u2217 A Xm\n= const. + tr A\"log Xm\n\n\u03b2m BT\n\nm exp \u2126\n\n\u03b2m BT\n\nm exp \u2126#\n\n(12)\n\nThe result is of the form of (the logarithm of) Eq. 7. The expression in brackets is equivalent\nto the right-hand side of the update Eq. 11. A bene\ufb01t of this update rule is that it only affects\ncolumns in \u2126 that are affected by a permutation Bm; all other columns are unchanged.\n\nWe note that the approximation in this derivation is the result of applying a Jensen\nbound. As a result, we gain a compact closed-form solution to the update problem, but the\nstate transition step may sacri\ufb01ce information in doing so (as indicated by the \u201c\u2264\u201d sign).\nIn our experimental results section, however, we \ufb01nd that this approximation is extremely\naccurate in practice.\n\n4 Computing the Data Association\nThe previous section formally derived our update rules, which are simple and local. We\nnow address the problem of recovering actual data association hypotheses from the infor-\nmation matrix, along with the associated probabilities.\n\nWe consider three cases: the computation of the most likely data association matrix as\nillustrated in Fig. 1b; the computation of a relative probability of the form p(A)/p(A\u2032); and\nthe computation of an absolute probability or expectation.\n\nTo recover argmaxA p(A), we need only solve a linear program.\nRelative probabilities are also easy to recover. Consider, for example, the quotient of\nthe probability p(A)/p(A\u2032) for two identity matrices A and A\u2032. When calculating this\nquotient from Eq. 7, the normalizer Z cancels out:\n\np(A)\np(A\u2032)\n\n= exp tr(A \u2212 A\u2032) \u2126\n\n(13)\n\nAbsolute probabilities and expectations are generally the most dif\ufb01cult to compute.\nThis is because of the partition function Z in Eq. 7, whose exact calculation requires con-\nsidering N ! permutation matrices.\n\nOur approximate method for recovering probabilities/expectations is based on the\n\nMetropolis algorithm. Speci\ufb01cally, consider the expectation of a function f :\n\nE[f (A)] = XA\n\nf (A) p(A)\n\n(14)\n\nOur method approximates this expression through a \ufb01nite sample of matrices A[1], A[2], . . .,\nusing Metropolis and the proposal distribution de\ufb01ned in Eq. 13. This proposal generates\nexcellent results for simple functions f (e.g., the marginal of a single identity). For more\n\n\f(a) camera\n\n(b) array of 16 ceiling-mounted cameras\n\n(c) camera images\n\n(d) 2 of the tracks\n\nFigure 2: The camera array, part of the common area in the Stanford AI Lab. Panel (d) compares\nour esitmate with ground truth for two of the tracks. The data association is essentially correct at all\ntimes.\n\n(a) Comparison K-hypothesis vs.\ninformation-theoretic tracker\n\n(b) Comparison using a DARPA challenge\n\ndata set produced by Northrop Grumman\n\nK-hypotheses\n\nour approach\n\n@@I\n\nour approach\n\nFigure 3: Results for our approach information-form \ufb01lter the common multi-hypothesis approach\nfor (a) synthetic data and (b) a DARPA challenge data set. The comparison (b) involves additional\nalgorithms, including one published in [8].\n\ncomplex functions f , we refer the reader to improved proposal distributions that have been\nfound to be highly ef\ufb01cient in related problems [6, 2].\n\n5 Experimental Results\nTo evaluate this algorithm, we deployed a network of ceiling-mounted cameras in our lab,\nshown in Fig. 2. We used 16 cameras to track individuals walking through the lab. The\ntracker uses background subtraction to \ufb01nd blobs and uses a color histogram to classify\nthese blobs. Only when two or more people come very close to each other might the\ntracker lose track of individual people. We \ufb01nd that for N = 5 our method tracks people\nnearly perfectly, but so does the full-blown Bayesian solution, as well as the K-best multi-\nhypothesis method that is popular in the tracking literature.\n\nTo investigate scaling to larger N , we compared our approach on two data sets: a syn-\nthetic one with up to N = 1, 600 objects, and a dataset using an sensor network simulation\nprovided to us by Northrop Grumman through an ongoing DARPA program. The latter\nset is thought to be realistic. It was chosen because it involves a large number (N = 200)\nof moving objects, whose motion patterns come from a behavioral model. In all cases,\nwe measured the number of objects mislabeled in the maximum likelihood hypothesis (as\nfound by solving the LP). All results are averaged over 50 runs.\n\nThe comparison in Fig. 3a shows that our approach outperforms the traditional K-best\nhypothesis approach (with K = N ) by a large margin. Furthermore, our approach seems\nto be unaffected by N , the number of entities in the environment, whereas the traditional\napproach deteriorates. This comes as no surprise, since the traditional approach requires\nincreasing numbers of samples to cover the space of all data associations. The results in\nFig. 3b compare (from left to right), the most likely hypothesis, the most recent sensor\nmeasurement, the K-best approach with K = 200, an approach proposed in [8], and our\napproach. Notice that this plot is in log-form.\n\n\fNo comparisons were attempted with of\ufb02ine techniques, such as the ones in [4, 6],\n\nbecause the data sets used here are quite large and our interest is online \ufb01ltering.\n\n6 Conclusion\nWe have provided an information form algorithm for the data association problem in object\ntracking. The key idea of this approach is to maintain a cumulative matrix of information\nassociating computer-internal tracks with physical objects. Updating this matrix is easy;\nfurthermore, ef\ufb01cient methods were proposed for extracting concrete data association hy-\npotheses from this representation. Empirical work using physical networks of camera ar-\nrays illustrated that our approach outperforms alternative paradigms that are commonly\nused throughout all of science.\n\nDespite these advances, the work possesses a number of limitations. Speci\ufb01cally, our\nclosed world assumption is problematic, although we believe the extension to open worlds\nis relatively straightforward. Also missing is a tight integration of our discrete formula-\ntion into continuous-valued traditional tracking algorithms such as EKFs. Such extensions\nwarrant further research.\n\nWe believe the key innovation here is best understood from a graphical model perspec-\ntive. Sampling K good data associations cannot exploit conditional independence in the\ndata association posterior, hence will always require that K is an exponential function of\nN . The information form and the equivalent graphical network in Fig. 1e exploits condi-\ntional independences. This subtle difference makes it possible to get away with O(N 2)\nmemory and O(N ) computation without a loss of accuracy when N increases, as shown\nin Fig. 3a. The information form discussed here\u2014and the associated graphical networks\u2014\npromise to overcome a key brittleness associated with the current state-of-the-art in online\ndata association.\n\nAcknowledgements\nWe gratefully thank Jaewon Shin and Leo Guibas for helpful discussions.\n\nThis research was sponsored by the Defense Advanced Research Projects Agency\n\n(DARPA) under the ACIP program and grant number NBCH104009.\n\nReferences\n[1] Y. Bar-Shalom and X.-R. Li. Estimation and Tracking: Principles, Techniques, and Software.\n\nYBS, Danvers, MA, 1998.\n\n[2] F. Dellaert, S.M. Seitz, C. Thorpe, and S. Thrun. EM, MCMC, and chain \ufb02ipping for structure\n\nfrom motion with unknown correspondence. Machine Learning, 50(1-2):45\u201371, 2003.\n\n[3] A. Doucet, J.F.G. de Freitas, and N.J. Gordon, editors. Sequential Monte Carlo Methods in\n\nPractice. Springer, 2001.\n\n[4] M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model \ufb01tting\nwith applications to image analysis and automated cartography. Communications of the ACM,\n24:381\u2013395, 1981.\n\n[5] P. Maybeck. Stochastic Models, Estimation, and Control, Volume 1. Academic Press, 1979.\n[6] H. Pasula, S. Russell, M. Ostland, and Y. Ritov. Tracking many objects with many sensors.\n\nIJCAI-99.\n\n[7] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan\n\nKaufmann, 1988.\n\n[8] J. Shin, N. Lee, S. Thrun, and L. Guibas. Lazy inference on object identities in wireless sensor\n\nnetworks. IPSN-05.\n\n[9] D.B. Reid. An algorithm for tracking multiple targets. IEEE Transactions on Aerospace and\n\nElectronic Systems, AC-24:843\u2013854, 1979.\n\n[10] S. Thrun, Y. Liu, D. Koller, A.Y. Ng, Z. Ghahramani, and H. Durrant-Whyte. Simultaneous\n\nlocalization and mapping with sparse extended information \ufb01lters. IJRR, 23(7/8), 2004.\n\n[11] D. Fox, J. Hightower, L. Lioa, D. Schulz, and G. Borriello. Bayesian Filtering for Location\n\nEstimation. IEEE Pervasive Computing, 2003.\n\n\f", "award": [], "sourceid": 2764, "authors": [{"given_name": "Brad", "family_name": "Schumitsch", "institution": null}, {"given_name": "Sebastian", "family_name": "Thrun", "institution": null}, {"given_name": "Gary", "family_name": "Bradski", "institution": null}, {"given_name": "Kunle", "family_name": "Olukotun", "institution": null}]}