{"title": "Ideal Observers for Detecting Motion: Correspondence Noise", "book": "Advances in Neural Information Processing Systems", "page_first": 827, "page_last": 834, "abstract": null, "full_text": "Ideal Observers for Detecting Motion:\n\nCorrespondence Noise\n\nHongjing Lu\n\nDepartment of Psychology, UCLA\n\nLos Angeles, CA 90095\n\nAlan Yuille\n\nDepartment of Statistics, UCLA\n\nLos Angeles, CA 90095\n\nhongjing@psych.ucla.edu\n\nyuille@stat.ucla.edu\n\nAbstract\n\nWe derive a Bayesian Ideal Observer (BIO) for detecting motion and\nsolving the correspondence problem. We obtain Barlow and Tripathy\u2019s\nclassic model as an approximation. Our psychophysical experiments\nshow that the trends of human performance are similar to the Bayesian\nIdeal, but overall human performance is far worse. We investigate ways\nto degrade the Bayesian Ideal but show that even extreme degradations\ndo not approach human performance. Instead we propose that humans\nperform motion tasks using generic, general purpose, models of motion.\nWe perform more psychophysical experiments which are consistent with\nhumans using a Slow-and-Smooth model and which rule out an alterna-\ntive model using Slowness.\n\n1\n\nIntroduction\n\nIdeal Observers give fundamental limits for performing visual tasks (somewhat similar to\nShannon\u2019s limits on information transfer). They give benchmarks against which to evaluate\nhuman performance. This enables us to determine objectively what visual tasks humans\nare good at, and may help point the way to underlying neuronal mechanisms. For a recent\nreview, see [1].\n\nIn an in\ufb02uential paper, Barlow and Tripathy [2] tested the ability of human subjects to detect\ndots moving coherently in a background of random dots. They derived an \u201cideal observer\u201d\nmodel using techniques from Signal Detection theory [3]. They showed that their model\npredicted the trends of the human performance as properties of the stimuli changed, but that\nhumans performed far worse than their model. They argued that degrading their model,\nby lowering the spatial resolution, would give predictions closer to human performance.\nBarlow and Tripathy\u2019s model has generated considerable interest, see [4,5,6,7].\n\nWe formulate this motion problem in terms of Bayesian Decision Theory and derive a\nBayesian Ideal Observer (BIO) model. We describe why Barlow and Tripathy\u2019s (BT) model\nis not fully ideal, show that it can be obtained as an approximation to the BIO, and deter-\nmine conditions under which it is a good approximation. We perform psychophysical ex-\nperiments under a range of conditions and show that the trends of human subjects are more\nsimilar to those of the BIO. We investigate whether degrading the Bayesian Ideal enables\nus to reach human performance, and conclude that it does not (without implausibly large\n\n\fdeformations). We comment that Barlow and Tripathy\u2019s degradation model is implausible\ndue to the nature of the approximations used.\n\nInstead we show that a generic motion detection model which uses a slow-and-smooth\nassumption about the motion \ufb01eld [8,9] gives similar performance to human subjects under\na range of experimental conditions. A simpler approach using a slowness assumption alone\ndoes not match new experimental data that we present. We conclude that human observers\nare not ideal, in the sense that they do not perform inference using the model that the\nexperimenter has chosen to generate the data, but may instead use a general purpose model\nperhaps adapted to the motion statistics of natural images.\n\n2 Bayes Decision Theory and Ideal Observers\n\nWe now give the basic elements of Bayes Decision Theory. The input data is D and\nwe seek to estimate a binary state W (e.g.\ncoherent or incoherent motion, horizon-\ntal motion to right or to left). We assume models P (D|W ) and P (W ). We de\ufb01ne\na decision rule \u03b1(D) and a loss function L(\u03b1(I), W ) = 1 \u2212 \u03b4\u03b1(D),W . The risk is\nR(\u03b1) = PD,W L(\u03b1(D), W )P (D|W )P (W ).\nOptimal performance is given by the Bayes rule: \u03b1\u2217 = arg min R(\u03b1). The fundamental\nlimits are given by Bayes Risk: R\u2217 = R(\u03b1\u2217). Bayes risk is the best performance that can\nbe achieved. It corresponds to ideal performance.\n\nBarlow and Tripathy\u2019s (BT) model does not achieve Bayes risk. This is because they used\nsimpli\ufb01cation to derive it using concepts from Signal Detection theory (SDT). SDT is es-\nsentially the application of Bayes Decision Theory to the task of signal detection but, for\nhistorical reasons, SDT restricts itself to a limited class of probability models and is unable\nto capture the complexity of the motion problem.\n\n3 Experimental Setup and Correspondence Noise\n\nWe now give the details of Barlow and Tripathy\u2019s stimuli, their model, and their experi-\nments. The stimuli consist of two image frames with N dots in each frame. The dots in the\n\ufb01rst frame are at random positions. For coherent stimuli, see \ufb01gure (1), a proportion CN\nof dots move coherently left or right horizontally with a \ufb01xed translation motion with dis-\nplacement T . The remaining N (1 \u2212 C) dots in the second frame are generated at random.\nFor incoherent stimuli, the dots in both frames are generated at random.\n\nEstimating motion for these stimuli requires solving the correspondence problem to match\ndots between frames. For coherent motion, the noise dots act as correspondence noise and\nmake the matching harder, see the rightmost panel in \ufb01gure (1).\n\nBarlow and Tripathy perform two types of binary forced choice experiments. In detection\nexperiments, the task is to determine whether the stimuli is coherent or incoherent motion.\nFor discrimination experiments, the goal is to determine if the motion is to the right or the\nleft.\n\nThe experiments are performed by adjusting the fraction C of coherently moving dots until\nthe human subject\u2019s performance is at threshold (i.e. 75 percent correct). Barlow and\nTripathy\u2019s (BT) model gives the proportion of dots at threshold to be C\u03b8 = 1/\u221aQ \u2212 N\nwhere Q is the size of the image lattice. This is approximately 1/\u221aQ (because N << Q)\nand so is independent of the density of dots. Barlow and Tripathy compare the thresholds of\nthe human subjects with those of their model for a range of experimental conditions which\nwe will discuss in later sections.\n\n\fFigure 1: The left three panels show coherent stimuli with N = 20, C = 0.1, N = 20, C =\n0.5 and N = 20, C = 1.0 respectively. The closed and open circles denote dots in the \ufb01rst\nand second frame respectively. The arrows show the motion of those dots which are moving\ncoherently. Correspondence noise is illustrated by the far right panel showing that a dot in\nthe \ufb01rst frame has many candidate matches in the second frame.\n\n4 The Bayesian Ideal Model\n\nWe now compute the Bayes rule and Bayes risk by taking into account exactly how the data\nis generated. We denote the dot positions in the \ufb01rst and second frame by D = {xi : i =\n1, ..., N},{ya : a = 1, ..., N}. We de\ufb01ne correspondence variables Via : Via = 1 if xi \u2192\nya, Via = 0 otherwise.\nThe generative model for the data is given by:\n\nP ({ya}|{xi},{Via}, T )P ({Via})P ({xi}) coherent,\n\nP (D|Coh, T ) = X\nP (D|Incoh) = P ({ya})P ({xi}), incoherent.\n\nVia\n\n(1)\nThe prior distributions for the dot positions P ({xi}), P ({ya}) allow all con\ufb01gurations of\nthe dots to be equally likely. They are therefore of form P ({xi}) = P ({ya}) = (Q\u2212N )!\nwhere Q is the number of lattice points. The model P ({ya}|{xi},{Via}, T ) for coher-\n(Q\u2212CN )! Qia (\u03b4ya,xi+T )Via. We set the priors\nent motion is P ({ya}|{xi},{Via}, T ) = (Q\u2212N )!\nP ({Via} to be the uniform distribution. There is a constraint Pia Via = CN (since only\nCN dots move coherently).\nThis gives:\n\nQ!\n\nP (D|Incoh) =\n\n(Q \u2212 N )!\n\nQ!\n\n(Q \u2212 N )!\n\n,\n\nP (D|Coh, T ) = {\n\n(N \u2212 CN )!\n\n(N )!\n\nQ!\n(N \u2212 CN )!\n\n(N )!\n\n}2(CN )! X\n\nVia\n\nY\n\nia\n\n(\u03b4ya+T,xi)Via.\n\nThese can be simpli\ufb01ed further by observing that PVia Qia (\u03b4ya,xi+T )Via =\n(\u03a8\u2212M )!M ! ,\nwhere \u03a8 is the total number of matches \u2013 i.e. the number of dots in the \ufb01rst frame that have\na corresponding dot at displacement T in the second frame (this includes \u201cfake\u201d matches\ndue to change alignment of noise dots in the two frames).\n\n\u03a8!\n\nP (D|Coh,T ) for detection (i.e. coherent versus incoherent), and (ii) log P (D|Coh,\u2212T )\n\nThe Bayes rule for performing the tasks are given by testing the log-likelihood ratios: (i)\nlog P (D|Incoh)\nP (D|Coh,T ) for\ndiscrimination (i.e. motion to right or to left). For detection, the log-likelihood ratio is a\nfunction of \u03a8. For discrimination, the log-likelihood ratio is a function of the number of\nmatches to the right \u03a8r and to the left \u03a8l. It is straightforward to calculate the Bayes risk\nand determine coherence thresholds.\n\n\fWe can rederive Barlow and Tripathy\u2019s model as an approximation to the Bayesian Ideal.\nThey make two approximations: (i) they model the distribution of \u03c8 as Binomial, (ii) they\nuse d\u2032. Both approximations are very good near threshold, except for small N. The use of\nd\u2032 can be justi\ufb01ed if P (\u03a8|Coh, T ) and P (\u03a8|Incoh) are Gaussians with similar variance.\nThis is true for large N = 1000 and a range of C but not so good for small N = 100, see\n\ufb01gure (2).\n\n0.09\n\n0.06\n\n0.03\n\ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\nP\n\nP(\u03c8|N) \n\nN=1000 \nC=0.5% \n\nP(\u03c8|C) \n\n0.09\n\n0.06\n\n0.03\n\ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\nP\n\nN=1000 \nC=5% \n \n\nP(\u03c8|N) \n\nP(\u03c8|C) \n\n0\n0\n\n30\n\n\u03c8\n\n60\n\n0\n0\n\n40\n\n\u03c8\n\n80\n\ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\nP\n\n0.9\n\n0.6\n\n0.3\n\n0\n0\n\nN=100\nC=1% \n\n0.4\n\n0.2\n\ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\nP\n\nP(\u03c8|N) \n\nN=200 \nC=2.5% \n\nP(\u03c8|C) \n\nP(\u03c8|C) \n\nP(\u03c8|N) \n\n2\n\n\u03c8\n\n4\n\n0\n0\n\n5\n\n\u03c8\n\n10\n\n15\n\nFigure 2: We plot P (\u03a8|Coh, T ) and P (\u03a8|Incoh), shown as P (\u03a8|C) and P (\u03a8|N ) re-\nspectively, for a range of N and C. One of Barlow and Tripathy\u2019s two approximations are\njusti\ufb01ed if the distributions are Gaussian with the same variance. This is true for large N\n(left two panels) but fails for small N (right two panels). Note that human thresholds are\nroughly 30 times higher than for BIO (the scales on graphs differ).\n\nWe computed the coherence threshold for the BIO and the BT models for N = 100 to N =\n1000, see the second and fourth panels in \ufb01gure (3). As described earlier, the BT threshold\nis approximately independent of the number N of dots. Our computations showed that the\nBIO threshold is also roughly constant except for small N (this is not surprising in light of\n\ufb01gure (2). This motivated psychophysics experiments to determine how humans performed\nfor small N (this range of dots was not explored in Barlow and Tripathy\u2019s experiments).\nAll our data points are from 300 trials using QUEST, so errors bars are so small that we do\nnot include them.\n\nWe performed the detection and discrimination tasks with translation motion T = 16 (as\nin Barlow and Tripathy). For detection and discrimation, the human subject\u2019s thresholds\nshowed similar trends to the thresholds for BIO and BT. But human performance at small\nN are more consistent with BIO, see \ufb01gure (3).\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\n1.0\n\n0.5 \n\n0.1\n\n100\n\nHL\nRK\n\nBaysian model\nBarlow & Tripathy\n\n0.03\n\n0.01\n\n1.0\n\n0.5\n\nBT\nHL\nRK\n\nBaysian model\nBarlow & Tripathy\n\n0.03\n\n0.01\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\n1000\n\nDot Numbers (N)\n\n10000\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\n0.1\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\nFigure 3: The left two panels show detection thresholds \u2013 human subjects (far left) and BIO\nand BT thresholds (left). The right two panels show discrimination thresholds \u2013 human\nsubjects (right) and BIO and BT (far right).\n\nBut probably the most striking aspect of \ufb01gure (3) is how poorly humans perform compared\nto the models. The thresholds for BIO are always higher than those for BT, but these\ndifferences are almost negligible compared to the differences with the human subjects. The\nexperiments also show that the human subject trends differ from the models at large N.\nBut these are extreme conditions where there are dots on most points on the image lattice.\n\n\f5 Degradating the Ideal Observer Models\n\n.\n\n(Q\u2212N )!\n\nP (D|Coh,T )P (T )\nP (D|Incoh)\n\nWe now degrade the Bayes Ideal model to see if we can obtain human performance. We\nconsider two mechanisms: (A) Humans do not know the precise value of the motion transla-\ntion T . (B) Humans have poor spatial uncertainty. We will also combine both mechanisms.\nFor (A), we model lack of knowledge of the velocity T by summing over different motions.\nWe generate the stimuli as before from P (D|Incoh) or P (D|Coh, T ), but we make the\ndecision by thresholding: log PT\nFor (B), we model lack of spatial resolution by replacing P ({ya}|{xi},{Via}, T ) =\n(Q\u2212CN )! Qia Via\u03b4ya,xi+t by P ({ya}|{xi},{Via}, T ) = (Q\u2212N )!\n(Q\u2212CN )! Qia ViafW (ya, xi + t).\nHere W is the width of a spatial window, so that fW (a, b) = 1/W 2,\nif |a \u2212 b| <\nW ; fW (a, b) = 0, otherwise.\nOur calculations, see \ufb01gure (4), show that neither (A) nor (B) not their combination are\nsuf\ufb01cient to account for the poor performance of human subjects. Lack of knowledge\nof the correct motion (and consequently summing over several models) does little to de-\ngrade performance. Decreasing spatial resolution does degrade performance but even huge\ndegradations are insuf\ufb01cient to reach human levels. Barlow and Tripathy [2] argue that\nthey can degrade their model to reach human performance but the degradations are huge\nand they occur in conditions (e.g. N = 50 or N = 100) where their model is not a good\napproximation to the true Bayesian Ideal Observer.\n\n0.5\n\n0.1\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\n5\n\n9\n\nSpatial uncertainty range (pixels)\n\n17\n\n33\n\nUnknown Velocity\nSpatial uncertainty\nLattice separation\nHuman performance\n\nFigure 4: Comparing the degraded models to human performance. We use a log-log plot\nbecause the differences between humans and model thresholds is very large.\n\n6 Slowness and Slow-and-Smooth\n\nWe now consider an alternative explanation for why human performance differs so greatly\nfrom the Bayesian Ideal Observer. Perhaps human subjects do not use the ideal model\n(which is only known to the designer of the experiments) and instead use a general purpose\nmotion model. We now consider two possible models: (i) a slowness model, and (ii) a slow\nand smooth model.\n\n1.0\n\n0.5\n\nSpeed=2\nSpeed=8\nSpeed=16\n\n1.0\n\n0.5\n\nSpeed=2\nSpeed=8\nSpeed=16\n\n1.0\n\n0.5\n\n0.1\n\nl\n\nd\no\nh\ns\ne\nr\nh\nT\n \ne\nc\nn\ne\nr\ne\nh\no\nC\n\n2D Nearest Neighbor\n1D Nearest Neighbor\nHumans\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\nSpeed=2\nSpeed=4\nSpeed=8\nSpeed=16\nHuman average\n\n1.0\n\n0.5\n\n0.1\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\nl\n\n \n\nd\no\nh\ns\ne\nr\nh\nT\ne\nc\nn\ne\nr\ne\nh\no\nC\n\n0.1\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\n0.1\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\n100\n\n1000\n\nDot Numbers (N)\n\n10000\n\nFigure 5: The coherence threshold as a function of N for different translation motions T .\nFrom left to right, human subject (HL), human subject (RK), 2DNN (shown for T = 16\nonly), and 1DNN. In the two right panels we have drawn the average human performance\nfor comparision.\n\n\fThe slowness model is partly motivated by Ullman\u2019s minimal mapping theory [10] and\npartly by the design of practical computer vision tracking systems. This model solves\nthe correspondence problem by simply matching a dot in the \ufb01rst frame to the closest\ndot in the second frame. We consider a 2D nearest neighbour model (2DNN) and a 1D\nnearest neighbour model (1DNN), for which the matching is constrained to be in horizontal\ndirections only. After the motion has been calculated we perform a log-likelihood test\nto solve the discrimination and detection tasks. This enables us to calculate coherence\nthresholds, see \ufb01gure (5). Both 1DNN and 2DNN predict that correspondence will be easy\nfor small translation motions even when the number of dots is very large. This motivates a\nnew class of experiments where we vary the translation motion.\n\nOur experiments show that 1DNN and 2DNN are poor \ufb01ts to human performance. Human\nperformance thresholds are relatively insensitive to the number N of dots and the trans-\nlation motion T , see the two left panels in \ufb01gure (5). By contrast, the 1DNN and 2DNN\nthresholds are either far lower than humans for small N or far higher at large N with a\ntransition that depends on T . We conclude that the 1DNN and 2DNN models do not match\nhuman performance.\n\nN=100, C=10%\n\nN=100, C=20%\n\nN=100, C=30%\n\nN=100, C=50%\n\nN=100, C=10%\n\nN=100, C=20%\n\nN=100, C=30%\n\nN=100, C=50%\n\nN=100, C=10%\n\nN=100, C=20%\n\nN=100, C=30%\n\nN=100, C=50%\n\nFigure 6: The motion \ufb02ows from Slow-and-Smooth for N = 100 as functions of C and\nT . From left to right, C = 0.1, C = 0.2, C = 0.3, C = 0.5. From top to bottom,\nT = 4, T = 8, T = 16. The closed and open circles denote dots in the \ufb01rst and second\nframe respectively. The arrows indicate the motion \ufb02ow speci\ufb01ed by the Slow-and-Smooth\nmodel.\n\nWe now consider the Slow-and-Smooth model [8,9] which has been shown to account for\na range of motion phenomena. We use a formulation [8] that was speci\ufb01cally designed for\ndealing with the correspondence problem.\n\n\fThis gives a model of form P (V, v|{xi},{ya}) = (1/Z)e\u2212E[V,v]/Tm, where\n\nE[V, v] =\n\nN\n\nN\n\nX\n\nX\n\ni=1\n\na=1\n\nVia(ya \u2212 xi \u2212 v(xi))2 + \u03bb||Lv||2 + \u03b6\n\nN\n\nX\n\ni=1\n\nVi0,\n\n(2)\n\nL is an operator that penalizes slow-and-smooth motion and depends on a paramters \u03c3, see\nYuille and Grzywacz for details [8]. We impose the constraint that PN\ni=a Via = 1, \u2200i,\nwhich enforces that each point i in the \ufb01rst frame is either unmatched, if Vi0 = 1, or is\nmatched to a point a in the second frame.\nWe implemented this model using an EM algorithm to estimate the motion \ufb01eld v(x) that\nmaximizes P (v|{xi},{ya}) = PV P (V, v|{xi},{ya}). The parameter settings are Tm =\n0.001, \u03bb = 0.5, \u03b6 = 0.01, \u03c3 = 0.2236. (The size of the units of length are normalized by\nthe size of the image). The size of \u03c3 determines the spatial scale of the interaction between\ndots [8]. This parameter settings estimate correct motion directions in the condition that all\ndots move coherently, C = 1.0.\nThe following results, see \ufb01gure (6), show that for 100 dots (N = 100) the results of the\nslow-and-smooth model are similar to those of the human subjects for a range of different\ntranslation motions. Slow-and-Smooth starts giving coherence thresholds between C = 0.2\nand C = 0.3 consistent with human performance. Lower thresholds occurred for slower\ncoherent translations in agreement with human performance.\n\nSlow-and-Smooth also gives thresholds similar to human performance when we alter the\nnumber N of dots, see \ufb01gure (7). Once again, Slow-and-Smooth starts giving the correct\nhorizontal motion between c = 0.2 and c = 0.3.\n\nN=50, C=10%\n\nN=50, C=20%\n\nN=50, C=30%\n\nN=50, C=50%\n\nN=100, C=10%\n\nN=100, C=20%\n\nN=100, C=30%\n\nN=100, C=50%\n\nN=1000, C=10%\n\nN=1000, C=20%\n\nN=1000, C=30%\n\nN=1000, C=50%\n\nFigure 7: The motion \ufb01elds of Slow-and-Smooth for T = 16 as a function of c and N.\nFrom left to right, C = 0.1, C = 0.2, C = 0.3, C = 0.5. From top to bottom, N =\n50, N = 100, N = 1000. Same conventions as for previous \ufb01gure.\n\n\f7 Summary\n\nWe de\ufb01ned a Bayes Ideal Observer (BIO) for correspondence noise and showed that Bar-\nlow and Tripathy\u2019s (BT) model [2] can be obtained as an approximation. We performed\npsychophysical experiments which showed that the trends of human performance were\nmore similar to those of BIO (when it differed from BT). We attempted to account for\nhuman\u2019s poor performance (compared to BIO) by allowing for degradations of the model\nsuch as poor spatial resolution and uncertainty about the precise translation velocity. We\nconcluded that these degradation had to be implausibly large to account for the poorness\nof human performance. We noted that Barlow and Tripathy\u2019s degradation model [2] takes\nthem into a regime where their model is a bad approximation to the BIO. Instead, we in-\nvestigated the possibility that human observers perform these motion tasks using generic\nprobability models for motion possibly adapted to the statistics of motion in the natural\nworld. Further psychophysical experiments showed that human performance was inconsis-\ntent with a model than prefers slow motion. But human performance was consistent with\nthe Slow-and-Smooth model [8,9].\n\nWe conclude with two metapoints. Firstly, it is possible to design ideal observer models for\ncomplex stimuli using techniques from Bayes decision theory. There is no need to restrict\noneself to the traditional models described in classic signal detection books such as Green\nand Swets [3]. Secondly, human performance at visual tasks may be based on generic\nmodels, such as Slow-and-Smooth, rather than the ideal models for the experimental tasks\n(known only to the experimenter).\n\nAcknowledgements\n\nWe thank Zili Liu for helpful discussions. We gratefully acknowledge funding support from the\nAmerican Association of University Women (HL), NSF0413214 and W.M. Keck Foundation (ALY).\n\nReferences\n\n[1] Geisler, W.S. (2002) \u201cIdeal Observer Analysis\u201d. In L. Chalupa and J. Werner (Eds). The Visual\nNeuroscienes. Boston. MIT Press. 825-837.\n\n[2] Barlow, H., and Tripathy, S.P. (1997) Correspondence noise and signal pooling in the detection of\ncoherent visual motion. Journal of Neuroscience, 17(20), 7954-7966.\n\n[3] Green, D.M., and Swets, J.A. (1966) Signal detection theory and psychophysics. New York:\nWiley.\n\n[4] Morrone, M.C., Burr, D. C., and Vaina, L. M. (1995) Two stages of visual processing for radial\nand circular motion. Nature, 376(6540), 507-509.\n\n[5] Neri, P., Morrone, M.C., and Burr, D.C. (1998) Seeing biological motion. Nature, 395(6705),\n894-896.\n\n[6] Song, Y., and Perona, P. (2000) A computational model for motion detection and direction dis-\ncrimination in humans. IEEE computer society workshop on Human Motion, Austin, Texas.\n\n[7] Wallace, J.M and Mamassian, P. (2004) The ef\ufb01ciency of depth discrimination for non-transparent\nand transparent stereoscopic surfaces. Vision Research, 44, 2253-2267.\n\n[8] Yuille, A.L. and Grzywacz, N.M. (1988) A computational theory for the perception of coherent\nvisual motion. Nature, 333,71-74,\n\n[9] Weiss, Y., and Adelson, E.H. (1998) Slow and smooth: A Bayesian theory for the combination of\nlocal motion signals in human vision Technical Report 1624. Massachusetts Institute of Technology.\n\n[10] Ullman, S. (1979) The interpretation of Visual Motion. MIT Press, Cambridge, MA, 1979.\n\n\f", "award": [], "sourceid": 2807, "authors": [{"given_name": "Hongjing", "family_name": "Lu", "institution": null}, {"given_name": "Alan", "family_name": "Yuille", "institution": null}]}