{"title": "Functional form of motion priors in human motion perception", "book": "Advances in Neural Information Processing Systems", "page_first": 1495, "page_last": 1503, "abstract": "It has been speculated that the human motion system combines noisy measurements with prior expectations in an optimal, or rational, manner. The basic goal of our work is to discover experimentally which prior distribution is used. More specifically, we seek to infer the functional form of the motion prior from the performance of human subjects on motion estimation tasks. We restricted ourselves to priors which combine three terms for motion slowness, first-order smoothness, and second-order smoothness. We focused on two functional forms for prior distributions: L2-norm and L1-norm regularization corresponding to the Gaussian and Laplace distributions respectively. In our first experimental session we estimate the weights of the three terms for each functional form to maximize the fit to human performance. We then measured human performance for motion tasks and found that we obtained better fit for the L1-norm (Laplace) than for the L2-norm (Gaussian). We note that the L1-norm is also a better fit to the statistics of motion in natural environments. In addition, we found large weights for the second-order smoothness term, indicating the importance of high-order smoothness compared to slowness and lower-order smoothness. To validate our results further, we used the best fit models using the L1-norm to predict human performance in a second session with different experimental setups. Our results showed excellent agreement between human performance and model prediction -- ranging from 3\\% to 8\\% for five human subjects over ten experimental conditions -- and give further support that the human visual system uses an L1-norm (Laplace) prior.", "full_text": "Functional form of motion priors in human motion\n\nperception\n\nHongjing Lu 1,2\n\nhongjing@ucla.edu\n\nTungyou Lin 3\n\ntungyoul@math.ucla.edu\n\nAlan L. F. Lee 1\n\nalanlee@ucla.edu\n\nLuminita Vese 3\n\nlvese@math.ucla.edu\n\nDepartment of Psychology1, Statistics2, Mathematics3 and Computer Science4, UCLA\n\nAlan Yuille 1,2,4\n\nyuille@stat.ucla.edu\n\nAbstract\n\nIt has been speculated that the human motion system combines noisy measure-\nments with prior expectations in an optimal, or rational, manner. The basic goal\nof our work is to discover experimentally which prior distribution is used. More\nspeci\ufb01cally, we seek to infer the functional form of the motion prior from the per-\nformance of human subjects on motion estimation tasks. We restricted ourselves\nto priors which combine three terms for motion slowness, \ufb01rst-order smoothness,\nand second-order smoothness. We focused on two functional forms for prior dis-\ntributions: L2-norm and L1-norm regularization corresponding to the Gaussian\nand Laplace distributions respectively. In our \ufb01rst experimental session we esti-\nmate the weights of the three terms for each functional form to maximize the \ufb01t to\nhuman performance. We then measured human performance for motion tasks and\nfound that we obtained better \ufb01t for the L1-norm (Laplace) than for the L2-norm\n(Gaussian). We note that the L1-norm is also a better \ufb01t to the statistics of motion\nin natural environments. In addition, we found large weights for the second-order\nsmoothness term, indicating the importance of high-order smoothness compared\nto slowness and lower-order smoothness. To validate our results further, we used\nthe best \ufb01t models using the L1-norm to predict human performance in a second\nsession with different experimental setups. Our results showed excellent agree-\nment between human performance and model prediction \u2013 ranging from 3% to\n8% for \ufb01ve human subjects over ten experimental conditions \u2013 and give further\nsupport that the human visual system uses an L1-norm (Laplace) prior.\n\n1 Introduction\n\nImagine that you are traveling in a moving car and observe a walker through a fence full of punch\nholes. Your visual system can readily perceive the walking person against the apparently moving\nbackground using only the motion signals visible through these holes. But this task is far from trivial\ndue to the inherent local ambiguity of motion stimuli, often referred to as the aperture problem. More\nprecisely, if you view a line segment through an aperture then you can easily estimate the motion\ncomponent normal to the line but it is impossible to estimate the tangential component. So there are\nan in\ufb01nite number of possible interpretations of the local motion signal.\n\nOne way to overcome this local ambiguity is to integrate local motion measurements across space\nto infer the \u201dtrue\u201d motion \ufb01eld. Physiological studies have shown that direction-selective neurons\n\n1\n\n\fin primary visual cortex perform local measurements of motion. Then the visual system integrates\nthese local motion measurements to form global motion perception [4, 5]. Psychophysicists have\nidenti\ufb01ed a variety of phenomena, such as motion capture and motion cooperativity, which appear\nto be consequences of motion spatial integration [1, 2, 3]. From the computational perspective,\na number of Bayesian models have been proposed to explain these effects by hypothesizing prior\nassumptions about the motion \ufb01elds that occur in natural environments. In particular, it has been\nshown that a prior which is biased to slow-and-smooth motion can account for a range of experi-\nmental results [6, 7, 8, 9, 10].\n\nBut although evidence from physiology and psychophysics supports the existence of an integration\nstage, it remains unclear exactly what motion priors are used to resolve the measurement ambigui-\nties. In the walking example described above (see \ufb01gure 1), the visual system needs to integrate the\nlocal measurements in the two regions within the red boxes in order to perceive a coherently moving\nbackground. This integration must be performed over large distances, because the regions are widely\nseparated, but this integration cannot be extended to include the walker region highlighted in the blue\nbox, because this would interfere with accurate estimation of the walker\u2019s movements. Hence the\nmotion priors used by the human visual system must have a functional form which enables \ufb02exible\nand robust integration.\n\nWe aim to determine the functional form of the motion priors which underly human perception,\nand to validate how well these priors can in\ufb02uence human perception in various motion tasks. Our\napproach is to combine parametric modeling of the motion priors with psychophysical experiments\nto estimate the model parameters that provide the best \ufb01t to human performance across a range\nof stimulus conditions. To provide further validation, we then use the estimated model to predict\nhuman performance in several different experimental setups. In this paper, we \ufb01rst introduce the\ntwo functional forms which we consider and review related literature in Section 2. Then in Section\n3 we present our computational theory and implementation details. In Section 4 we test the theory\nby comparing its predictions with human performance in a range of psychophysical experiments.\n\nFigure 1: Observing a walker with a moving camera. Left panel, two example frames. The visual\nsystem needs to integrate motion measurements from the two regions in the red boxes in order to\nperceive the motion of the background. But this integration should not be extended to the walker\nregion highlighted in the blue box. Right panel, the integration task is made harder by observing the\nscene through a set of punch holes. The experimental stimuli in our psychophysical experiments are\ndesigned to mimic these observation conditions.\n\n2 Functional form of motion priors\n\nMany models have proposed that the human visual system uses prior knowledge of probable mo-\ntions, but the functional form for this prior remains unclear. For example, several well-established\ncomputational models employ Gaussian priors to encode the bias towards slow and spatially smooth\nmotion \ufb01elds. But the choice of Gaussian distributions has largely been based on computational\nconvenience [6, 8], because they enable us to derive analytic solutions.\n\nHowever, some evidence suggests that different distribution forms may be used by the human visual\nsystem. Researchers have used motion sequences in real scenes to measure the spatial and temporal\nstatistics of motion \ufb01elds [11, 12]. These natural statistics show that the magnitude of the motion\n(speed) falls off in a manner similar to a Laplacian distribution ( L1-norm regularization), which has\nheavier tails than Gaussian distributions (see the left plot in \ufb01gure 2). These heavy tails indicates\nthat while slow motions are very common, fast motions are still occur fairly frequently in natural\n\n2\n\n\fenvironments. A similar distribution pattern was also found for spatial derivatives of the motion \ufb02ow,\nshowing that non-smooth motion \ufb01elds can also happen in natural environments. This statistical\n\ufb01nding is not surprising since motion discontinuities can arise in the natural environment due to the\nrelative motion of objects, foreground/background segmentation, and occlusion.\n\nStocker and Simoncelli [10] conducted a pioneering study to infer the functional form of the slow-\nness motion prior. More speci\ufb01cally, they used human subject responses in a speed discrimination\ntask to infer the shape of the slowness prior distribution. Their inferred slowness prior showed sig-\nni\ufb01cantly heavier tails than a Gaussian distribution. They showed that a motion model using this\ninferred prior provided an adequate \ufb01t to human data for a wide range of stimuli.\n\nFinally, the robustness of the L1-norm has also been demonstrated in many statistical applications\n(e.g., regression and feature selection). In the simplest case of linear regression, suppose we want\nto \ufb01nd the intercept with the constraint of zero slope. The regression with L1-norm regularization\nestimates the intercept based on the sample median, whereas the L2-norm regression estimates the\nintercept based on the sample mean. A single outlier has very little effect on the median but can\nalter the mean signi\ufb01cantly. Accordingly, the L1-norm regularization is less sensitive to outliers\nthan is the L2-norm. We illustrate this for motion estimation by the example in the right panel\nof \ufb01gure 2. If there is a motion boundary in the true motion \ufb01eld, then a model using L2-norm\nregularization (Gaussian priors) tends to impose strong smoothing over the two distinct motion\n\ufb01elds which blurs the motion across discontinuity. But the model with an L1-norm (Laplace prior)\npreserves the motion discontinuity and gives smooth motion \ufb02ow on both sides of it.\n\nFigure 2: Left plot, the Gaussian distribution (L2-norm regularization) and the Laplace distribution\n(L1-norm regularization). Right plot, an illustration of over-smoothing caused by using Gaussian\npriors.\n\n3 Mathematical Model\n\nThe input data is speci\ufb01ed by local motion measurements ~rq, of form ~uq = (u1q, u2q), at a discrete\nset of positions ~rq, q = 1, ..., N in the image plane. The goal is to \ufb01nd a smooth motion \ufb01eld\n~v de\ufb01ned at all positions ~r in the image domain, estimated from the local motion measurements.\nThe motion \ufb01eld ~v can be thought of as an interpolation of the data which obeys a slowness and\nsmoothness prior and which agrees approximately with the local motion measurements. Recall that\nthe visual system can only observe the local motion in the directions ~nq = ~uq\n|~uq| (sometimes called\ncomponent motion) because of the aperture problem. Hence approximate agreement with local\nmeasurements reduces to the constraints:\n\n~v(~rq) \u00b7 ~nq \u2212 ~uq \u00b7 ~nq \u2248 0.\n\nAs illustrated in \ufb01gure 3, we consider three motion prior terms which quantify the preference for\nslowness, \ufb01rst-order smoothness and second-order smoothness respectively. Let \u2126 denote the image\ndomain \u2013 i.e. the set of points ~r = (r1, r2) \u2208 \u2126. We de\ufb01ne the prior to be a Gibbs distribution with\nenergy function of form:\n\nE(~v) =Z\u2126\n\n(\n\n\u03bb\n\u03b1\n\n|~v|\u03b1 +\n\n\u00b5\n\u03b2\n\n3\n\n|\u2207~v|\u03b2 +\n\n\u03b7\n\u03b3\n\n|4~v|\u03b3)d~r,\n\n\fwhere \u03bb, \u00b5, \u03b7, \u03b1, \u03b2, \u03b3 are positive parameters and\n\n|~v| =p(v1)2 + (v2)2,\n\n|\u2207~v| =r(cid:16) \u2202v1\n\u2202r1(cid:17)2\n|4~v| =r(cid:16) \u2202 2v1\n1 (cid:17)2\n2 (cid:17)2\n+(cid:16) \u2202 2v1\n\n\u2202r2(cid:17)2\n+(cid:16) \u2202v1\n1 (cid:17)2\n+(cid:16) \u2202 2v2\n\n\u2202r2\n\n\u2202r2\n\n\u2202r2\n\n\u2202r2(cid:17)2\n+(cid:16) \u2202v2\n\n,\n\n.\n\n\u2202r1(cid:17)2\n+(cid:16) \u2202v2\n2 (cid:17)2\n+(cid:16) \u2202 2v2\n\n\u2202r2\n\nFigure 3: An illustration of three prior terms: (i) slowness, (ii) \ufb01rst-order smoothness, and (iii)\nsecond-order smoothness\n\nThe (negative log) likelihood function for grating stimuli imposes the measurement constraints and\nis of form:\n\nE(~u|~v) =\n\n|~v(~rq) \u00b7 ~nq \u2212 ~uq \u00b7 ~nq|p =\n\nN\n\nXq=1\n\nN\n\n|~v(~rq) \u00b7 ~nq \u2212 |~uq||p.\n\nXq=1\n\nThe combined energy function to be minimized is:\n\ninf\n\n~v nF (~v) =\n\nc\np\n\nE(~u|~v) + E(~v)o.\n\nThis energy is a convex function provided the exponents satisfy \u03b1, \u03b2, \u03b3, p \u2265 1. Therefore the energy\nminimum can be found by imposing the \ufb01rst order optimality conditions, \u2202F (~v)\n\u2202~v = 0 (the Euler-\nLagrange equations). Below we computer these Euler-Lagrange partial differential equations in\n~v = (v1, v2). We \ufb01x the likelihood term by setting p = 2 (the exponent of the likelihood term). If\n\u03b1, \u03b2, \u03b3 6= 2, the Euler-Lagrange equations are non-linear partial differential equations (PDEs) and\nexplicit solutions cannot be found (if \u03b1, \u03b2, \u03b3 = 2 the Euler-Lagrange equations will be linear and so\ncan be solved by Fourier transforms or Green\u2019s functions, as previously done in [6]). To solve these\nnon-linear PDEs we discretize them by \ufb01nite differences and use iterative gradient descent (i.e. we\napply the dynamics \u2202~v(~r,t)\n\u2202~v(~r,t) until we reach a \ufb01xed state). More precisely, we initialize\n~v(~r, 0) at random, and solve the update equation for t > 0:\n\n\u2202t = \u2212 \u2202F (~v(~r,t))\n\n\u2202vk\n\u2202t\n\n(~r, t) = \u2212\u03bb|~v|\u03b1\u22122vk + \u00b5div(cid:16)|\u2207~v|\u03b2\u22122\u2207vk(cid:17) \u2212 \u03b74(cid:16)|4~v|\u03b3\u221224vk(cid:17)\n\n\u2212 c(cid:16)~v(~rq) \u00b7 ~nq \u2212 ~uq \u00b7 ~nq(cid:17)p\u22121\n\nnk q\u03b4~r,~rq ,\n\nwhere k = 1, 2, \u03b4~r,~rq = 1 if ~r = ~rq and \u03b4~r,~rq = 0 if ~r 6= ~rq. Since the powers \u03b1 \u2212 2, \u03b2 \u2212\n2, \u03b3 \u2212 2 become negative when the positive exponents \u03b1, \u03b2, ... take value 1, we include a small\n\u0001 = 10\u22126 inside the square roots to avoid division by zero (when calculating terms like |.|). The\nalgorithm stops when the difference between two consecutive energy estimates is close to zero (i.e.\nthe stopping criterion is based on thresholding the energy change).\n\nOur implementation discretized the Euler-Lagrange equations, as speci\ufb01ed below. Let ~B(l) =\n|\u2207~v(l)|\u03b2\u22122, ~C(l) = |4~v(l)|\u03b3\u22122, ~A(l) = |~v(l)|\u03b1\u22122, where l denotes time discretization with 4t\nthe time-step, and (i, j) denotes space discretization with h = 4r1 = 4r2 being the space-step.\nThen the above PDE\u2019s can be discretized as\n\nvk\n\n(l+1)\ni,j \u2212 vk\n\n(l)\ni,j\n\n4t\n\n= F idk\n\n(l)\n\ni,j \u2212 \u03bb ~Ai,j vk\n\n(l+1)\ni,j\n\n4\n\n\f(l)\n\ni,j+1 + ~B(l)\n\ni,j\u22121vk\n\ni,j+1 + ~C(l)\n\ni,j\u22121)vk\n\n(l)\ni,j\u22121]\n(l+1)\ni,j\n\n(l+1)\ni,j\n\ni,j\u22121 \u2212 ~B(l)\ni+1,j + ~B(l)\ni+1,j + ~C(l)\n\ni\u22121,j \u2212 2 ~B(l)\n\ni,j )vk\ni\u22121,j + ~B(l)\n\n(l)\n\ni\u22121,jvk\ni\u22121,j + 16 ~C(l)\ni+1,j + ( ~C(l)\n\n(l)\n\ni+1,j + ~C(l)\ni,j )vk\n\ni,j )vk\n(l)\n\ni,j+1 + ( ~C(l)\n\n(l)\n\n+\n\n\u2212\n\n+ ~B(l)\n\n\u00b5\nh2 [(\u2212 ~B(l)\ni,j vk\n\u03b7\nh4 {( ~C(l)\n\u2212 4[( ~C(l)\n+ ( ~C(l)\n+ ( ~C(l)\n+ ( ~C(l)\n+ ~C(l)\n\ni,j+1 + ~C(l)\ni+1,j + ~C(l)\ni\u22121,j + ~C(l)\ni+1,j vk\n\ni,j vk\ni,j + ~C(l)\ni\u22121,j + ~C(l)\ni,j )vk\ni+1,j + ~C(l)\ni\u22121,j + ~C(l)\n\ni,j\u22121 + ~C(l)\n\n(l)\ni\u22121,j\n\ni,j )vk\n(l)\ni,j\u22121]\n\n(l)\n\ni+1,j+1 + ( ~C(l)\ni\u22121,j+1 + ( ~C(l)\n\n(l)\n\ni,j+1)vk\n\ni,j+1)vk\ni+2,j + ~C(l)\n\n(l)\n\ni\u22121,jvk\n\n(l)\n\ni\u22122,j + ~C(l)\n\ni,j+1vk\n\ni,j\u22121)vk\n\n(l)\ni+1,j\u22121\n\n(l)\ni\u22121,j\u22121\n\ni,j\u22121)vk\ni,j+2 + ~C(l)\n(l)\n\ni,j\u22121vk\n\n(l)\ni,j\u22122}\n\nwhere F idki,j =( \u2212c(cid:16)~v(~rq) \u00b7 ~nq \u2212 ~uq \u00b7 ~nq(cid:17)p\u22121\n\n0 otherwise\n\nnk q if ~rq = (i, j)\n\n. Letting\n\n~E1i,j = ~Bi,j\u22121 + ~Bi\u22121,j + 2 ~Bi,j, ~E2i,j = ~Ci+1,j + ~Ci\u22121,j + 16 ~Ci,j + ~Ci,j+1 + ~Ci,j\u22121,\n\n~E3i,j = ~Ci+1,j + ~Ci,j, ~E4i,j = ~Ci\u22121,j + ~Ci,j , ~E5i,j = ~Ci,j+1 + ~Ci,j, ~E6i,j = ~Ci,j\u22121 + ~Ci,j,\n\n~E7i,j = ~Ci+1,j + ~Ci,j+1, ~E8i,j = ~Ci+1,j + ~Ci,j\u22121, ~E9i,j = ~Ci\u22121,j + ~Ci,j+1,\n\n~E10i,j = ~Ci\u22121,j + ~Ci,j\u22121, ~E11 = 1/(1 + 4t(\u03bb ~A + \u00b5\n\nh2 ~E1 + \u03b7\n\nh4 ~E2)),\n\nwe can solve for v(l+1) and we obtain\n\nvk\n\n(l+1)\ni,j\n\n= E11(l)\n\n(l)\ni,j + 4t{F idk\n\n(l)\ni,j +\n\n(l)\n\ni+1,j + ~B(l)\n\ni\u22121,jvk\n\n(l)\n\ni\u22121,j + ~B(l)\n\ni,j vk\n\n(l)\n\ni,j+1 + ~B(l)\n\ni,j\u22121vk\n\n(l)\ni,j\u22121)\n\ni,j(cid:16)vk\n\u03b7\nh4 [\u22124( ~E3\n\n\u2212\n\n(l)\ni,jvk\n\n(l)\n\ni+1,j + ~E4\n\n(l)\ni,jvk\n\n(l)\ni,jvk\n\n(l)\n\ni,j+1 + ~E6\n\n(l)\ni,jvk\n\n(l)\ni,j\u22121)\n\n\u00b5\nh2 ( ~B(l)\ni,j vk\ni\u22121,j + ~E5\n\n(l)\n\n+ ~E7\n\n(l)\ni,jvk\n\n+ ~C(l)\n\ni+1,j vk\n\n(l)\ni,j vk\n\n(l)\n\ni+1,j+1 + ~E8\ni+2,j + ~C(l)\n\n(l)\n\ni\u22121,jvk\n\n(l)\n\ni+1,j\u22121 + ~E9\ni\u22122,j + ~C(l)\n\n(l)\n\ni,j+1vk\n\n(l)\ni,jvk\n(l)\n\n(l)\n\ni\u22121,j+1 + ~E10\ni,j+2 + ~C(l)\n\ni,j\u22121vk\n\n(l)\ni,jvk\n(l)\n\n(l)\ni\u22121,j\u22121\n\ni,j\u22122]}(cid:17).\n\n4 Experiments\n\nWe compared two possible functional forms for the motion prior: (1) the Laplace distribution with\nL1-norm regularization, with \u03b1 = \u03b2 = \u03b3 = 1, (2) the Gaussian distribution with L2-norm regular-\nization, with \u03b1 = \u03b2 = \u03b3 = 2. Since the main goal of this work is to discover motion priors, we\nemployed the same likelihood term with p = 2 for both models. We used the performance of human\nsubjects in the \ufb01rst experimental session to estimate the weights of the three prior terms, \u03bb, \u00b5, \u03b7,\nfor each functional form. We then validated the predictions of the model by comparing them with\nhuman performance in a second experimental session which uses different stimulus parameters.\n\n4.1 Stimulus\n\nWe used a multiple-aperture stimulus [13] which consists of 12 by 12 drifting sine-wave gratings\nwithin a square window subtending 8\u25e6. Each element (0.5\u25e6) was composed of an oriented sinusoidal\ngrating of 5.6 cycles/deg spatial frequency, which was within a stationary Gaussian window. The\ncontrast of the elements was 0.2. The motion stimulus included 20 time frames which were presented\nwithin 267 ms. The global motion stimulus was generated as follows. First, the orientation of each\nlocal grating element was randomly determined. Second, a global motion (also called 2D motion,\nwith the speed of 1 deg/sec) direction was chosen. Third, a certain proportion of elements (signal\nelements) were assigned with the predetermined 2D motion , while each of the remaining elements\n(noise elements) was assigned a random 2D motion. Finally, with its orientation and 2D motion\nvelocity, the drifting speed for each element was computed so that the local (or component) drifting\nvelocity was consistent with the assigned 2D motion velocity. As shown in \ufb01gure 4 the global\nmotion strength was controlled by varying the proportion of signal elements in the stimulus (i.e., the\n\n5\n\n\fcoherence ratio). Stimuli with high ratio exhibited more coherent motion, and stimuli with low ratio\nexhibited more random motion.\n\nIn all the experiments reported in this paper, each participant completed two experiment sessions\nwith different stimulus parameters. The goal of session 1 was parameter estimation: to estimate the\nweights of the three prior terms \u2013 slowness, \ufb01rst-order smoothness and second-order smoothness, \u2013\nfor each model. Session 2 was for model validation: using the weights estimated from session 1 to\npredict subject performance for different experimental conditions.\n\nFigure 4: Stimulus illustration. Multiple-aperture stimuli with coherence ratio of 0, 0.4, 0.8 and 1\nfrom left to right. the blue and green arrows indicate the 2D motion directions assigned for signal\nand noise elements, respectively.\n\n4.2 Experiment 1\n\n4.2.1 Procedure\n\nThere were two separate sessions in Experiment 1. On each trial of the \ufb01rst session, observers\nwere presented with two motion patterns, one after another. The \ufb01rst one was the reference motion\npattern, which always moved upward (0 degree), and the second one was the test motion pattern,\nwhose global motion direction was either tilted towards the left or the right relative to the reference\npattern. Both patterns lasted for 267 ms with 500 ms inter-stimulus interval. The observer\u2019s task\nwas to determine whether the global motion direction of the test pattern was more towards the left\nor right relative to the reference pattern. In order to make sure observers understood the task and\nwere able to perceive the global motion, before the beginning of the \ufb01rst session, observers passed a\ntest session in which they achieved 90% accuracy in 40 consecutive trials with 80% coherence and\n20 (or 45) degrees of angular difference. To allow observers to familiarize themselves with the task,\nbefore each experimental session observers went through a practice session with 10 blocks of 25\ntrials.\n\nThe \ufb01rst session consisted of 20 blocks of 50 trials. the coherence ratio was constant within each\nblock. The observer\u2019s discrimination performance was measured for ten coherence ratios (0, 0.1,\n0.2, .., 0.9) in the \ufb01rst session. The angular difference between the reference and test motion was\n\ufb01xed for each observer in the entire session (2 degrees for observers AL, MW and AE; 45 degrees\nfor OQ and CC). The second session was identical to the \ufb01rst one, except that the coherence ratio\nwas \ufb01xed at 0.7, and the angular difference between the global motion directions of the reference\nand the test patterns was varied across blocks (ten angular differences: 1, 5, 10, .., 45 degrees).\n\n4.2.2 Results\n\nWe implemented motion models with the Laplace prior distribution (termed \u201dL1 model\u201d) and the\nGaussian prior (termed \u201dL2 model\u201d). As the \ufb01rst step, exhaustive search was conducted to \ufb01nd a\nset of weights for the prior terms that provided the best \ufb01t to the human psychometric performance\nin experimental session 1. Table 1 reports the estimated parameters for each individual subject us-\ning the L1 and L2 models. There was clear individual difference for the estimated weight values.\nHowever, across all \ufb01ve subjects, large weight values were found for the second-order smoothness\nterms, indicating the contribution from higher-order smoothness preference is important in perceiv-\ning global motion from multiple-aperture stimulus.\n\nFigure 5 shows the results from each individual participant and best-\ufb01tting model performance. The\nresults clearly show the L1 model provided the better \ufb01t to human data when compared to the L2\nmodel. In general,humans appear to be sensitive to the inclusion of noise elements, and perform\n\n6\n\n\fTable 1: Estimated weights \u03bb, \u00b5, \u03b7 of slowness, \ufb01rst-order smoothness and second-order smoothness\nprior terms, for L1 and L2-norm model\n\nSubjects\n\nL1 \u03bb\n\nL1 \u00b5\n\nL1 \u03b7\n\nL2 \u03bb\n\nL2 \u00b5\n\nL2 \u03b7\n\nAE\nAL\nCC\nMW\nOQ\n\n0.001\n0.01\n0.001\n0.001\n0.01\n\n1\n100\n0.1\n10\n100\n\n15000\n16000\n16000\n17000\n18000\n\n0.01\n0.01\n0.001\n0.01\n0.01\n\n100\n1\n0.1\n1\n100\n\n16000\n16000\n16000\n20000\n18000\n\nworse than the L2 model, which tends to strongly encourage smoothness over the entire display\nwindow.\n\nIn experimental session 2, the two models predicted performance as a function of angular difference\nbetween the reference motion and the test motion. As shown in \ufb01gure 7, the L1 model yielded less\nerror in \ufb01tting human performance than did the L2 model. This result illustrates the power of the L1\nmodel in predicting human performance in motion tasks different from the tasks used for estimating\nmodel parameters.\n\nAE\n\nCC\n\ny\nc\na\nr\nu\nc\nc\nA\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n\n0.5\n\n0.4\n\n \n\n \n\n1\n\n0.9\n\ny\nc\na\nr\nu\nc\nc\nA\n\n0.8\n\n0.7\n\n0.6\n\n0.5\n\n \n\n \n\nr\no\nr\nr\ne\n \nl\ne\nd\no\nM\n\n0.16\n\n0.14\n\n0.12\n\n0.1\n\n0.08\n\n0.06\n\n0.04\n\n0.02\n\n0\n\n \n\nHuman\nL1 model\nL2 model\n\n0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9\n\nCoherence ratio\n\nHuman\nL1 model\nL2 model\n\n0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9\n\nCoherence ratio\n\n \n\nL1 model\nL2 model\n\nAL\n\nAE\n\nCC\n\nMW\n\nOQ\n\nFigure 5: Comparison between human performance and model predictions in session 1. Left two\nplots, accuracy as a function of coherence ratio for two representative subjects. Blue solid lines\nindicate human performance. Red and green dashed lines indicate L1 and L2 model predictions\nwith the best \ufb01tted parameters. Right plot, model error for all \ufb01ve subjects. The model error was\ncomputed as the mean absolute difference between human performance and model predictions. L1\nmodel consistently \ufb01ts human performance better than L2 model for all subjects\n\nAE\n\nCC\n\ny\nc\na\nr\nu\nc\nc\nA\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n\n0.5\n\n0.4\n\n \n\n1\n\n5\n\nHuman\nL1 model\nL2 model\n\n10\n\n15\n\n20\n\n25\n\n30\n\n35\n\n40\n\nAngular difference (degree)\n\n \n\n1\n\n0.9\n\ny\nc\na\nr\nu\nc\nc\nA\n\n0.8\n\n0.7\n\n0.6\n\n0.5\n\n \n\n1\n\n5\n\n45\n\nHuman\nL1 model\nL2 model\n\n10\n\n15\n\n20\n\n25\n\n30\n\n35\n\n40\n\nAngular difference (degree)\n\n \n\n0.25\n\nr\no\nr\nr\ne\n \nl\ne\nd\no\nM\n\n0.2\n\n0.15\n\n0.1\n\n0.05\n\n0\n \n\n45\n\n \n\nL1 model\nL2 model\n\nAL\n\nAE\n\nCC\n\nMW\n\nOQ\n\nFigure 6: Comparison between human performance and model predictions in session 1. Left two\nplots, accuracy as a function of angular difference between the reference and the test motion for two\nrepresentative subjects. Blue solid lines indicate human performance. Red and Green dashed lines\nindicate L1 and L2 model predictions. Right plot, model error for all \ufb01ve subjects. Less errors from\nL1 model indicate that L1 model consistently \ufb01ts human performance better than L2 model for all\nsubjects\n\n4.3 Experiment 2\n\nThe results of Experiment 1 clearly support the conclusion that the motion model with Laplace prior\n(L1-norm regularization) \ufb01ts human performance better than does the model with Gaussian prior\n\n7\n\n\f(L2 model). In Experiment 2, we compared human motion judgment with predictions of the L1\nmodel on each trial, rather than using the average performance as in Experiment 1. Such a detailed\ncomparison can provide quantitative measures of how well the L1 model is able to predict human\nmotion judgment for speci\ufb01c stimuli.\n\nIn Experiment 2, the \ufb01rst session was identical to that in Experiment 1, in which angular difference\nin the two global motion directions were \ufb01xed (45 degrees for all observers) while the coherence\nratio was varied. In the second session, observers were presented with one motion stimulus on each\ntrial. The global motion direction of the pattern was randomly selected from 24 possible directions\n(with a 15-degree difference between two adjacent directions). Observers reported their perceived\nglobal motion directions by rotating a line after the motion stimulus disappeared from the screen.\nThe experiment included 12 blocks (each with 48 trials) and six coherence ratios (0, 0.1, 0.3, .., 0.9).\nA two-pass design was used to let each observer run the identical session twice in order to measure\nthe reliability of the observer\u2019s judgments.\n\nWe used human performance in session 1 to estimate model parameters: weights \u03bb, \u00b5, \u03b7 for slow-\nness, \ufb01rst-order smoothness and second-order smoothness prior terms for each individual partici-\npant. Since identical stimuli were used in the two runs of session 2, we can quantify the reliability\nof the observer\u2019s judgment by computing the response correlation across trials in these two runs. As\nshown in the left plot of \ufb01gure 7, human observers\u2019 responses were signi\ufb01cantly correlated in the\ntwo runs, even in the condition of random motion (coherence ratio is close to 0). The correlated\nresponses in these subthreshold conditions suggest that human observers are able to provide con-\nsistent interpretation of motion \ufb02ow, even when the motion is random. The right plot of \ufb01gure 7\nshows the trial-by-trial correlation between human motion judgments with model-predicted global\nmotion direction. The model-human correlations were comparable to human self-correlations. Even\nin the random motion condition (where the coherence ratio is 0), the correlation between the model\nand human judgments is greater than 0.5, indicating the predictive power of the model. We also\nnoticed that the correlation between human and L2 model was around 8 percent worse than the hu-\nman self-correlation and the correlation between the L1 model and humans. This \ufb01nding further\ndemonstrated that the L1 model provided a better \ufb01t to human data than did the L2 model.\n\nl\n\nn\no\ni\nt\na\ne\nr\nr\no\nc\n \nf\nl\ne\ns\n\u2212\nn\na\nm\nu\nH\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n\n0.5\n\n0.4\n\n0.3\n\n \n\n \n\nAP\nMS\nSG\nXD\n\n0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9\n\nCoherence ratio\n\nn\no\n\ni\nt\n\nl\n\na\ne\nr\nr\no\nc\n \nl\ne\nd\no\nM\n\u2212\nn\na\nm\nu\nH\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n\n0.5\n\n0.4\n\n0.3\n\n \n\n \n\nAP\nMS\nSG\nXD\n\n0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9\n\nCoherence ratio\n\nFigure 7: Comparison between human performance and model predictions using trial-by-trial corre-\nlation. Left plot, human self correlation between two runs of identical experimental sessions. Right\nplot, correlation between human motion judgement and model predicted global motion direction.\nThe signi\ufb01cant correlation between human and the model indicates the L1 model is able to predict\nhuman motion judgment for speci\ufb01c stimuli, even in the random display, i.e., coherence ratio close\nto 0.\n5 Conclusions\n\nWe found that a motion prior in the form of the Laplace distribution with L1-norm regularization\nprovided signi\ufb01cantly better agreement with human performance than did Gaussian priors with L2-\nnorm. We also showed that humans weighted second-order motion smoothness much higher than\n\ufb01rst-order smoothness and slowness. Furthermore, model predictions using this Laplace prior were\nconsistent with human perception of coherent motion, even for random displays. Overall our results\nsuggest that human motion perception for these types of stimuli can be well modeled using Laplace\npriors.\n\nAcknowledgments\n\nThis research was supported by NSF grants IIS-613563 to AY and BCS-0843880 to HL.\n\n8\n\n\fReferences\n\n[1] R. Sekuler, S.N.J. Watamaniuk and R. Blake. Perception of Visual Motion. In Steven\u2019s Hand-\nbook of Experimental Psychology. Third edition. H. Pashler, series editor. S. Yantis, volume\neditor. J. Wiley Publishers. New York. 2002.\n\n[2] L. Welch. The perception of moving plaids revewals two processing stages. Nature,337,734-\n\n736. 1989.\n\n[3] P. Schrater, D. Knill and E. Simoncelli. Mechanisms of visual motion detection. Nature Neu-\n\nroscience, 3, 64-68. 2000.\n\n[4] J. A. Movhson and W. T. Newsome. Visual response properties of striate cortical neurons\n\nprojecting to area MT in macaque monkeys. Visual Neuroscience, 16, 7733-7741. 1996.\n\n[5] N. C. Rust, V. Mante, E. P. Simoncelli and J. A. Movshon. How MT cells analyze the motion\n\nof visual patterns. Nature Neuroscience, 9(11), 1421-1431. 2006.\n\n[6] A.L. Yuille and N.M. Grzywacz. A computational theory for the perception of coherent visual\n\nmotion. Nature, 333,71-74. 1988.\n\n[7] A.L. Yuille and N.M. Grzywacz. A Mathematical Analysis of the Motion Coherence Theory.\n\nInternational Journal of Computer Vision. 3. pp 155-175. 1989.\n\n[8] Y. Weiss, E.P. Simoncelli, and E.H. Adelson. Motion illusions as optimal percepts. Nature\n\nNeuroscience, 5, 598-604. 2002.\n\n[9] H. Lu and A.L. Yuille. Ideal Observers for Detecting Motion: Correspondence Noise. Ad-\n\nvances in Neural Information Processing Systems 7, pp. 827-834. 2005.\n\n[10] A.A. Stocker and E.P. Simoncelli. Noise characteristics and prior expectations in human visual\n\nspeed perception. Nature Neuroscience, 9(4), pp. 578-585, 2006.\n\n[11] S. Roth and M. J. Black. On the spatial statistics of optical \ufb02ow. International Journal of\n\nComputer Vision, 74(1), pp. 33-50, 2007.\n\n[12] C. Liu, W. T. Freeman, E. H. Adelson and Y. Weiss. IEEE Conference on Computer Vision and\n\nPattern Recognition, 2008.\n\n[13] Amano, K., Edwards, M., Badcock, D. R. and Nishida, S. Adaptive pooling of visual motion\nsignals by the human visual system revealed with a novel multi-element stimulus. Journal of\nVision, 9(3), 4, 1-25, 2009.\n\n9\n\n\f", "award": [], "sourceid": 847, "authors": [{"given_name": "Hongjing", "family_name": "Lu", "institution": null}, {"given_name": "Tungyou", "family_name": "Lin", "institution": null}, {"given_name": "Alan", "family_name": "Lee", "institution": null}, {"given_name": "Luminita", "family_name": "Vese", "institution": null}, {"given_name": "Alan", "family_name": "Yuille", "institution": null}]}