{"title": "Constraining a Bayesian Model of Human Visual Speed Perception", "book": "Advances in Neural Information Processing Systems", "page_first": 1361, "page_last": 1368, "abstract": null, "full_text": "Constraining a Bayesian Model of Human Visual\n Speed Perception\n\n\n\n Alan A. Stocker and Eero P. Simoncelli\n Howard Hughes Medical Institute,\n Center for Neural Science, and Courant Institute of Mathematical Sciences\n New York University, U.S.A.\n\n\n\n Abstract\n\n It has been demonstrated that basic aspects of human visual motion per-\n ception are qualitatively consistent with a Bayesian estimation frame-\n work, where the prior probability distribution on velocity favors slow\n speeds. Here, we present a refined probabilistic model that can account\n for the typical trial-to-trial variabilities observed in psychophysical speed\n perception experiments. We also show that data from such experiments\n can be used to constrain both the likelihood and prior functions of the\n model. Specifically, we measured matching speeds and thresholds in a\n two-alternative forced choice speed discrimination task. Parametric fits\n to the data reveal that the likelihood function is well approximated by\n a LogNormal distribution with a characteristic contrast-dependent vari-\n ance, and that the prior distribution on velocity exhibits significantly\n heavier tails than a Gaussian, and approximately follows a power-law\n function.\n\n\nHumans do not perceive visual motion veridically. Various psychophysical experiments\nhave shown that the perceived speed of visual stimuli is affected by stimulus contrast,\nwith low contrast stimuli being perceived to move slower than high contrast ones [1, 2].\nComputational models have been suggested that can qualitatively explain these perceptual\neffects. Commonly, they assume the perception of visual motion to be optimal either within\na deterministic framework with a regularization constraint that biases the solution toward\nzero motion [3, 4], or within a probabilistic framework of Bayesian estimation with a prior\nthat favors slow velocities [5, 6].\n\nThe solutions resulting from these two frameworks are similar (and in some cases identi-\ncal), but the probabilistic framework provides a more principled formulation of the problem\nin terms of meaningful probabilistic components. Specifically, Bayesian approaches rely\non a likelihood function that expresses the relationship between the noisy measurements\nand the quantity to be estimated, and a prior distribution that expresses the probability of\nencountering any particular value of that quantity. A probabilistic model can also provide a\nricher description, by defining a full probability density over the set of possible \"percepts\",\nrather than just a single value. Numerous analyses of psychophysical experiments have\nmade use of such distributions within the framework of signal detection theory in order to\nmodel perceptual behavior [7].\n\nPrevious work has shown that an ideal Bayesian observer model based on Gaussian forms\n\n\f\n high contrast low contrast\n\n y y\n posterior \n likelihood\n y densit y densit posterior\n\n likelihood\n obabilit prior prior\n obabilit\n pr pr\n\n v^ v^\n a visual speed b visual speed\n\nFigure 1: Bayesian model of visual speed perception. a) For a high contrast stimulus, the\nlikelihood has a narrow width (a high signal-to-noise ratio) and the prior induces only a\nsmall shift of the mean ^\n v of the posterior. b) For a low contrast stimuli, the measurement\nis noisy, leading to a wider likelihood. The shift is much larger and the perceived speed\nlower than under condition (a).\n\n\n\nfor both likelihood and prior is sufficient to capture the basic qualitative features of global\ntranslational motion perception [5, 6]. But the behavior of the resulting model deviates\nsystematically from human perceptual data, most importantly with regard to trial-to-trial\nvariability and the precise form of interaction between contrast and perceived speed. A\nrecent article achieved better fits for the model under the assumption that human contrast\nperception saturates [8]. In order to advance the theory of Bayesian perception and provide\nsignificant constraints on models of neural implementation, it seems essential to constrain\nquantitatively both the likelihood function and the prior probability distribution. In previous\nwork, the proposed likelihood functions were derived from the brightness constancy con-\nstraint [5, 6] or other generative principles [9]. Also, previous approaches defined the prior\ndistribution based on general assumptions and computational convenience, typically choos-\ning a Gaussian with zero mean, although a Laplacian prior has also been suggested [4]. In\nthis paper, we develop a more general form of Bayesian model for speed perception that\ncan account for trial-to-trial variability. We use psychophysical speed discrimination data\nin order to constrain both the likelihood and the prior function.\n\n\n1 Probabilistic Model of Visual Speed Perception\n\n1.1 Ideal Bayesian Observer\n\nAssume that an observer wants to obtain an estimate for a variable v based on a measure-\nment m that she/he performs. A Bayesian observer \"knows\" that the measurement device\nis not ideal and therefore, the measurement m is affected by noise. Hence, this observer\ncombines the information gained by the measurement m with a priori knowledge about v.\nDoing so (and assuming that the prior knowledge is valid), the observer will on average \nperform better in estimating v than just trusting the measurements m. According to Bayes'\nrule 1\n p(v|m) = p(m|v)p(v) (1)\n \nthe probability of perceiving v given m (posterior) is the product of the likelihood of v for\na particular measurements m and the a priori knowledge about the estimated variable v\n(prior). is a normalization constant independent of v that ensures that the posterior is a\nproper probability distribution.\n\n\f\n 1 Pcum=0.875\n\n )1^\n P\n + cum=0.5\n > v2^\n P(v\n\n\n\n 0 v2\n a b vmatch vthres\n\nFigure 2: 2AFC speed discrimination experiment. a) Two patches of drifting gratings were\ndisplayed simultaneously (motion without movement). The subject was asked to fixate\nthe center cross and decide after the presentation which of the two gratings was moving\nfaster. b) A typical psychometric curve obtained under such paradigm. The dots represent\nthe empirical probability that the subject perceived stimulus2 moving faster than stimulus1.\nThe speed of stimulus1 was fixed while v2 is varied. The point of subjective equality, vmatch,\nis the value of v2 for which Pcum = 0.5. The threshold velocity vthresh is the velocity for\nwhich Pcum = 0.875.\n\n\nIt is important to note that the measurement m is an internal variable of the observer and\nis not necessarily represented in the same space as v. The likelihood embodies both the\nmapping from v to m and the noise in this mapping. So far, we assume that there is a\nmonotonic function f (v) : v vm that maps v into the same space as m (m-space).\nDoing so allows us to analytically treat m and vm in the same space. We will later propose\na suitable form of the mapping function f (v).\n\nAn ideal Bayesian observer selects the estimate that minimizes the expected loss, given the\nposterior and a loss function. We assume a least-squares loss function. Then, the optimal\nestimate ^\n v is the mean of the posterior in Equation (1). It is easy to see why this model\nof a Bayesian observer is consistent with the fact that perceived speed decreases with con-\ntrast. The width of the likelihood varies inversely with the accuracy of the measurements\nperformed by the observer, which presumably decreases with decreasing contrast due to\na decreasing signal-to-noise ratio. As illustrated in Figure 1, the shift in perceived speed\ntowards slow velocities grows with the width of the likelihood, and thus a Bayesian model\ncan qualitatively explain the psychophysical results [1].\n\n\n1.2 Two Alternative Forced Choice Experiment\n\nWe would like to examine perceived speeds under a wide range of conditions in order to\nconstrain a Bayesian model. Unfortunately, perceived speed is an internal variable, and it is\nnot obvious how to design an experiment that would allow subjects to express it directly 1.\nPerceived speed can only be accessed indirectly by asking the subject to compare the speed\nof two stimuli. For a given trial, an ideal Bayesian observer in such a two-alternative forced\nchoice (2AFC) experimental paradigm simply decides on the basis of the two trial estimates\n^v1 (stimulus1) and ^v2 (stimulus2) which stimulus moves faster. Each estimate ^v is based\non a particular measurement m. For a given stimulus with speed v, an ideal Bayesian\nobserver will produce a distribution of estimates p(^\n v|v) because m is noisy. Over trials,\nthe observers behavior can be described by classical signal detection theory based on the\ndistributions of the estimates, hence e.g. the probability of perceiving stimulus2 moving\n\n\n 1Although see [10] for an example of determining and even changing the prior of a Bayesian\nmodel for a sensorimotor task, where the estimates are more directly accessible.\n\n\f\nfaster than stimulus1 is given as the cumulative probability\n ^v2\n Pcum(^\n v2 > ^v1) = p(^\n v2|v2) p(^\n v1|v1) d^v1 d^v2 (2)\n 0 0\nPcum describes the full psychometric curve. Figure 2b illustrates the measured psychomet-\nric curve and its fit from such an experimental situation.\n\n\n2 Experimental Methods\n\nWe measured matching speeds (Pcum = 0.5) and thresholds (Pcum = 0.875) in a 2AFC\nspeed discrimination task. Subjects were presented simultaneously with two circular\npatches of horizontally drifting sine-wave gratings for the duration of one second (Fig-\nure 2a). Patches were 3deg in diameter, and were displayed at 6deg eccentricity to either\nside of a fixation cross. The stimuli had an identical spatial frequency of 1.5 cycle/deg. One\nstimulus was considered to be the reference stimulus having one of two different contrast\nvalues (c1=[0.075 0.5]) and one of five different speed values (u1=[1 2 4 8 12] deg/sec)\nwhile the second stimulus (test) had one of five different contrast values (c2=[0.05 0.1 0.2\n0.4 0.8]) and a varying speed that was determined by an interleaved staircase procedure.\nFor each condition there were 96 trials. Conditions were randomly interleaved, including\na random choice of stimulus identity (test vs. reference) and motion direction (right vs.\nleft). Subjects were asked to fixate during stimulus presentation and select the faster mov-\ning stimulus. The threshold experiment differed only in that auditory feedback was given\nto indicate the correctness of their decision. This did not change the outcome of the ex-\nperiment but increased significantly the quality of the data and thus reduced the number of\ntrials needed.\n\n\n3 Analysis\n\nWith the data from the speed discrimination experiments we could in principal apply a\nparametric fit using Equation (2) to derive the prior and the likelihood, but the optimization\nis difficult, and the fit might not be well constrained given the amount of data we have ob-\ntained. The problem becomes much more tractable given the following weak assumptions:\n\n We consider the prior to be relatively smooth.\n We assume that the measurement m is corrupted by additive Gaussian noise with\n a variance whose dependence on stimulus speed and contrast is separable.\n We assume that there is a mapping function f(v) : v vm that maps v into the\n space of m (m-space). In that space, the likelihood is convolutional i.e. the noise\n in the measurement directly defines the width of the likelihood.\n\nThese assumptions allow us to relate the psychophysical data to our probabilistic model in\na simple way. The following analysis is in the m-space. The point of subjective equality\n(Pcum = 0.5) is defined as where the expected values of the speed estimates are equal. We\nwrite\n E ^vm,1 = E ^vm,2 (3)\n vm,1 - E 1 = vm,2 - E 2\nwhere E is the expected shift of the perceived speed compared to the veridical speed.\nFor the discrimination threshold experiment, above assumptions imply that the variance\nvar ^\n vm of the speed estimates ^vm is equal for both stimuli. Then, (2) predicts that the\ndiscrimination threshold is proportional to the standard deviation, thus\n\n vm,2 - vm,1 = var ^vm (4)\n\n\f\n likelihood\n\n\n\n\n a\n\n prior\n b\n\n vm\n\nFigure 3: Piece-wise approximation We perform a parametric fit by assuming the prior to\nbe piece-wise linear and the likelihood to be LogNormal (Gaussian in the m-space).\n\n\nwhere is a constant that depends on the threshold criterion Pcum and the exact shape of\np(^\n vm|vm).\n\n3.1 Estimating the prior and likelihood\n\nIn order to extract the prior and the likelihood of our model from the data, we have to find\na generic local form of the prior and the likelihood and relate them to the mean and the\nvariance of the speed estimates. As illustrated in Figure 3, we assume that the likelihood is\nGaussian with a standard deviation (c, vm). Furthermore, the prior is assumed to be well-\napproximated by a first-order Taylor series expansion over the velocity ranges covered by\nthe likelihood. We parameterize this linear expansion of the prior as p(vm) = avm + b.\n\nWe now can derive a posterior for this local approximation of likelihood and prior and then\ndefine the perceived speed shift (m). The posterior can be written as\n 1 1\n p(vm|m) = p(m|vm)p(vm) = [exp(- v2m )(avm + b)] (5)\n 2(c, vm)2\nwhere is the normalization constant\n \n = p(m|vm)p(vm)dvm = b\n - 2 2(c, vm)2 (6)\nWe can compute (m) as the first order moment of the posterior for a given m. Exploiting\nthe symmetries around the origin, we find\n \n (m) = vp(vm|m)dvm a(m)\n - b(m) (c, vm)2 (7)\nThe expected value of (m) is equal to the value of at the expected value of the measure-\nment m (which is the stimulus velocity vm), thus\n\n E = (m)|m=v = a(vm)\n m b(vm) (c, vm)2 (8)\n\nSimilarly, we derive var ^\n vm . Because the estimator is deterministic, the variance of the\nestimate only depends on the variance of the measurement m. For a given stimulus, the\nvariance of the estimate can be well approximated by\n\n var ^\n vm = var m (^vm(m)|m=v )2 (9)\n m m\n\n = var m (1 - (m)|m=v )2 var m\n m m\n\n\f\nUnder the assumption of a locally smooth prior, the perceived velocity shift remains locally\nconstant. The variance of the perceived speed ^\n vm becomes equal to the variance of the\nmeasurement m, which is the variance of the likelihood (in the m-space), thus\n\n var ^\n vm = (c, vm)2 (10)\nWith (3) and (4), above derivations provide a simple dependency of the psychophysical\ndata to the local parameters of the likelihood and the prior.\n\n\n3.2 Choosing a Logarithmic speed representation\n\nWe now want to choose the appropriate mapping function f (v) that maps v to the m-space.\nWe define the m-space as the space in which the likelihood is Gaussian with a speed-\nindependent width. We have shown that discrimination threshold is proportional to the\nwidth of the likelihood (4), (10). Also, we know from the psychophysics literature that\nvisual speed discrimination approximately follows a Weber-Fechner law [11, 12], thus that\nthe discrimination threshold increases roughly proportional with speed and so would the\nlikelihood. A logarithmic speed representation would be compatible with the data and our\nchoice of the likelihood. Hence, we transform the linear speed-domain v into a normalized\nlogarithmic domain according to\n\n vm = f(v) = ln( v + v0 ) (11)\n v0\nwhere v0 is a small normalization constant. The normalization is chosen to account for\nthe expected deviation of equal variance behavior at the low end. Surprisingly, it has been\nfound that neurons in the Medial Temporal area (Area MT) of macaque monkeys have\nspeed-tuning curves that are very well approximated by Gaussians of constant width in\nabove normalized logarithmic space [13]. These neurons are known to play a central role\nin the representation of motion. It seems natural to assume that they are strongly involved\nin tasks such as our performed psychophysical experiments.\n\n\n4 Results\n\nFigure 4 shows the contrast dependent shift of speed perception and the speed discrimina-\ntion threshold data for two subjects. Data points connected with a dashed line represent\nthe relative matching speed (v2/v1) for a particular contrast value c2 of the test stimulus\nas a function of the speed of the reference stimulus. Error bars are the empirical stan-\ndard deviation of fits to bootstrapped samples of the data. Clearly, low contrast stimuli\nare perceived to move slower. The effect, however, varies across the tested speed range\nand tends to become smaller for higher speeds. The relative discrimination thresholds for\ntwo different contrasts as a function of speed show that the Weber-Fechner law holds only\napproximately. The data are in good agreement with other data from the psychophysics\nliterature [1, 11, 8].\nFor each subject, data from both experiments were used to compute a parametric least-\nsquares fit according to (3), (4), (7), and (10). In order to test the assumption of a LogNor-\nmal likelihood we allowed the standard deviation to be dependent on contrast and speed,\nthus (c, vm) = g(c)h(vm). We split the speed range into six bins (subject2: five) and\nparameterized h(vm) and the ratio a/b accordingly. Similarly, we parameterized g(c) for\nthe seven contrast values. The resulting fits are superimposed as bold lines in Figure 4.\n\nFigure 5 shows the fitted parametric values for g(c) and h(v) (plotted in the linear domain),\nand the reconstructed prior distribution p(v) transformed back to the linear domain. The\napproximately constant values for h(v) provide evidence that a LogNormal distribution\nis an appropriate functional description of the likelihood. The resulting values for g(c)\nsuggest for the likelihood width a roughly exponential decaying dependency on contrast\nwith strong saturation for higher contrasts.\n\n\f\n contrast:\n reference stimulus contrast c1: 0.075\n subject 1 e) 0.79\n 0.5\n 0.075 0.5\n 0.5 elativ\n\n 1.5 0.4\n eshold (r\n contrast c2\n 0.3\n ching speed\n\n 1 0.2\n ed mat\n\n 0.1\n discrimination thr\n normaliz\n 0.5 0\n 1 10 1 10 1 10\n\n\n\n subject 2\n\n e)\n 0.5\n\n 1.5 elativ\n 0.4\n contrast c2\n\n ching speed eshold (r 0.3\n\n 1\n ed mat 0.2\n\n\n normaliz 0.1\n\n 0.5 discrimination thr\n 1 10 1 10 1 10\n a speed of reference stimulus [deg/sec] b stimulus speed [deg/sec]\n\n\nFigure 4: Speed discrimination data for two subjects. a) The relative matching speed of\na test stimulus with different contrast levels (c2=[0.05 0.1 0.2 0.4 0.8]) to achieve subjec-\ntive equality with a reference stimulus (two different contrast values c1). b) The relative\ndiscrimination threshold for two stimuli with equal contrast (c1,2=[0.075 0.5]).\n\n subject 1 reconstructed prior g(c) h(v)\n 1 2\n 1\n Gaussian 0.9\n ed] Power-Law\n 1.5\n 0.8\n 0.1\n 0.7\n n=-1.41 1\n 0.6\n\n 0.01 0.5 0.5\n p(v) [unnormaliz\n 0.4\n\n 0.3\n 1 10 0.1 1 1 10\n\n\n\n 1\n subject 2 2\n 1\n\n 0.9\n ed] 1.5\n 0.8\n n=-1.35\n 0.1\n 0.7\n 1\n 0.6\n\n 0.01 0.5 0.5\n p(v) [unnormaliz 0.4\n\n 0.3 0\n 1 10 0.1 1 1 10\n speed [deg/sec] contrast speed [deg/sec]\n\n\nFigure 5: Reconstructed prior distribution and parameters of the likelihood function. The\nreconstructed prior for both subjects show much heavier tails than a Gaussian (dashed fit),\napproximately following a power-law function with exponent n -1.4 (bold line).\n\n\f\n5 Conclusions\n\nWe have proposed a probabilistic framework based on a Bayesian ideal observer and stan-\ndard signal detection theory. We have derived a likelihood function and prior distribution\nfor the estimator, with a fairly conservative set of assumptions, constrained by psychophys-\nical measurements of speed discrimination and matching. The width of the resulting like-\nlihood is nearly constant in the logarithmic speed domain, and decreases approximately\nexponentially with contrast. The prior expresses a preference for slower speeds, and ap-\nproximately follows a power-law distribution, thus has much heavier tails than a Gaussian.\n\nIt would be interesting to compare the here derived prior distributions with measured true\ndistributions of local image velocities that impinge on the retina. Although a number of\nauthors have measured the spatio-temporal structure of natural images [14, e.g. ], it is\nclearly difficult to extract therefrom the true prior distribution because of the feedback loop\nformed through movements of the body, head and eyes.\n\n\nAcknowledgments\n\nThe authors thank all subjects for their participation in the psychophysical experiments.\n\n\nReferences\n\n [1] P. Thompson. Perceived rate of movement depends on contrast. Vision Research, 22:377380,\n 1982.\n\n [2] L.S. Stone and P. Thompson. Human speed perception is contrast dependent. Vision Research,\n 32(8):15351549, 1992.\n\n [3] A. Yuille and N. Grzywacz. A computational theory for the perception of coherent visual\n motion. Nature, 333(5):7174, May 1988.\n\n [4] Alan Stocker. Constraint Optimization Networks for Visual Motion Perception - Analysis and\n Synthesis. PhD thesis, Dept. of Physics, Swiss Federal Institute of Technology, Zurich, Switzer-\n land, March 2002.\n\n [5] Eero Simoncelli. Distributed analysis and representation of visual motion. PhD thesis, MIT,\n Dept. of Electrical Engineering, Cambridge, MA, 1993.\n\n [6] Y. Weiss, E. Simoncelli, and E. Adelson. Motion illusions as optimal percept. Nature Neuro-\n science, 5(6):598604, June 2002.\n\n [7] D.M. Green and J.A. Swets. Signal Detection Theory and Psychophysics. Wiley, New York,\n 1966.\n\n [8] F. Hurlimann, D. Kiper, and M. Carandini. Testing the Bayesian model of perceived speed.\n Vision Research, 2002.\n\n [9] Y. Weiss and D.J. Fleet. Probabilistic Models of the Brain, chapter Velocity Likelihoods in\n Biological and Machine Vision, pages 7796. Bradford, 2002.\n\n[10] K. Koerding and D. Wolpert. Bayesian integration in sensorimotor learning. Nature,\n 427(15):244247, January 2004.\n\n[11] Leslie Welch. The perception of moving plaids reveals two motion-processing stages. Nature,\n 337:734736, 1989.\n\n[12] S. McKee, G. Silvermann, and K. Nakayama. Precise velocity discrimintation despite random\n variations in temporal frequency and contrast. Vision Research, 26(4):609619, 1986.\n\n[13] C.H. Anderson, H. Nover, and G.C. DeAngelis. Modeling the velocity tuning of macaque MT\n neurons. Journal of Vision/VSS abstract, 2003.\n\n[14] D.W. Dong and J.J. Atick. Statistics of natural time-varying images. Network: Computation in\n Neural Systems, 6:345358, 1995.\n\n\f\n", "award": [], "sourceid": 2570, "authors": [{"given_name": "Alan", "family_name": "Stocker", "institution": null}, {"given_name": "Eero", "family_name": "Simoncelli", "institution": null}]}