{"title": "Forward dynamic models in human motor control: Psychophysical evidence", "book": "Advances in Neural Information Processing Systems", "page_first": 43, "page_last": 50, "abstract": null, "full_text": "Forward dynamic models in human \n\nmotor control: Psychophysical evidence \n\nDaniel M. Wolpert \nwolpert@psyche.mit.edu \n\nZouhin Ghahramani \nzoubin@psyche.mit.edu \n\nMichael I. Jordan \njordan@psyche.mit.edu \n\nDepartment of Brain & Cognitive Sciences \n\nMassachusetts Institute of Technology \n\nCambridge, MA 02139 \n\nAbstract \n\nBased on computational principles, with as yet no direct experi(cid:173)\nmental validation, it has been proposed that the central nervous \nsystem (CNS) uses an internal model to simulate the dynamic be(cid:173)\nhavior of the motor system in planning, control and learning (Sut(cid:173)\nton and Barto, 1981; Ito, 1984; Kawato et aI., 1987; Jordan and \nRumelhart, 1992; Miall et aI., 1993). We present experimental re(cid:173)\nsults and simulations based on a novel approach that investigates \nthe temporal propagation of errors in the sensorimotor integration \nprocess. Our results provide direct support for the existence of an \ninternal model. \n\n1 \n\nIntroduction \n\nThe notion of an internal model, a system which mimics the behavior of a natural \nprocess, has emerged as an important theoretical concept in motor control (Jordan, \n1995). There are two varieties of internal models-\"forward models,\" which mimic \nthe causal flow of a process by predicting its next state given the current state \nand the motor command, and \"inverse models,\" which are anticausal, estimating \nthe motor command that causes a particular state transition. Forward models(cid:173)\nthe focus of this article-have been been shown to be of potential use for solving \nfour fundamental problems in computational motor control. First, the delays in \nmost sensorimotor loops are large, making feedback control infeasible for rapid \n\n\f44 \n\nDaniel M. Wolpert, Zoubin Ghahramani, Michaell. Jordan \n\nmovements. By using a forward model for internal feedback the outcome of an action \ncan be estimated and used before sensory feedback is available (Ito, 1984; Miall \net al., 1993). Second, a forward model is a key ingredient in a system that uses motor \noutflow (\"efference copy\") to anticipate and cancel the reafferent sensory effects of \nself-movement (Gallistel, 1980; Robinson et al., 1986). Third, a forward model can \nbe used to transform errors between the desired and actual sensory outcome of a \nmovement into the corresponding errors in the motor command, thereby providing \nappropriate signals for motor learning (Jordan and Rumelhart, 1992). Similarly \nby predicting the sensory outcome of the action, without actually performing it, a \nforward model can be used in mental practice to learn to select between possible \nactions (Sutton and Barto, 1981). Finally, a forward model can be used for state \nestimation in which the model's prediction of the next state is combined with a \nreafferent sensory correction (Goodwin and Sin, 1984). Although shown to be of \ntheoretical importance, the existence and use of an internal forward model in the \nCNS is still a major topic of debate. \n\nWhen a subject moves his arm in the dark, he is able to estimate the visual loca(cid:173)\ntion of his hand with some degree of accuracy. Observer models from engineering \nformalize the sources of information which the CNS could use to construct this \nestimate (Figure 1). This framework consists of a state estimation process (the \nobserver) which is able to monitor both the inputs and outputs of the system. In \nparticular, for the arm, the inputs are motor commands and the output is sensory \nfeedback (e.g. vision and proprioception). There are three basic methods whereby \nthe observer can estimate the current state (e.g. position and velocity) of the hand \nform these sources: It can make use of sensory inflow, it can make use of integrated \nmotor outflow (dead reckoning), or it can combine these two sources of information \nvia the use of a forward model. \n\nu(t) \n\nInput \n\nSystem \n\nOutput \n\nX(t) \n\ny(t) \n\nMoto \nr \n\nComma nd -\n\nS ensory \nedback \nFe \n\nObserver \n\nState estimate \n\nx(t) \n\nFigure 1. Observer model of state estimation. \n\n\fForward Dynamic Models in Human Motor Control \n\n45 \n\n2 State Estimation Experiment \n\nTo test between these possibilities, we carried out an experiment in which subjects \nmade arm movements in the dark. The full details of the experiment are described \nin the Appendix. Three experimental conditions were studied, involving the use \nof null, assistive and resistive force fields. The subjects' internal estimate of hand \nlocation was assessed by asking them to localize visually the position of their hand at \nthe end of the movement. The bias of this location estimate, plotted as a function \nof movement duration shows a consistent overestimation of the distance moved \n(Figure 2). This bias shows two distinct phases as a function of movement duration, \nan initial increase reaching a peak of 0.9 cm after one second followed by a sharp \ntransition to a region of gradual decline. The variance of the estimate also shows an \ninitial increase during the first second of movement after which it plateaus at about \n2 cm2 . External forces had distinct effects on the bias and variance propagation. \nWhereas the bias was increased by the assistive force and decreased by the resistive \nforce (p < 0.05), the variance was unaffected. \n\na \n\n1.0 \n\n-\n- 0.5 \n\nE \n0 \n\n(J) \n \n0.0 \n\nTime (5) \n\n..... . \n\n(\\I \n\nd \n\n- 2 \n-\n\nE \n0 \n\n1 \n\nQ) \n0 \nc:: \n \n<] \n\n0 \n\n-1 \n\n-2 \n\n0.0 0.5 1.0 1.5 2.0 2.5 \n\n0.0 0.5 \n\nTime (5) \n\n1.5 2.0 2.5 \n\n1.0 \nTime (5) \n\nFigure 2. The propagation of the (a) bias and (b) variance of the state \nestimate is shown, with standard error lines, against movement duration. \nThe differential effects on (c) bias and (d) variance of the external force, \nassistive (dotted lines) and resistive (solid lines), are also shown relative \nto zero (dashed line). A positive bias represents an overestimation of \nthe distance moved. \n\n\f46 \n\nDaniel M. Wolpert, Zoubin Ghahramani, Michael I. Jordan \n\n3 Observer Model Simulation \n\nThese experimental results can be fully accounted for only if we assume that the mo(cid:173)\ntor control system integrates the efferent outflow and the reafferent sensory inflow. \nTo establish this conclusion we have developed an explicit model of the sensorimotor \nintegration process which contains as special cases all three of the methods referred \nto above. The model-a Kalman filter (Kalman and Bucy, 1961)-is a linear dy(cid:173)\nnamical system that produces an estimate of the location of the hand by monitoring \nboth the motor outflow and the feedback as sensed, in the absence of vision, solely \nby proprioception. Based on these sources of information the model estimates the \narm's state, integrating sensory and motor signals to reduce the overall uncertainty \nin its estimate. \nRepresenting the state of the hand at time t as x(t) (a 2 x 1 vector of position \nand velocity) acted on by a force u = [Uint, Uext]T, combining both internal motor \ncommands and external forces, the system dynamic equations can be written in the \ngeneral form of \n\nx(t) = Ax(t) + Bu(t) + w(t), \n\n(1) \nwhere A and B are matrices with appropriate dimension. The vector w(t) represents \nthe process of white noise with an associated covariance matrix given by Q = \nE[w(t)w(t)T]. The system has an observable output y(t) which is linked to the \nactual hidden state x(t) by \n\ny(t) = Cx(t) + v(t), \n\n(2) \n\nwhere C is a matrix with appropriate dimension and the vector v(t) represents the \noutput white noise which has the associated covariance matrix R = E[v(t)v(t)T]. In \nour paradigm, y(t) represents the proprioceptive signals (e.g. from muscle spindles \nand joint receptors). \n\nIn particular, for the hand we approximate the system dynamics by a damped point \nmass moving in one dimension acted on by a force u(t). Equation 1 becomes \n\n(3) \n\nwhere the hand has mass m and damping coefficient {3. We assume that this system \nis fully observable and choose C to be the identity matrix. The parameters in \nthe simulation, {3 = 3.9 N ,s/m, m = 4 kg and Uint = 1.5 N were chosen based \non the mass of the arm and the observed relationship between time and distance \ntraveled. The external force Uext was set to -0.3, 0 and 0.3 N for the resistive, null \nand assistive conditions respectively. To end the movement the sign of the motor \ncommand Uint was reversed until the arm was stationary. Noise covariance matrices \nof Q = 9.5 X 10- 5 [ and R = 3.3 x 1O- 4[ were used representing a standard deviation \nof 1.0 cm for the position output noise and 1.8 cm s-l for the position component \nof the state noise. \nAt time t = 0 the subject is given full view of his arm and, therefore, starts with an \nestimate X(O) = x(O) with zero bias and variance-we assume that vision calibrates \nthe system. At this time the light is extinguished and the subject must rely on \nthe inputs and outputs to estimate the system's state. The Kalman filter, using a \n\n\fForward Dynamic Models in Human Motor Control \n\n47 \n\nmodel of the system A, Band C, provides an optimal linear estimator of the state \ngiven by \n\ni(t) = Ax(t) + Bu(t) + K(t)[y(t) - Cx(t)] \nsensory correction \n\nforward model \n\n'V' \n\n.I \n\n, \n\n, \n\nV \n\nI \n\nwhere K(t) is the recursively updated gain matrix. The model is, therefore, a com(cid:173)\nbination of two processes which together contribute to the state estimate. The first \nprocess uses the current state estimate and motor command to predict the next \nstate by simulating the movement dynamics with a forward model. The second \nprocess uses the difference between actual and predicted reafferent sensory feed(cid:173)\nback to correct the state estimate resulting from the forward model. The relative \ncontributions of the internal simulation and sensory correction processes to the final \nestimate are modulated by the Kalman gain matrix K(t) so as to provide optimal \nstate estimates. We used this state update equation to model the bias and variance \npropagation and the effects of the external force. \n\nBy making particular choices for the parameters of the Kalman filter, we are able \nto simulate dead reckoning, sensory inflow-based estimation, and forward model(cid:173)\nbased sensorimotor integration. Moreover, to accommodate the observation that \nsubjects generally tend to overestimate the distance that their arm has moved, we \nset the gain that couples force to state estimates to a value that is larger than \nits veridical value; B = ~ [1~4 1~6] while both A and C accurately reflected \nthe true system. This is consistent with the independent data that subjects tend \nto under-reach in pointing tasks suggesting an overestimation of distance traveled \n(Soechting and Flanders, 1989). \n\nSimulations of the Kalman filter demonstrate the two distinct phases of bias prop(cid:173)\nagation observed (Figure 3). By overestimating the force acting on the arm the \nforward model overestimates the distance traveled, an integrative process eventu(cid:173)\nally balanced by the sensory correction. The model also captures the differential \neffects on bias of the externally imposed forces. By overestimating an increased \nforce under the assistive condition, the bias in the forward model accrues more \nrapidly and is balanced by the sensory feedback at a higher level. The converse \napplies to the resistive force. In accord with the experimental results the model \npredicts no change in variance under the two force conditions. \n\n4 Discussion \nWe have shd-ttn that the Kalman filter is able to reproduce the propagation of the \nbias and variance of estimated position of the hand as a function of both movement \nduration and external forces . The Kalman filter also simulates the interesting and \nnovel empirical result that while the variance asymptotes, the bias peaks after about \none second and then gradually declines. This behavior is a consequence pf a trade \noff between the inaccuracies accumulating in the internal simulation of the arm's \ndynamics and the feedback of actual sensory information. Simple models which do \nnot trade off the contributions of a forward model with sensory feedback, such as \nthose based purely on sensory inflow or on motor outflow, are unable to reproduce \nthe observed pattern of bias and variance propagation. The ability of the Kalman \nfilter to parsimoniously model our data suggests that the processes embodied in the \n\n\f48 \n\na \n\n1.0 \n\n-\n- 0.5 \n\nE \n0 \nen \nn1 \nen \n\nDaniel M. Wolpert, Zoubin Ghahramani, Michaell. Jordan \n\nc \n\n1.5 -\n-\n\nE 1.0 \n0 \n0.5 \nen 0.0 ~-------:.:...:.:...:,;.;.:.;.. \nn1 \n[Ii -0.5 \n \n0.0 \n\n0.0 0.5 1.0 \n\n1.5 2.0 2.5 \n\nTime (s) \n\nd \n\n- 2 \n0 -\n\nC\\I \nE \n\n0 \n\nQ) \n0 \nc:: \nn1 \n.... \nn1 \n> \n \n\n\" \n'> It; \n\n\\ Screen \n\nFinger \n\nSemi\u00b7silvered mirror \n\n.~ \n\nBulb \n\n~orque motors \n\nManlpulandum \n\nFigure 4. Experimental Setup \n\nEight subjects participated and performed 300 trials each. Each trial started with \nthe subject visually placing his thumb at a target square projected randomly on \nthe movement line. The arm was then illuminated for two seconds, thereby allow(cid:173)\ning the subject to perceive visually his initial arm configuration. The light was \nthen extinguished leaving just the initial target. The subject was then required to \nmove his hand either to the left or right, as indicated by an arrow in the initial \nstarting square. This movement was made in the absence of visual feedback of arm \nconfiguration. The subject was instructed to move until he heard a tone at which \npoint he stopped. The timing of the tone was controlled to produce a uniform dis(cid:173)\ntribution of path lengths from 0-30 cm. During this movement the subject either \nmoved in a randomly selected null or constant assistive or resistive 0.3N force field \ngenerated by the torque motors. Although it is not possible to directly probe a \nsubject's internal representation of the state of his arm, we can examine a func(cid:173)\ntion of this state-the estimated visual location of the thumb. (The relationship \nbetween the state of the arm and the visual coordinates of the hand is known as \n\n\f50 \n\nDaniel M. Wolpert, Zoubin Ghahramani, Michaell. Jordan \n\nthe kinematic transformation; Craig, 1986.) Therefore, once at rest the subject \nindicated the visual estimate of his unseen thumb position using a trackball, held \nin his other hand, to move a cursor projected in the plane of the thumb along the \nmovement line. The discrepancy between the actual and visual estimate of thumb \nlocation was recorded as a measure of the state estimation error. The bias and \nvariance propagation of the state estimate was analyzed as a function of movement \nduration and external forces. A generalized additive model (Hastie and Tibshirani, \n1990) with smoothing splines (five effective degrees of freedom) was fit to the bias \nand variance as a function of final position, movement duration and the interaction \nof the two forces with movement duration, simultaneously for main effects and for \neach subject. This procedure factors out the additive effects specific to each subject \nand, through the final position factor, the position-dependent inaccuracies in the \nkinematic transformation. \n\nReferences \n\nCraig, J. (1986). Introduction to robotics. Addison-Wesley, Reading, MA. \nFaye, I. (1986). An impedence controlled manipulandum for human movement stud(cid:173)\n\nies. MS Thesis, MIT Dept. Mechanical Engineering, Cambridge, MA. \n\nGallistel, C. (1980). The organization of action: A new synthesis. Erlbaum, \n\nHilladale, NJ. \n\nGoodwin, G. and Sin, K. (1984). Adaptive filtering prediction and control. Prentice(cid:173)\n\nHall, Englewood Cliffs, NJ. \n\nHastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman and \n\nHall, London. \n\nIto, M. (1984). The cerebellum and neural control. Raven Press, New York. \nJordan, M. and Rumelhart, D. (1992). Forward models: Supervised learning with \n\na distal teacher. Cognitive Science, 16:307-354. \n\nJordan, M. I. (1995). Computational aspects of motor control and motor learning. \nIn Heuer, H. and Keele, S., editors, Handbook of Perception and Action: Motor \nSkills. Academic Press, New York. \n\nKalman, R. and Bucy, R. S. (1961). New results in linear filtering and prediction. \n\nJournal of Basic Engineering (ASME), 83D:95-108. \n\nKawato, M., Furawaka, K., and Suzuki, R. (1987). A hierarchical neural network \nmodel for the control and learning of voluntary movements. Biol.Cybern., 56:1-\n17. \n\nMiall, R., Weir, D., Wolpert, D., and Stein, J. (1993). Is the cerebellum a Smith \n\nPredictor? Journal of Motor Behavior, 25(3):203-216. \n\nRobinson, D., Gordon, J., and Gordon, S. (1986). A model of the smooth pursuit \n\neye movement system. Biol.Cybern., 55:43-57. \n\nSoechting, J. and Flanders, M. (1989). Sensorimotor representations for pointing \n\nto targets in three- dimensional space. J.Neurophysiol., 62:582-594. \n\nSutton, R. and Barto, A. (1981). Toward a modern theory of adaptive networks: \n\nexpettation and prediction. Psychol.Rev., 88:135-170. \n\n\f", "award": [], "sourceid": 909, "authors": [{"given_name": "Daniel", "family_name": "Wolpert", "institution": null}, {"given_name": "Zoubin", "family_name": "Ghahramani", "institution": null}, {"given_name": "Michael", "family_name": "Jordan", "institution": null}]}