{"title": "Forward dynamic models in human motor control: Psychophysical evidence", "book": "Advances in Neural Information Processing Systems", "page_first": 43, "page_last": 50, "abstract": null, "full_text": "Forward dynamic models  in human \n\nmotor  control:  Psychophysical evidence \n\nDaniel M.  Wolpert \nwolpert@psyche.mit.edu \n\nZouhin Ghahramani \nzoubin@psyche.mit.edu \n\nMichael I.  Jordan \njordan@psyche.mit.edu \n\nDepartment of Brain &  Cognitive Sciences \n\nMassachusetts Institute of Technology \n\nCambridge, MA  02139 \n\nAbstract \n\nBased  on  computational  principles,  with  as  yet  no  direct  experi(cid:173)\nmental  validation,  it  has  been  proposed  that  the  central  nervous \nsystem  (CNS)  uses  an internal model to simulate the  dynamic be(cid:173)\nhavior of the motor system in planning, control and learning (Sut(cid:173)\nton  and  Barto,  1981;  Ito,  1984;  Kawato  et  aI.,  1987;  Jordan  and \nRumelhart,  1992;  Miall et aI.,  1993).  We  present  experimental re(cid:173)\nsults  and simulations based on a  novel  approach  that investigates \nthe temporal propagation of errors  in the sensorimotor integration \nprocess.  Our results  provide  direct  support for  the  existence  of an \ninternal model. \n\n1 \n\nIntroduction \n\nThe notion of an internal model, a  system which  mimics the behavior of a  natural \nprocess,  has emerged as an important theoretical concept in motor control (Jordan, \n1995).  There  are two varieties of internal models-\"forward models,\"  which  mimic \nthe  causal  flow  of a  process  by  predicting  its  next  state  given  the  current  state \nand  the  motor command, and  \"inverse  models,\"  which  are  anticausal,  estimating \nthe  motor command that  causes  a  particular state  transition.  Forward  models(cid:173)\nthe  focus  of this  article-have been  been  shown  to  be  of potential use  for  solving \nfour  fundamental  problems  in  computational  motor  control.  First,  the  delays  in \nmost  sensorimotor  loops  are  large,  making  feedback  control  infeasible  for  rapid \n\n\f44 \n\nDaniel M.  Wolpert,  Zoubin  Ghahramani,  Michaell. Jordan \n\nmovements.  By using a forward model for  internal feedback the outcome of an action \ncan  be  estimated  and  used  before  sensory  feedback  is  available  (Ito,  1984;  Miall \net al., 1993).  Second, a forward model is a key ingredient in a system that uses motor \noutflow  (\"efference  copy\")  to anticipate and cancel  the  reafferent  sensory  effects  of \nself-movement (Gallistel,  1980;  Robinson et  al.,  1986).  Third,  a forward  model can \nbe  used  to  transform errors  between  the  desired  and  actual  sensory  outcome of a \nmovement into the corresponding  errors  in the motor command, thereby  providing \nappropriate  signals  for  motor  learning  (Jordan  and  Rumelhart,  1992).  Similarly \nby predicting  the sensory  outcome of the  action,  without  actually performing it,  a \nforward  model  can  be  used  in  mental practice  to  learn  to select  between  possible \nactions  (Sutton  and  Barto,  1981).  Finally, a  forward  model  can  be  used  for  state \nestimation  in  which  the  model's  prediction  of the  next  state  is  combined  with  a \nreafferent  sensory  correction  (Goodwin  and  Sin,  1984).  Although  shown  to  be  of \ntheoretical  importance,  the  existence  and  use  of an  internal forward  model  in  the \nCNS  is  still a  major topic of debate. \n\nWhen  a  subject  moves his  arm in  the  dark,  he  is  able  to estimate the visual  loca(cid:173)\ntion  of his  hand  with  some  degree  of accuracy.  Observer  models from  engineering \nformalize  the  sources  of information  which  the  CNS  could  use  to  construct  this \nestimate  (Figure  1).  This  framework  consists  of a  state  estimation  process  (the \nobserver)  which  is  able  to  monitor both  the  inputs  and outputs of the  system.  In \nparticular, for  the  arm, the inputs  are motor commands and  the output  is  sensory \nfeedback  (e.g.  vision  and  proprioception).  There  are  three  basic  methods whereby \nthe observer  can estimate the current  state  (e.g.  position and velocity) of the  hand \nform these sources:  It can make use  of sensory inflow,  it can make use  of integrated \nmotor outflow (dead reckoning),  or it can combine these  two sources of information \nvia the use  of a forward  model. \n\nu(t) \n\nInput \n\nSystem \n\nOutput \n\nX(t) \n\ny(t) \n\nMoto \nr \n\nComma nd -\n\nS ensory \nedback \nFe \n\nObserver \n\nState estimate \n\nx(t) \n\nFigure  1.  Observer  model  of state estimation. \n\n\fForward Dynamic Models in Human Motor Control \n\n45 \n\n2  State Estimation Experiment \n\nTo  test between  these  possibilities,  we  carried out an experiment in which subjects \nmade arm movements in the dark.  The full  details of the experiment are  described \nin  the  Appendix.  Three  experimental  conditions  were  studied,  involving  the  use \nof null,  assistive  and resistive  force  fields.  The subjects'  internal  estimate of hand \nlocation was assessed by asking them to localize visually the position of their hand at \nthe end of the movement.  The  bias of this location estimate, plotted as a  function \nof  movement  duration  shows  a  consistent  overestimation  of  the  distance  moved \n(Figure 2).  This bias shows two distinct phases as a function of movement duration, \nan  initial increase  reaching  a  peak  of 0.9  cm  after one  second  followed  by  a  sharp \ntransition to a region of gradual decline.  The variance of the estimate also shows an \ninitial increase during the first  second of movement after which it plateaus at about \n2 cm2 .  External forces  had distinct  effects  on  the  bias  and variance  propagation. \nWhereas the bias was increased  by the assistive force  and decreased  by the resistive \nforce  (p  < 0.05), the  variance was  unaffected. \n\na \n\n1.0 \n\n-\n- 0.5 \n\nE \n0 \n\n(J) \n<tI \nCD \n\n0.0 \n\nc \n\n-\n-\n\n1.5 \nE  1.0 \n0 \n0.5 \n(J)  0.0  - - ~ . -- .:...  - :~::-. -' \n<tI \ni:i5  -0.5 \n<]  -1 .0 \n-1.5 \n\n0.0  0.5  1.0  1.5  2.0  2.5 \n\n0.0  0.5  1.0 \n\n1.5  2.0  2.5 \n\nTime  (5) \n\n(\\I \n\nb \n\n2.0 \n\n- 2.5 \n-\n\nE \n0 \nQ)  1.5 \n0 \nc::  1.0 \n<tI \n<tI  0.5 \nL... \n> \n0.0 \n\nTime  (5) \n\n..... . \n\n(\\I \n\nd \n\n- 2 \n-\n\nE \n0 \n\n1 \n\nQ) \n0 \nc:: \n<tI \nL... \n<tI \n> \n<] \n\n0 \n\n-1 \n\n-2 \n\n0.0  0.5  1.0  1.5  2.0  2.5 \n\n0.0  0.5 \n\nTime  (5) \n\n1.5  2.0  2.5 \n\n1.0 \nTime  (5) \n\nFigure  2.  The propagation  of the  (a)  bias  and  (b)  variance  of the state \nestimate is shown, with standard error lines, against movement duration. \nThe differential  effects on  (c)  bias and (d) variance of the external force, \nassistive  (dotted lines)  and resistive  (solid lines),  are also shown relative \nto  zero  (dashed  line).  A  positive  bias  represents  an  overestimation  of \nthe distance  moved. \n\n\f46 \n\nDaniel M.  Wolpert,  Zoubin  Ghahramani,  Michael  I.  Jordan \n\n3  Observer Model Simulation \n\nThese experimental results can be fully accounted for only if we assume that the mo(cid:173)\ntor control system integrates the efferent  outflow and the reafferent  sensory inflow. \nTo establish this conclusion we have developed an explicit model of the sensorimotor \nintegration process which contains as special cases all three of the methods referred \nto above.  The  model-a Kalman filter  (Kalman and  Bucy,  1961)-is a  linear dy(cid:173)\nnamical system that produces an estimate of the location of the hand by monitoring \nboth the motor outflow and the feedback  as sensed,  in the absence  of vision, solely \nby  proprioception.  Based  on  these  sources  of information the  model estimates the \narm's state, integrating sensory and motor signals to reduce  the overall uncertainty \nin its estimate. \nRepresenting  the  state  of the  hand  at  time  t  as  x(t)  (a  2  x  1  vector  of position \nand velocity)  acted  on by  a  force  u  =  [Uint, Uext]T,  combining both internal motor \ncommands and external forces,  the system dynamic equations can be written in the \ngeneral form of \n\nx(t) = Ax(t) + Bu(t) + w(t), \n\n(1) \nwhere A and B  are matrices with appropriate dimension.  The vector w(t) represents \nthe  process  of  white  noise  with  an  associated  covariance  matrix  given  by  Q  = \nE[w(t)w(t)T].  The  system  has  an  observable  output  y(t)  which  is  linked  to  the \nactual hidden state x(t)  by \n\ny(t) =  Cx(t) + v(t), \n\n(2) \n\nwhere  C  is a  matrix with appropriate dimension and the vector v(t) represents  the \noutput white noise which has the associated covariance matrix R = E[v(t)v(t)T].  In \nour paradigm, y(t) represents  the proprioceptive signals  (e.g.  from muscle  spindles \nand joint receptors). \n\nIn particular, for the hand we approximate the system dynamics by a damped point \nmass moving in  one dimension acted on by  a force  u(t).  Equation 1 becomes \n\n(3) \n\nwhere the hand has mass m and damping coefficient {3.  We assume that this system \nis  fully  observable  and  choose  C  to  be  the  identity  matrix.  The  parameters  in \nthe  simulation,  {3  =  3.9  N ,s/m,  m  =  4  kg  and  Uint  =  1.5  N  were  chosen  based \non  the  mass of the  arm and  the  observed  relationship  between  time and  distance \ntraveled.  The external force  Uext  was  set  to -0.3, 0  and 0.3  N for  the resistive,  null \nand  assistive  conditions  respectively.  To end  the  movement the  sign  of the  motor \ncommand Uint  was reversed until the arm was stationary.  Noise covariance matrices \nof Q = 9.5 X  10- 5 [  and R = 3.3 x 1O- 4[  were used representing a standard deviation \nof 1.0 cm for  the position output noise  and  1.8 cm s-l for  the position component \nof the state noise. \nAt time t =  0 the subject  is given full view of his arm and, therefore,  starts with an \nestimate X(O)  = x(O) with zero  bias and variance-we assume that vision calibrates \nthe  system.  At  this  time  the  light  is  extinguished  and  the  subject  must  rely  on \nthe inputs  and outputs to estimate the system's state.  The  Kalman filter,  using  a \n\n\fForward Dynamic Models in Human  Motor Control \n\n47 \n\nmodel of the system A,  Band C,  provides an optimal linear estimator of the state \ngiven by \n\ni(t) = Ax(t) + Bu(t) + K(t)[y(t) - Cx(t)] \nsensory  correction \n\nforward  model \n\n'V' \n\n.I \n\n, \n\n, \n\nV \n\nI \n\nwhere  K(t) is the recursively  updated gain matrix.  The model is,  therefore,  a com(cid:173)\nbination of two processes  which together contribute to the state estimate.  The first \nprocess  uses  the  current  state  estimate  and  motor  command to  predict  the  next \nstate  by  simulating  the  movement  dynamics  with  a  forward  model.  The  second \nprocess  uses  the  difference  between  actual  and  predicted  reafferent  sensory  feed(cid:173)\nback  to  correct  the  state estimate resulting  from  the forward  model.  The relative \ncontributions of the internal simulation and sensory correction processes to the final \nestimate are  modulated by  the  Kalman gain  matrix K(t) so  as  to provide  optimal \nstate estimates.  We  used  this state update equation to model the bias and variance \npropagation and the effects  of the external force. \n\nBy  making particular choices  for  the  parameters of the  Kalman filter,  we  are  able \nto  simulate  dead  reckoning,  sensory  inflow-based  estimation,  and  forward  model(cid:173)\nbased  sensorimotor  integration.  Moreover,  to  accommodate the  observation  that \nsubjects generally tend to overestimate the distance  that their arm has moved,  we \nset  the  gain  that  couples  force  to  state  estimates  to  a  value  that  is  larger  than \nits  veridical  value;  B = ~ [1~4  1~6]  while  both A and C accurately  reflected \nthe  true system.  This  is  consistent  with  the  independent  data that subjects  tend \nto under-reach  in  pointing tasks suggesting  an  overestimation of distance  traveled \n(Soechting and  Flanders,  1989). \n\nSimulations of the  Kalman filter  demonstrate the two distinct  phases  of bias prop(cid:173)\nagation  observed  (Figure  3).  By  overestimating  the  force  acting  on  the  arm  the \nforward  model overestimates  the  distance  traveled,  an  integrative  process  eventu(cid:173)\nally  balanced  by  the sensory  correction.  The  model  also  captures  the  differential \neffects  on  bias  of the  externally  imposed  forces.  By  overestimating  an  increased \nforce  under  the  assistive  condition,  the  bias  in  the  forward  model  accrues  more \nrapidly  and  is  balanced  by  the  sensory  feedback  at  a  higher  level.  The  converse \napplies  to  the  resistive  force.  In  accord  with  the  experimental  results  the  model \npredicts no  change in  variance under  the two force  conditions. \n\n4  Discussion \nWe  have  shd-ttn  that the Kalman filter  is  able  to  reproduce  the  propagation of the \nbias and variance of estimated position of the hand as a function of both movement \nduration  and external forces .  The  Kalman filter  also  simulates the interesting  and \nnovel empirical result that while the variance asymptotes, the bias peaks after about \none  second  and then gradually declines.  This  behavior is  a  consequence  pf a  trade \noff  between  the  inaccuracies  accumulating in  the  internal  simulation of the  arm's \ndynamics and the feedback  of actual sensory  information.  Simple models which  do \nnot  trade  off the  contributions of a  forward  model  with sensory  feedback,  such  as \nthose  based purely on sensory  inflow or on  motor outflow,  are unable to reproduce \nthe observed  pattern of bias  and variance  propagation.  The  ability of the Kalman \nfilter to parsimoniously model our data suggests that the processes embodied in the \n\n\f48 \n\na \n\n1.0 \n\n-\n- 0.5 \n\nE \n0 \nen \nn1 \nen \n\nDaniel M.  Wolpert,  Zoubin  Ghahramani,  Michaell.  Jordan \n\nc \n\n1.5 -\n-\n\nE  1.0 \n0 \n0.5 \nen  0.0  ~-------:.:...:.:...:,;.;.:.;.. \nn1 \n[Ii  -0.5 \n<l  -1.0 \n-1.5 \n\n0.0  0.5  1.0  1.5  2.0  2.5 \n\n0.0  0.5 \n\n1.0  1.5  2.0  2.5 \nTime  (s) \n\nTime  (s) \n\nb \n\n- 2.5 \n-Q)  1.5 \n\nC\\I \nE \n0 \n\n2.0 \n\n0 \nc::  1.0 \nn1 \n.... \nn1  0.5 \n> \n0.0 \n\n0.0  0.5  1.0 \n\n1.5  2.0  2.5 \n\nTime  (s) \n\nd \n\n- 2 \n0 -\n\nC\\I \nE \n\n0 \n\nQ) \n0 \nc:: \nn1 \n.... \nn1 \n> \n<l \n\n-1 \n\n-2 \n\n0.0  0.5 \n\n1.0  1.5  2.0  2.5 \nTime  (s) \n\nFigure  3.  Simulated  bias  and  variance  propagation,  in  the same  rep-\nresentation  and  scale  as  Figure  2,  from  a  Kalman  filter  model  of  the \nsensorimotor integration  process. \n\nfilter,  namely  internal simulation through  a  forward  model  together  with  sensory \ncorrection,  are  likely  to  be  embodied in the  sensorimotor integration  process.  We \nfeel  that  the  results  of this  state  estimation study  provide strong  evidence  that  a \nforward  model is used  by the CNS  in maintaining its estimate of the hand location. \nFurthermore, the state estimation paradigm provides a framework to study the sen(cid:173)\nsorimotor integration process in both normal and patient populations.  For example, \nthe specific  predictions of the sensorimotor integration model can be tested  in  both \npatients with sensory  neuropathies,  who lack proprioceptive reafference,  and in  pa(cid:173)\ntients with damage to the cerebellum, a  proposed site for  the forward  model (Miall \net al.,  1993). \n\nAcknowledgements \n\nWe thank Peter Dayan for suggestions about the manuscript.  This project was sup(cid:173)\nported  by  grants  from  the  McDonnell-Pew  Foundation,  ATR Human Information \nProcessing Research Laboratories, Siemens Corporation, and by grant N00014-94-1-\n0777 from the Office of Naval Research.  Daniel M.  Wolpert and Zoubin Ghahramani \nare  McDonnell-Pew Fellows in Cognitive Neuroscience.  Michael!. Jordan is  a  NSF \nPresidential Young Investigator. \n\n\fForward Dynamic Models  in  Human  Motor Control \n\n49 \n\nAppendix:  Experimental Paradigm \n\nTo  investigate  the  way  in which  errors  in the  state estimate change over  time  and \nwith external forces we used a setup (Figure 4) consisting of a combination of planar \nvirtual visual  feedback  with  a  two  degree  of freedom  torque  motor  driven  manip(cid:173)\nulandum  (Faye,  1986).  The  subject  held  a  planar  manipulandum on  which  his \nthumb was mounted.  The manipulandum was used both to accurately  measure the \nposition of the subject's thumb and also,  using the torque motors, to constrain the \nhand to move along a line across the subject's body.  A projector was used to create \nvirtual images in the plane of the movement by projecting a  computer VGA screen \nonto  a  horizontal  rear  projection  screen  suspended  above  the  manipulandum.  A \nhorizontal semi-silvered  mirror was  placed  midway between  the screen  and manip(cid:173)\nulandum.  The subject  viewed  the  reflected  image of the  rear  projection screen  by \nlooking down  at the mirror;  all  projected  images, therefore,  appeared  to  be  in the \nplane of the thumb, independent  of head position. \n\nProjector \n\n\\ \n\n,', \n\n'0' \\  Image \n\nCursor \n\n: \n, \n\nI \n\nI \n\nI '   \\ \n\nI '   \\ \n\n1. \n.,,> \n\n\" \n'> It; \n\n\\  Screen \n\nFinger \n\nSemi\u00b7silvered mirror \n\n.~ \n\nBulb \n\n~orque motors \n\nManlpulandum \n\nFigure 4.  Experimental  Setup \n\nEight subjects  participated and performed 300  trials each.  Each  trial started with \nthe  subject  visually  placing  his  thumb  at  a  target  square  projected  randomly on \nthe  movement line.  The arm was  then  illuminated for  two seconds,  thereby  allow(cid:173)\ning  the  subject  to  perceive  visually  his  initial  arm  configuration.  The  light  was \nthen extinguished leaving just the  initial target.  The subject  was  then required  to \nmove  his  hand  either  to  the  left  or  right,  as  indicated  by  an  arrow  in  the  initial \nstarting square.  This movement was  made in the absence  of visual feedback of arm \nconfiguration.  The subject  was  instructed  to move until he  heard  a  tone  at which \npoint he stopped.  The timing of the tone  was  controlled to produce  a uniform dis(cid:173)\ntribution of path lengths from  0-30  cm.  During this  movement the subject  either \nmoved in  a randomly selected  null or constant  assistive or resistive  0.3N  force  field \ngenerated  by  the  torque  motors.  Although  it  is  not  possible  to  directly  probe  a \nsubject's  internal  representation  of the  state  of his  arm,  we  can  examine  a  func(cid:173)\ntion  of this  state-the estimated  visual  location  of the  thumb.  (The  relationship \nbetween  the  state  of the  arm and  the  visual  coordinates  of the  hand  is  known  as \n\n\f50 \n\nDaniel M.  Wolpert,  Zoubin Ghahramani,  Michaell.  Jordan \n\nthe  kinematic  transformation;  Craig,  1986.)  Therefore,  once  at  rest  the  subject \nindicated  the  visual  estimate of his  unseen  thumb position  using  a  trackball,  held \nin  his  other hand,  to  move a  cursor  projected  in  the plane of the thumb along the \nmovement line.  The discrepancy  between  the  actual and  visual estimate of thumb \nlocation  was  recorded  as  a  measure  of the  state  estimation  error.  The  bias  and \nvariance propagation of the state estimate was  analyzed as  a function of movement \nduration and external forces.  A generalized additive model (Hastie and Tibshirani, \n1990)  with smoothing splines  (five  effective  degrees  of freedom)  was  fit  to the  bias \nand variance as a function of final  position, movement duration and the interaction \nof the two  forces  with  movement duration, simultaneously for  main effects  and for \neach subject.  This procedure factors out the additive effects specific to each subject \nand,  through  the  final  position  factor,  the  position-dependent  inaccuracies  in  the \nkinematic transformation. \n\nReferences \n\nCraig,  J.  (1986).  Introduction  to  robotics.  Addison-Wesley,  Reading, MA. \nFaye, I. (1986).  An impedence  controlled manipulandum for human  movement stud(cid:173)\n\nies.  MS  Thesis,  MIT Dept.  Mechanical  Engineering,  Cambridge, MA. \n\nGallistel,  C.  (1980).  The  organization  of  action:  A  new  synthesis.  Erlbaum, \n\nHilladale,  NJ. \n\nGoodwin, G. and Sin, K.  (1984).  Adaptive filtering  prediction  and control.  Prentice(cid:173)\n\nHall,  Englewood  Cliffs,  NJ. \n\nHastie,  T.  and Tibshirani,  R.  (1990).  Generalized  Additive  Models.  Chapman and \n\nHall,  London. \n\nIto,  M.  (1984).  The  cerebellum  and  neural  control.  Raven Press,  New  York. \nJordan,  M.  and  Rumelhart,  D.  (1992).  Forward  models:  Supervised learning  with \n\na  distal teacher.  Cognitive  Science,  16:307-354. \n\nJordan,  M.  I.  (1995).  Computational aspects  of motor control and motor learning. \nIn Heuer,  H.  and Keele,  S., editors,  Handbook  of Perception  and Action:  Motor \nSkills.  Academic Press,  New  York. \n\nKalman,  R.  and  Bucy,  R.  S.  (1961).  New  results  in linear filtering  and prediction. \n\nJournal  of Basic  Engineering  (ASME),  83D:95-108. \n\nKawato,  M.,  Furawaka,  K.,  and Suzuki,  R.  (1987).  A  hierarchical  neural  network \nmodel for the control and learning of voluntary movements.  Biol.Cybern., 56:1-\n17. \n\nMiall,  R.,  Weir,  D.,  Wolpert,  D.,  and  Stein,  J.  (1993).  Is  the cerebellum  a  Smith \n\nPredictor?  Journal  of Motor  Behavior,  25(3):203-216. \n\nRobinson,  D.,  Gordon,  J.,  and Gordon,  S.  (1986).  A  model of the smooth pursuit \n\neye  movement system.  Biol.Cybern.,  55:43-57. \n\nSoechting,  J.  and  Flanders,  M.  (1989).  Sensorimotor  representations  for  pointing \n\nto targets in  three- dimensional space.  J.Neurophysiol.,  62:582-594. \n\nSutton,  R.  and  Barto,  A.  (1981).  Toward  a  modern theory  of adaptive  networks: \n\nexpettation and prediction.  Psychol.Rev.,  88:135-170. \n\n\f", "award": [], "sourceid": 909, "authors": [{"given_name": "Daniel", "family_name": "Wolpert", "institution": null}, {"given_name": "Zoubin", "family_name": "Ghahramani", "institution": null}, {"given_name": "Michael", "family_name": "Jordan", "institution": null}]}