Part of Advances in Neural Information Processing Systems 3 (NIPS 1990)
Peter Dayan
Barto, Sutton and Watkins [2] introduced a grid task as a didactic ex(cid:173) ample of temporal difference planning and asynchronous dynamical pre>(cid:173) gramming. This paper considers the effects of changing the coding of the input stimulus, and demonstrates that the self-supervised learning of a particular form of hidden unit representation improves performance.