Barycentric Interpolators for Continuous Space and Time Reinforcement Learning

Part of Advances in Neural Information Processing Systems 11 (NIPS 1998)

Bibtex Metadata Paper


Rémi Munos, Andrew Moore


In order to find the optimal control of continuous state-space and time reinforcement learning (RL) problems, we approximate the value function (VF) with a particular class of functions called the barycentric interpolators. We establish sufficient conditions under which a RL algorithm converges to the optimal VF, even when we use approximate models of the state dynamics and the reinforce(cid:173) ment functions .