TD(0) Leads to Better Policies than Approximate Value Iteration

Part of Advances in Neural Information Processing Systems 18 (NIPS 2005)

Bibtex Metadata Paper

Authors

Benjamin Roy

Abstract

Abstract Unavailable