NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The reviewers felt that this paper was well-executed, even though the proposed approach is a rather straightforward application of techniques from the robust MDP literature (specifically, minmax planning with appropriately defined uncertainty sets derived from a Lipschitzness assumption). For the final version, the authors should improve the discussion of related literature on robust MDPs (e.g., "Reinforcement Learning in Robust Markov Decision Processes" by Lim et al., NIPS 2013 + references therein) and on MDPs with non-stationary transitions (e.g., "Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions" by Abbasi-Yadkori et al., NIPS 2013 + references therein).