NeurIPS 2020

POMO: Policy Optimization with Multiple Optima for Reinforcement Learning

Meta Review

Three reviewers support accepting the paper, one argues for rejection. From the reviews, rebuttal and discussion, the consensus seemed to be that the paper has an interesting new idea and good empirical results. The debate was around how much novelty there is, and how likely it is for the idea to be useful in the future, which are slightly more subjective concerns. I recommend acceptance, and I hope future work will show that this was a valuable stepping stone. I still recommend that the authors revise the paper according to the reviewer's suggestions, in particular in terms of not making overstated claims and giving the reader broader context.