NeurIPS 2020

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Meta Review

The reviewers agreed that this is an interesting, novel, and well-executed contribution. Congratulation! I would like to bring up two issues that were raised in the discussion, and ask the authors to address them in their final version. - It would be good to add some insight on why the meta learned update actually performs better. - The improvement over IMPALA, although clearly evident, comes at the cost of a much more complex algorithm. This should at least be mentioned/discussed.