NeurIPS 2020

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Meta Review

The paper proposes an approach for complex 7-player multi-agent game Diplomacy (no-communication). All the reviewers liked the clear writing and thorough evaluation. Some concerns were raised about fairer comparison to prior state-of-the-art which were partially addressed by the authors' rebuttal. The paper was discussed among the reviewers and everyone agrees that the paper has valuable insights to be shared with community. Please incorporate appropriate changes in the camera ready version of paper in response to reviewers' final comments. For instance, clarifying the comparison setup against prior state-of-the-art (DipNet) that the approach is initialized by effectively computing a best response to baseline, etc. Please refer to reviews for more details.