NeurIPS 2020

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Meta Review

After considerable discussion, I feel that this paper does meet the acceptance threshold . There needs to be added discussion of the necessity of explorability, since the lower bound is not convincing. Further in the tabular case, explorability is not needed but this algorithm would fail. These should all be carefully discussed.