NeurIPS 2020

Efficient Planning in Large MDPs with Weak Linear Function Approximation


Meta Review

All reviewers agree that the paper makes a nice contribution to planning with function approximation. In particular, the paper considers an important open problem, and while the problem is solved by making a few assumptions (mostly notably the core states), the results have made significant progress on the important problem. The reviewers also appreciate the use of precise language and careful description of related work. Among the remaining concerns, R2 wants to see some evidence of robustness against the failure of the "core state" assumption. While performing empirical experiments may not fit the theoretical nature of the paper, the authors can consider a theoretical justification: namely, define a notion of error that measures how much the core-states assumption is violated, and show how such an error manifest itself in the final guarantee. How much of a blow-up (hopefully polynomial) will we get? Sketching out such an analysis (possibly in the appendix) would help answer the robustness question.