NeurIPS 2020

Multimodal Graph Networks for Compositional Generalization in Visual Question Answering


Meta Review

After the author response and discussion all reviewers recommend (weak) accept of this paper for its contributions including: - Significant improvements on the synthetic CLEVR/CLOSURE task - Overall novel and interesting method I accept the paper with the expectation that the author will improve and clarify the paper according the author response and suggestions by the reviewers, including discussion of related work. The main concern of the reviewers and I is that the paper limits their experimental evaluation to the synthetic CLEVR dataset. The authors are strongly encouraged to include results on a non-synthetic dataset (e.g. VQA-CP, NVLR/2, GQA - or subsets if necessary) in the final version, even if results in a negative result which could be analyzed by the authors.