Review for NeurIPS paper: Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency

NeurIPS 2020

Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency

Meta Review

The rebuttal and additional analyses seemed to satisfy most of the reviewers, so thanks to the authors for their thorough and responsive reply. Overall this seems like important, well-presented work, and hence I'm recommending acceptance. That said, there was a suggestion to apply the analysis of figure 4 to natural images (it seems like from the initial analysis included in the rebuttal that the results shouldn't be surprising). I sympathize with the argument that artificial stimuli are often used in neuroscience, but in this case I think performing this analysis on natural images would provide a nice sanity check, and I think many NeurIPS readers would also be curious. I don't think that lack of this analysis is enough to reject the paper, but I'd like to request authors include this for the final version.