NeurIPS 2020

Adversarially Robust Few-Shot Learning: A Meta-Learning Approach


Meta Review

The submission proposes a method called adversarial querying (AQ) to tackle the problem of adversarial robustness in few-shot learning. Adversarial querying works by applying an adversarial perturbation to the query set when meta-training in an effort to find a few-shot learner parameterization which is robust to adversarial attacks when tuned on the support set of a given learning problem. Results in the paper show that naturally trained few-shot learners are very sensitive to adversarial attacks. Adversarial robustness results are presented for a variety of benchmarks (mini-ImageNet, CIFAR-FS, Omniglot) and learners (Prototypical Networks, R2-D2, MetaOptNet, MAML). The proposed approach is shown to yield better adversarial robustness than competing approaches (transfer learning from an adversarially-trained backbone, ADML) while maintaining a better clean accuracy. Strengths identified by reviewers include the algorithm-agnostic nature of the proposed approach and the systematic nature of the empirical investigation (e.g. experiments with many meta-learners and adversarial attacks, comparison against a transfer learning baseline). Several questions raised by reviewers were satisfyingly addressed in the rebuttal through additional experiments (Reviewer 1: transfer learning with a deeper backbone and robustness across different norms; Reviewer 2: testing with state-of-the-art meta-learning algorithms; Reviewer 4: applying AQ to Reptile and testing on Meta-Dataset). Multiple reviewers noted that the proposed approach is a very straightforward application of adversarial training to the meta-learning setting, and as such its technical contribution is somewhat thin. While experiments are extensive, they do not present conclusions that are surprising or challenge existing beliefs, which means that the submission’s contribution is judged mainly from an empirical perspective. Reviewers 4 and 5 were concerned with the large gap in natural accuracy between AQ and its non-adversarially-trained baseline. Authors agreed in the rebuttal that this is an obstacle for deployment and point out that such an obstacle also exists in the standard adversarial robustness setting. I think it would be unfair to ask of the submission that it completely overcomes this obstacle, but it’s fair to ask that it acknowledges and discusses the natural accuracy gap, especially in the context of few-shot classification, where the small number of labeled examples already contributes to an accuracy degradation. Given that the authors have already taken action on many of the reviewers’ suggestions (for instance via additional experiments), I am fairly confident that they would follow through on this suggestion if the submission was accepted. Several reviewers also expressed concerns with the paper’s justification for perturbing only the query set. The authors justify this choice by pointing out that the two alternatives either do not target the right objective (support-only perturbations) or do not offer empirical benefits over adversarial querying (support+query perturbations; Tables 7, 8, 14, and 15). Reviewers find that the empirical results are counter-intuitive and lack a clear explanation. I agree that this observation is puzzling, and that providing a convincing explanation would greatly strengthen the submission (although I would note that the observation in itself is still a valuable contribution). What needs to be determined is whether the paper has enough merits as it is to justify acceptance. While there’s clear room for improvement, I think the NeurIPS community would still benefit from the submission due to its extensive empirical evaluation. I therefore recommend acceptance.