Summary and Contributions: The authors propose a graph policy network to deal with active learning problems on graphs. The query strategy is formalized as a Markov decision process, and a GNN-based policy network is learned with reinforce learning to select the most informative nodes so that the classifier could reach its best performance with the least labeling budget.
Strengths: The proposed graph policy network is trained with reinforce learning to take into account the long-term performance of active learning. From this point, the proposed method has advanced previous methods aimed to only satisfy the short-term criterion at each step. Node interactions are also captured in GNN-based policy network to better measure node informativeness in local neighborhood. The effectiveness of the proposed method is evaluated on transferable active learning on graphs from the same domain and across different domains.
Weaknesses: The proposed framework is proposed to reduce the labeling budge for transferable active learning problems on graphs. However, experiments have issues with clarity. - In Section 4.2 & 4.3, It is unclear how many labeled nodes are used to train the policy network? - Why ANRMAB performs much worse than AGE? In theory, ANRMAB is expected to outperform AGE as it uses multi-armed bandit to adjust the weights of three heuristics. - Given this paper is proposed to address transferable active learning, the focus of experiments should be on measuring classification performance w.r.t. different query budgets, rather than overall performance in Tables 1&2. - In Section 4.4, it is hard to see the effect of labeling budgets for training the policy network and more importantly the effect of query budgets on classification performance. This part of experiments could be better designed to support the claims made.
Correctness: Most of the claims and method are correct, but empirical evaluation could be better set up to support the claims made.
Clarity: In general, this paper is well written. The methodology is clearly presented, making it relatively easy to follow.
Relation to Prior Work: Yes, this work has done a good job in reviewing the previous work.
Additional Feedback: I have read the author rebuttal. It would be good if the authors could explain how the model is transferred to the test graph more explicitly.
Summary and Contributions: The paper studies active learning for graphs using reinforcement learning.
Strengths: This is an interesting paper that aims at reducing the label annotation for GNNs. The proposed algorithm seems sound and results are good.
Weaknesses: How the improvement of the identified limitation "Ignoring Long-term Performance" in page 1 was tested in the numerical experiments? It was not clear to me what are the definitions of the reward R and the measure M in Eq.1 and Eq.4?
Clarity: The paper is nicely written.
Relation to Prior Work: Comparisons with previous graph active learning works [8,13,12,5] could be better explained. What are the new advantages of the newly proposed technique?
Summary and Contributions: This submission proposed a graph policy network for zero-shot transferrable active learning to query node labels for semi-supervised node classification on a "target" graph by training the policy network using multiple labeled "source" graphs. Experimental results based on graphs from the same or different domains demonstrate the efficacy of the proposed method. Updated during rebuttal: I truly appreciate the authors's extra efforts after rebuttal for clarification. As it is critical to clearly present the transferrable active learning with the policy network, I would strongly suggest that the authors shall rewrite Section 3 to better present all the transferrable active learning components, especially clearly stating that the learned policy network is applied to node-based representations.
Strengths: 1. Collected training data with labels in general can be difficult and costly. The paper provides a transferrable active learning strategy for semi-supervised node classification with graphs, which can potentially address this challenge in this specific setup.
Weaknesses: 1. The paper focused no node classification performance evaluation. There are more tasks with graphs, including link prediction mentioned by the authors. It would be interesting to check the performances of the proposed active learning strategy , for these tasks. 2. More critically, the presentation needs to be significantly improved. Specifically, it is not clear how graph policy network and the one for node classification interact with each other. It is not clear either how the learned policy network can be applied to the target graph if source and target graphs have different topology or target and source have different number of classes. Updated during rebuttal: I truly appreciate the authors's extra efforts after rebuttal for clarification. As it is critical to clearly present the transferrable active learning with the policy network, I would strongly suggest that the authors shall rewrite Section 3 to better present all the transferrable active learning components, especially clearly stating that the learned policy network is applied to node-based representations.
Correctness: The idea of training a graph policy network does make sense for active learning in graph neural networks. However, the clarity is lacking.
Clarity: In addition to the aforementioned presentation problems, there are numerous other problems: 1) For node class prediction probabilities, there is no clear description how they are computed. 2) In equation (2), throughout the paper, it is not clear which evaluate metric was used to estimate reward. Is it micro-F1, macro-F1, directly or something else? 3) The definition in equation (3) is not clear as the authors did not even definite the action space. Is that defined for each node or the whole graph? The paper definitely needs careful rewriting.
Relation to Prior Work: The authors implemented the deep reinforcement learning for active learning strategy in  for the graph node classification problem. There are also existing active learning methods for graph data, including the ones , , and  as the authors discussed in the related work. However, the performance comparison with  was not reported.
Additional Feedback: Update during rebuttal: The authors have clarified many places during the rebuttal, which is highly appreciated. I do hope they can rewrite the Section 3 (with updated Figure 1) and Appendix significantly to make the proposed method presentation more self-explaining.
Summary and Contributions: This paper presents a novel active learning strategy for GNNs which leverages a GNN-based policy network. Experiments show that in a semi-supervised node classification setting, the proposed approach can greatly reduce the label budget.
Strengths: - GPA is an elegant approach to address the widespread scarcity of labels in graph ML tasks - the experiments are very convincing, both in terms of raw performance, and in terms of the ablation study, query budgets, etc. After reading the paper, I was left with the impression that each design choice made in GPA has been carefully vetted by the authors
Weaknesses: - the 5 different graphs from "different domains" are in fact covering the same domain (i.e., they are all citation networks) - Figure 2.Left would have greatly benefited from an additional curve obtained from the performance of a recent, SotA GNN architecture - lack of discussion on the training overhead induced by GPA
Correctness: To the best of my knowledge, the proposed method is sound, The empirical methodology is exhaustive and robust. I inspected the code as well, and it appears to be of good quality.
Clarity: Yes, the paper is well written and well structured. It was honestly a pleasure to read.
Relation to Prior Work: The discussion on the related work is reasonably extensive. I would have liked to see more recent GNN architectures, both in the related work and in the experiments (to find out if GPA would carry the same benefits).
Additional Feedback: I thoroughly enjoyed reading this paper, and I found the GNN-based policy network to be a very elegant and effective idea. I raised a few concerns in the previous sections of my review, which I'm confident the authors will be able to address in the next revision of the manuscript. As an additional minor comment, Figure 1 could be made more readable -- for instance, it is hard to understand that the 4 graphs in the policy network \pi panel represent different feature aggregations. === Response === I would like to thank the authors for their thorough rebuttal, and for their quick responses. The additional details provided by the authors confirmed my understanding of the paper, and solidified my very positive score.