The reviewers found this paper to be interesting and compelling, nicely summarized by R2 in discussion: think the method is sound and exciting and the key challenges in transferability live in the availability of (high-accuracy) training data and in the challenges of representation learning for molecules (GCNs need to be exposed to a lot of chemical variability to be able to interpolate in chemical space.). The alkanes are essentially the same bond over and over and lignin is trained and tested in the same chemical space. I insist that these are representation learning challenges to be solved by the community and improvements there could be combined with this RL approach." That said, the reviewers did find several areas where the paper can be improved. Because of space limitations, I understand that not all of these suggestions will be able to be incorporated within page limits, but I do expect the authors will address as much as possible within the main final text, and all feedback addressed either in main text or in a supplementary appendix.