Review for NeurIPS paper: Barking up the right tree: an approach to search over molecule synthesis DAGs

NeurIPS 2020

Barking up the right tree: an approach to search over molecule synthesis DAGs

Review 1

Summary and Contributions: This submission describes an autoencoder for directed acyclic graphs that can be applied to molecular synthesis graphs. Synthesizability is an underappreciated aspect of molecular generation. The decoder ensures that valid synthetic pathways are generated, subject to an external oracle that predicts reaction products from reactants. An optimization setting is also demonstrated where reinforcement learning is used to bias generation, neglecting the latent encoding.

Strengths: The complex process of generating a DAG is cleverly posed as a sequential decision-making process while imposing some constraints on the action space (e.g., selecting commercially-available molecules, forbidding cyclic pathways) informed by the domain. Remarkably, this actually works as demonstrated through Figure 5’s walk in the latent space. Empirical results for goal-directed optimization are thorough and convincing.

Weaknesses: I do not see any weaknesses with the current work, only exciting potential extensions. I can tell that the authors were constrained by the page limits, but the Appendix contains more complete explanations of the work.

Correctness: Yes, I have no concerns.

Clarity: The only suggestion I would make is to add a caveat to the main text that both the Molecular Transformer and the CASP oracle are imperfect; computer scientists unfamiliar with chemistry might not otherwise appreciate that point. Minor point: the (lack of) capitalization of article titles in the references is odd. Some typos include “WAE”, a missing space in Figure 5’s caption.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: I would highly recommend discussing the reconstruction accuracy of DoG-AE (Appendix L168) in the main text. I know it’s hard to contextualize whether 65% is good or bad (and there are no other approaches to benchmark against), but this is an important statistic to mention. Further, because this is an autoregressive model, it would be appropriate to examine reconstruction accuracy on the test set when using a beam search of some kind. --- I've read the rebuttal - still strongly support acceptance

Review 2

Summary and Contributions: The submission presents a molecular generation framework that incorporates synthesizability information by directly generating chemical reaction pathways (more specifically reaction DAGs) that specify how the molecule is made. The model is shown to have competitive performance on a variety of molecule generation and optimization benchmark metrics

Strengths: Quite an interesting problem, and improvements in molecular property optimization could have significant impacts on drug discovery, which could be very relevant in the current environment The proposed architecture to generate synthesis DAGs is very interesting and allows the possibility of having multi-step non-linear synthesis routes A wide range of molecular generation baselines are benchmarked The proposed model has competitive performance on a variety of benchmark metrics Code is provided

Weaknesses: The model assumes that there exists a perfect forward reaction predictor oracle for the generation process. Given that this cannot be true, it would have been interesting to see how sensitive the proposed model is to varying error rates of the reaction predictor module, or at least some comment about this issue. Eg does the reaction DAG generation process just break if a spurious reaction prediction is made?

Correctness: Seems to be correct

Clarity: Paper is well written and arranged in a clear way. The fonts and structures in the figures are too small, eg Figure 3, 4, 5

Relation to Prior Work: Related work section is quite comprehensive

Reproducibility: Yes

Additional Feedback: The model requires a large reaction data set in order to generate sufficient quantities of synthesis DAGs. From the largest public reaction dataset (USPTO), only 72008 reaction DAGs could be extracted. This seems like quite a small training dataset. Do you think it is a significant constraining factor? Once again, the font size in the figures is way too small and hard to read, especially in print ## After rebuttal and reviewer discussion Still support acceptance of paper

Review 3

Summary and Contributions: The authors proposed a multi-step molecular synthesis route generation method, given a target molecule as input. Specifically, by representing the synthesis route as a directed acyclic graphs (DAGs), the authors proposed a hierarchical neural message passing procedure that exchanges information not only among atoms in molecules, but also among the nodes in the DAG, and develop an efficient serialization procedure for DAGs.

Strengths: (1) The paper aims to tackle a very interesting problem: reverse-engineering the synthesis procedure to generate the module of interest (2) The paper is very well-written, and the figures are very illustrative

Weaknesses: (1) Each DAG may correspond to multiple topological sorting paths; it is unclear if they are all valid synthesis procedure. If so, are they all enumerated in the training step. (2) It is unclear if all partial synthesis procedure (i.e., a subset of actions up to some moment) also corresponds to a valid and non-out-of-distribution action embedding.

Correctness: It seems reasonable, though I must admit it’s out of my expertise domain.

Clarity: Yes

Relation to Prior Work: I don't have a judgement because it’s out of my expertise domain.

Reproducibility: No

Additional Feedback:

Review 4

Summary and Contributions: The work filled an important gap for molecular design and optimization by considering the synthesis pathways of the molecules into the process, which is a prerequisite for experimental testing. Several contributions are made includes: 1. Representing synthesis pathway as DAG and serialzation of the it so that a RNN can be used. 2. A new message passing procedure for atoms in the molecules as well as nodes in the DAG, so that encoding from DAG to latent vector is possible. 3. Competitive results comparing to other baselines.

Strengths: 1. It filled an important application gap for molecular design and optimization. 2. Many technical contributions are made to use synthesis pathway. 3. I believe it will be an important work for molecule design and optimization community.

Weaknesses: Incorporating synthesis pathways into molecule design and optimization makes sense to me. But that comes with extra complexity. Apart from good empirical results the authors have shown, can you also comment on the complexity regrading to implementation, training time, inference time, especially considering JT-VAE is better on 3/5 metrics in Table 1.

Correctness: Yes.

Clarity: The paper is very well written.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: There are many modelling choices and some ablation study or more detailed study on those components will shad more lights on method. For example: in the action embedding, GNN is preferred instead of fingerprint because other work shows GNN perform well. But in your setting, is it really true? If so, how much performance down-gradation will be observed? If the difference is limited, Fingerprint is so much simpler to use. ---After reading the author's feedback--- I am keeping the accept suggestion.