Reviews are all on the accept side: 1 top 50% of accepted and 3 marginally above threshold. Only R4 (strong accept) intervened in the discussion. As the main reason for calling this paper borderline was limited novelty compared to , I had to proceed to a detailed comparative rereading of this paper to . In my opinion, this approach is very different from . While the authors presented it as only introducing a small modeling difference from , this has a huge impact on everything, in particular the resulting DNN architecture and the inference process. But failure to appreciate the novelty may come from clarity issues. As pointed by R1, "the equations could be more organized to see the contribution compared to the previous world.", and I do not think one can fully understand and reproduce the model in its present description. We note that the authors also attached their code and stated in the rebuttal: "R1: We will reorganize the equations to make the analytic comparisons between our approach and previous work more clear", and "R4: We will add a diagram to clarify the encoding/decoding process." In summary: - I believe there is enough novelty in this paper to be accepted. - Absence of fully understandable descriptions seem to be an issue at every level of this paper: equations, model description, training procedure, experiments. Authors have stated they will clarify all of this, but will they have enough space? - 2 reviewers mentioned the confusing use of 'reparameterization' in the title: as the Authors have not addressed it in their rebuttal, let me bring this as a third reminder. Another reason for accept: language modeling approaches with explicit topic modeling put a much stronger burden of description on the authors than back boxes such as GPT-3 or T5, and authors should not be over-penalized for that effort.