NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:295
Title:AGEM: Solving Linear Inverse Problems via Deep Priors and Sampling

Reviewer 1

The paper deals with inverse problems addressed within the Bayesian framework. More precisely, the prior for the unknown object is defined through a given set of candidate objects (data-driven prior). The paper proposes a solution for the question of hyperparameter tuning and it is original regarding this point. It clearly propose an advance of the state of the art for a class of important problems. The paper is technically sound and clear written.

Reviewer 2

Quality: The overall approach proposed by the paper seems to be technically sound. One weakness of the proposed AGEM is that the result is sensitive to the sigma in the proposal distribution in MH-E step. I wonder if the authors tried other MCMC sampling algorithms, such as HMC, or adaptive MH proposals (such as Haario et al 2001). Overall, the paper did a great job of clearly stating the experimental details, make the results reproducible. Clarity: The paper well-written and easy to follow. The details of the datasets and experiments are well documented. Originality: The proposed AGEM algorithm for simultaneously solving linear inverse problem and estimating noise level is novel to my knowledge. References: Haario, H., Saksman, E., & Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli, 7(2), 223-242.

Reviewer 3

Response to authors: I thank the authors for their answers. I appreciate the additional results provided by the authors, which shows the stability of their approach. I still think that the paper needs more work for publication (see below). However, after discussions with other reviewers, I will upgrade my original suggestion. * Currently the paper does not show very-well the contributions, and does not put them in perspective w.r.t. prior work (MMSE estimators and noise estimation methods). * As the authors mention, the paper proposes a new sampling technique for the posteriori: We should be able to see some examples of these samples see if they do have a "visual" quality. * The sampling quality is not very well supported by the proposed method. The ADMM approach (which is build on top of MAP) achieves better PSNR values in average compared to the MMSE solution. This shows that the method is not performing well (at least compared to MAP solutions). I do not expect that the method performs well here, but I think it is necessary that we see the evidence and hear what the authors think and investigate in this regard. ----------------------------- This paper presents a method for sampling a posterior distribution using DAEs for solving linear inverse problems. The sampling technique is based on two approximations: kernel density estimation as in DMSP approach, and linear approximation of the density (in Log) for sampling. The paper is well written and the proposed technique is novel. I have, however, one major concern about the theoretical explanations and experiments. The proposed method AGEM, is not a MAP estimator as far as I understood: averaging samples from the posteriori leads to MMSE solution and not the most probable (MAP). This, however is not described at the begging (Eq. 2) and presented as a MAP solution. I think that having a Minimum Mean Squared Error (MMSE) estimator is, by itself, very interesting and a good contribution. But the presentation of the method shows that the authors did not consider this fact. I ask the authors to please address this issue and clarify if this is not the case. Putting this method in context, the experiments and results are missing more important comparisons to other MMSE estimators. Summary of the points: 0) The AGEM method is an MMSE estimator, not MAP. 1) Equation 15 seems to be missing the q() terms from Eq. 14, please clarify. 2) Linear approximation in 16 needs to be discussed and clarified. What are the consequences? Alain and Bengio use similar approximations by taking a summation over interpolated samples. How do the two approaches compare? 3) I think the paper is missing a convergence visualization (change in results/PSNR) of AGEM with different number of samples.