Review for NeurIPS paper: Constant-Expansion Suffices for Compressed Sensing with Generative Priors

NeurIPS 2020

Constant-Expansion Suffices for Compressed Sensing with Generative Priors

Review 1

Summary and Contributions: This paper is about compressed sensing (CS) under generative priors. In such a problem, undersampled linear measurements of a signal of interest are provided, and the signal is sought. The mathematical ambiguity is resolved by finding the feasible point that is in the range of a trained generative model (such as a GAN), which is itself computed by solving an empirical risk minimization. Existing theory establishes a convergence guarantee of an efficient algorithm under an appropriate random model for the weights of the generative prior. The convergence guarantee assumes that the generative model is a multilayer perceptron where the width of each layer grows log-linearly. As real neural networks can succeed despite expanding at every layer, it is important to extend the existing theory to less restrictive assumptions regarding network expansivity. The present work establishes comparable convergence guarantees for CS with generative priors with only linear expansivity over the layers of a network. To do this, the authors introduce a novel technique for establishing concentration of random functions. The work of this paper immediately generalizes several existing works, including compressive sensing with generative priors and phase retrieval with generative priors. The authors additionally contribute a tight lower bound on the expansivity of one-layer generative models in order to allow invertibility (of generative models) without noise or compressive measurements.

Strengths: This paper presents an original and creative new proof technique for establishing concentration of random functions. Such concentration results are common in machine learning and signal processing, and this technique could potentially be extended to many new contexts. The paper broadens the applicability of rigorous recovery guarantees for deep generative neural networks when used as priors for inverse problems. These improvements immediately provide advances to multiple important inverse problems.

Weaknesses: Along the main theme of the paper, I do not see any significant weaknesses. My only relatively minor issue with the paper is regarding the lower bound on generative model expansivity. There is an apparent contradiction between the provided lower bound that expansivity must be at least a factor of 2 at each layer (Line 165 and Section C) and the comment on line 65 that "The expansivity condition is a major limitation of the prior theory, since in practice neural networks do not expand at every layer." The paper could be improved by commenting on / resolving this contradiction. In Section C, the authors state that "it suffices to consider a one-layer neural network." Under the presumption that the expansivity at each layer is constant, this comment is true, but it is misleading because expansivity can in generally vary between layers, as it does in real networks. If the first layer is expansive but the subsequent layers are not expansive, then there is not necessarily a problem. If two different activations of the second layer give identical activations in the third layer, it is possible that only one of these second-layer activations lies in the range of the earlier layers).

Correctness: Yes

Clarity: Yes

Relation to Prior Work: Yes, though I think that in Line 84, the authors should add a reference to "A Geometric Analysis of Phase Retrieval" by Sun, Qu, Wright, Foundations of Computational Mathematics, 2017. https://dx.doi.org/10.1007/s10208-017-9365-9

Reproducibility: Yes

Additional Feedback: Overall, great work. Unclear sentence: Line 79 "The Gaussian noise setting admits an efficient algorithm with recovery error Theta(k/m)" does not provide sufficient clarity of context. You may want to mention the denoising problem for increased clarity. Typos: Line 225 - "family of functions f_w(x) = w.x is (eps, c eps , 1/2)-pseudo-Lipschitz". A factor of two might be missing (or perhaps a different constant c is used with the previous line). Line 562 in Supplemental materials is missing a ")" ******** After Reading Author Responses I am satisfied with the author response, and I still believe the paper is a good submission that should be accepted to this conference.

Review 2

Summary and Contributions: This is in fact a paper on uniform concentration. The question is as follows. Given a family of functions {f_t}_{t in T} (where T is a probability space) over some space X, we know that f_t(x) is close to f(x) with high probability for a random t, and wish to show that with high probability, f_t(x) is uniformly close to f(x) simultaneously for all x in X. Usually one would assume that f_t and f are Lipschitz (over T x X and over X, respectively), and use a net argument. However, this fails to give the tight bound of the dimension of a Gaussian random matrix that satisfies the weight distribution condition (WDC). The looseness comes from the fact that treating f_t(x) as a function over T x X and requiring Lipschitzness would incur a large Lipschitz constant. The innovation in this paper is that, while it still resorts to an eps-net type argument, it does not use the same standard ball as the neighbourhood for the net, and does not use the same net for all f_t. Instead, for each f_t, it uses potentially a different shape of neighbourhood to form the net and randomizes the net points. This sort of decouples the net from the parameter t, allowing for a tighter bound.

Strengths: This technique is potentially useful for many other problems which relies on uniform concentration. Various communities may be benefited from this technique.

Weaknesses: Nothing really.

Correctness: I read the main body of the paper and found the approach convincing and so I believe it is correct. I did not read the proofs in the supplementary material.

Clarity: The paper is also well-written. I find it very pleasant to read. Small typos: Line 101: “uniformly near” sounds incomplete. Maybe “uniformly concentrated near”? Line 255: u^T G(x,y) u^T should be u^T G(x,y) u Line 258: both bounds (1) and (2) refer to the two bounds listed above, instead of equations (1) and (2), right? Maybe use (a) and (b) here to avoid confusion

Relation to Prior Work: The difference in approach is clearly stated.

Reproducibility: Yes

Additional Feedback:

Review 3

Summary and Contributions: This paper considers the problem of retrieving a signal that can be expressed as the output of a neural network with random weights applied to a low-dimensional vector. Previous work developed guarantees for this problem under the condition that the dimension of the weight matrices in the network grow by a logarithmic factor. Here the authors show that it is sufficient for the dimension to grow by a constant factor. To this end they introduce a concept called pseudo-Lipschitzness and use it to prove a concentration bound.

Strengths: The paper is written beautifully. The authors explain their contributions extremely clearly. The mathematical content is quite interesting and well explained. The concentration bound may be useful to analyze other problems.

Weaknesses: The signal model considered in the paper is essentially of theoretical interest: the signal is the output of a neural network with random weights, encoded by a low-dimensional vector at the input. Its connection to signals or measurement models of practical interest is unclear, which will limit the impact of the paper and its potential audience to very mathematically-oriented readers. The mathematical contribution may be interesting in itself but its applications seem somewhat restricted in scope.

Correctness: Yes, as far as I can tell.

Clarity: Yes, extremely well written.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: I have updated my score after reading the author feedback.