NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:2565
Title:Efficient Deep Approximation of GMMs

Reviewer 1


		
Clarity: The paper is very well written and the contributions are clearly expressed. Quality: Technical quality of the presentation is high. Originality: the techniques seem to be original (although I am not an expert in this area, so my knowledge of the surrounding literature is somewhat limited). Significance: Unclear. The authors use the adjectives "shallow" and "deep" in several places but really they seem to be contrasting 1- vs. 2-hidden-layer networks. It is not clear how/whether the approach can lead to non-trivial separation results for *actual* deep networks. === Update after rebuttal period: bumping the score up, please add formal comparisons with Eldan-Shamir '16.

Reviewer 2


		
This paper takes a commonly used machine learning model, the Gaussian mixture model (GMM) whose discriminant function for classification task, as the target for approximation by neural networks, and provides clear theoretical analyses over it. To be specific, it proves the necessity and sufficiency for a two layer neural network with a linear number of nodes to approximate the GMM, and also the necessity of a single layer network with the exponential number of ndoes. The paper makes good contribution to consolidate the theoretical basis of the deep learning methods. === After rebuttal: no change of the score. Regarding the deep learning, the theoretical works are less than sufficient. This one counts for a piece.

Reviewer 3


		
In this article, the authors studied the approximation complexity of the discriminant function of GMMs, and showed that deeper networks (i.e., neural network models with two hidden layers) can exponentially reduce the number of parameters needed, compared to shallow ones with a single hidden layer. In this vein, I find this work complements the existing literature by considering the practical relevant context of separating GMMs. If the authors had manages to relax the Assumptions 1-3 to cover more practical settings, my rating would be higher, but I think it is entirely appropriate to postpone such improvements to future work. **After rebuttal**: I have read the author response and my score remains the same.