Authors

Ananda Theertha Suresh, Alon Orlitsky, Jayadev Acharya, Ashkan Jafarpour

Abstract

Many important distributions are high dimensional, and often they can be modeled as Gaussian mixtures. We derive the first sample-efficient polynomial-time estimator for high-dimensional spherical Gaussian mixtures. Based on intuitive spectral reasoning, it approximates mixtures of $k$ spherical Gaussians in $d$-dimensions to within$\ell_1$ distance $\epsilon$ using $\mathcal{O}({dk^9(\log^2 d)}/{\epsilon^4})$ samples and $\mathcal{O}_{k,\epsilon}(d^3\log^5 d)$ computation time. Conversely, we show that any estimator requires $\Omega\bigl({dk}/{\epsilon^2}\bigr)$ samples, hence the algorithm's sample complexity is nearly optimal in the dimension. The implied time-complexity factor \mathcal{O}_{k,\epsilon}$is exponential in$k$, but much smaller than previously known. We also construct a simple estimator for one-dimensional Gaussian mixtures that uses$\tilde\mathcal{O}(k /\epsilon^2)$samples and$\tilde\mathcal{O}((k/\epsilon)^{3k+1})\$ computation time.