Processing math: 80%

A Comprehensively Tight Analysis of Gradient Descent for PCA

Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

Bibtex Paper Reviews And Public Comment » Supplemental

Authors

Zhiqiang Xu, Ping Li

Abstract

We study the Riemannian gradient method for PCA on which a crucial fact is that despite the simplicity of the considered setting, i.e., deterministic version of Krasulina's method, the convergence rate has not been well-understood yet. In this work, we provide a general tight analysis for the gap-dependent rate at O(1Δlog1ϵ) that holds for any real symmetric matrix. More importantly, when the gap Δ is significantly smaller than the target accuracy ϵ on the objective sub-optimality of the final solution, the rate of this type is actually not tight any more, which calls for a worst-case rate. We further give the first worst-case analysis that achieves a rate of convergence at O(1ϵlog1ϵ). The two analyses naturally roll out a comprehensively tight convergence rate at O(1max. Particularly, our gap-dependent analysis suggests a new promising learning rate for stochastic variance reduced PCA algorithms. Experiments are conducted to confirm our findings as well.