Paper ID: | 3379 |
---|---|

Title: | UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization |

originality: I don't know of any other results of this form, although the non-adaptive problem has been solved previously, and the unconstrained problem has been solved up to logs. minor nit: I don't think the version of lemma 1 in cutkosky2019 has log dependencies - it looks like lemma 1 is a special case of theorem 1 in cutkosky2019. quality: This paper presents a clean solution to an important open problem in adaptive convex optimization. The techniques seem interesting and may be useful for other analyses. I liked the use of optimism here and I think this in particular is an under-utilized type of guarantee that might have more applications. clarity: the paper is clearly written. I found even the proofs in the appendices to be fairly painless to follow! significance: This is paper solves an interesting open problem in adaptive stochastic convex optimization and so I believe its results to be significant. -------------- I have read the author response and maintain my score.

I have read the author's response and appreciate the proof provided in this work. I updated my score to 8. ================================ The paper is very cleanly written. Problem setup, proofs and convergence guarantees are easy to read. The main result is new. The theoretical significance of this result is high as it provides a clean way to achieve acceleration for constrained smooth/nonsmooth convex optimization problems. The empirical significance is limited but it's fine as experiment improvement is not the main claim of the paper.

Overall, I think the result is nice and the analysis has some new techniques which I have not seen before. The only issue is that the authors do not consider the strongly convex case, which is more challenging. In the previous work, for example, the work by Guanghui Lan, achieving the uniformly optimal rate for non-strongly convex case is easy but for strongly convex case is much harder. I wonder if the authors can discuss the main difficulty of extending their results to the strongly convex case. Another suggestion is to explicitly define alpha_t=t. I can only find alpha_t defined in Lemma 1 and have to guess that it is still defined as alpha_t=t for the rest of the theorem and their proofs. Please define alpha just like how you define eta_t. If alpha_t varies with theorems, please define them separately. ++++++++++++++ I have read the comments from other reviewers and the responses. I am satisfied with the responses and will keep my score still at 7.