Reviews: Primal-Dual Block Generalized Frank-Wolfe

After considering the rebuttal and discussing the paper, the reviewers have made a significant update to their scores, and came to a consensus to accept the paper, agreeing that it makes a nice contribution to NeurIPS, *assuming* the clarifications provided in the rebuttal are implemented in the camera ready version. It is important that the authors carefully update their camera ready given the reviewers comment (I will check!). Some side comments: - I fully agree with R4 that calling the method "Frank-Wolfe" is quite misleading. [1] had the excuse of considering a more specific problem (where the k-SVD can be argued to cost k 1-SVD). I suggest the authors rename the title to "Primal-Dual Block Generalized Frank-Wolfe", e.g., as generalized FW (e.g. "Generalized Conditional Gradient for Sparse Estimation", JMLR 2017) already exists with more powerful oracles than a LMO. - R1 and R4 mention some papers which should be mentioned in the related work. For example, for L64-66, [22] was extended in Osokin et al. ICML 2016 to obtain (almost) a linear convergence rate. Also, Gidel et al. AISTATS 2017 presents a FW method for saddle point problems which could be applied to (7) when the Fenchel dual function has compact support (e.g. with f_i being the hinge loss), and which has a linear convergence rate when the operator is strongly monotone.

Paper ID:	7759
Title:	Primal-Dual Block Generalized Frank-Wolfe