Paper ID: | 365 |
---|---|

Title: | Multiway clustering via tensor block models |

Comment on rebuttal: Thanks for addressing my concerns. I increased my score. By my comment on \lambda on table 1, I meant that the bottom of the table hides the bar not that the definition of \bar{\lambda} is not clear. --------------------------------------------------------------------------------------------------------- The paper studies the problem of recovering block structure from a noisy tensor, which can be seen as a higher-order extension of stochastic block models. A least square estimator with good convergence properties is proposed together with alternating least squares implementation. Extensions are proposed to select number of blocks as well as model sparse data. This is a well-executed paper. The formulation is clear and the flow of ideas is natural. Theoretical analysis and comprehensive experiments are provided. I did not check the correctness of the proofs. I do have some minor comments though: - L59: I believe by "fiber" the authors mean "slice". - L131: How do you know your estimator is "nearly optimal"? If this is based on equation (6) I would remove it and maybe mention later than equation (6) provides a *suggestive evidence* for optimality. - L230: Please indicate that this is *RMSE* rate and use big O notation to make it clear you dropped a term. - In sparsity experiment, it seems you are using \rho instead of p to indicate sparsity? Please fix that and specify the value of \rho (C norm) you used. - Also, in sparsity experiment. It does not make much sense to report baseline results. They do not add any information and only make the table more difficult to read. - In the description of table 1: the bar of \lambda is not clear.

I have read the author response and found that the authors significantly addressed my concerns -- they provided new theoretical and empirical results. The results are convincing, and I increase my score. ------- In this paper, a tensor block model -- a multiway extension of a stochastic block model -- is studied. The main contribution is to derive a statistical convergence rate of the least square estimator under sub-Gaussian noise. The authors try to confirm the theoretical results by numerical simulations. The strength of this paper is in the theoretical result, which improves the existing convergence rate and also proves the consistency of clustering results. However, my current evaluation is slightly below the border of acceptance. The follows are my major concerns. Originality. The problem of tensor block models has been at least studied since Jegelka et al. (2009), who proposed an efficient algorithm with an approximation guarantee. Chi et al. (2018) derived a statistical convergence rate earlier than this work. Thus, strictly saying, the originality of this paper is to improve the convergence rate. Although a few extensions such as sparse estimation is proposed, I feel they are somewhat incremental. Measuring MSE in a clustering problem. The convergence rate studied in this paper is for the mean square error between a true (noiseless) tensor and a recovered tensor. I agree with this setting if it is for a tensor recovery problem. However, how about the MSE setting for a clustering problem? For example, suppose we have two estimators A and B such that MSE(A) <= MSE(B). Can we say the clustering result of A is always better than B? I mean, it seems there is a gap between MSE and the correctness of the clustering, and I'm not sure measuring MSE is a right thing for clustering. Toy data experiments. In Figure 2, I'm not convinced that the results are consistent with the theory. Specifically, I feel a gap between (4,4,4) and others. This may be because the range of the x axis is different between (4,4,4) and others after rescaled N. Also the number of R settings is not large enough. Real data experiments. The baselines, CP and Tucker decompositions, are general methods for tensor decomposition but not proper methods for clustering. It should be compared with Jegelka et al. (2009) and Chi et al. (2018).

Thank you for addressing my comments in the rebuttal. I increased my score. ----------------------------------------------------- This paper proposes a tensor block model clustering method with applications to multiview clustering. They propose an optimization method for the clustering model using the least-squares method. The proposed method is supported with theoretical analysis on convergence. Furthermore, the paper is well written with supporting experiments. I am not an expert on this topic, however, my judgment is that the paper borrows similar ideas from [15] and other papers and give an extension. Still, I feel the paper has certain among of novelty in the proposed clustering method and shows good performance. I feel that the paper makes a reasonable contribution. Some issues in the supplementary section: What do the authors mean by A.2 in the second proof on page 1 of the appendix? Also, I did not understand this "Combining (A.2), (A.2) and (A.2), we have". In fact, some of the equation references in the appendix are confusing.