Paper ID: 1819
Title: A class of network models recoverable by spectral clustering
Current Reviews

Submitted by Assigned_Reviewer_1

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
In this paper, a generalization of known results on the Stochastic Block Model for a different type of Network generative model is presented, namely, the fact that spectral clustering based on the Laplacian works well for large degree (log(n)).

I think that the paper is not presenting the current knowledge for the SBM accurately, and the discussion of state-of-the art is flawed: It ignores all the spectacular development of spectral clustering for the Stochastic Block model in the last 3 years, where it has already been proven that a large class of networks are recoverable either exactly or with a small error after a sharp bound, see eg:

For detectability: http://arxiv.org/abs/1306.5550 http://arxiv.org/abs/1311.3085 http://arxiv.org/abs/1311.4115

For exact recovery: http://arxiv.org/abs/1405.3267

Given all these strong results, the novelty of the presented paper seem rather weak, and both the motivation and the title are misleading. In fact, rather than giving new results about the detectability of the identification of clusters in the plethora of models for networks (SMB, degree-corrected, preferential attachement) the authors instead invented yet another model for which their proof work.

Q2: Please summarize your review in 1-2 sentences
A generalization of known results on the Stochastic Block Model for a different type of Network generative model. Interesting, but only the weakest result known for the SBM are adapted to new class of models, while all the recent progress is ignored. Because of the lack of proper depiction of the current state-of-the-art, the reviewer finds also the presentation to be misleading. Also, it seems that a new kind of generative model is invented only so that the proof could apply.

Submitted by Assigned_Reviewer_2

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The term KPFM is never well defined.

After reading line 66, I thought I understood, "A random graph family over node set V admits a K-preference frame H, and is called a Preference Frame Model (KPFM), if the edges i, j, i < j are sampled independently from Bernoulli distributions with parameters Sij."

Then, line 104 says, "We note that the the KPFM model does not specifically require that the edges are sampled independently."

This is fine, but what appeared to be a definition on line 66 is no longer a definition.

This is super frustrating.

Given that the aim of this paper is to {increase the clarity of the of the parameterizations}, it seems that the paper should excel in defining its model.

This referee agrees that there may be space to generalize the Stochastic Blockmodel and its degree correction in a way that more closely aligns with spectral clustering.

For example, I'd suggest the paper starts by saying that ''spectral clustering is not a model based algorithm, but can we derive a model that more naturally aligns with its properties?''

This referee is eager to up his quality score.

This paper has a lot of promise, but the organization of the paper is sorely lacking.

Where is the definition of misclustered?

Where is the definition of the model? Are edges independent or not?

Minor point: Because the algorithm uses a normalization step, I believe that the correct citation for this version of the algorithm is not 13 and 21, but rather 16.

Q2: Please summarize your review in 1-2 sentences
This is an interesting problem and the approach in this paper might be exactly correct.

However, the organization and clarity of the paper is very frustrating.

Submitted by Assigned_Reviewer_3

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The authors present an analysis that establishes the applicability of the classical spectral clustering algorithm on a wide variety of generalized block stochastic models (various communities, sizes, transition probabilities, etc). Their results consider the weak recovery regime, where "most of the nodes" are clustered correctly; certainly a more realistic setup.

The ideas presented seem novel, however I am not an expert in the area. The paper is well-written, however it is extremely heavy in notation, making the technical result inaccessible.

One example where the technical discussion is confusing: It is not clear how parameter kappa should be thought of. For example, what is a maximum degree bound with a reasonable kappa, where the probability bounds are meaningful?

Also, there are many relevant and important recent results on the subject, and a comparison against them will be very useful - Results by Mossel, Neeman, Sly et al. http://arxiv.org/abs/1407.1591 http://arxiv.org/abs/1404.6325 http://arxiv.org/abs/1311.4115 - Results by Abbe, Bandeira, Sandon et al. http://arxiv.org/abs/1506.03729 http://arxiv.org/abs/1503.00609 http://arxiv.org/abs/1405.3267 - Results by Lelarge, Massoulie et al. http://arxiv.org/abs/1506.08621 http://arxiv.org/abs/1406.6897

Finally, a minor comment: the authors mention at some point that: "there are re strong reasons to believe that the KPFM is the most general class for which community recovery by direct spectral clustering is possible." Why is this true?

Overall, it seems that the contribution could be a valuable addition to the literature of community detection on SBMs, however it is not clear how it compares against the state of the art spectral methods for extended SBMs as noted above.

Some typos: - [19,18,9].In particular -> space missing

- overfill in line 090

- "degree model"[8]

-> space missing

Q2: Please summarize your review in 1-2 sentences
The authors establish that the spectral clustering algorithm that works for SBM, is also applicable to a more general class of stochastic community models. The ideas seem novel, however the technical material is not very accessible, and more comparisons to prior art seem necessary.

Submitted by Assigned_Reviewer_4

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
Typos:

Line 91/92 Equation has indexing typo

Line 116:

This type of interactions

Line 125: consists with

Line 160/161: recover(y)
Q2: Please summarize your review in 1-2 sentences
The paper analyzes stochastic block models and related models under an umbrella model of preference frame models. Particularly focusing on recovery using spectral methods. The generalization seems useful for the overall problem. The paper is well written and assumptions clearly stated.

Author Feedback
Author Feedback
Q1:Author rebuttal: Please respond to any concerns raised in the reviews. There are no constraints on how you want to argue your case, except for the fact that your text should be limited to a maximum of 5000 characters. Note however, that reviewers and area chairs are busy and may not read long vague rebuttals. It is in your own interest to be concise and to the point.
REV 1: Apologies for the eggregious and frustrating mistake(s). The KPFM random graph model is as defined in lines 66-69 where ([n], S) is a weighted graph that admits a pref. frame (l 57-58).
What we **should have** said at the bottom of p.2 is that we use independent sampling of edges only to prove concentration of hat{L}. The result extends to other graph models with dependent edges (e.g graph lifts) if one can prove concentration.

"Misclustered" is defined the usual way: p_err = (1/n)*(min over all permutations of cluster labels of the Hamming distance between label vectors).

Minor: [16] normalizes the _rows_ in step 3 to unit length. Alg 1 is almost that of [13,21] (there, columns are normalized after step 3 not before). This variation amounts to the re-definition of c_max, c_min (in l 252).

Minor correction: lambda_2 of block M^kl (Thm 6) is the third eigenvalue (the first 2 e-values of M^kl have same magnitude and opposite signs).

"spectral clustering is not model-based but can we derive...": this could well be the opening statement of our paper.

We hope that with these clarifications the reviewer will be motivated to continue reading the paper. It is a privilege to have an expert reviewer for our paper and we would appreciate more in depth feedback.

Rev 4:
Thank you for feedback and insightful questions. kappa is needed to prove concentration of hat{L} , Supplement, Prop 4. We suspect kappa could also grow SLOWLY with n.
For reasonable p_err kappa (eq 7) is not the critical factor in our exps. One needs the expression [C_0....] to get small enough which requires fairly large n.

regarding the 8 references:

[Mossel,Neeman,Sly,et al]
**Indeed state of the art, but for the *planted bisection* problem. This model has 2 parameters (p,q) whereas our model has O(n^2) parameters. DC-SBM has O(n). One of our contributions is to show that the vast majority of these are nuisance params (lines 149-155 and proof of Prop 2).
** are on the very sparse regime d_i = O(1) and on the problem of "detection"/"correlated partition". The recovery regime of our Thm 3 is impossible below the d_i=Omega(log n) threshold.

1506.03729 and 1503.00609 (Abbe&Sandon) deal strictly with SBM, i.e O(K^2) params. The sharp threshold obtained does not apply to KPFM or even HKPFM.
1405.3267 (Abbe,Bandeira,Hall) is on planted bisection.

1506.08261 (posted 3 weeks after the NIPS submission) -- recovery of degree-corrected SBM (i.e HKPFM), under very weak conditions (more below). Implemented on our example, got p_err=0.26, lambda_K of hat{H} has multiplicity 6.

1406.6897 ### impressively general model. Technically, this model is hierarchical (parameters are random, edges random) while our model has fixed parameters. Besides, the bound obtained (Thm 1, eq 9) vanishes only when epsilon_r vanishes, i.e. when the relevant operator T (properly normalized) tends to a rank K operator. Our KPFM does not require this assumption. If we were to construct the T operator for KPFM, the residual spectrum will stay bounded away from 0, so the the recovery error will not vanish asymptotically.

To put things in perspective, the models cited above relie more or less on a similar condition (e.g that there is an "ideal" matrix of rank K). This is true for DC-SBM (and exploited implicitly by e.g Gulikers & al). The KPFM departs from this assumption; we use e-value separation instead (much weaker than controlling epsilon_r). In Theorem 6 we show how one can control the eigenvalue separation using only the natural condition that clusters cannot be further partitioned (see bottom of pag 6). This is a novel result by itself.

"Finally a minor comment..." a precise answer requires more space. Very simplistically, the rows in alg 1, step 3 will concentrate iff the matrix hat{P} is close to satisfying (1), which essentially defines KPFM.

Rev 5:

1306.5550 Krzakala&al,1311.3085 Massoulie, 1311.4115 (see Rev 3), 1405.3267 (see R3). Remarkable results, yet they apply only to the SBM (K^2 params or fewer). We agree with the reviewer that the non-backtracking r.w. holds great promise and will including this + citations in the next version. But we are not aware of this approach having been applied to anything beyond SBM.

In a NIPS paper there is little room to cover progress that is not _directly_ related to our contribution. Thus accusation of "ignorance" and "misleading" should wait until we post on arxiv. What is fair to say is that work on planted bisection is making spectacular progress, but that more complex models, and in particular geometric aspects that matter once we get away O(K^2) models, have been neglected and our paper advances the state of the art in that direction.

KPFM is much more flexible than the above models, yet we don't make stronger assumptions for recovery. In fact our assumptions are a little simpler, clearer, and in practice weaker as we show by the comparison in sec 4.2.