{"title": "Latent distance estimation for random geometric graphs", "book": "Advances in Neural Information Processing Systems", "page_first": 8724, "page_last": 8734, "abstract": "Random geometric graphs are a popular choice for a latent points generative model for networks. Their definition is based on a sample of $n$ points $X_1,X_2,\\cdots,X_n$ on the Euclidean sphere~$\\mathbb{S}^{d-1}$ which represents the latent positions of nodes of the network. The connection probabilities between the nodes are determined by an unknown function (referred to as the ``link'' function) evaluated at the distance between the latent points. We introduce a spectral estimator of the pairwise distance between latent points and we prove that its rate of convergence is the same as the nonparametric estimation of a function on $\\mathbb{S}^{d-1}$, up to a logarithmic factor. In addition, we provide an efficient spectral algorithm to compute this estimator without any knowledge on the nonparametric link function. As a byproduct, our method can also consistently estimate the dimension $d$ of the latent space.", "full_text": "Latent Distance Estimation for Random Geometric\n\nGraphs\n\nErnesto Araya\n\nLaboratoire de Math\u00e9matiques d\u2019Orsay (LMO)\n\nUniversit\u00e9 Paris-Sud\n\n91405 Orsay Cedex, France\n\nernesto.araya-valdivia@u-psud.fr\n\nYohann De Castro\n\nInstitut Camille Jordan\n\u00c9cole Centrale de Lyon\n69134 \u00c9cully, France\n\nyohann.de-castro@ec-lyon.fr\n\nAbstract\n\nRandom geometric graphs are a popular choice for a latent points generative model\nfor networks. Their de\ufb01nition is based on a sample of n points X1, X2,\u00b7\u00b7\u00b7 , Xn\non the Euclidean sphere Sd\u22121 which represents the latent positions of nodes of\nthe network. The connection probabilities between the nodes are determined by\nan unknown function (referred to as the \u201clink\u201d function) evaluated at the distance\nbetween the latent points. We introduce a spectral estimator of the pairwise distance\nbetween latent points and we prove that its rate of convergence is the same as the\nnonparametric estimation of a function on Sd\u22121, up to a logarithmic factor. In\naddition, we provide an ef\ufb01cient spectral algorithm to compute this estimator\nwithout any knowledge on the nonparametric link function. As a byproduct, our\nmethod can also consistently estimate the dimension d of the latent space.\n\n1\n\nIntroduction\n\nRandom geometric graph (RGG) models have received attention lately as alternative to some simpler\nyet unrealistic models as the ubiquitous Erd\u00f6s-R\u00e9nyi model [12]. They are generative latent point\nmodels for graphs, where it is assumed that each node has associated a latent point in a metric\nspace (usually the Euclidean unit sphere or the unit cube in Rd) and the connection probability\nbetween two nodes depends on the position of their associated latent points. In many cases, the\nconnection probability depends only on the distance between the latent points and it is determined by\na one-dimensional \u201clink\u201d function.\nBecause of its geometric structure, this model is appealing for applications in wireless networks\nmodeling [18], social networks [17] and biological networks [15], to name a few. In many of these\nreal-world networks, the probability that a tie exists between two agents (nodes) depends on the\nsimilarity of their pro\ufb01les. In other words, the connection probability depends on some notion of\ndistance between the position of the agents in a metric space, which in the social network literature\nhas been called the social space.\nIn the classical RGG model, as introduced by Gilbert in [13], we consider n independent and\nidentically distributed latent points {Xi}n\ni=1 in Rd and the construct the graph with vertex set\nV = {1, 2,\u00b7\u00b7\u00b7 , n}, where the node i and j are connected if and only if the Euclidean distance\n(cid:107)Xi \u2212 Xj(cid:107)d is smaller than a certain prede\ufb01ned threshold \u03c4. The seminal reference on the classical\nRGG model, from the probabilistic point of view, is the monograph [27]. Another good reference is\nthe survey paper [30]. In such a case, the \u201clink\u201d function, which we have not yet formally de\ufb01ned, is\nthe threshold function 1t\u2264\u03c4 (t). Otherwise stated, two points are connected only if their distance is\nsmaller than \u03c4. In that case, all the randomness lies in the fact that we are sampling the latent points\nwith a certain distribution. We choose to maintain the name of random geometric graphs for more\ngeneral \u201clink\u201d functions.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fThe angular version of the RGG model has also received attention. On that model, the latent points\nare uniformly distributed on Sd\u22121 (the unit sphere on Rd), and two points are connected if their\nangle is bellow a certain threshold. This model has been used in the context of sensor and wireless\nnetworks [14]. In [9] the authors show that in when the size of the graph n is \ufb01xed and the dimension d\ngoes to in\ufb01nity, the RGG model on the sphere is indistinguishable from the Erd\u00f6s-Renyi model, in\nthe sense that the total variation distance between both graph distributions converges to zero. On\nthe other hand, in [5] the authors prove that in the dense case if d satisfy a bound with respect to n\n(speci\ufb01cally, if d/n3 \u2192 0) then we can distinguish between both models, by a counting the number\nof triangles. The angular RGG model has also been used in the context of approximate graph coloring\n[19].\nWe are interested in the problem of recovering the pairwise distances between the latent\npoints {Xi}n\ni=1 for geometric graphs on Sd\u22121 given a single observation of the network. We\nlimit ourselves to the case when the network is a simple graph. Furthermore, we will assume that\nthe dimension d is \ufb01xed and that the \u201clink\" function is not known. This problem and some of its\nvariants have been studied for different versions of the model and under a different set of hypothesis,\nsee for example the recent work [1] and the references therein. In that work the authors propose a\nmethod for estimating the latent distances based on the graph theoretic distance between two nodes\n(that is the length of the shortest path between the nodes). Independently, in [10] the authors develop\na similar method which has slightly less recovery error, but for a less general model. In both cases,\nthe authors consider the cube in Rd (or the whole Rd) but not the sphere. Our strategy is similar to\nthe one developed in [28], where they considered the latent point estimation problem in random dot\nproduct graphs, which is a more restricted model compared to the one considered here. However,\nthey considered more general Euclidean spaces and latent points distributions other than the uniform.\nSimilar ideas has been used in the context of vertex classi\ufb01cation for latent position graphs [29].\nWe will use the notion of graphon function to formalize the concept of \u201clink\u201d function. Graphons are\ncentral objects to the theory of dense graph limits. They were introduced by Lov\u00e1sz and Szegedy\nin [25] and further developed in a series of papers, see [3],[4]. Formally, they are symmetric kernels\nthat take values in [0, 1], thus they will act as the \u201clink\u201d function for the latent points. The spectrum\nof the graphon is de\ufb01ned as the spectrum of an associated integral operator, as in [24, Chap.7]. In this\npaper, they will play the role of limit models for the adjacency matrix of a graph, when the size goes\nto in\ufb01nity. This is justi\ufb01ed in light of the work of Koltchinskii and Gin\u00e9 [22] and Koltchinskii [21].\nIn particular, the adjacency matrix of the observed graph can be though as a \ufb01nite perturbed version\nof this operator, combining results from [22] and [2].\nWe will focus on the case of dense graphs on the sphere Sd\u22121 where the connection probability\ndepends only on the angle between two nodes. This allows us to use the harmonic analysis on the\nsphere to have a nice characterization of the graphon spectrum, which has a very particular structure.\nMore speci\ufb01cally, the following two key elements holds: \ufb01rst of all, the basis of eigenfunctions is\n\ufb01xed (do not depend on the particular graphon considered) and equal to the well-known spherical\nharmonic polynomials. Secondly, the multiplicity of each eigenvalue is determined by a sequence of\nintegers that depends only on the dimension d of the sphere and is given by a known formula and the\nassociated eigenspaces are composed by spherical harmonics of the same polynomial degree.\nThe graphon eigenspace composed only with linear eigenfunctions (harmonic polynomials of degree\none) will play an important role in the latent distances matrix recovery as all the information we\nneed to reconstruct the distances matrix is contained in those eigenfunctions. We will prove that\nit is possible to approximately recover this information from the observed adjacency matrix of the\ngraph under regularity conditions (of the Sobolev type) on the graphon and assuming an eigenvalue\ngap condition (similar hypotheses are made in [6] in the context of matrix estimation and in [23] in\nthe context of manifold learning). We do this by proving that a suitable projection of the adjacency\nmatrix, onto a space generated by exactly d of its eigenvectors, approximates well the latent distances\nmatrix considering the mean squared error in the Frobenius norm. We give nonasymptotic bounds\nfor this quantity obtaining the same rate as the nonparametric rate of estimation of a function on the\nsphere Sd\u22121, see [11, Chp.2] for example. Our approach includes the adaptation of some perturbation\ntheorems for matrix projections from the orthogonal to a \u201cnearly\u201d orthogonal case, which combined\nwith concentration inequalities for the spectrum gives a probabilistic \ufb01nite sample bound, which is\nnovel to the best of our knowledge. More speci\ufb01cally, we prove concentration inequalities for the\nsampled eigenfunctions of the integral operator associated to a geometric graphon, which are not\nnecessarily orthogonal as vectors in Rn. Our method shares some similarities with the celebrated\n\n2\n\n\fUSVT method, introduced by Chatterjee in [6], but in that case they obtained an estimator of the\nprobability matrix described in Section 2.2 and not of the population Gram matrix as our method. We\ndevelop an ef\ufb01cient algorithm, which we call Harmonic EigenCluster(HEiC) to reconstruct the latent\npositions from the data and illustrate its usefulness with synthetic data.\n\n2 Preliminaries\n\n2.1 Notation\nWe will consider Rd with the Euclidean norm (cid:107)\u00b7(cid:107) and the Euclidean scalar product (cid:104) , (cid:105). We de\ufb01ne the\nsphere Sd\u22121 := {x \u2208 Rd : (cid:107)x(cid:107) = 1}. For a set A \u2282 R its diameter diam(A) := supx,y\u2208A |x \u2212 y|\nand if B \u2282 R the distance between A and B is dist(A, B) := inf x\u2208A,y\u2208B |x\u2212 y|.We will use (cid:107)\u00b7(cid:107)F\nthe Frobenius norm for matrices and (cid:107)\u00b7(cid:107)op for the operator norm. The identity matrix in Rd\u00d7d will be\nIdd. If X is a real valued random variable and \u03b1 \u2208 (0, 1), X \u2264\u03b1 C means that P(X \u2264 C) \u2265 1 \u2212 \u03b1.\n\n2.2 Generative model\n\nWe describe the generative model for networks which is a generalization of the classical random\ngeometric graph model introduced by Gilbert in [13]. We base our de\ufb01nition on the W -random graph\nmodel described in [24, Sec. 10.1]. The central objects will be graphon functions on the sphere,\nwhich are symmetric measurable functions of the form W : Sd\u22121 \u00d7 Sd\u22121 \u2192 [0, 1]. Throughout this\npaper, we consider the measurable space (Sd\u22121, \u03c3), where \u03c3 is the uniform measure on the sphere.\nOn Sd\u22121 \u00d7 Sd\u22121 we consider the product measure \u03c3 \u00d7 \u03c3.\nTo generate a simple graph from a graphon function, we \ufb01rst sample n points {Xi}n\ni=1 independently\non the sphere Sd\u22121, according to the uniform measure \u03c3. These are the so-called latent points.\nSecondly, we construct the matrix of distances between these points, called the Gram matrix G\u2217 (we\nwill often call it population Gram matrix) de\ufb01ned by\n\nG\u2217\nij := (cid:104)Xi, Xj(cid:105)\n\nand the so-called probability matrix\n\n\u0398ij = \u03c1nW (Xi, Xj)\n\nwhich is also a n \u00d7 n matrix. The function W gives the precise meaning for the \u201clink\u201d function,\nbecause it determines the connection probability between Xi and Xj. The introduction of the scale\nparameter 0 < \u03c1n \u2264 1 allow us to control the edge density of the sampled graph given a function\nW , see [20] for instance. The case \u03c1n = 1 corresponds to the dense case (the parameter \u0398ij do not\ndepend on n) and when \u03c1n \u2192 0 the graph will be sparser. Our main results will hold in the regime\n\u03c1n = \u2126( log n\nn ), which we call relatively sparse. Most of the time we will work with the normalized\nn \u0398. If there exists a function f : [\u22121, 1] \u2192 [0, 1] such that\nversion of the probability matrix Tn := 1\nW (x, y) = f ((cid:104)x, y(cid:105)) for all x, y \u2208 Sd\u22121 we will say that W is a geometric graphon.\nFinally, we de\ufb01ne the random adjacency matrix \u02c6Tn, which is a n \u00d7 n symmetric random matrix that\nhas independent entries (except for the symmetry constraint \u02c6Tn = \u02c6T T\nn ), conditional on the probability\nmatrix, with laws\nwhere B(m) is the Bernoulli distribution with mean parameter m. Since the probability matrix\ncontains the mean parameters for the Bernoulli distributions that de\ufb01ne the random adjacency matrix\nit has been also called the parameter matrix [6]. Observe that the classical RGG model on the sphere\nis a particular case of the described W -random graph model when W (x, y) = 1(cid:104)x,y(cid:105)\u2264\u03c4 . In that case,\nsince the entries of the probability matrix only have values in {0, 1}, the adjacency matrix and the\nprobability matrix are equal. Depending on the context, we use \u02c6Tn for the random matrix as described\nabove or for an instance of this random matrix, that is for the adjacency matrix of the observed graph.\nThis will be clear from the context.\nIt is worth noting that graphons can be, without loss of generality, de\ufb01ned in [0, 1]2. The previous\naf\ufb01rmation means that for any graphon there exists a graphon in [0, 1]2 that generates the same distri-\nbution on graphs for any given number of nodes. However, in many cases the [0, 1]2 representation\n\nn( \u02c6Tn)ij \u223c B(\u0398ij)\n\n3\n\n\fcan be less revealing than other representations using a different underlying space. This is illustrated\nin the case of the pre\ufb01x attachment model in [24, example 11.41].\nIn the sequel we use the notation \u03bb0, \u03bb1,\u00b7\u00b7\u00b7 , \u03bbn\u22121 for the eigenvalues of the normalized probability\nmatrix Tn. Similarly, we denote by \u02c6\u03bb0, \u02c6\u03bb1,\u00b7\u00b7\u00b7 , \u02c6\u03bbn\u22121 the eigenvalues of the matrix \u02c6Tn. We recall that\n\u02c6Tn ) have the same set of eigenvectors. We will denote by vj for\nTn (resp. \u02c6Tn) and 1\n\u03c1n\n1 \u2264 j \u2264 n the eigenvector of Tn associated to \u03bbj, which is also the eigenvector of 1\nTn associated\n\u03c1n\n\u02c6\u03bbj of \u02c6Tn.\n\u03bbj. Similarly, we denote by \u02c6vj to the eigenvector associated to the eigenvalue \u03c1n\nto 1\n\u03c1n\n\nTn (resp. 1\n\u03c1n\n\nOur main result is that we can recover the Gram matrix using the eigenvectors of \u02c6Tn as follows\nTheorem 1 (Informal statement). There exists a constant c1 > 0 that depends only on the dimension\nd such that the following is true. Given a graphon W on the sphere such that W (x, y) = f ((cid:104)x, y(cid:105))\nwith f : [\u22121, 1] \u2192 [0, 1] unknown, which satis\ufb01es an eigenvalue gap condition and has Sobolev\nregularity s, there exists a subset of the eigenvectors of \u02c6Tn, such that \u02c6G := 1\n\u02c6V \u02c6V T converges to the\npopulation Gram matrix G\u2217 := 1\n2s+d\u22121 (up to a log factor). This estimate\n\u02c6V \u02c6V T can be found in linear time given the spectral decomposition of \u02c6Tn.\nWe will say that a geometric graphon W (x, y) = f ((cid:104)x, y(cid:105)) on Sd\u22121 has regularity s if f belongs the\nweighted Sobolev space Z s\n2 , as de\ufb01ned in [26].\nIn order to make the statement of 1 rigorous, we need to precise the eigenvalue gap condition and\nde\ufb01ne the graphon eigensystem.\n\n\u03b3([\u22121, 1]) with weight function w\u03b3(t) = (1 \u2212 t)\u03b3\u2212 1\n\nn ((cid:104)Xi, Xj(cid:105))i,j at rate n\n\n\u2212s\n\nc1\n\n2.3 Geometric graphon eigensystem\n\nHere we gather some asymptotic and concentration properties for the eigenvalues and eigenfunctions\nof the matrices \u02c6Tn, Tn and the operator TW , which allows us to recover the Gram matrix from data.\nThe key fact is that the eigenvalues (resp. eigenvectors) of the matrix 1\nTn converge to\n\u03c1n\nthe eigenvalues (resp. sampled eigenfunctions) of the integral operator TW : L2(Sd\u22121) \u2192 L2(Sd\u22121)\n\n\u02c6Tn and 1\n\u03c1n\n\n(cid:90)\n\nTW g(x) =\n\ng(y)W (x, y)d\u03c3(y)\n\nSd\u22121\n\nwhich is compact [16, Sec.6, example 1] and self-adjoint (which follows directly from the symmetry\nof W ). Then by a classic theorem in functional analysis [16, Sec.6, Thm. 1.8] its spectrum is a\ndiscrete set {\u03bb\u2217\nk}k\u2208N \u2282 R and its only accumulation point is zero. In consequence, we can see the\nspectra of \u02c6Tn, Tn and TW (which we denote \u03bb( \u02c6Tn), \u03bb(Tn) and \u03bb(TW ) resp.) as elements of the space\nC0 of in\ufb01nite sequences that converge to 0 (where we complete the \ufb01nite sequences with zeros). It is\nworth noting that in the case of geometric graphons with regularity s (in the Sobolev sense de\ufb01ned\nabove) the rate of convergence of \u03bb(TW ) is determined by the regularity parameter s. We have the\nfollowing:\n\n\u2022 The spectrum of \u03bb( 1\n\n\u03c1n\n\nfollows\n\nTn) converges to \u03bb(TW ) (almost surely) in the \u03b42 metric, de\ufb01ned as\n\n(cid:115)(cid:88)\n\ni\u2208N\n\n\u03b42(x, y) = inf\np\u2208P\n\n(xi \u2212 yp(i))2\n\nwhere P is the set of all permutations of the non-negative integers. This is proved in [22].\nIn [8] they prove the following\n\n(cid:16)\n\n\u03b42\n\n\u03bb(\n\n1\n\u03c1n\n\nTn), \u03bb(TW )\n\n(cid:17) \u2264\u03b1/4 C\n\n(cid:16) log n\n\n(cid:17) s\n\n2s+d\u22121\n\nn\n\n(1)\n\n\u2022 Matrices \u02c6Tn approach to matrix Tn in operator norm as n gets larger. Applying [2, Cor.3.3]\n\nto the centered matrix Y = \u02c6Tn \u2212 Tn we get\n\nD\u2217\n\n0\n\n\u221a\n\nlog n\nn\n\n(2)\n\nE((cid:107) \u02c6Tn \u2212 Tn(cid:107)op) (cid:46) D0\nn\n\n+\n\n4\n\n\f(cid:80)n\nj=1 Yij(1\u2212 Yij) and\n\n(cid:111)\n0 \u2264 1, which implies that\n\nwhere (cid:46) denotes inequality up to constant factors, D0 = max0\u2264i\u2264n\n0 = maxij |Yij|. We clearly have that D0 = O(n\u03c1n) and D\u2217\nD\u2217\nlog n\nn\n\n\u221a\n\nE(cid:107) \u02c6Tn \u2212 Tn(cid:107)op (cid:46) max\n\n(cid:110) \u03c1n\u221a\n(cid:107) \u02c6Tn \u2212 Tn(cid:107)op \u2264\u03b1/4 C max(cid:8) 1\u221a\n\nn\n\n,\n\n\u221a\n\n,\n\nWe see that this inequality do not improve if \u03c1n is smaller than in the relatively sparse case,\nthat is \u03c1n = \u2126( log n\n\nn ). We prove that, as a corollary of the results in [2], we have\n\n(cid:9)\n\n(3)\n\nlog n\n\u03c1nn\n\n\u03c1nn\n\n1\n\u03c1n\n\nAn analogous bound can be obtained for the Frobenius norm replacing \u02c6Tn with \u02c6T usvt\nthe\nUSVT estimator de\ufb01ned in [6]. For our main results, Proposition 3 and Theorem 4 the\noperator norm bound will suf\ufb01ce.\n\nn\n\nA remarkable fact in the case of geometric graphons on Sd\u22121 is that the eigenfunctions {\u03c6k}k\u2208N of\nthe integral operator TW are a \ufb01xed set that do not depend on the particular function f considered.\nThis comes from the fact that TW is a convolution operator on the sphere and its eigenfunctions are\nthe well-known spherical harmonics of dimension d, which are harmonic polynomials in d variables\nde\ufb01ned on Sd\u22121 corresponding to the eigenfunctions of the Laplace-Beltrami operator on the sphere.\nThis follows from [7, Thm.1.4.5] and from the Funck-Hecke formula given in [7, Thm.1.2.9]. Let\ndk denote the dimension of the k-th spherical harmonic space. It is well-known [7, Cor.1.1.4] that\n\n(cid:1). Another important fact, known as the addition\n\n(cid:1) \u2212(cid:0)k+d\u22123\nd0 = 1, d1 = d and dk = (cid:0)k+d\u22121\ndk(cid:88)\n\nk\u22122\ntheorem [7, Lem.1.2.3 and Thm.1.2.6], is that\n\nk\n\n\u03c6j(x)\u03c6j(y) = ckG\u03b3\n\nk((cid:104)x, y(cid:105))\n\ni=dk\u22121\n\nk are the Gegenbauer polynomials of degree k with parameter \u03b3 = d\u22122\n\nwhere G\u03b3\nThe Gegenbauer polynomial of degree one is G\u03b3\n1 ((cid:104)Xi, Xj(cid:105)) = 2\u03b3(cid:104)Xi, Xj(cid:105) for every i and j. In consequence, by the addition theorem\nG\u03b3\n\nand ck = 2k+d\u22122\n.\nd\u22122\n1 (t) = 2\u03b3t (see [7, Appendix B2]), hence we have\n\n2\n\n1 ((cid:104)Xi, Xj(cid:105)) =\nG\u03b3\n\n1\nc1\n\n\u03c6k(Xi)\u03c6k(Xj)\n\nd(cid:88)\n\nk=1\n\nd(cid:88)\n\nwhere we recall that d1 = d. This implies the following relation for the Gram matrix, observing that\n2\u03b3c1 = d\n\n1\n\n1\nn\n\n2\u03b3c1\n\nT =\n\nG\u2217 :=\n\n((cid:104)Xi, Xj(cid:105))i,j =\n\nj v\u2217\nv\u2217\n\u221a\nj=1\nwhere v\u2217\nn and V \u2217 is the matrix with columns v\u2217\nj is the Rn vector with i-th coordinate \u03c6j(Xi)/\nj .\nIn a similar way, we de\ufb01ne for any matrix U in Rn\u00d7d with columns u1, u2,\u00b7\u00b7\u00b7 , ud, the matrix\nGU := 1\nd U U T . As part of our main theorem we prove that for n large enough there exists a matrix \u02c6V\nin Rn\u00d7d where each column is one of the eigenvector of \u02c6Tn, such that \u02c6G := G \u02c6V approximates G\u2217\nwell, in the sense that the norm (cid:107) \u02c6G \u2212 G\u2217(cid:107)F converges to 0 at a rate which is that of the nonparametric\nestimation of a function on Sd\u22121.\n\nV \u2217V \u2217T\n\n(4)\n\n1\nd\n\nj\n\n2.4 Eigenvalue gap condition\n\n1, v\u2217\n\n2,\u00b7\u00b7\u00b7 , v\u2217\nInformally, we assume that the eigenvalue \u03bb\u2217\n\nIn this section we describe one of our main hypotheses on W needed to ensure that the space\nd} can be effectively recovered with the vectors \u02c6v1, \u02c6v2,\u00b7\u00b7\u00b7 , \u02c6vd using our al-\nspan{v\u2217\ngorithm.\n1 is suf\ufb01ciently isolated from the rest\nof the spectrum of TW (not counting multiplicity). We assume without loss of generality that\n\u03bb\u2217\n1 = \u03bb\u2217\n. Given a geometric graphon W , we de\ufb01ne the spectral gap of W relative to\nthe eigenvalue \u03bb\u2217\n\n2 = \u00b7\u00b7\u00b7 = \u03bb\u2217\nd1\n1 by\n\nj /\u2208{1,\u00b7\u00b7\u00b7 ,d1}\nwhich quanti\ufb01es the distance between the eigenvalue \u03bb\u2217\nwe have the following elementary proposition.\n\nGap1(W ) := min\n\n1 \u2212 \u03bb\u2217\n|\u03bb\u2217\nj|\n1 and the rest of the spectrum. In particular,\n\n5\n\n\fProposition 2. It holds that Gap1(W ) = 0 if and only if there exists j /\u2208 {1,\u00b7\u00b7\u00b7 , d1} such that\n\u03bb\u2217\nj = \u03bb\u2217\n\n1 or \u03bb\u2217\n\n1 = 0.\n\nProof. Observe that the unique accumulation point of the spectrum of TW is zero. The proposition\nfollows from this observation.\nTo recover the population Gram matrix G\u2217 with our Gram matrix estimator \u02c6G we require the spectral\ngap \u2206\u2217 := Gap1(W ) to be different from 0. This assumption have been made before in the\nliterature, in results that are based in some version of the Davis-Kahan sin \u03b8 theorem (see for\ninstance [6], [23], [29]). More precisely, our results will hold on the following event\n\n(cid:110)\n\n(cid:16)\n\n\u03bb(cid:0) 1\n\nE :=\n\n\u03b42\n\n(cid:1), \u03bb(TW )\n\n(cid:17) \u2228 2 9\n\u221a\n\u03c1n\u2206\u2217 (cid:107)Tn \u2212 \u02c6Tn(cid:107)op \u2264 \u2206\u2217\n\nd\n\n4\n\n2\n\n(cid:111)\n\n,\n\nTn\n\n\u03c1n\n\nfor which we prove the following: given an arbitrary \u03b1 we have that\n\nfor n large enough (depending on W and \u03b1). This dependence can be made explicit using (1) and (3)\n\n} \u2264\n\nlog n\nn\n\n\u2206\u22172\n\u221a\n215/2C\n\nd\n\nand\n\nlog n\n\nn\n\n\u2264(cid:0) \u2206\u2217\n\n8C(cid:48)\n\n(cid:1) 2s+d\u22121\n\ns\n\nP(E) \u2265 1 \u2212 \u03b1\n2\n\n(cid:114) \u03c1n\n\n,\n\nn\n\n\u221a\n\nmax{\n\nwhere C, C(cid:48) > 0. The following theorems are the main results of this paper. Their proofs can be\nfound in the supplementary material.\nProposition 3. On the event E, there exists one and only one set \u039b1, consisting of d eigenvalues of\n\u02c6Tn, whose diameter is smaller than \u03c1n\u2206\u2217/2 and whose distance to the rest of the spectrum of \u02c6Tn\nis at least \u03c1n\u2206\u2217/2. Furthermore, on the event E, our algorithm (Algorithm 1) returns the matrix\n\u02c6G = (1/c1) \u02c6V \u02c6V T , where \u02c6V has by columns the eigenvectors corresponding to the eigenvalues on \u039b1.\nTheorem 4. Let W be a regular geometric graphon on Sd\u22121 with regularity parameter s and such\nthat \u2206\u2217 > 0. Then there exists a set of eigenvectors \u02c6v1,\u00b7\u00b7\u00b7 , \u02c6vd of \u02c6Tn such that\n\n(cid:107)G\u2217 \u2212 \u02c6G(cid:107)F = O(n\u2212 s\n\n2s+d\u22121 )\n\nwhere \u02c6G = G \u02c6V and \u02c6V is the matrix with columns \u02c6v1,\u00b7\u00b7\u00b7 , \u02c6vd. Moreover, this rate is the minimax rate\nof nonparametric estimation of a regression function f with Sobolev regularity s in dimension d \u2212 1.\nThe condition \u2206\u2217 > 0 allow us to use Davis-Kahan type results for matrix perturbation to prove\nTheorem 4. With this and concentration for the spectrum we are able to control with high probability\nthe terms (cid:107) \u02c6G \u2212 G(cid:107)F and (cid:107)G \u2212 G\u2217(cid:107)F . The rate of nonparametric estimation of a function in Sd\u22121\ncan be found in [11, Chp.2].\n\n3 Algorithms\n\nThe Harmonic EigenCluster algorithm(HEiC) (see Algorithm 1 below) receives the observed adja-\ncency matrix \u02c6Tn and the sphere dimension as its inputs to reconstruct the eigenspace associated to\n1. In order to do so, the algorithm selects d vectors in the set \u02c6v1, \u02c6v2,\u00b7\u00b7\u00b7 \u02c6vn, whose\nthe eigenvalue \u03bb\u2217\n2,\u00b7\u00b7\u00b7 , v\u2217\n1, v\u2217\nlinear span is close to the span of the vectors v\u2217\nd de\ufb01ned in Section 2.3. The main idea\nis to \ufb01nd a subset of {\u02c6\u03bb0, \u02c6\u03bb2,\u00b7\u00b7\u00b7 , \u02c6\u03bbn\u22121}, which we call \u039b1, consisting on d1 elements (recall that\n1. This can be done assuming that the event E\nd1 = d) and where all its elements are close to \u03bb\u2217\nde\ufb01ned above holds (which occurs with high probability). Once we have the set \u039b1, we return the\nspan of the eigenvectors associated to the eigenvalues in \u039b1.\nFor a given set of indices i1,\u00b7\u00b7\u00b7 , id we de\ufb01ne\n\nGap1( \u02c6Tn; i1,\u00b7\u00b7\u00b7 , id) := min\n\ni /\u2208{i1,\u00b7\u00b7\u00b7 ,id} max\n\nj\u2208{i1,\u00b7\u00b7\u00b7 ,ij}\n\n|\u02c6\u03bbj \u2212 \u02c6\u03bbi|\n\nand\n\nGap1( \u02c6Tn) :=\n\nmax\n\n{i1,\u00b7\u00b7\u00b7 ,id}\u2208S n\n\nd\n\nGap1( \u02c6Tn; i1,\u00b7\u00b7\u00b7 , id)\n\n6\n\n\fn\u22121} \u2190eigenvalues of \u02c6Tn sorted in decreasing order\n1+d}: where \u039bsort\n\nis the i-th element in \u039bsort\n\ni\n\nAlgorithm 1: Harmonic EigenCluster(HEiC) algorithm\nInput: ( \u02c6Tn, d) adjacency matrix and sphere dimension\n\n1\n\n1\n\n,\u00b7\u00b7\u00b7 , \u02c6\u03bbsort\n,\u00b7\u00b7\u00b7 , \u039bsort\n\n\u039bsort = {\u02c6\u03bbsort\n\u039b1 \u2190 {\u039bsort\nInitialize i = 2, gap = Gap1( \u02c6Tn; 1, 2,\u00b7\u00b7\u00b7 , d)\nif Gap1( \u02c6Tn; i, i + 1,\u00b7\u00b7\u00b7 , i + d) > gap then\nend if\ni \u2190 i + 1\n\nwhile i \u2264 n \u2212 d do\n\u039b1 \u2190 {\u039bsort\n\n,\u00b7\u00b7\u00b7 , \u039bsort\ni+d}\n\ni\n\nend while\nReturn: \u039b1, gap\n\n0\n\nwhere S n\nd contains all the subsets of {1,\u00b7\u00b7\u00b7 , n \u2212 1} of size d. This de\ufb01nition parallels that of\nGap1(W ) for the graphon. Observe any set of indices in S n\nd will not include 0. Otherwise stated, we\ncan leave \u02c6\u03bbsort\nout of this de\ufb01nition and it will not be candidate to be in \u039b1. In the supplementary\nmaterial we prove that the largest eigenvalue of the adjacency matrix will be close to the eigenvalue \u03bb\u2217\nand in consequence can not be close enough to \u03bb\u2217\n0\n1 to be in the set \u039b1, given the de\ufb01nition of the\nevent E.\nTo compute Gap1( \u02c6Tn) we consider the set of eigenvalues \u02c6\u03bbj ordered in decreasing order. We use the\nnotation \u02c6\u03bbsort\n\nto emphasize this fact. We de\ufb01ne the right and left differences on the sorted set by\n\nj\n\nleft(i) = |\u02c6\u03bbsort\nright(i) = left(i + 1)\n\ni \u2212 \u02c6\u03bbsort\ni\u22121|\n\nwhere left(\u00b7) is de\ufb01ned for 1 \u2264 i \u2264 n and right(\u00b7) is de\ufb01ned for 0 \u2264 i \u2264 n\u2212 1. With these de\ufb01nition,\nwe have the following lemma, which we prove in the supplementary material.\nLemma 5. On the event E, the following equality holds\n\nGap1( \u02c6Tn) = max\n\nmax\n\n1\u2264i\u2264n\u2212d\u22121\n\nmin{left(i), right(i + d)}, left(n \u2212 d + 1)\n\n(cid:110)\n\n(cid:111)\n\nThe set \u039b1 has the form \u039b1 = {\u02c6\u03bbsort\ni\u2217\nthat either\n\n, \u02c6\u03bbsort\ni\u2217 = arg max\n1\u2264i\u2264n\u2212d\u22121\n\ni\u2217+1,\u00b7\u00b7\u00b7 , \u02c6\u03bbsort\n\ni\u2217+d} for some 1 \u2264 i\u2217 \u2264 n \u2212 d \u2212 1. We have\n\nmin{left(i), right(i + d)}\n\nor i\u2217 = n \u2212 d depending whether or not one has max1\u2264i\u2264n\u2212d\u22121 min{left(i), right(i + d)} >\nleft(n\u2212 d + 1). The algorithm then constructs the matrix \u02c6V having columns {\u02c6vi\u2217 , \u02c6vi\u2217+1,\u00b7\u00b7\u00b7 , \u02c6vi\u2217+d}\nand returns \u02c6V \u02c6V T .\nIt is worth noting that Algorithm 1 time complexity n3 + n, where n3 comes from the fact that we\ncompute the eigenvalues and eigenvectors of the n \u00d7 n matrix \u02c6Tn and the linear term is because\nwe explore the whole set of eigenvalues to \ufb01nd the maximum gap for the size d. In terms of space\ncomplexity the algorithm is n2 because we need to store the matrix \u02c6Tn.\nRemark 1. If we change \u02c6Tn in the input of Algorithm 1 to \u02c6T usvt\n(obtained by the USVT algorithm [6])\nwe predict that the algorithm will give similar results. This is because discarding some eigenvalues\nbellow a prescribed threshold do not have effect on our method. However, as preprocessing step the\nUSVT might help in speeding up the eigenspace detection, but this step is already linear in time.\n\nn\n\n3.1 Estimation of the dimension d\nSo far we have focused on the estimation of the population Gram matrix G\u2217. We now give an\nalgorithm to \ufb01nd the dimension d, when it is not provided as input. This method receives the\n\n7\n\n\fmatrix \u02c6Tn as input and uses Algorithm 1 as a subroutine to compute a score, which is simply the value\nof the variable Gap1( \u02c6Tn) returned by Algorithm 1. We do this for each d in a set of candidates, which\nwe call D. This set of candidates will be usually, but not necessarily, \ufb01xed to {1, 2, 3,\u00b7\u00b7\u00b7 , dmax}.\nOnce we have computed the scores, we pick the candidate that have the maximum score.\nGiven the guarantees provided by Theorem 4, the previously described procedure will \ufb01nd the correct\ndimension, with high probability (on the event E), if the true dimension of the graphon is on the\ncandidate set D. This will happen, in particular, if the assumptions of Theorem 4 are satis\ufb01ed. We\nrecall that the main hypothesis on the graphon is that the spectral gap Gap1(W ) should be different\nfrom 0.\n\n4 Experiments\n\nWe generate synthetic data using different geometric graphons. In the \ufb01rst set of examples, we focus\nin recovering the Gram matrix when the dimension is provided. In the second set we tried to recover\nthe dimension as well.\n\n4.1 Recovering the Gram matrix\n\nWe start by considering the graphon W1(x, y) = 1(cid:104)x,y(cid:105)\u22640 which de\ufb01nes, through the sampling\nscheme given in Section 2.2, the same random graph model as the classical RGG model on Sd\u22121\nwith threshold 0. Thus two sampled points Xi, Xj \u2208 Sd\u22121 will be connected if and only if they lie in\nthe same semisphere.\n\nFigure 1: In the left we have a boxplot of M SEn for different values of n. In the right, we plot the\nscore for a set of candidate dimensions D = {1,\u00b7\u00b7\u00b7 , 19}. Data were sampled with W1 on Sd\u22121 with\nd = 3.\nWe consider different values for the sample size n and for each of them we sample 100 Gram matrices\nin the case d = 3 and run the Algorithm 1 for each. We compute each time the mean squared error,\nde\ufb01ned by\n\nM SEn =\n\n1\n\nn2(cid:107) \u02c6G \u2212 G\u2217(cid:107)2\n\nF\n\nIn Figure 1 we put the M SEn for different values of n, showing how M SEn decrease in terms of n.\nFor each n, the M SEn we plot is the mean over the 100 sampled graphs.\n\n4.2 Recovering the dimension d\n\nWe conducted a simulation study using graphon W1, sampling 1000 point on the sphere of dimension\nd = 3 and we use Algorithm 1 to compute a score and recover d. We consider a set of candidates\nwith dmax = 15. In Figure 1 we provide a boxplot for the score of each candidate repeating the\nprocedure 50 times. We see that for this graphon, the algorithm can each time differentiates the true\ndimension from the \u201cnoise\".\n\n8\n\n\fFigure 2: The mean (25 repetitions) runtime of the HEiC algorithm for the graphon W1. The\nexperiments were performed on a 3,3Ghz Intel i5 with 16GB RAM.\n\n5 Discussion\n\nAlthough in this paper we have focused on the sphere as the latent metric space, our main result\ncan be extended to other latent space where the distance is translation invariant, such as compact\nLie groups or compact symmetric spaces. In that case, the geometric graphon will be of the form\nW (x, y) = f (cos \u03c1(x, y)) where x, y are points in the compact Lie group S and \u03c1(\u00b7,\u00b7) is the metric.\nWe will have\n\nf (cos \u03c1(x, y)) = f (cos \u03c1(x \u00b7 y\u22121, e1)) = \u02dcf (x \u00b7 y\u22121)\n\nwhere e1 is the identity element in S and \u02dcf (x) = f (\u03c1(x, e1)). In consequence W (x, y) = \u02dcf (x\u00b7 y\u22121).\nFurthermore, there exists an addition theorem in this case (which is central to our recovery result).\nAnalogous regularity notions to the one considered in this work are also worth exploring. In [8] the\nauthors give more details on the model of geometric graphon in compact Lie groups with focus on\nthe graphon estimation.\nIn principle, it would be possible to extend most of the results of this paper to the case when the\nunderlying space is Bd = {x \u2208 Rd : (cid:107)x(cid:107) \u2264 1} and the link function depends only on the inner\nproducts of the points in Bd. As detailed in [7], the harmonic analysis on the sphere can be extended\nto the unit ball. In particular, an analogous addition theorem exists. Besides, one fundamental fact\nthat used in the proof of Theorem 1 is the control on the growth of the L2(Sd\u22121) norm of the spherical\nharmonics, which has its analog for the polynomial base in L2(Bd). Despite the similarities between\nthe model on the unit sphere and the model on the unit ball, they might generate very different graphs.\nFor instance, an interesting feature of the model on Bd is that is not only angle dependent (as in the\ncase of the unit sphere), but also norm dependent. This would allow to generate graphs with more\nheterogenous node distribution. The study in depth of this model is left for a future work as well as\nthe study of the sparser case.\n\nReferences\n[1] E. Arias-Castro, A. Channarond, B. Pelletier, and N. Verzelen. On the estimation of latent\n\ndistances using graph distances. arXiv:1804.10611, 2018.\n\n[2] A. Bandeira and R. Van Handel. Sharp nonasymptotic bounds on the norm of random matrices\n\nwith independent entries. Annals of Probability, 44(4):2479\u20132506, 2016.\n\n[3] C. Borgs, J.T. Chayes, L. Lovasz, V.T Sos, and K. Vesztergombi. Convergent sequences of dense\ngraphs i. subgraph frequencies,metric properties and testing. Adv. Math, 219(6):1801\u20131851,\n2008.\n\n[4] C. Borgs, J.T Chayes, L. Lovasz, V.T. Sos, and K. Vesztergombi. Convergent sequences of\ndense graphs ii. multiway cuts and statistical physics. Annals of Mathematics, 176(1):151\u2013219,\n2012.\n\n[5] S. Bubeck, J. Ding, R. Eldan, and M. R\u00e1cz. Testing for high dimensional geometry in random\n\ngraphs. Random Structures and Algorithms, 49:503\u2013532, 2016.\n\n9\n\n\f[6] S. Chatterjee. Matrix estimation by universal singular value thresholding. Annals of Statistics,\n\n43(1):177\u2013214, 2015.\n\n[7] F. Dai and Y. Xu. Approximation theory and harmonic Analysis on spheres and balls. Springer\n\nVerlag Monographs in Mathematics, 2013.\n\n[8] Y. De Castro, C. Lacour, and T.M. Pham Ngoc. Adaptive estimation of nonparametric geometric\n\ngraphs. arxiv.org/pdf/1708.02107.\n\n[9] L. Devroye, A. Gy\u00f6rgy, L. Backstrom, and C. Marlow. High-dimensional random geometric\n\ngraphs and their clique number. Electronic journal of probability, 16(90):2481\u20132508, 2011.\n\n[10] J. Diaz, C. McDiarmid, and D. Mitsche. Learning random points from geometric graphs or\n\norderings. arXiv:1804.10611, 2018.\n\n[11] M. Emery, A. Nemirovski, and D. Voiculescu. Lectures on probability theory,. Springer-Verlag\n\nBerlin Heidelberg, Ecole d\u2019ete de probabilites de saint-\ufb02our XXVIII edition, 1998.\n\n[12] P. Erd\u00f6s and A. R\u00e9nyi. On the evolution of random graphs. Publ. Math. Inst. Hungar. Acad. Sci,\n\n5:17\u201360, 1960.\n\n[13] E.N. Gilbert. Random plane networks. J.Soc.Industrial Applied Mathematics, 9(5):533\u2013543,\n\n1961.\n\n[14] R. Gupta, T. Roughgarden, and C. Sheshadhri. Decomposition of triangle-dense graphs. SIAM\n\nJournal on Computing, 45(2):197\u2013215, 2016.\n\n[15] D.J. Higham, M. Rasajski, and N. Przulj. Fitting a geometric graph to a protein-protein\n\ninteraction network. Bioinformatics, 24(8):1093\u20131099, 2008.\n\n[16] F. Hirsch and G. Lacombe. Elements of functional analysis. Springer-Verlag New York, 1999.\n\n[17] P. Hoff, A. Raftery, and M. Handcock. Latent space approaches to social network analysis.\n\nJournal of the American Statistical Association, 97(460):1090\u20131098, 2002.\n\n[18] X. Jia. Wireless networks and random geometric graphs. Proc. Int. Symp. Parallel Architectures,\n\nAlgorithms and Networks, pages 575\u2013579, 2004.\n\n[19] D Karger, R. Motwani, and M. Sudan. Approximate graph coloring by semide\ufb01nite program-\n\nming. Journal of the ACM (JACM), 45(2):246\u2013265, 1998.\n\n[20] O. Klopp, A. Tsybakov, and N. Verzelen. Oracle inequalities for network models and sparse\n\ngraphon estimation. Annals of Statistics, 45(1):316\u2013354, 2017.\n\n[21] V. Koltchinskii. Asymptotics of spectral projections of some random matrices approximating\nintegral operators. Progress in Probability, 43(In: Eberlein E., Hahn M., Talagrand M. (eds)\nHigh Dimensional Probability):191\u2013227, 1998.\n\n[22] V. Koltchinskii and E. Gin\u00e9. Random matrix approximation of spectra of integral operators.\n\nBernoulli, pages 113\u2013167, 2000.\n\n[23] K. Levin and V. Lyzinski. Laplacian eigenmaps from sparse, noisy similarity measurements.\n\nIEEE Transactions on Signal Processing, 65:1998\u20132003, 2017.\n\n[24] L. Lovasz. Large networks and graph limits. Colloquium Publications (AMS), 2012.\n\n[25] L. Lov\u00e1sz and B. Szegedy. Limits of dense graph sequences. J.Combin.Theory.Ser B, 96(6):197\u2013\n\n215, 2006.\n\n[26] S. Nicaise. Jacobi polynomials, weighted Sobolev spaces and approximation results of some\n\nsingularities. Math. Nachr., 213:117\u2013140, 2000.\n\n[27] M Penrose. Random geometric graphs. Oxford University Press, \ufb01rst edition, 2003.\n\n10\n\n\f[28] D.L. Sussman, M. Tang, and C.E. Priebe. Consistent latent position estimation and vertex\nIEEE transactions on Pattern Analysis and\n\nclassi\ufb01cation for random dot product graphs.\nMachine Intelligence, 36:48\u201357, 2014.\n\n[29] M Tang, D.L Sussman, and C.E Priebe. Universally consistent vertex classi\ufb01cation for latent\n\nposition graphs. Annals of Statistics, 41:1406\u20131430, 2013.\n\n[30] M. Walters. Random geometric graphs. Surveys in Combinatorics, pages 365\u2013402, 2011.\n\n11\n\n\f", "award": [], "sourceid": 4700, "authors": [{"given_name": "Ernesto", "family_name": "Araya Valdivia", "institution": "Universit\u00e9 Paris-Sud"}, {"given_name": "De Castro", "family_name": "Yohann", "institution": "\u00c9cole centrale de Lyon"}]}