{"title": "Blind Deconvolutional Phase Retrieval via Convex Programming", "book": "Advances in Neural Information Processing Systems", "page_first": 10030, "page_last": 10040, "abstract": "We consider the task of recovering two real or complex $m$-vectors from phaseless Fourier measurements of their circular convolution.  Our method is a novel convex relaxation that is based on a lifted matrix recovery formulation that allows a nontrivial convex relaxation of the bilinear measurements from convolution.    We prove that if  the two signals belong to known random subspaces of dimensions $k$ and $n$, then they can be recovered up to the inherent scaling ambiguity with $m  >> (k+n) \\log^2 m$  phaseless measurements.  Our method provides the first theoretical recovery guarantee for this problem by a computationally efficient algorithm and does not require a solution estimate to be computed for initialization. Our proof is based Rademacher complexity estimates.  Additionally, we provide an ADMM implementation of the method and provide numerical experiments that verify the theory.", "full_text": "Blind Deconvolutional Phase Retrieval via Convex\n\nProgramming\n\nAli Ahmed\n\nDepartment of Electrical Engineering\nInformation Technology University\n\nLahore, Pakistan.\n\nali.ahmed@itu.edu.pk\n\nAlireza Aghasi\n\nDepartment of Business Analytics\n\nGeorgia State University\n\nAtlanta, GA.\n\naaghasi@gsu.edu\n\nCollege of Computer and Information Science\n\nPaul Hand\n\nNortheastern University\n\nBoston, MA.\n\np.hand@northeastern.edu\n\nAbstract\n\nWe consider the task of recovering two real or complex m-vectors from phaseless\nFourier measurements of their circular convolution. Our method is a novel convex\nrelaxation that is based on a lifted matrix recovery formulation that allows a\nnontrivial convex relaxation of the bilinear measurements from convolution. We\nprove that if the two signals belong to known random subspaces of dimensions\nk and n, then they can be recovered up to the inherent scaling ambiguity with\nm >> (k + n) log2 m phaseless measurements. Our method provides the \ufb01rst\ntheoretical recovery guarantee for this problem by a computationally ef\ufb01cient\nalgorithm and does not require a solution estimate to be computed for initialization.\nOur proof is based Rademacher complexity estimates. Additionally, we provide\nan ADMM implementation of the method and provide numerical experiments that\nverify the theory.\n\n1\n\nIntroduction\n\nThis paper considers recovery of two unknown signals (real- or complex-valued) from the magnitude\nonly measurements of their convolution. Let w, and x be vectors residing in Hm, where H denotes\nm e\u2212j2\u03c0\u03c9t/m, 1 \u2264\neither R, or C. Moreover, denote by F the DFT matrix with entries F [\u03c9, t] = 1\u221a\n\u03c9, t \u2264 m. We observe the phaseless Fourier coef\ufb01cients of the circular convolution w (cid:126) x of w, and\nx\n\ny = |F (w (cid:126) x)|,\n\n(1)\nwhere |z| returns the element wise absolute value of the vector z. We are interested in recovering\nw, and x from the phaseless measurements y of their circular convolution. In other words, the\nproblem concerns blind deconvolution of two signals from phaseless measurements. The problem\ncan also be viewed as identifying the structural properties on w such that its convolution with the\nsignal/image of interest x makes the phase retrieval of a signal x well-posed. Since w, and x\nare both unknown, and in addition, the measurements are phaseless, the inverse problem becomes\nseverly ill-posed as many pairs of w, and x correspond to the same y. We show that this non-linear\nproblem can be ef\ufb01ciently solved, under Gaussian measurements, using a semide\ufb01nite program and\nalso theoretically prove this assertion. We also propose a heuristic approach to solve the proposed\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fsemide\ufb01nite program computationally ef\ufb01ciently. Numerical experiments show that, using this\nalgorithm, one can successfully recover a blurred image from the magnitude only measurements of\nits Fourier spectrum.\nPhase retrieval has been of continued interest in the \ufb01elds of signal processing, imaging, physics,\ncomputational science, etc. Perhaps, the single most important context in which phase retrieval arises\nis the X-ray crystallography Harrison [1993], Millane [1990], where the far-\ufb01eld pattern of X-rays\nscattered from a crystal form a Fourier transform of its image, and it is only possible to measure the\nintensities of the electromagnetic radiation. However, with the advancement of imaging technologies,\nthe phase retrieval problem continues to arise in several other imaging modalities such as diffraction\nimaging Bunk et al. [2007], microscopy Miao et al. [2008], and astronomical imagingFienup and\nDainty [1987]. In the imaging context, the result in this paper would mean that if rays are convolved\nwith a generic pattern (either man made or naturally arising due to propagation of light through some\nunknown media) w prior to being scattered/re\ufb02ected from the object, the image of the object can be\nrecovered from the Fourier intensity measurements later on. As is well known from Fourier optics\nGoodman [2008], the convolution of a visible light with a generic pattern can be implemented using\na lens-grating-lens setup.\nDespite recent advances in theoretical understanding of phase retrieval Candes et al. [2013, 2015a],\nthe application to actual problems such as crystallography remains challenging owing partly to the\nsimplistic mathematical models that may not fully capture the actual physical problem at hand. Our\ncomparatively more complex model in (1) more elaborately encompasses structure in actual physical\nproblems, for example, crystallography, where due to the natural periodic arrangement of a crystal\nstructural unit, the observed electron density function of the crystal exactly takes the form (1); for\ndetails, see, Section 2 of Elser et al. [2017].\nBlind deconvolution is a fundamental problem in signal processing, communications, and in general\nsystem theory. Visible light communication has been proposed as a standard in 5G communications\nfor local area networks Azhar et al. [2013], Retamal et al. [2015], Azhar et al. [2010]. Propagation of\ninformation carrying light through an unknown communication medium is modeled as a convolution.\nThe channel is unknown and at the receiver it is generally dif\ufb01cult to measure the phase information\nin the propagated light. The result in this paper says that the transmitted signal can be blindly\ndeconvolved from the unknown channel from the Fourier intensity measurements of the light only.\nThe reader is referred to the \ufb01rst section of the supplementary note for a detailed description of the\nvisible light communication and its connection to our formulation.\n\n1.1 Observations in Matrix Form\n\nThe phase retrieval, and blind deconvolution problem has been extensively studied in signal processing\ncommunity in recent years Candes et al. [2015b], Ahmed et al. [2014] by lifting the unknown vectors\nto a higher dimensional matrix space formed by their outer products. The resulting rank-1 matrix is\nrecovered using nuclear norm as a convex relaxation of the non-convex rank constraint. Recently,\nother forms of convex relaxations have been proposed Bahmani and Romberg [2017a], Goldstein\nand Studer [2018], Aghasi et al. [2017a,b] that solve both the problems in the native (unlifted) space\nleading to computationally ef\ufb01ciently solvable convex programs. This paper handles the non-linear\nconvolutional phase retrieval problem by lifting it into a bilinear problem. The resulting problem,\nthough still non-convex, gives way to an effective convex relaxation that provably recovers w, and x\nexactly.\nIt is clear from (1) that uniquely recovering w, and x is not possible without extra knowledge or\ninformation about the problem. We will address the problem under a broad and generally applicable\nstructural assumptions that both the vectors w, and x are members of known subspaces of Hm. This\nmeans that w, and x can be parameterized in terms of unknown lower dimensional vectors h \u2208 Hk,\nand m \u2208 Hn, respectively as follows\n\nw = Bh, x = Cm,\n\n(2)\nwhere B \u2208 Hm\u00d7k, and C \u2208 Hm\u00d7n are known matrices whose columns span the subspaces in which\nw, and x reside, respectively. Recovering h, and m would imply the recovery of w, and x, therefore,\nwe take h, and m as the unknowns in the inverse problem henceforth. Since the circular convolution\noperator diagonalizes in the Fourier domain, the measurements in (1) take the following form after\n\n2\n\n\fincorporating the subspace constraints in (2)\ny = 1\u221a\n\nm| \u02c6Bh (cid:12) \u02c6Cm|,\n\n\u221a\n\n\u221a\n\nmF C, and (cid:12) represent the Hadamard product. Denoting by b\nwhere \u02c6B =\nthe rows of \u02c6B, and \u02c6C, respectively, the entries of the measurements y can be expressed as\n\nmF B, \u02c6C =\n\n\u2217\n(cid:96) and c\u2217\n\n(cid:96)\n\ny2\n(cid:96) = 1\n\nm|(cid:104)b(cid:96), h(cid:105)(cid:104)c(cid:96), m(cid:105)|2, (cid:96) = 1, 2, 3, . . . , m.\n\n\u2217, and mm\u2217 taking the form\nm(cid:104)b(cid:96)b\n\n\u2217\n(cid:96) , hh\n\nEvidently the problem is non-linear in both unknowns. However, it reduces to a bilinear problem in\nthe lifted variables hh\n\ny2\n(cid:96) = 1\n\n(cid:96) , mm\u2217(cid:105) = 1\n\u2217(cid:105)(cid:104)c(cid:96)c\u2217\n(3)\n\u2217, and mm\u2217, respectively. Treating the lifted variables\nwhere H, and M are the rank-1 matrices hh\nH, and M as unknowns makes the measurements bilinear in the unknowns; a structure that will help\nus formulate an effective convex relaxation.\n\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\n(cid:96) , M(cid:105),\n\nm(cid:104)b(cid:96)b\n\n1.2 Novel Convex Relaxation\n\nThe task of recovering H, and M from y in (3) can be naturally posed as an optimization program\n\n\ufb01nd H, M\nsubject to 1\n\nm(cid:104)b(cid:96)b\n\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\n(cid:96) , M(cid:105) = y2\n\n(cid:96) , (cid:96) = 1, 2, 3, . . . , m.\n\n(4)\n\nrank(H) = 1, rank(M ) = 1.\n\n(cid:96) = 1\n\nm(cid:104)b(cid:96)b\n\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\nHowever, both the measurement and the rank constraints are non-convex. Further, the immediate\nconvex relaxation of each measurement constraint is trivial, as the convex hull of the set of (H, M )\nsatisfying y2\n\u2217, and M = mm\u2217 are also positive\nTo derive our convex relaxation, recall that the true H = hh\nsemide\ufb01nite (PSD). This means that incorporating the PSD constraint in the optimization program\n(cid:96) , M(cid:105) are necessarily\ntranslates into the fact that the variables u(cid:96) = (cid:104)b(cid:96)b\nnon-negative. That is,\n\n(cid:96) , M(cid:105) is the set of all possible (H, M ).\n\n(cid:96) , H(cid:105) and v(cid:96) = (cid:104)c(cid:96)c\u2217\n\u2217\n\nH (cid:60) 0, and M (cid:60) 0 =\u21d2 u(cid:96) \u2265 0, and v(cid:96) \u2265 0,\n\nm u(cid:96)v(cid:96) \u2265 y2\n\nwhere the implication simply follows by the de\ufb01nition of PSD matrices. This observation restricts the\nhyperbolic constraint set in Figure 1 to the \ufb01rst quadrant only. For a \ufb01xed (cid:96), we propose replacing\nthe non-convex hyperbolic set {(u(cid:96), v(cid:96)) \u2208 R2 | 1\n(cid:96) , u(cid:96) \u2265 0, v(cid:96) \u2265 0} with its convex hull\n{(u(cid:96), v(cid:96)) \u2208 R2 | 1\n(cid:96) , u(cid:96) \u2265 0, v(cid:96) \u2265 0}. In short, our convex relaxation is possible because\nthe PSD constraint from lifting happens to select a speci\ufb01c branch of the hyperbola given by any\nparticular bilinear measurement, and this single branch has a nontrivial convex hull.\nThe rest of the convex relaxation is standard, as the rank constraint in (4) is then relaxed with a\nnuclear-norm minimization, which reduces to trace minimization in the PSD case. Hence, we study\nthe convex program\n\nm u(cid:96)v(cid:96) = y2\n\nminimize Tr(H) + Tr(M )\nm(cid:104)b(cid:96)b\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\nsubject to 1\nH (cid:60) 0, M (cid:60) 0.\n\n(cid:96) , M(cid:105) \u2265 y2\n\n(cid:96) , (cid:96) = 1, 2, . . . , m\n\n(5)\n\nThe following lemma formally proves the convexity of the optimization program above.\nLemma 1. If y \u2208 Rm such that y(cid:96) > 0 then the optimization program in (5) is a convex program.\n\n:= {(H, M ) \u2208 Hk\u00d7k \u00d7 Hm\u00d7m | 1\n\nProof. The objective of (5) is simply linear, we focus on the constraints. For a \ufb01xed (cid:96), let\n(cid:96) , H (cid:60) 0, M (cid:60) 0},\nS(cid:96)\nm u(cid:96)v(cid:96) \u2265 y2\nS(cid:96),1 := {(u(cid:96), v(cid:96)) \u2208 R2 | 1\n(cid:96) , u(cid:96) \u2265 0, v(cid:96) \u2265 0}, and S(cid:96),2 := {(H, M ) \u2208 Hk\u00d7k \u00d7\nHm\u00d7m | ((cid:104)b(cid:96)b\n(cid:96) , M(cid:105)) \u2208 S(cid:96),1}. To show that S(cid:96) is convex, it suf\ufb01ces to show that S(cid:96),1,\n(cid:96) , H(cid:105),(cid:104)c(cid:96)c\u2217\n\u2217\nand S(cid:96),2 are convex.\n\n(cid:96) , M(cid:105) \u2265 y2\n\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\nm(cid:104)b(cid:96)b\n\n3\n\n\fv(cid:96)\n\n(cid:26)(cid:18)u(cid:96)\n\n(cid:19)\n\nv(cid:96)\n\nConv\n\n1\n\n(cid:27)\n\n: 1\nm u(cid:96)v(cid:96) = y2\n\n(cid:96) , u(cid:96) > 0\n\nmu\n\n(cid:96)v\n\n(cid:96)=\n\ny2\n\n(cid:96)\n\n0\n\nu(cid:96)\n\nFigure 1: Left: Restriction of the hyperbolic constraint to the \ufb01rst quadrant; Right: Abstract\nIllustration of the Geometry of the Convex Relaxation. PSD cone (blue) and the surface of the\nhyperbolic set (red) formed by two intersecting hyperbolas (m = 2). Evidently, there are multiple\npoints on the surface and also in the convex hull of the hyperbolic set that lie on the PSD cone. The\nminimizer of the optimization program (5) picks the one with minimum trace that happens to lie at\nthe intersection of hyperbolic ridge and the PSD cone (pointed out by an arrow). The gray envelope\nof two (m = 2) hyperplanes surrounding the hyperbolic set correspond to the linearization of the\nhyperbolic set at the minimizer; this forms the basis of a connected linearly constrained program later\nin (9).\n\nFix (u1, v1), (u2, v2) \u2208 S(cid:96),1, and let \u03b1 \u2208 [0, 1]. Note that u1 > 0, and u2 > 0 as y(cid:96) > 0. Consider\n\n(\u03b1u1 + (1 \u2212 \u03b1)u2)(\u03b1v1 + (1 \u2212 \u03b1)v2)\n\n1\nm\n\n1\nm\n\n=\n\u2265 (\u03b12y2\n\n(cid:0)(\u03b12u1v1 + (1 \u2212 \u03b1)2u2v2) + \u03b1(1 \u2212 \u03b1)(u1v2 + u2v1)(cid:1)\n(cid:19)\n(cid:18)\n(cid:18)\n\n+\n2\u03b12u1u2 \u2212 2\u03b1u1u2 + \u03b1(1 \u2212 \u03b1)(u2\n\n(cid:96) ) + \u03b1(1 \u2212 \u03b1)(\n(cid:19)\n\ny2\n(cid:96) u2\nu1\n1 + u2\n2)\n\n(cid:96) + (1 \u2212 \u03b1)2y2\n\n(\u03b1 \u2212 \u03b12)(u1 \u2212 u2)2\n\ny2\n(cid:96) u1\nu2\n\nu1u2\n\n1 +\n\n)\n\n1 +\n\nu1u2\n\n\u2265 y2\n(cid:96) ,\n\n= y2\n(cid:96)\n\n= y2\n(cid:96)\n\nwhere the last inequality follows form the fact that \u03b1 \u2208 [0, 1], and u1u2 > 0. This shows that S(cid:96),1 is\nconvex.\nThe set S(cid:96),2 is convex as the inverse image of a convex set of a linear map is convex. This implies\nthat S(cid:96) is convex. Finally, the intersection of any number of convex sets is convex means that the\nconstraint of (5) is convex. This proves that (5) is a convex program.\n\n1.3 Main Result\n\nAs we are presenting the \ufb01rst analytical results on this problem, we choose the subspace matrices B,\nand C to be standard Gaussian:\n\nB[(cid:96), i] \u223c Normal(0, 1\n\nm ), ((cid:96), i) \u2208 [m] \u00d7 [k], and C[(cid:96), i] \u223c Normal(0, 1\n\n(6)\nNote that this choice results in b(cid:96), c(cid:96) \u223c Normal(0, I). We show that with this choice the optimization\nprogram in (5) recovers a global scaling of (\u03b1H (cid:92), \u03b1\u22121M (cid:92)) of the true solution (H (cid:92), M (cid:92)). We will\ninterchangeably use the notation (H, M ) \u2208 (Hk\u00d7k,Hn\u00d7n) to denote the pair of matrices H and\nM, or the block diagonal matrix\n\nm ), ((cid:96), i) \u2208 [m] \u00d7 [n].\n\n(H, M ) =\n\n0 M\n\n.\n\n(7)\n\n(cid:20)H 0\n\n(cid:21)\n\n4\n\n.\fThe exact value of the unknown scalar multiple \u03b1 can be characterized for the solution of (5). Observe\n\nthat the solution ((cid:99)H,(cid:99)M ) of the convex optimization program in (5) obeys Tr((cid:99)H) = Tr((cid:99)M ). We\n\naim to show that the solution of the optimization program recovers the scaling ( \u02dcH, \u02dcM ) of the true\nsolution (H (cid:92), M (cid:92)):\n\n(cid:115)\n\n(cid:115)\n\n\u02dcH =\n\nTr(M (cid:92))\nTr(H (cid:92))\n\nH (cid:92), \u02dcM =\n\nTr(H (cid:92))\nTr(M (cid:92))\n\nM (cid:92).\n\n(8)\n\nNote that Tr( \u02dcH) = Tr( \u02dcM ). The main result can now be stated as follows.\nTheorem 1 (Exact Recovery). Given the magnitude only spectrum measurements (1) of the convo-\nlution of two unknown vectors w(cid:92), and x(cid:92) in Hm. Suppose that w(cid:92), and x(cid:92) are generated as in\n(2), where B, and C are known standard Gaussian matrices as in (6). Then the convex optimiza-\ntion program in (5) uniquely recovers (\u03b1H (cid:92), \u03b1\u22121M (cid:92)) for \u03b1 =\nTr H (cid:92) with probability at least\n1 \u2212 exp(\u2212 1\n\n(cid:16)(cid:112)(k + n) log m + t\n\n, where c is an absolute constant.\n\n(cid:113) Tr M (cid:92)\n\n2 mt2) whenever m \u2265 c\n\n(cid:17)2\n\n1.4 Main Contributions\n\nIn this paper, we study the combination of two important and notoriously challenging signal recovery\nproblems: phase retrieval and blind deconvolution. We introduce a novel convex formulation that is\npossible because the algebraic structure from lifting resolves the bilinear ambiguity just enough to\npermit a nontrivial convex relaxation of the measurements. The strengths of our approach are that it\nallows a novel convex program that is the \ufb01rst to provably permit recovery guarantees with optimal\nsample complexity for the joint task of phase retrieval and blind deconvolution when the signals belong\nto known random subspaces. Additionally, unlike many recent convex relaxations and nonconvex\napproaches, our approach does not require an initialization or estimate of the true solution in order\nto be stated or solved. Admittedly, our method, directly interpreted, is computationally prohibitive\nfor large problem sizes because lifting squares the dimensionality of the problem. Nonetheless,\ntechniques, such as Burer-Monteiro approaches that only maintain low-rank representations Burer\nand Monteiro [2003], have been developed for similar problems. This current work provides the\ntheoretical justi\ufb01cation for the exploration of such problems in this dif\ufb01cult combination of phase\nretrieval and blind deconvolution, and we leave such work for future research.\nWe do not want to give the reader the impression that the present paper solves the problem of\nblind deconvolutional phase retrieval in practice. The numerical experiments we perform do indeed\nshow excellent agreement with the theorem in the case of random subspaces. Such subspaces are\nunlikely to appear in practice, and typically appropriate subspaces would be deterministic, including\npartial Discrete Cosine Transforms or partial Discrete Wavelet Transforms. Numerical experiments,\nnot shown, indicate that our convex relaxation is less effective for the cases of these deterministic\nsubspaces. We suspect this is due to the fact that the subspaces for both measurements should be\nmutually incoherent, in addition to both being incoherent with respect to the Fourier basis given by\nthe measurements. As with the initial recovery theory for the problems of compressed sensing and\nphase retrieval, we have studied the random case in order to show information theoretically optimal\nsample complexity is possible by ef\ufb01cient algorithms. Based on this work, it is clear that blind\ndeconvolutional phase retrieval is still a very challenging problem in the presence of deterministic\nmatrices, and one for which development of convex or nonconvex methods may provide substantial\nprogress in applications.\n\n2 Proof of Theorem 1\n\nTo prove Theorem 1, we will show that ( \u02dcH, \u02dcM ) is the unique minimizer of an optimization program\nwith a larger feasible set de\ufb01ned by linear constraints.\nLemma 2. If ( \u02dcH, \u02dcM ) is the unique solution to\n\nminimize (cid:107)H(cid:107)\u2217 + (cid:107)M(cid:107)\u2217\nsubject to 1\n\nm ((cid:104)b(cid:96)b\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\n(cid:96) , \u02dcM(cid:105) + (cid:104)b(cid:96)b\n\n(cid:96) , \u02dcH(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\n(cid:96) , M(cid:105)) \u2265 2y2\n(cid:96) ,\n\n(9)\n\n(cid:96) = 1, 2, 3, . . . , m.\n\n5\n\n\fthen ( \u02dcH, \u02dcM ) is the unique solution to (5).\n\nProof. Start by observing that the trace in (5) can be replaced with nuclear norm as on the set of PSD\nmatrices both are equivalent. This gives\n\nminimize (cid:107)H(cid:107)\u2217 + (cid:107)M(cid:107)\u2217\nm(cid:104)b(cid:96)b\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\nsubject to 1\nH (cid:60) 0, M (cid:60) 0.\n\n(cid:96) , M(cid:105) \u2265 y2\n\n(cid:96) , (cid:96) = 1, 2, . . . , m\n\n(10)\n\nIt suf\ufb01ces now to show that the feasible set of (9) contains the feasible set of (10). Recall the notations\n\nu(cid:96) = (cid:104)b(cid:96)b\n\n(cid:96) , H(cid:105), v(cid:96) = (cid:104)c(cid:96)c\u2217\n\u2217\n\nUsing the fact that a convex set with smooth boundary is contained in a half space de\ufb01ned by the\ntangent hyperplane at any point on the boundary of the set. Consider the point (\u02dcu(cid:96), \u02dcv(cid:96)) \u2208 R2, and\nobserve that\n\n(cid:96) , M(cid:105), \u02dcu(cid:96) = (cid:104)b(cid:96)b\n(cid:26)\n\n(cid:96) , u(cid:96) \u2265 0, and v(cid:96) \u2265 0(cid:9) \u2286\n\n(cid:96) , \u02dcH(cid:105), and \u02dcv(cid:96) = (cid:104)c(cid:96)c\u2217\n(cid:96) , \u02dcM(cid:105).\n\u2217\n(cid:21)\n\n(cid:20)\u02dcv(cid:96)\n\n(u(cid:96), v(cid:96)) \u2208 R2 | 1\n\n(cid:8)(u(cid:96), v(cid:96)) \u2208 R2 | 1\n\nm u(cid:96)v(cid:96) \u2265 y2\n\n(cid:20)u(cid:96) \u2212 \u02dcu(cid:96)\n\nv(cid:96) \u2212 \u02dcv(cid:96)\n\n\u2265 0\n\nm\n\n\u02dcu(cid:96)\n\n(cid:21)\n\n(cid:27)\n\n\u00b7\n\n.\n\nRewriting u(cid:96) and v(cid:96) in the form of original constraints, we have that any feasible point ( \u02dcH, \u02dcM ) of\n(10) satis\ufb01es 1\n\n(cid:96) , M(cid:105)) \u2265 2y2\n\n(cid:96) , \u02dcM(cid:105) + (cid:104)b(cid:96)b\n\n(cid:96) , (cid:96) = 1, 2, 3, . . . , m.\n\n(cid:96) , H(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\n(cid:96) , \u02dcH(cid:105)(cid:104)c(cid:96)c\u2217\n\u2217\n\nm ((cid:104)b(cid:96)b\n\n\u2217\n(cid:96) , \u02dcu(cid:96)c(cid:96)c\u2217\n\nThe geometry of the linearly constrained program (9) is also shown in Figure 1 (Right), where the\nhyperbolic set is replaced by an envelop of hyperplanes de\ufb01ned by the linear constraints of (9).\nVisually it is clear from Figure 1 that the feasible set of (9) is larger than that of (5).\nDe\ufb01ne a set S := {(H, M ) | (H, M ) = \u03b1(\u2212 \u02dcH, \u02dcM ), and \u03b1 \u2208 [\u22121, 1]}, and A(cid:96) =\n(cid:96) ) \u2208 H(k+n)\u00d7(k+n), and de\ufb01ne a linear map A : H(k+n)\u00d7(k+n) \u2192 Hm as\n(\u02dcv(cid:96)b(cid:96)b\nA((H, M )) = [(cid:104)A1, (H, M )(cid:105), . . . ,(cid:104)Am, (H, M )(cid:105)]T; one can imagine A as a matrix with vec-\ntorized A(cid:96) as its rows. The linear constraints in the (9) are A((H, M )) \u2265 2y2; the inequality\nhere applies elementwise. Furthermore, de\ufb01ne N := span((\u2212 \u02dcH, \u02dcM )), and it is easy to see that\nS \u2282 N \u2286 Null(A).\nWe want to show that any feasible perturbation (\u03b4H, \u03b4M ) around the truth ( \u02dcH, \u02dcM ) strictly increases\nthe objective. From the discussion above, it is clear that the perturbations (\u03b4H, \u03b4M ) \u2208 S do not\nchange the objective and also lead to feasible points of (9). Our general strategy will be to resolve\nany perturbation (\u03b4H, \u03b4M ) into two components, one in N and the other in N\u22a5, where N\u22a5 is\nthe orthogonal complement of the subspace N . The component in N does not affect the objective.\nWe show that the components in N\u22a5 of all the feasible perturbations lead to a strict increase in\nthe objective of (9). This should imply that that the minimizer of (9) can be anywhere in the set1\n( \u02dcH, \u02dcM ) \u2295 N . However, as we are minimizing the (trace) norms, an arbitrary large scaling of the\nsolution is prevented and it is restricted to the subset ( \u02dcH, \u02dcM ) \u2295 S. Moreover, among these solutions\nonly ( \u02dcH, \u02dcM ) lies in the feasible set of (10). Given this and the fact that ( \u02dcH, \u02dcM ) is a minimizer of\n(9) implies that ( \u02dcH, \u02dcM ) is the unique minimizer of (10).\nWe begin by characterizing the set of descent directions for the objective function of the optimization\nprogram (9). Let T\u02dch, and T \u02dcm be the set of symmetric matrices of the form\n\nT\u02dch := {X = \u02dchz\u2217 + z \u02dch\n\n\u2217}, T \u02dcm := {X = \u02dcmz\u2217 + z \u02dcm\u2217},\n\nand denote the orthogonal complements by T \u22a5\n\u02dch\nthe row and column spaces of X are perpendicular to \u02dch. PT\u02dch\nthe set T\u02dch, and a matrix X of appropriate dimensions can be projected into T\u02dch as\n\n\u02dcm, respectively. Note that X \u2208 T \u22a5\n\niff both\ndenotes the orthogonal projection onto\n\n, and T \u22a5\n\n\u02dch\n\nPT\u02dch\n\n\u2217\n(X) := \u02dch\u02dch\n(cid:107)\u02dch(cid:107)2\n\n\u2217\nX + X \u02dch\u02dch\n(cid:107)\u02dch(cid:107)2\n\n\u2217\n\u2212 \u02dch\u02dch\n(cid:107)\u02dch(cid:107)2\n\n2\n\n\u2217\nX \u02dch\u02dch\n(cid:107)\u02dch(cid:107)2\n\n2\n\n2\n\n2\n\nSimilarly, de\ufb01ne the projection operator PT \u02dcm. The projection onto orthogonal complements are then\nsimply PT \u22a5\nas a\n\n:= I \u2212 PT \u02dcm, where I is the identity operator. We use X T\u02dch\n\n:= I \u2212 PT\u02dch\n\n, and PT \u22a5\n\n\u02dcm\n\n\u02dch\n\n1For a point x, and a set S, the notation x \u2295 S denotes a set of points x + si for every si \u2208 S.\n\n6\n\n\f\u2217\n\n, \u02dcm \u02dcm\u2217) + (W T \u22a5\n\nshorthand for PT\u02dch\nof the objective at the proposed solution ( \u02dcH, \u02dcM ) is\n\nThe set Q of descent directions of the objective of (9) is de\ufb01ned as\n\n\u2202(cid:107)( \u02dcH, \u02dcM )(cid:107)\u2217 :=(cid:8)G = (\u02dch\u02dch\n(cid:8)(\u03b4H, \u03b4M ) \u2208 N\u22a5 :(cid:10)(G, (\u03b4H, \u03b4M )(cid:11) \u2264 0,\u2200G \u2208 \u2202(cid:107)( \u02dcH, \u02dcM )(cid:107)\u2217(cid:9) \u2286\n(cid:8)(\u03b4H, \u03b4M ) \u2208 N\u22a5 :(cid:10)(\u02dch\u02dch\n)(cid:107)\u2217 \u2264 0,\u2200G \u2208 \u2202(cid:107)( \u02dcH, \u02dcM )(cid:107)\u2217(cid:9) \u2282\n(cid:8)(\u03b4H, \u03b4M ) \u2208 N\u22a5 : (cid:107)(\u03b4H T \u22a5\n\n, \u02dcm \u02dcm\u2217), (\u03b4H, \u03b4M )(cid:11)+\n\n\u2217\n(cid:107)(\u03b4H T \u22a5\n\n)(cid:107)\u2217 \u2264 (cid:107)(\u03b4H T\u02dch\n\n), (cid:107)(W T \u22a5\n\n, \u03b4M T \u22a5\n\n, \u03b4M T \u22a5\n\n\u02dcm\n\n, W T \u22a5\n\n\u02dcm\n\n\u02dch\n\n\u02dch\n\n\u02dch\n\n\u02dcm\n\n=: Q.\n\n, \u03b4M T \u02dcm )(cid:107)F , \u2200G \u2208 \u2202(cid:107)( \u02dcH, \u02dcM )(cid:107)\u2217(cid:9)\n\n)(cid:107) \u2264 1(cid:9).\n\n, W T \u22a5\n\n\u02dcm\n\n\u02dch\n\n(X). Using the notation in (7), the objective of (9) is (cid:107)(H, M )(cid:107)\u2217, and subgradient\n\n(11)\nWe quantify the \"width\" of the set of descent directions Q through a Rademacher complexity, and a\nprobability that the gradients of the constraint functions of (9) lie in a certain half space. This enables\nus to build an argument using the small ball method Koltchinskii and Mendelson [2015], Mendelson\n[2014] that it is unlikely to have points that meet the constraints in (9) and still be in Q. Before\nmoving forward, we introduce the above mentioned Rademacher complexity and probability term.\n(cid:96) , H(cid:105). For a set Q \u2282\n(cid:96) , M(cid:105) + \u02dcv(cid:96)(cid:104)b(cid:96)b\nDenote the constraint functions as2 f(cid:96)(H, M ) = \u02dcu(cid:96)(cid:104)c(cid:96)c\u2217\n\u2217\n(Hk\u00d7k,Hn\u00d7n), the Rademacher complexity of the gradients \u2207f(cid:96) = ( \u2202f(cid:96)\n\u2217\n(cid:96) , \u02dcu(cid:96)c(cid:96)c\u2217\n\u2202H , \u2202f(cid:96)\n(cid:69)\n(cid:96) )\n\u2202M ) = (\u02dcv(cid:96)b(cid:96)b\nis de\ufb01ned as\n\nm(cid:88)\n\n(H,M )\n(cid:107)(H,M )(cid:107)F\n\n,\n\n(12)\n\n(cid:68)\u2207f(cid:96),\n\nC(Q) := E sup\n\n(H,M )\u2208Q\n\n1\u221a\nm\n\n\u03b5(cid:96)\n\n(cid:96)=1\n\np\u03c4 (Q) :=\n\nwhere \u03b5(cid:96), (cid:96) = 1, 2, 3, . . . , m are iid Rademacher random variables independent of everything else in\nthe expression. For a convex set Q, C(Q) is a measure of the width of Q around origin interms of the\ngradients \u2207f(cid:96), (cid:96) = 1, 2, 3, . . . , m. For example, random choice of gradient might yield little overlap\nwith a structured set Q leading to a smaller complexity Q.\nOur result also depends on a probability p\u03c4 (Q) and a positive parameter \u03c4 de\ufb01ned as\n\nP(cid:0)(cid:104)\u2207f, (H, M )(cid:105) \u2265 \u03c4(cid:107)(H, M )(cid:107)F\n\n(cid:1).\n\ninf\n\n(H,M )\u2208Q\n\n(13)\nThe probability p\u03c4 (Q) quanti\ufb01es visibility of the set Q through the gradient vectors \u2207f. A small\nvalue of \u03c4 and p\u03c4 (Q) means that the set Q mainly remains invisible through the lenses of \u2207f(cid:96), (cid:96) =\n1, 2, 3, . . . , m. This can be appreciated just by noting that p\u03c4 (Q) depends on the correlation of the\nelements of Q with the gradient vectors \u2207f(cid:96).\nFollowing lemma shows that the minimizer of the linear program (9) almost always resides in the\ndesired set ( \u02dcH, \u02dcM ) \u2295 S for a suf\ufb01ciently large m quanti\ufb01ed interms of C(Q), p\u03c4 (Q), and \u03c4.\nLemma 3. Consider the optimization program in (9) and Q, characterized in (11), be the set of\ndescent directions for which C(Q), and p\u03c4 (Q) can be determined using (12) and (13). Choose\n\n(cid:18) 2C(Q) + t\u03c4\n\n(cid:19)2\n\nm \u2265\n\n\u03c4 p\u03c4 (Q)\n\nfor any t > 0. Then the minimizer ((cid:99)H,(cid:99)M ) of (9) lies in the set ( \u02dcH, \u02dcM ) \u2295 S with probability at\n\nleast 1 \u2212 e\u22122mt2.\nProof of this lemma is based on small ball method developed in Koltchinskii and Mendelson [2015],\nMendelson [2014] and further studied in Lecu\u00e9 et al. [2018], Lecu\u00e9 and Mendelson [2017]. The\nproof is mainly repeated using the argument in Bahmani and Romberg [2017b], and is provided in\nthe supplementary material for completeness.\nWith Lemma 3 in place, an application of Lemma 2 and the discussion after it proves that for choice\nof m outlined in Lemma 3, ( \u02dcH, \u02dcM ) is the unique minimizer of (5). The last missing piece in the\nproof of Theorem 1 is the computation of the Rademacher complexity C(Q), and p\u03c4 (Q) for the Q.\n\n2For brevity, we will often drop the dependence on H, and M in the notation f(cid:96)(H, M )\n\n7\n\n\f(\u03b4H,\u03b4M )\n(cid:107)(\u03b4H,\u03b4M )(cid:107)F\n\n(cid:96)=1\n\u02dcm ), and using Holder\u2019s inequalities, we obtain\n\n(cid:13)(cid:13)(cid:13)F\n\n,\u03b4M T \u02dcm )\n(cid:107)(\u03b4H,\u03b4M )(cid:107)F\n\n(cid:69)\n(cid:13)(cid:13)(cid:13) (\u03b4H T\u02dch\n(cid:13)(cid:13)(cid:13)(cid:13)\u2217\n\n)\n\n2.1 Rademacher Complexity\nWe begin with evaluation of the complexity C(Q)\n1\u221a\nm\n, T \u22a5\nSplitting (\u03b4H, \u03b4M ) between (T\u02dch, T \u02dcm), and (T \u22a5\n(cid:96) ), \u02dcu(cid:96)PT \u02dcm (c(cid:96)c\u2217\n\u2217\n(cid:96) ))\n\nC(Q) := E\n\n(\u03b4H,\u03b4M )\u2208Q\n\nm(cid:88)\n\n\u03b5(cid:96)(\u02dcv(cid:96)PT\u02dch\n\n(b(cid:96)b\n\nsup\n\n\u03b5(cid:96)\n\n\u02dch\n\n(cid:68)\u2207f(cid:96),\n(cid:13)(cid:13)(cid:13)F\n\n\u00b7\n\nC(Q) \u2264 E(cid:13)(cid:13)(cid:13) 1\u221a\n+ E(cid:13)(cid:13)(cid:13) 1\u221a\n\nm\n\nm\n\nm(cid:88)\nm(cid:88)\n\n(cid:96)=1\n\n(cid:96)=1\n\nOn the set Q, de\ufb01ned in (11), we have\n\n,\u03b4M T \u02dcm )\n(cid:107)(\u03b4H,\u03b4M )(cid:107)F\nUsing Jensen\u2019s inequality, the \ufb01rst expectation simply becomes\n\n,\u03b4M T \u22a5\n(cid:107)(\u03b4H,\u03b4M )(cid:107)F\n\n\u02dcm\n\n\u03b5(cid:96)(\u02dcv(cid:96)b(cid:96)b\n\n(cid:13)(cid:13)(cid:13) (\u03b4H T \u22a5\n\n\u02dch\n\n\u2217\n(cid:96) , \u02dcu(cid:96)c(cid:96)c\u2217\n(cid:96) )\n\n(\u03b4H,\u03b4M )\u2208Q\n\nsup\n\n(cid:13)(cid:13)(cid:13) \u00b7\n\u2264(cid:13)(cid:13)(cid:13) (\u03b4H T\u02dch\n(cid:13)(cid:13)(cid:13)\u2217\n(cid:118)(cid:117)(cid:117)(cid:116) 1\n(cid:96) )(cid:1)(cid:13)(cid:13)(cid:13)F\n\n\u2264\n\nm\n\n)\n\n,\u03b4M T \u22a5\n(cid:107)(\u03b4H,\u03b4M )(cid:107)F\n\n\u02dcm\n\n(\u03b4H,\u03b4M )\u2208Q\n\n\u02dch\n\nsup\n\n(cid:13)(cid:13)(cid:13)(cid:13) (\u03b4H T \u22a5\n(cid:13)(cid:13)(cid:13)F\n(cid:0)\u02dcv(cid:96)PT\u02dch\n(cid:17)\n\n\u2264 1.\n\n\u03b5(cid:96)\n\nE(cid:13)(cid:13)(cid:13) m(cid:88)\n\n(cid:96)=1\n\n(cid:96) )(cid:107)2\n\u2217\n\nF + (cid:107)\u02dcu(cid:96)PT \u02dcm (c(cid:96)c\u2217\n\n(cid:96) )(cid:107)2\n\nF\n\n,\n\n(cid:0)\u02dcv(cid:96)PT\u02dch\n(cid:118)(cid:117)(cid:117)(cid:116) 1\n\n=\n\n(cid:96) ), \u02dcu(cid:96)PT \u02dcm(c(cid:96)c\u2217\n\u2217\n(b(cid:96)b\nm(cid:88)\n\nE(cid:16)(cid:107)\u02dcv(cid:96)PT\u02dch\n\n(b(cid:96)b\n\nm\n\n(cid:96)=1\n\nE(cid:13)(cid:13)(cid:13) 1\u221a\n\nm\n\nm(cid:88)\n\n(cid:96)=1\n\n\u03b5(cid:96)\n\n(cid:96) )(cid:1)(cid:13)(cid:13)(cid:13)2\n\nF\n\n(cid:96) ), \u02dcu(cid:96)PT \u02dcm (c(cid:96)c\u2217\n\u2217\n\n(b(cid:96)b\n\nwhere the last equality follows by going through with the expectation over \u03b5(cid:96)\u2019s. Recall from the\nde\ufb01nition of the projection operator that PT\u02dch\n, and\n\u02dcv(cid:96) = |c\u2217\n\n\u2217\n\u2217\n\u2217\n(cid:96) ) := \u02dch\u02dch\n(cid:96) + b(cid:96)b\n(b(cid:96)b\n(cid:107)\u02dch(cid:107)2\n|b\u2217\n\u02dch|2\n(cid:96) )(cid:107)2\n(cid:96) \u02dcm|2. It can be easily veri\ufb01es that (cid:107)PT\u02dch\n\u2217\nF = 2\n(b(cid:96)b\n(cid:107)\u02dch(cid:107)2\n\u02dch|4\n|b\u2217\n\u02dch|2\n2 \u2212 |b\u2217\nE(cid:107)\u02dcv(cid:96)PT\u02dch\n(cid:107)b(cid:96)(cid:107)2\n(cid:107)\u02dch(cid:107)4\n(cid:107)\u02dch(cid:107)2\n\n2 \u00b7 E(cid:16)\n\n, and, therefore,\n\nF \u2264 E|c\u2217\n\n(cid:96) \u02dcm|4\n\n(cid:96) )(cid:107)2\n\u2217\n\n(b(cid:96)b\n\nb(cid:96)b\n\n\u2217\n(cid:96)\n\n2\n\n2\n\n2\n\n2\n\n(cid:96)\n\n(cid:96)\n\n(cid:96)\n\nwhere we used a simple calculation involving fourth moments of Gaussians E|b\nIn an exactly similar manner, we can also show that (cid:107)\u02dcu(cid:96)PT \u02dcm (c(cid:96)c\u2217\nthese together gives us\n\n(cid:96) )(cid:107)2\n\n2\n\n2\n\n\u2264 5 max((cid:107)\u02dch(cid:107)2\n\nMoreover,\n\nm\n\nm\n\nE(cid:13)(cid:13)(cid:13) 1\u221a\nE(cid:13)(cid:13)(cid:13) 1\u221a\n(cid:32)(cid:13)(cid:13)(cid:13) 1\u221a\n\n(cid:96)=1\n\nm(cid:88)\nm(cid:88)\nm(cid:88)\n\n(cid:96)=1\n\n(cid:0)\u02dcv(cid:96)PT\u02dch\n\n\u03b5(cid:96)\n\n(b(cid:96)b\n\n(cid:96) ), \u02dcu(cid:96)PT \u02dcm (c(cid:96)c\u2217\n\u2217\n\n(cid:96) )(cid:1)(cid:13)(cid:13)(cid:13)F\n(cid:13)(cid:13)(cid:13) \u2264 E max\n(\u02dcu(cid:96), \u02dcv(cid:96)) \u00b7 E(cid:13)(cid:13)(cid:13) 1\u221a\n(cid:33)\n(cid:13)(cid:13)(cid:13) \u2265 c\n\u2217\n(cid:96) , c(cid:96)c\u2217\n(cid:80)m\n(cid:96) )\n\n\u221a\n\n(cid:96)\n\n\u2217\n(cid:96) , \u02dcu(cid:96)c(cid:96)c\u2217\n(cid:96) )\n\u03b5(cid:96)(\u02dcv(cid:96)b(cid:96)b\n\nThis directly implies that E(cid:13)(cid:13)(cid:13) 1\u221a\n\n\u03b5(cid:96)(b(cid:96)b\n\n(cid:96)=1\n\nm\n\nStandard net arguments; see, for example, Sec. 5.4.1 of Eldar and Kutyniok [2012] show that\n\u2264 e\u2212cm, provided that m \u2265 c(k + n).\n\nk + n\n\n\u221a\n\nP\n\nk + n. The random variables u(cid:96) and\nv(cid:96) being sub-exponential have Orlicz-1 norms bounded by c max((cid:107)\u02dch(cid:107)2\n2). Using standard\nresults, such as Lemma 3 in van de Geer and Lederer [2013], we then have E max(cid:96)(u(cid:96), v(cid:96)) \u2264 c log m.\nPutting these together yields\n\n2,(cid:107) \u02dcm(cid:107)2\n\n(cid:96)=1 \u03b5(cid:96)(b(cid:96)b\n\nm\n\n(cid:96)\n\n2\n\n2\n\n2\n\n\u2217\n(cid:96)\n\nb(cid:96)b\n\n\u2217\n\u02dch\u02dch\n(cid:107)\u02dch(cid:107)2\n\n\u2217\n(cid:96)\n(cid:107)b(cid:96)(cid:107)2\n\n(cid:17) \u2264 3(cid:107) \u02dcm(cid:107)4\n\n\u2217\n\u2217\n\u2212 \u02dch\u02dch\n\u02dch\u02dch\n(cid:107)\u02dch(cid:107)2\n(cid:107)\u02dch(cid:107)2\n\u02dch|4\n2 \u2212 |b\u2217\n(cid:107)\u02dch(cid:107)4\n2 (6k \u2212 3) ,\n\u02dch|2(cid:107)b(cid:96)(cid:107)2\n2 = 3k(cid:107)\u02dch(cid:107)2\n2.\nF \u2264 3(cid:107)\u02dch(cid:107)4\n2(6n \u2212 3). Putting\n\u221a\n2,(cid:107) \u02dcm(cid:107)2\n2)\nm(cid:88)\n\nk + n.\n\n\u03b5(cid:96)(b(cid:96)b\n\n\u2217\n(cid:96) , c(cid:96)c\u2217\n(cid:96) )\n\n(cid:13)(cid:13)(cid:13)\n\nm\n\n(cid:96)=1\n\n\u2217\n(cid:96) , c(cid:96)c\u2217\n(cid:96) )\n\n(cid:13)(cid:13)(cid:13) \u2264 c\n(cid:13)(cid:13)(cid:13) \u2264 c max((cid:107)\u02dch(cid:107)2\n(cid:113)\n\n2,(cid:107) \u02dcm(cid:107)2\n2)\n\n8\n\nE(cid:13)(cid:13)(cid:13) 1\u221a\n\nm(cid:88)\n\n2,(cid:107) \u02dcm(cid:107)2\n2)\nWe have all the ingredients for the \ufb01nal bound on C(Q) stated below\n\n\u2217\n(cid:96) , \u02dcu(cid:96)c(cid:96)c\u2217\n(cid:96) )\n\n\u03b5(cid:96)(\u02dcv(cid:96)b(cid:96)b\n\n(cid:96)=1\n\nm\n\nC(Q) \u2264 c max((cid:107)\u02dch(cid:107)2\n\n(k + n) log2 m.\n\n(14)\n\n(15)\n\n(cid:113)\n\n(k + n) log2 m.\n\n\fm\n\nm\n\nFigure 2: Phase portraits highlighting the frequency of successful recoveries of the proposed convex\nprogram for random and deterministic channel subspaces (see the text for the experiment details)\n\nn + k\n\nn + k\n\n2.2 Probability p\u03c4 (Q)\nThe calculation for the probability p\u03c4 (Q), and the positive parameter \u03c4 are given in Supplementary\nmaterial due to limitation of space. We \ufb01nd that\n\n(16)\nThe complexity estimate in (15), value of \u03c4 computed above, and p\u03c4 (Q) stated in (16) together with\nan application of Lemma 3 prove Theorem 1.\n\np\u03c4 (Q) \u2265 c > 0, and \u03c4 = c max((cid:107)\u02dch(cid:107)2\n\n2,(cid:107) \u02dcm(cid:107)2\n2).\n\n3 Convex Implementation and Phase Transition\n\nTo implement the semi-de\ufb01nite convex program (5), we propose a numerical scheme based on the\nalternating direction method of multipliers (ADMM). Due to the space limit, the technical details of\nthe algorithm are moved to Section 4 of the supplementary note.\nTo illustrate the perfect recovery region, in Figure 2 we present the phase portrait associated with\nthe proposed convex framework. To obtain the diagram on the left panel, for each \ufb01xed value of m,\nwe run the algorithm for 100 different combinations of n and k, each time using a different set of\nGaussian matrices B and C. If the algorithm converges to a suf\ufb01ciently close neighborhood of the\nground-truth solution (a distance less than 1% of the solution\u2019s (cid:96)2 norm), we label the experiment\nas successful. Figure 2 shows the collected success frequencies, where solid black corresponds to\n100% success and solid white corresponds to 0% success. For an empirically selected constant c, the\nsuccess region almost perfectly stands on the left side of the line n + k = cm log\nWhile the analysis in this paper is speci\ufb01cally focused on the Gaussian subspace embeddings for\nw and x, on the right panel of Figure 2 we have plotted the phase diagram for the case that B is\ndeterministic and a subset of the columns of identity matrix (equispaced sampling of the columns),\nand C is Gaussian as before. This importantly hints that the convex framework is applicable to more\nrealistic deterministic subspace models.\n\n\u22122 m.\n\nAcknowledgments\n\nPH acknowledges support from NSF DMS 1464525.\n\n9\n\n\fReferences\nRobert W Harrison. Phase problem in crystallography. JOSA a, 10(5):1046\u20131055, 1993.\n\nRick P Millane. Phase retrieval in crystallography and optics. JOSA A, 7(3):394\u2013411, 1990.\n\nOliver Bunk, Ana Diaz, Franz Pfeiffer, Christian David, Bernd Schmitt, Dillip K Satapathy, and\nJ Friso van der Veen. Diffractive imaging for periodic samples: retrieving one-dimensional\nconcentration pro\ufb01les across micro\ufb02uidic channels. Acta Crystallographica Section A: Foundations\nof Crystallography, 63(4):306\u2013314, 2007.\n\nJianwei Miao, Tetsuya Ishikawa, Qun Shen, and Thomas Earnest. Extending x-ray crystallography\nto allow the imaging of noncrystalline materials, cells, and single protein complexes. Annu. Rev.\nPhys. Chem., 59:387\u2013410, 2008.\n\nC Fienup and J Dainty. Phase retrieval and image reconstruction for astronomy. Image Recovery:\n\nTheory and Application, 231:275, 1987.\n\nJoseph Goodman. Introduction to fourier optics. 2008.\n\nEmmanuel J Candes, Thomas Strohmer, and Vladislav Voroninski. Phaselift: Exact and stable signal\nrecovery from magnitude measurements via convex programming. Communications on Pure and\nApplied Mathematics, 66(8):1241\u20131274, 2013.\n\nEmmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval from coded diffraction\n\npatterns. Applied and Computational Harmonic Analysis, 39(2):277\u2013299, 2015a.\n\nVeit Elser, Ti-Yen Lan, and Tamir Bendory. Benchmark problems for phase retrieval. arXiv preprint\n\narXiv:1706.00399, 2017.\n\nAhmad Helmi Azhar, Thomas Tran, and Dominic O\u2019Brien. A gigabit/s indoor wireless transmission\nusing mimo-ofdm visible-light communications. IEEE photonics technology letters, 25(2):171\u2013174,\n2013.\n\nJos\u00e9 Ram\u00f3n Dur\u00e1n Retamal, Hassan Makine Oubei, Bilal Janjua, Yu-Chieh Chi, Huai-Yung Wang,\nCheng-Ting Tsai, Tien Khee Ng, Dan-Hua Hsieh, Hao-Chung Kuo, Mohamed-Slim Alouini,\net al. 4-gbit/s visible light communication link based on 16-qam ofdm transmission over remote\nphosphor-\ufb01lm converted white light by using blue laser diode. Optics express, 23(26):33656\u201333666,\n2015.\n\nAhmad Helmi Azhar, Tuan-Anh Tran, and Dominic O\u2019Brien. Demonstration of high-speed data\ntransmission using mimo-ofdm visible light communications. In GLOBECOM Workshops (GC\nWkshps), 2010 IEEE, pages 1052\u20131056. IEEE, 2010.\n\nEmmanuel J Candes, Yonina C Eldar, Thomas Strohmer, and Vladislav Voroninski. Phase retrieval\n\nvia matrix completion. SIAM review, 57(2):225\u2013251, 2015b.\n\nAli Ahmed, Benjamin Recht, and Justin Romberg. Blind deconvolution using convex programming.\n\nIEEE Transactions on Information Theory, 60(3):1711\u20131732, 2014.\n\nSohail Bahmani and Justin Romberg. Phase Retrieval Meets Statistical Learning Theory: A Flexible\nConvex Relaxation. In Aarti Singh and Jerry Zhu, editors, Proceedings of the 20th International\nConference on Arti\ufb01cial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning\nResearch, pages 252\u2013260, Fort Lauderdale, FL, USA, 20\u201322 Apr 2017a. PMLR. URL http:\n//proceedings.mlr.press/v54/bahmani17a.html.\n\nTom Goldstein and Christoph Studer. Phasemax: Convex phase retrieval via basis pursuit. IEEE\n\nTransactions on Information Theory, 2018.\n\nAlireza Aghasi, Ali Ahmed, and Paul Hand. Branchhull: Convex bilinear inversion from the entrywise\n\nproduct of signals with known signs. arXiv preprint arXiv:1702.04342, 2017a.\n\nAlireza Aghasi, Ali Ahmed, and Paul Hand. Convex inversion of the entrywise product of real signals\nwith known signs. In Signals, Systems, and Computers, 2017 51st Asilomar Conference on, pages\n1622\u20131626. IEEE, 2017b.\n\n10\n\n\fSamuel Burer and Renato D.C. Monteiro. A nonlinear programming algorithm for solving semidef-\ninite programs via low-rank factorization. Mathematical Programming, 95(2):329\u2013357, Feb\n2003. ISSN 1436-4646. doi: 10.1007/s10107-002-0352-8. URL https://doi.org/10.1007/\ns10107-002-0352-8.\n\nVladimir Koltchinskii and Shahar Mendelson. Bounding the smallest singular value of a random\nmatrix without concentration. International Mathematics Research Notices, 2015(23):12991\u2013\n13008, 2015.\n\nShahar Mendelson. Learning without concentration. In Conference on Learning Theory, pages 25\u201339,\n\n2014.\n\nGuillaume Lecu\u00e9, Shahar Mendelson, et al. Regularization and the small-ball method i: sparse\n\nrecovery. The Annals of Statistics, 46(2):611\u2013641, 2018.\n\nGuillaume Lecu\u00e9 and Shahar Mendelson. Regularization and the small-ball method ii: complexity\n\ndependent error rates. The Journal of Machine Learning Research, 18(1):5356\u20135403, 2017.\n\nSohail Bahmani and Justin Romberg. Anchored regression: Solving random convex equations via\n\nconvex programming. arXiv preprint arXiv:1702.05327, 2017b.\n\nYonina C Eldar and Gitta Kutyniok. Compressed sensing: theory and applications. Cambridge\n\nUniversity Press, 2012.\n\nSara van de Geer and Johannes Lederer. The bernstein\u2013orlicz norm and deviation inequalities.\n\nProbability theory and related \ufb01elds, 157(1-2):225\u2013250, 2013.\n\n11\n\n\f", "award": [], "sourceid": 6487, "authors": [{"given_name": "Ali", "family_name": "Ahmed", "institution": "Information Technology University"}, {"given_name": "Alireza", "family_name": "Aghasi", "institution": "Institute for Insight"}, {"given_name": "Paul", "family_name": "Hand", "institution": "Northeastern University"}]}