{"title": "Superset Technique for Approximate Recovery in One-Bit Compressed Sensing", "book": "Advances in Neural Information Processing Systems", "page_first": 10387, "page_last": 10396, "abstract": "One-bit compressed sensing (1bCS) is a method of signal acquisition under extreme measurement quantization that gives important insights on the limits of signal compression and analog-to-digital conversion. The setting is also equivalent to the problem of learning a sparse hyperplane-classifier. In this paper, we propose a generic  approach for signal recovery in nonadaptive 1bCS that leads to improved sample complexity for approximate recovery for a variety of signal models, including nonnegative signals and binary signals. We construct 1bCS matrices that are universal - i.e. work for all signals under a model - and at the same time recover very general random sparse signals with high probability. In our approach, we divide the set of samples (measurements) into two parts, and use the first part to recover the superset of the support of a sparse vector. The second set of measurements is then used to approximate the signal within the superset. While support recovery in 1bCS is well-studied, recovery of superset of the support requires fewer samples, which then leads to an overall reduction in sample complexity for approximate recovery.", "full_text": "Superset Technique for Approximate Recovery in\n\nOne-Bit Compressed Sensing\n\nUniversity of Massachusetts Amherst\n\nUniversity of Massachusetts Amherst\n\nLarkin Flodin\n\nAmherst, MA 01003\n\nlflodin@cs.umass.edu\n\nVenkata Gandikota\n\nAmherst, MA 01003\n\ngandikota.venkata@gmail.com\n\nArya Mazumdar\n\nUniversity of Massachusetts Amherst\n\nAmherst, MA 01003\narya@cs.umass.edu\n\nAbstract\n\nOne-bit compressed sensing (1bCS) is a method of signal acquisition under ex-\ntreme measurement quantization that gives important insights on the limits of\nsignal compression and analog-to-digital conversion. The setting is also equiva-\nlent to the problem of learning a sparse hyperplane-classi\ufb01er. In this paper, we\npropose a generic approach for signal recovery in nonadaptive 1bCS that leads\nto improved sample complexity for approximate recovery for a variety of signal\nmodels, including nonnegative signals and binary signals. We construct 1bCS\nmatrices that are universal - i.e. work for all signals under a model - and at the\nsame time recover very general random sparse signals with high probability. In\nour approach, we divide the set of samples (measurements) into two parts, and\nuse the \ufb01rst part to recover the superset of the support of a sparse vector. The\nsecond set of measurements is then used to approximate the signal within the\nsuperset. While support recovery in 1bCS is well-studied, recovery of superset\nof the support requires fewer samples, which then leads to an overall reduction in\nsample complexity for approximate recovery.\n\nIntroduction\n\n1\nSparsity is a natural property of many real-world signals. For example, image and speech signals\nare sparse in the Fourier basis, which led to the theory of compressed sensing, and more broadly,\nsampling theory [12, 7]. In some important multivariate optimization problems with many optimal\npoints, sparsity of the solution is also a measure of \u2018simplicity\u2019 and insisting on sparsity is a common\nmethod of regularization [19]. While recovering sparse vectors from linear measurements is a\nwell-studied topic, technological advances and increasing data size raises new questions. These\ninclude quantized and nonlinear signal acquisition models, such as 1-bit compressed sensing [4]. In\n1-bit compressed sensing, linear measurements of a sparse vector are quantized to only 1 bit, e.g.\nindicating whether the measurement outcome is positive or not, and the task is to recover the vector\nup to a prescribed Euclidean error with minimum number of measurements. Like compressed sensing,\nthe overwhelming majority of the literature, including this paper, focuses on the nonadaptive setting\nfor the problem.\nOne of the ways to approximately recover a sparse vector from 1-bit measurements is to use a subset\nof all the measurements to identify the support of the vector. Next, the remainder of the measurements\ncan be used to approximate the vector within the support. Note that this second set of measurements\nis also prede\ufb01ned, and therefore the entire scheme is still nonadaptive. Such a method appears in the\n\nPreprint. Under review.\n\n\fcontext of \u2018universal\u2019 matrix designs in [9, 1]. The resulting schemes are the best known, in some\nsense, but still result in a large gap between the upper and lower bounds for approximate recovery of\nvectors.\nIn this paper we take steps to close these gaps, by presenting a simple yet powerful idea. Instead\nof using a subset of the measurements to recover the support of the vector exactly, we propose\nusing a (smaller) set of measurements to recover a superset of the support. The remainder of the\nmeasurements can then be used to better approximate the vector within the superset. It turns out this\nidea which we call the \u201csuperset technique\u201d leads to optimal number of measurements for universal\nschemes for several important classes of sparse vectors (for example, nonnegative vectors). We also\npresent theoretical results providing a characterization of matrices that would yield universal schemes\nfor all sparse vectors.\n\nPrior Results. While the compressed sensing framework was introduced in [7], it was not until [4]\nthat 1-bit quantization of the measurements was considered as well, to try and combat the fact that\ntaking real-valued measurements to arbitrary precision may not be practical in applications. Initially,\nthe focus was primarily on approximately reconstructing the direction of the signal x (the quantization\ndoes not preserve any information about the magnitude of the signal, so all we can hope to reconstruct\nis the direction). However, in [10] the problem of support recovery, as opposed to approximate vector\nreconstruction, was \ufb01rst considered and it was shown that O (k log n) measurements is suf\ufb01cient to\nrecover the support of a k-sparse signal in Rn with high probability. This was subsequently shown to\nbe tight with the lower bound proven in [3].\nAll the above results assume that a new measurement matrix is constructed for each sparse signal, and\nsuccess is de\ufb01ned as either approximately recovering the signal up to error \u270f in the `2 norm (for the\napproximate vector recovery problem), or exactly recovering the support of the signal (for the support\nrecovery problem), with high probability. Generating a new matrix for each instance is not practical\nin all applications, which has led to interest in the \u201cuniversal\u201d versions of the above two problems,\nwhere a single matrix must work for support recovery or approximate recovery of all k-sparse real\nsignals, with high probability.\n\nthe entries of the signal to be nonnegative (which is the case for many real-world signals such as\n\nfor universal approximate recovery. The dependence on \u270f was then improved signi\ufb01cantly to\n\nPlan and Vershynin showed in [15] that both O k\nk measurements suf\ufb01ce\nOk3 log n\n\u270f in [9], who also considered the problem of universal support recovery, and showed\nthat for that problem, Ok3 log n measurements is suf\ufb01cient. They showed as well that if we restrict\nimages), then Ok2 log n is suf\ufb01cient for universal support recovery. The constructions of their\n(RUFFs) can be used to improve the upper bound on universal support recovery to Ok2 log n\nOk2 log n + k\n\nmeasurement matrices are based primarily on combinatorial objects, speci\ufb01cally expanders and Union\nFree Families (UFFs).\nMost recently, [1] showed that a modi\ufb01ed version of the UFFs used in [9] called \u201cRobust UFFs\u201d\n\nfor all real-valued signals, matching the previous upper bound for nonnegative signals, and showed\nthis is nearly tight with a lower bound of \u2326(k2 log n/ log k) for real signals. They also show that\n\n\u270f measurements suf\ufb01ces for universal approximate recovery.\n\nIn tandem with the development of these theoretical results providing necessary and suf\ufb01cient\nnumbers of measurements for support recovery and approximate vector recovery, there has been a\nsigni\ufb01cant body of work in other directions on 1-bit compressed sensing, such as heuristic algorithms\nthat perform well empirically, and tradeoffs between different parameters. More speci\ufb01cally, [11]\nintroduced a gradient-descent based algorithm called Binary Iterative Hard Thresholding (BIHT)\nwhich performs very well in practice; later, [13] gave another heuristic algorithm which performs\ncomparably well or better, and aims to allow for very ef\ufb01cient decoding after the measurements are\ntaken. Other papers such as [18] have studied the tradeoff between the amount of quantization of the\nsignal, and the necessary number of measurements.\n\nk + k\n\n\u270f6 log n\n\nk and O k\n\n\u270f5 log2 n\n\nOur Results. We focus primarily on upper bounds in the universal setting, aiming to give construc-\ntions that work with high probability for all sparse vectors. In [1], 3 major open questions are given\nregarding Universal 1-bit Compressed Sensing, which, paraphrasing, are as follows:\n\n1. How many measurements are necessary and suf\ufb01cient for a matrix to be used to exactly\n\nrecover all k-sparse binary vectors?\n\n2\n\n\fTable 1: Upper and lower bounds for 1bCS problems with k-sparse signals\n\nProblem\nUniversal Support Recovery (x 2 Rn)\nUniversal \u270f-approximate Recovery (x 2 Rn)\nUniversal \u270f-approximate Recovery (x 2 Rn\n0)\nUniversal Exact Recovery (x 2{ 0, 1}n)\nNon-Universal Support Recovery (x 2 Rn)\n\nUB\n\n[1], [11]\n\nO\u21e3k2 log n\u2318 [1]\neO(min(k2 log n\nO\u21e3k log( n\nO\u21e3k log( n\nO (k log n) [3]\n\nk ) + k\n\n\u270f\u2318\u21e4\nk ) + k3/2\u2318\u21e4\n\nk + k\n\n\u270f , k\n\n\u270f log n\n\nk ))\n\n*Bound proved in this work.\n\nExplicit UB\n\nO\u21e3k2 log n\u2318\u21e4\n\n\u2013\n\n\u2013\n\u2013\n\nO\u21e3k log2 n\u2318\u21e4\n\nLB\n\n\u2326(k2 log n/ log k) [1]\n\u2326(k log n\n\nk + k\n\n\u270f ) [1]\n\n\u2326(k log n\nk )\n\u2326(k log n\nk )\n\u2326(k log n\nk ) [3]\n\n2. What is the correct complexity (in terms of number of measurements) of universal \u270f-\n\napproximate vector recovery for real signals?\n\n3. Can we obtain explicit (i.e. requiring time polynomial in n and k) constructions of the\nRobust UFFs used for universal support recovery (yielding measurement matrices with\n\nOk2 log n rows)?\n\nk + k\n\nk + k\n\n\u270f , k\n\n\u270f.\n\nIn this work we make progress towards solutions to all three Open Questions. Our primary contribu-\ntion is the \u201csuperset technique\u201d which relies on ideas from the closely related sparse recovery problem\nof group testing [8]; in particular, we show in Theorem 6 that for a large class of signals including\nall nonnegative (and thus all binary) signals, we can improve the upper bound for approximate\nrecovery by \ufb01rst recovering an O (k)-sized superset of the support rather than the exact support, then\nsubsequently using Gaussian measurements. The previous best upper bound for binary signals from\n[11] was Ok3/2 log n, which we improve to Ok3/2 + k log n\nk, and for nonnegative signals was\nOmin(k2 log n\n\u270f log n), which we improve to Ok log n\nof Robust UFFs yielding measurement matrices for support recovery with Ok2 log n rows in time\nsurements than is optimal (Ok log2 n as opposed to Ok log n\nk) in Section 4.2; to our knowledge,\n\nRegarding Open Question 3, using results of Porat and Rothschild regarding weakly explicit construc-\ntions of Error-Correcting Codes (ECCs) on the Gilbert-Varshamov bound [16], we give a construction\n\nthat is polynomial in n (though not in k) in Theorem 12. Based on a similar idea, we also give a\nweakly explicit construction for non-universal approximate recovery using only sightly more mea-\n\nexplicit constructions in the non-universal setting have not been studied previously. Furthermore, this\nresult gives a single measurement matrix which works for almost all vectors, as opposed to typical\nnon-universal results which work with high probability for a particular vector and matrix pair.\nIn Appendix C, we give a suf\ufb01cient condition generalizing the notion of RUFFs for a matrix to be\nused for universal recovery of a superset of the support for all real signals; while we do not provide\nconstructions, this seems to be a promising direction for resolving Open Question 2.\nThe best known upper and lower bounds for the various compressed sensing problems considered in\nthis work are presented in Table 1.\n2 De\ufb01nitions\nWe write Mi for the ith row of the matrix M, and Mi,j for the entry of M in the ith row and jth\ncolumn. We write vectors x in boldface, and write xi for the ith component of the vector x. The set\n{1, 2, . . . , n} will be denoted by [n], and for any set S we write P(S) for the power set of S (i.e. the\nset of all subsets of S).\nWe will write supp(x) \u2713 [n] to mean the set of indices of nonzero components of x (so supp(x) =\n{i : xi 6= 0}), and ||x||0 to denote | supp(x)|.\nFor a real number y, sign(y) returns 1 if y is strictly positive, 1 if y is strictly negative, and 0 if\ny = 0. While this technically returns more than one bit of information, if we had instead de\ufb01ned\nsign(y) to be 1 when y  0 and 1 otherwise, we could still determine whether y = 0 by looking at\nsign(y), sign(y), so this affects the numbers of measurements by only a constant factor. We will\nnot concern ourselves with the constants involved in any of our results, so we have chosen to instead\nuse the more convenient de\ufb01nition.\nWe will sometimes refer to constructions from the similar \u201cgroup testing\u201d problem in our results.\nTo this end, we will use the symbol \u201c\u201d to represent the group testing measurement between a\nmeasurement vector and a signal vector. Speci\ufb01cally, for a measurement m of length n and signal x\n\n3\n\n\fof length n, m  x is equal to 1 if supp(m) \\ supp(x) is nonempty, and 0 otherwise. We will also\nmake use of the \u201clist-disjunct\u201d matrices used in some group testing constructions.\nDe\ufb01nition 1. An m \u21e5 n binary matrix M is (k, l)-list disjunct if for any two disjoint sets S, T \u2713\ncol(M ) with |S| = k,|T| = l, there exists a row in M in which some column from T has a nonzero\nentry, but every column from S has a zero.\n\nThe primary use of such matrices is that in the group testing model, they can be used to recover a\nsuperset of size at most k + l of the support of any k-sparse signal x from applying a simple decoding\nto the measurement results M  x.\nIn the following de\ufb01nitions, we write S for a generic set that is the domain of the signal. In this paper\nwe consider signals with domain R, R0 (nonnegative reals), and {0, 1}.\nDe\ufb01nition 2. An m \u21e5 n measurement matrix M can be used for Universal Support Recovery of\nk-sparse x 2 Sn (in m measurements) if there exists a decoding function f : {1, 0, 1}m !P ([n])\nsuch that f (sign(M x)) = supp(x) for all x satisfying ||x||0 \uf8ff k.\nDe\ufb01nition 3. An m\u21e5 n measurement matrix M can be used for Universal \u270f-Approximate Recovery\nof k-sparse x 2 Sn (in m measurements) if there exists a decoding function f : {1, 0, 1}m ! Sn\nsuch that\n\nx\n\n||x||2 \n\n\n\nf (sign(M x))\n\n||f (sign(M x))||22 \uf8ff \u270f,\n\nfor all x with ||x||0 \uf8ff k.\n3 Upper Bounds for Universal Approximate Recovery\nHere we present our main result, an upper bound on the number of measurements needed to perform\nuniversal \u270f-approximate recovery for a large class of real vectors that includes all binary vectors and\nall nonnegative vectors. The general technique will be to \ufb01rst use what are known as \u201clist-disjunct\u201d\nmatrices from the group testing literature to recover a superset of the support of the signal, then use\nGaussian measurements to approximate the signal within the superset. Because the measurements in\nthe second part are Gaussian, we can perform the recovery within the (initially unknown) superset\nnonadaptively. When restricting to the class of binary or nonnegative signals, our upper bound\nimproves on existing results and is close to known lower bounds.\nFirst, we need a lemma stating the necessary and suf\ufb01cient conditions on a signal vector x in\norder to be able to reconstruct the results of a single group testing measurement m  x using sign\nmeasurements. To concisely state the condition, we introduce some notation: for a subset S \u2713 [n]\nand vector x of length n, we write x|S to mean the restriction of x to the indices of S.\nLemma 1. Let m 2{ 0, 1}n and x 2 Rn. De\ufb01ne S = supp(m) \\ supp(x). If either S is empty or\nS is nonempty and mT|S x|S 6= 0, we can reconstruct the result of the group testing measurement\nm  x from the sign measurement sign(mT x).\nProof. We observe sign(mT x) and based on that must determine the value of m x, or equivalently\nwhether S is empty or nonempty. If sign(mT x) 6= 0 then mT x 6= 0, so S is nonempty and\nm  x = 1. Otherwise we have sign(mT x) = 0, in which case we must have mT x = 0. If S were\nnonempty then we would have mT|S x|S = 0, contradicting our assumption. Therefore in this case\nwe must have S empty and m  x = 0, so for x satisfying the above condition we can reconstruct\nthe results of a group testing measurement.\n\nFor convenience, we use the following property to mean that a signal x has the necessary property\nfrom Lemma 1 with respect to every row of a matrix M.\nProperty 1. Let M be an m\u21e5n matrix, and x a signal of length n. De\ufb01ne Si = supp(Mi)\\supp(x).\nThen for every row Mi of M, either Si is empty, or M T\nCorollary 2. Let M be a (k, l)-list disjunct matrix, and x 2 Rn be a k-sparse real signal. If\nProperty 1 holds for M and x, then we can use the measurement matrix M to recover a superset of\nsize at most k + l of the support of x using sign measurements.\nk ) rows which we\nCombining this corollary with results of [6], there exist matrices with Ok log( n\ncan use to recover an O (k)-sized superset of the support of x using sign measurements, provided x\n\ni |Si x|Si 6= 0.\n\n4\n\n\fsatis\ufb01es the above condition. Strongly explicit constructions of these matrices exist also, although\n\nrequiring Ok1+o(1) log n rows [5].\n\nThe other result we need is one that tells us how many Gaussian measurements are necessary to\napproximately recover a real signal using maximum likelihood decoding. Similar results have\nappeared elsewhere, such as [11], but we include the proof for completeness.\nLemma 3. There exists a measurement matrix A for 1-bit compressed sensing such that for every\npair of k-sparse x, y 2 Rn with ||x||2 = ||y||2 = 1, sign(Ax) 6= sign(Ay) whenever ||x y||2 >\u270f ,\nprovided that\n\nm = O\u2713 k\n\n\u270f\n\nk\u2318\u25c6 .\nlog\u21e3 n\n\nWe will make use of the following facts in the proof.\nFact 4. For all x 2 R, 1  x < ex.\nFact 5. For all x 2 [0, 1], cos1(x) p2(1  x).\nProof of Lemma 3. Let A \u21e0N m\u21e5n(0, 1). For a measurement to separate x and y, it is necessary\nthat the hyperplane corresponding to some row a of A lies between x and y. Thus our goal here is to\nshow that if we take m to be large enough, that all pairs of points at distance >\u270f will be separated\nwith high probability. Since the rows of A are chosen independently and have Gaussian entries, they\nare spherically symmetric, and thus the probability that the random hyperplane a lies between x and\ny is proportional to the angle between them. Let ||x  y||2 >\u270f , then we start out by upper bounding\nthe probability that no measurement separates a particular pair x and y.\nBefore beginning, recall that for unit vectors 1  xT y = ||x  y||2\n2/2, so given that ||x  y||2 >\u270f ,\nwe have xT y < 1  \u270f2/2.\n\nPr[sign(ax) = sign(ay)] =\n\n\u21e1\n\n<\n\n1  cos1(xT y)\n1  cos1(1\u270f2/2)\n\uf8ff exp( cos1(1\u270f2/2)\n\uf8ff\n\n\u21e1\nexp( \u270f\n\u21e1 )\n\n\u21e1\n\n)\n\n(from Fact 4).\n(from Fact 5).\n\nm\u270f\n\nAs there are m independent measurements, the probability that x and y are not separated by any of\nthe m measurements is at most\n\nso union bounding over alln\n\n\u21e1 \u2318 ,\nexp\u21e3\n\u2713n\nk\u25c62\n\u21e1 \u2318 .\nexp\u21e3\nThis probability becomes less than 1 for m  \u21e1\nthere exists a matrix that can perform \u270f-approximate recovery for all pairs of sparse vectors.\n\nk2 pairs of k-sparse x and y, the total probability of error is strictly less\n\nk , so with this number of measurements\n\n\u270f (2k) log n\n\nthan\n\nm\u270f\n\n\u270f log(O(k)\n\nk )\u2318 = O k\n\n\u270f-approximate recovery within the superset. We can do this even nonadaptively, because the rows of\nthe matrix for approximate recovery are Gaussian. Combining this with Corollary 2 and the group\ntesting constructions of [6], we have the following theorem.\n\nNote that in the case that we already have a superset of the support of size O (k), the previous result\ntells us there exists a matrix with O\u21e3 k\n\u270f rows which can be used to perform\nM (2) where M (1) is a (k,O (k))-list disjunct matrix with Ok log n\nTheorem 6. Let M = \uf8ffM (1)\nk\nrows, and M (2) is a matrix with O k\n\u270f rows that can be used for \u270f-approximate recovery within the\nsuperset as in Lemma 3, so M consists of Ok log( n\n\u270f rows. Let x 2 Rn be a k-sparse signal.\n\nIf Property 1 holds for M (1) and x, then M can be used for \u270f-approximate recovery of x.\n\nk ) + k\n\n5\n\n\fRemark. We note that the class of signal vectors x which satisfy the condition in Theorem 6 is\nactually quite large, in the sense that there is a natural probability distribution over all sparse signals\nx for which vectors violating the condition occur with probability 0. The details are laid out in\nLemma 14.\n\nAs special cases, we have improved upper bounds for nonnegative and binary signals. For ease of\ncomparison with the other results, we assume the binary signal is rescaled to have unit norm, so has\n\nall entries either 0 or equal to 1/p||x||0.\nCorollary 7. Let M =\uf8ffM (1)\nrows, and M (2) is a matrix with O k\nsuperset as in Lemma 3, so M consists of Ok log( n\n\nM (2) where M (1) is a (k,O (k))-list disjunct matrix with Ok log n\nk\n\u270f rows that can be used for \u270f-approximate recovery within the\n\u270f rows. Let x 2 Rn be a k-sparse signal.\n\nIf all entries of x are nonnegative, then M can be used for \u270f-approximate recovery of x.\n\nk ) + k\n\nProof. In light of Theorem 6, we need only note that as all entries of M (1) and x are nonnegative,\nProperty 1 is satis\ufb01ed for M (1) and x.\n\nM (2) where M (1) is a (k,O (k))-list disjunct matrix with Ok log n\nCorollary 8. Let M =\uf8ffM (1)\nk\nrows, and M (2) is a matrix with Ok3/2 rows that can be used for \u270f-approximate recovery (with\n\u270f< 1/pk) within the superset as in Corollary 2 , so M consists of Ok log( n\nk ) + k3/2 rows. Let\nx 2 Rn be the k-sparse signal vector. If all nonzero entries of x are equal, then M can be used for\nexact recovery of x.\nProof. Here we use the fact that if we perform \u270f-approximate recovery using \u270f< 1/pk then as\nthe minimum possible distance between any two k-sparse rescaled binary vectors is 1/pk, we will\nrecover the signal vector exactly.\n\n4 Explicit Constructions\n4.1 Explicit Robust UFFs from Error-Correcting Codes\nIn this section we explain how to combine several existing results in order to explicitly construct\nRobust UFFs that can be used for support recovery of real vectors. This partially answers Open\nProblem 3 from [1].\nDe\ufb01nition 4. A family of sets F = {B1, B2, . . . , Bn} with each Bi \u2713 [m] is an (n, m, d, k, \u21b5)-\nRobust-UFF if |Bi| = d,8i, and for every distinct j0, j1, . . . , jk 2 [n], |Bj0 \\ (Bj1 [ Bj2 [\u00b7\u00b7\u00b7[\nBjk )| <\u21b5 |Bj0|.\nIt is shown in [1] that nonexplicit (n, m, d, k, 1/2)-Robust UFFs exist with m = Ok2 log n , d =\nO (k log n) which can be used to exactly recover the support of any k-sparse real vector of length n\nin m measurements.\nThe results we will need are the following, where the q-ary entropy function Hq is de\ufb01ned as\n\nHq(x) = x logq(q  1)  x logq x  (1  x) logq(1  x).\n\nTheorem 9 ([16] Thm. 2). Let q be a prime power, m and k positive integers, and  2 [0, 1]. Then if\nk \uf8ff (1  Hq())m, we can construct a q-ary linear code with rate k\nm and relative distance  in time\nOmqk.\nTheorem 10 ([1] Prop. 17). Given a q-ary error correcting code with rate r and relative distance\n(1  ), we can construct a (qrd, qd, d, 1, )-Robust-UFF.\nTheorem 11 ([1] Prop.\n(n, m, d, k, \u21b5)-Robust-UFF.\n\nIf F is an (n, m, d, 1, \u21b5/k)-Robust-UFF, then F is also an\n\n15).\n\nBy combining the above three results, we have the following.\n\n6\n\n\fTheorem 12. We can explicitly construct an (n, m, d, k, \u21b5)-Robust UFF with m = O\u21e3 k2 log n\nd = O\u21e3 k log n\nProof. First, we instantiate Theorem 9 to obtain a q-ary code C of length d with q = O (k/\u21b5), relative\ndistance  = k\u21b5\nApplying Theorem 10 to this code results in an (n, m, d, 1, )-Robust-UFF F where n = qrd,\nm = qd,  = 1  . By Theorem 11, F is also an (n, m, d, k, k)-Robust UFF. Plugging back in\nthe parameters of the original code,\n\n\u21b5 \u2318 in time O(k/\u21b5)k.\nk , and rate r = 1  Hq() in time Oqk.\n\n\u21b52 \u2318 and\n\nm = qd =\n\nq log n\nr log q\n\n=\n\nq log n\n\n(1  Hq((k  \u21b5)/k)) log q\n\n= O\u2713 k2 log n\n\u21b52 \u25c6 ,\n\nk = (1  )k = (1 \n\n)k = k  (k  \u21b5) = \u21b5.\n\nk  \u21b5\n\nk\n\nWhile the time needed for this construction is not polynomial in k (and therefore the construction is\nnot strongly explicit) as asked for in Open Question 3 of [1], this at least demonstrates that there exist\n\ncodes with suf\ufb01ciently good parameters to yield Robust UFFs with m = Ok2 log n.\n\n4.2 Non-Universal Approximate Recovery\nIf instead of requiring our measurement matrices to be able to recover all k-sparse signals simultane-\nously (i.e. to be universal), we can instead require only that they are able to recover \u201cmost\u201d k-sparse\nsignals. Speci\ufb01cally, in this section we will assume that the sparse signal is generated in the following\nway: \ufb01rst a set of k indices is chosen to be the support of the signal uniformly at random. Then, the\nsignal is chosen to be a uniformly random vector from the unit sphere on those k indices. We relax\nthe requirement that the supports of all k-sparse signals can be recovered exactly (by some decoding)\nto the requirement that we can identify the support of a k-sparse signal with probability at least 1  ,\nwhere  2 [0, 1). Note that even when  = 0, this is a weaker condition than universality, as the\nspace of possible k-sparse signals is in\ufb01nite.\nIt is shown in [3] that a random matrix construction using O (k log n) measurements suf\ufb01ces to\nrecover the support with error probability approaching 0 as k and n approach in\ufb01nity. The following\ntheorem shows that we can explicitly construct a matrix which works in this setting, at the cost of\nslightly more measurements (about Ok log2(n)).\n )\u2318 rows that can exactly determine the support of a k-sparse\nvectors) with m = O\u21e3k log(n)\nsignal with probability at least 1  , where the signals are generated by \ufb01rst choosing the size k\nsupport uniformly at random, then choosing the signal to be a uniformly random vector on the sphere\non those k coordinates.\n\nTheorem 13. We can explicitly construct measurement matrices for Support Recovery (of real\n\nlog k log( n\n\nTo prove this theorem, we need a lemma which explains how we can use sign measurements to\n\u201csimulate\u201d group testing measurements with high probability. Both the result and proof are similar\nto Lemma 1, with the main difference being that given the distribution described above, the vectors\nviolating the necessary condition in Lemma 1 occur with zero probability and so can be safely ignored.\nFor this lemma, we do not need the further assumption made in Theorem 13 that the distribution over\nsupport sets is uniform. The proof is presented in Appendix A.\nLemma 14. Suppose we have a measurement vector m 2{ 0, 1}n, and a k-sparse signal x 2 Rn.\nThe signal x is generated randomly by \ufb01rst picking a subset of size k from [n] (using any distribution)\nto be the support, then taking x to be a uniformly random vector on the sphere on those k coordinates.\nThen from sign(mT x), we can determine the value of m  x with probability 1.\nAs the above argument works with probability 1, we can easily extend it to an entire measurement\nmatrix M with any \ufb01nite number of rows by a union bound, and recover all the group testing\nmeasurement results M  x with probability 1 as well. This means we can leverage the following\nresult from [14]:\n\n7\n\n\flog k log(n) log( n\n\nlog k log n log( n\n\nk + k\n\nTheorem 15 ([14] Thm. 5). When x 2{ 0, 1}n is drawn uniformly at random among all k-\nsparse binary vectors, there exists an explicitly constructible group testing matrix M with m =\n )\u2318 rows which can exactly identify x from observing the measurement results\nO\u21e3 k\nM  x with probability at least 1  .\nCombining this with the lemma above, we can use the matrix M from Theorem 15 with m =\nO\u21e3 k\n )\u2318 rows (now representing sign measurements) to exactly determine the support\nof x with probability at least 1  ; we \ufb01rst use Lemma 14 to recover the results of the group testing\ntests M  x with probability 1, and can then apply the above theorem using the results of the group\ntesting measurements.\nWe can also use this construction for approximate recovery rather than support recovery using\nLemma 3, by appending O k\n\u270f rows of Gaussian measurements to M, \ufb01rst recovering the exact\n\u270f rows for non-universal approximate recovery of real signals, where the top\nOk log2(n) + k\n\nportion is explicit.\nRemark. Above, we have shown that in the non-universal setting, we can use constructions from\ngroup testing to recover the exact support with high probability, and then subsequently perform\napproximate recovery within that exact support. If we are interested only in performing approximate\nrecovery, we can apply our superset technique here as well; Lemma 14 implies also that using\na (k,O (k))-list disjunct matrix we can with probability 1 recover an O (k)-sized superset of the\nsupport, and such matrices exist with Ok log n\n\u270f more\nGaussian measurements to recover the signal within the superset. This gives a non-universal matrix\n\u270f rows for approximate recovery, the top part of which can be made strongly\nwith Ok log n\nexplicit with only slightly more measurements (Ok1+o(1) log n\n\nk rows. Following this, we can use O k\n\nsupport, then doing approximate recovery within that support. This gives a matrix with about\n\n5 Experiments\nIn this section, we present some empirical results relating to the use of our superset technique in\napproximate vector recovery for real-valued signals. To do so, we compare the average error (in `2\nnorm) of the reconstructed vector from using an \u201call Gaussian\u201d measurement matrix to \ufb01rst using\na small number of measurements to recover a superset of the support of the signal, then using the\nremainder of the measurements to recover the signal within that superset via Gaussian measurements.\nWe have used the well-known BIHT algorithm of [11] for recovery of the vector both using the all\nGaussian matrix and within the superset, but we emphasize that this superset technique is highly\ngeneral, and could just as easily be applied on top of other decoding algorithms that use only Gaussian\nmeasurements, such as the \u201cQCoSaMP\u201d algorithm of [17].\n\nk vs. Ok log n\nk).\n\nTo generate random signals x, we \ufb01rst choose a size k support uniformly at random among then\nk\npossibilities, then for each coordinate in the chosen support, generate a random value from N (0, 1).\nThe vector is then rescaled so that ||x||2 = 1.\nFor the dotted lines in Figure 1 labeled \u201call Gaussian,\u201d for each value of (n, m, k) we performed 500\ntrials in which we generated an m \u21e5 n matrix with all entries in N (0, 1). We then used BIHT (run\neither until convergence or 1000 iterations, as there is no convergence guarantee) to recover the signal\nfrom the measurement matrix and measurement outcomes.\nFor the solid lines in Figure 1 labeled \u201c4k log n Superset,\u201d we again performed 500 trials for each\n\nvalue of (n, m, k) where in each trial we generated a measurement matrix M =\uf8ffM (1)\n\nM (2) with m\n\nrows in total. Each entry of M (1) is a Bernoulli random variable that takes value 1 with probability\nk+1 and value 0 with probability k\nk+1; there is evidence from the group testing literature [3, 2] that\nthis probability is near-optimal in some regimes, and it appears also to perform well in practice;\nsee Appendix B for some empirical evidence. The entries of M (2) are drawn from N (0, 1). We\nuse a standard group testing decoding (i.e., remove any coordinates that appear in a test with result\n0) to determine a superset based on y1 = sign(M (1)x), then use BIHT (again run either until\nconvergence or 1000 iterations) to reconstruct x within the superset using the measurement results\ny2 = sign(M (2)x). The number of rows in M (1) is taken to be m1 = 4k log10(n) based on the\n\n1\n\n8\n\n\f(a) n = 1000, k = 5\n\n(b) n = 1000, k = 10\n\n(c) n = 1000, k = 20\n\n(d) n = 1000, k = 40\n\nFigure 1: Average error of reconstruction for different sparsity levels with and without use of matrix\nfor superset of support recovery\n\nfact that with high probability Ck log n rows for some constant C should be suf\ufb01cient to recover an\nO (k)-sized superset, and the remainder m2 = (m  m1) of the measurements are used in M (2).\nWe display data only for larger values of m, to ensure there are suf\ufb01ciently many rows in both\nportions of the measurement matrix. From Figure 1 one can see that in this regime, using a small\nnumber of measurements to \ufb01rst recover a superset of the support provides a modest improvement in\nreconstruction error compared to the alternative. In the higher-error regime when there are simply\nnot enough measurements to obtain an accurate reconstruction, as can be seen in the left side of the\ngraph in Figure 1d, the two methods perform about the same. In the empirical setting, our superset of\nsupport recovery technique can be viewed as a very \ufb02exible and low overhead method of extending\nother existing 1bCS algorithms which use only Gaussian measurements, which are quite common.\nAcknowledgements: This research is supported in part by NSF CCF awards 1618512, 1642658, and\n1642550 and the UMass Center for Data Science.\nReferences\n[1] Jayadev Acharya, Arnab Bhattacharyya, and Pritish Kamath. Improved bounds for universal\none-bit compressive sensing. In 2017 IEEE International Symposium on Information Theory\n(ISIT), pages 2353\u20132357. IEEE, 2017.\n\n[2] Matthew Aldridge, Leonardo Baldassini, and Oliver Johnson. Group testing algorithms: Bounds\n\nand simulations. IEEE Trans. Information Theory, 60(6):3671\u20133687, 2014.\n\n[3] George K. Atia and Venkatesh Saligrama. Boolean compressed sensing and noisy group testing.\n\nIEEE Trans. Information Theory, 58(3):1880\u20131901, 2012.\n\n[4] Petros Boufounos and Richard G. Baraniuk. 1-bit compressive sensing.\n\nIn 42nd Annual\nConference on Information Sciences and Systems, CISS 2008, Princeton, NJ, USA, 19-21 March\n2008, pages 16\u201321. IEEE, 2008.\n\n[5] Mahdi Cheraghchi. Noise-resilient group testing: Limitations and constructions. Discrete\n\nApplied Mathematics, 161(1-2):81\u201395, 2013.\n\n9\n\n\f[6] Annalisa De Bonis, Leszek Gasieniec, and Ugo Vaccaro. Optimal two-stage algorithms for\n\ngroup testing problems. SIAM Journal on Computing, 34(5):1253\u20131270, 2005.\n\n[7] David L. Donoho. Compressed sensing. IEEE Trans. Information Theory, 52(4):1289\u20131306,\n\n2006.\n\n[8] D. Du and F. Hwang. Combinatorial Group Testing and Its Applications. Applied Mathematics.\n\nWorld Scienti\ufb01c, 2000.\n\n[9] Sivakant Gopi, Praneeth Netrapalli, Prateek Jain, and Aditya Nori. One-bit compressed sensing:\nProvable support and vector recovery. In International Conference on Machine Learning, pages\n154\u2013162, 2013.\n\n[10] Jarvis D. Haupt and Richard G. Baraniuk. Robust support recovery using sparse compressive\nsensing matrices. In 45st Annual Conference on Information Sciences and Systems, CISS 2011,\nThe John Hopkins University, Baltimore, MD, USA, 23-25 March 2011, pages 1\u20136. IEEE, 2011.\n[11] Laurent Jacques, Jason N Laska, Petros T Boufounos, and Richard G Baraniuk. Robust 1-bit\ncompressive sensing via binary stable embeddings of sparse vectors. IEEE Transactions on\nInformation Theory, 59(4):2082\u20132102, 2013.\n\n[12] HJ Landau. Sampling, data transmission, and the nyquist rate. Proceedings of the IEEE,\n\n55(10):1701\u20131706, 1967.\n\n[13] Ping Li. One scan 1-bit compressed sensing. In Arthur Gretton and Christian C. Robert, editors,\nProceedings of the 19th International Conference on Arti\ufb01cial Intelligence and Statistics,\nAISTATS 2016, Cadiz, Spain, May 9-11, 2016, volume 51 of JMLR Workshop and Conference\nProceedings, pages 1515\u20131523. JMLR.org, 2016.\n\n[14] Arya Mazumdar. Nonadaptive group testing with random set of defectives.\n\nInformation Theory, 62(12):7522\u20137531, 2016.\n\nIEEE Trans.\n\n[15] Yaniv Plan and Roman Vershynin. Robust 1-bit compressed sensing and sparse logistic re-\ngression: A convex programming approach. IEEE Trans. Information Theory, 59(1):482\u2013494,\n2013.\n\n[16] Ely Porat and Amir Rothschild. Explicit nonadaptive combinatorial group testing schemes.\n\nIEEE Trans. Information Theory, 57(12):7982\u20137989, 2011.\n\n[17] Hao-Jun Michael Shi, Mindy Case, Xiaoyi Gu, Shenyinying Tu, and Deanna Needell. Methods\nfor quantized compressed sensing. In 2016 Information Theory and Applications Workshop,\nITA 2016, La Jolla, CA, USA, January 31 - February 5, 2016, pages 1\u20139. IEEE, 2016.\n\n[18] Martin Slawski and Ping Li. b-bit marginal regression. In Corinna Cortes, Neil D. Lawrence,\nDaniel D. Lee, Masashi Sugiyama, and Roman Garnett, editors, Advances in Neural Information\nProcessing Systems 28: Annual Conference on Neural Information Processing Systems 2015,\nDecember 7-12, 2015, Montreal, Quebec, Canada, pages 2062\u20132070, 2015.\n\n[19] Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal\n\nStatistical Society: Series B (Methodological), 58(1):267\u2013288, 1996.\n\n10\n\n\f", "award": [], "sourceid": 5490, "authors": [{"given_name": "Larkin", "family_name": "Flodin", "institution": "University of Massachusetts Amherst"}, {"given_name": "Venkata", "family_name": "Gandikota", "institution": "University of Massachusetts, Amherst"}, {"given_name": "Arya", "family_name": "Mazumdar", "institution": "University of Massachusetts Amherst"}]}