{"title": "Linear readout from a neural population with partial correlation data", "book": "Advances in Neural Information Processing Systems", "page_first": 2469, "page_last": 2477, "abstract": "How much information does a neural population convey about a stimulus? Answers to this question are known to strongly depend on the correlation of response variability in neural populations. These noise correlations, however, are essentially immeasurable as the number of parameters in a noise correlation matrix grows quadratically with population size. Here, we suggest to bypass this problem by imposing a parametric model on a noise correlation matrix. Our basic assumption is that noise correlations arise due to common inputs between neurons. On average, noise correlations will therefore reflect signal correlations, which can be measured in neural populations. We suggest an explicit parametric dependency between signal and noise correlations. We show how this dependency can be used to fill the gaps\" in noise correlations matrices using an iterative application of the Wishart distribution over positive definitive matrices. We apply our method to data from the primary somatosensory cortex of monkeys performing a two-alternative-forced choice task. We compare the discrimination thresholds read out from the population of recorded neurons with the discrimination threshold of the monkey and show that our method predicts different results than simpler, average schemes of noise correlations.\"", "full_text": "Linear readout from a neural population\n\nwith partial correlation data\n\nAdrien Wohrer(1), Ranulfo Romo(2), Christian Machens(1)\n\n(1) Group for Neural Theory\n\nLaboratoire de Neurosciences Cognitives\n\n\u00b4Ecole Normale Suprieure\n\n75005 Paris, France\n\n{adrien.wohrer,christian.machens}@ens.fr\n\n(2) Instituto de Fisiolog\u00b4\u0131a Celular\n\nUniversidad Nacional Aut\u00b4onoma de M\u00b4exico\n\nMexico City, Mexico\n\nrromo@ifc.unam.mx\n\nAbstract\n\nHow much information does a neural population convey about a stimulus? An-\nswers to this question are known to strongly depend on the correlation of response\nvariability in neural populations. These noise correlations, however, are essen-\ntially immeasurable as the number of parameters in a noise correlation matrix\ngrows quadratically with population size. Here, we suggest to bypass this problem\nby imposing a parametric model on a noise correlation matrix. Our basic assump-\ntion is that noise correlations arise due to common inputs between neurons. On\naverage, noise correlations will therefore re\ufb02ect signal correlations, which can be\nmeasured in neural populations. We suggest an explicit parametric dependency\nbetween signal and noise correlations. We show how this dependency can be used\nto \u201d\ufb01ll the gaps\u201d in noise correlations matrices using an iterative application of the\nWishart distribution over positive de\ufb01nitive matrices. We apply our method to data\nfrom the primary somatosensory cortex of monkeys performing a two-alternative-\nforced choice task. We compare the discrimination thresholds read out from the\npopulation of recorded neurons with the discrimination threshold of the monkey\nand show that our method predicts different results than simpler, average schemes\nof noise correlations.\n\n1\n\nIntroduction\n\nIn the \ufb01eld of population coding, a recurring question is the impact on coding ef\ufb01ciency of so-called\nnoise correlations, i.e., trial-to-trial covariation of different neurons\u2019 activities due to shared connec-\ntivity. Noise correlations have been proposed to be either detrimental or bene\ufb01cial to the quantity\nof information conveyed by a population [1, 2, 3]. Also, some proposed neural coding schemes,\nsuch as those based on synchronous spike waves, fundamentally rely on second- and higher- order\ncorrelations in the population spikes [4].\nThe problem of noise correlations is made particularly dif\ufb01cult by its high dimensionality along\ntwo distinct physical magnitudes: time, and number of neurons. Ideally, one should describe the\nprobabilistic structure of any set of spike trains, at any times, for any ensemble of neurons in\nthe population; which is clearly impossible experimentally. As a result, when recording from a\n\n1\n\n\fpopulation of neurons with a \ufb01nite number of trials, one only has access to very partial correlation\ndata. First, studies based on experimental data are most often limited to second order (pairwise)\ncorrelations. Second, the temporal correlation structure is generally simpli\ufb01ed (e.g., by assuming\nstationarity) or forgotten altogether (by studying only correlation in overall spike counts). Third and\nmost importantly, even with modern multi-electrode arrays, one is limited in the number of neurons\nwhich can be recorded simultaneously during an experiment. Thus, when data are pooled over\nexperiments involving different neurons, most pairwise noise correlation indices remain unknown.\nIn consequence, there is always a strong need to \u201c\ufb01ll the gaps\u201d in the partial correlation data\nextracted experimentally from a population.\n\nIn contrast to noise-correlation data, the \ufb01rst-order probabilistic data are easily extracted from a pop-\nulation: They simply consist in the trial-averaged \ufb01ring rates of the neurons, generally referred to\nas their \u201csignal\u201d. In particular, one can easily measure so-called signal correlations which measure\nhow different neurons\u2019 trial-averaged \ufb01ring rates covary with changes in the stimulus.\nIn this paper, we propose a method to \u201c\ufb01ll the gaps\u201d in noise correlation data, based on signal corre-\nlation data. This approach can be summarized by the notion that \u201csimilar tuning reveals shared in-\nputs\u201d. Indeed, noise correlations reveal a proximity of connection between neurons (through shared\ninputs and/or reciprocal connections) which, in turn, will generally result in some covariation of the\nneurons\u2019 \ufb01rst-order response to stimuli. When browsing through neural pairs in the population, one\nshould thus expect to \ufb01nd a statistical link between their signal- and noise- correlations; and this has\nindeed been reported several times [5, 6]. If this statistical structure is well described, it can serve\nas basis to randomly generate noise correlation structures, compatible with the measured signal cor-\nrelation. Furthermore, to assess the impact of this randomness, one can perform repeated picks of\npotential noise correlation structures, each time observing the resulting impact on the coding capac-\nity in the population. Then, this method will provide reliable estimates (average + error bar) of the\nimpact of noise correlations on population coding, given partial noise correlation data.\nWe present this general approach in a simpli\ufb01ed setting in Section 2. The input stimulus is a single\nparameter which can take a \ufb01nite number of values. The population\u2019s response is summarized by\na single number for each neuron (its mean \ufb01ring rate during the trial), so that in turn a correlation\nstructure is simply given by a symmetric, positive, NxN matrix. In Section 3, we detail the method\nused to generate random noise correlation matrices compatible with the population\u2019s signal corre-\nlation, which we believe to be novel. In Section 4, we apply this procedure to assess the amount\nof information about the stimulus in the somatosensory cortex of macaques responding to tactile\nstimulation.\n\n2 Model of the neural population\n\nbe described by a series of Dirac pulses Si(t) =(cid:80)ni\n\nPopulation activity R. We consider a population of N neurons tested over a discrete set of pos-\nsible stimuli f \u2208 {f1, . . . , fK}, lasting for a period of time T . The spike train of neuron i can\nk ). Due to trial-to-trial variability,\nk are random variables, whose distribution\n\nk=1 \u03b4(t \u2212 t(i)\n\nthe number of emitted spikes ni and the spike times t(i)\ndepends (amongst other things) on the value of stimulus f.\nAt each trial, information about f can be extracted from the spike trains Si(t) using several possible\nreadout mechanisms. In this article, we limit ourselves to the simplest type of readout: The popu-\nlation activity is summarized by the N-dimensional vector R = {Ri}i=1...N , where Ri = ni/T is\nthe mean \ufb01ring rate of neuron i on this trial. A more plausible readout, based on sliding-window\nestimates of the instantaneous \ufb01ring rate, has been presented elsewhere [7].\n\nFirst-moment measurements. Given a particular stimulus f, we note \u03bbi(t, f) the probability of\nobserving a spike from neuron i at time t regardless of other neurons\u2019 spikes (i.e., the \ufb01rst moment\ndensity, in the nomenclature of point processes): E(Si(t)| f) = \u03bbi(t, f). Experimentally, \u03bbi(t, f)\nis measured fairly easily, as the trial-averaged \ufb01ring rate of neuron i in stimulus condition f.\n\n2\n\n\fSince Ri = 1/T(cid:80)T\n\nt=0 Si(t), its expectancy is given by\n\nT(cid:88)\n\nt=0\n\nE(Ri | f) = 1/T\n\n\u03bbi(t, f) \u2206= \u03bbi(f).\n\n(1)\n\nThis function of f is generally called the tuning curve of neuron i.\nThe trial-averaged \ufb01ring rates \u03bbi(t, f) can also be used to de\ufb01ne the signal correlation matrix \u03c3 =\n{\u03c3ij}i,j=1...N , as:\n\n(cid:80)\nf,t \u03bbi(t, f)\u03bbj(t, f) \u2212 KT(cid:98)\u03bbi(cid:99)\u03bbj\n\nf,t \u03bbi(t, f)2 \u2212 KT(cid:98)\u03bbi\n\n2(cid:17)(cid:16)(cid:80)\n\nf,t \u03bbj(t, f)2 \u2212 KT(cid:99)\u03bbj\n\n2(cid:17) ,\n\n(cid:114)(cid:16)(cid:80)\nwhere (cid:98)\u03bbi = 1/(KT )(cid:80)\n\n\u03c3ij =\n\nf,t \u03bbi(t, f) is the overall average \ufb01ring rate of neuron i across trials and\nstimuli. The Pearson correlation \u03c3ij measures how much the \ufb01rst-order responses of neurons i and\nj \u201clook alike\u201d, both in their temporal course and across stimuli. Being a correlation matrix, \u03c3 is\npositive de\ufb01nite, with 1s on its diagonal, and off-diagonal elements between \u22121 and 1. As opposed\nto most studies which de\ufb01ne signal correlation only based on tuning curves, it is important for our\npurpose to also include the time course of response in the measure of signal similarity. Indeed,\nsimilar temporal courses are more likely to reveal shared input, and thus possible noise correlation.\n\nA model for noise correlations. While \ufb01rst-moment (\u201csignal\u201d) statistics can be measured exper-\nimentally with good precision, second-moment statistics (noise correlations) can never be totally\nmeasured in a large population. For this reason a parametric model must be introduced, that will\nallow us to infer the correlation parameters that could not be measured.\nWe introduce a simple model in which the noise correlation matrix \u03c1 is independent of stimulus f:\nFor a given stimulus f, the population activity R is supposed to follow the multivariate Gaussian\nN (\u00b5(f), Q(f)), with\n\n(2)\n\n\u00b5i(f) = \u03bbi(f),\n\n(cid:113)\n\n\u03bbi(f)\u03bbj(f).\n\nQij(f) = \u03c1ij\n\n(3)\nLet us make a few remarks about this model. The \ufb01rst line is imposed by eq. (1). The second line\nimplies that var(Ri |f ) = Qii(f) = \u03bbi(f), meaning that all neurons in this model are supposed\nto have a Fano factor of one. This model is the simplest possible for our purpose, as its only free\nparameter is the chosen noise correlation matrix \u03c1, and it has often been used in the literature [8].\nNaturally, the assumption of Gaussianity is a simplifying approximation, as the values for R really\ncome from a discretized spike count.\n\n3\n\nInferring the full noise correlation structure\n\n3.1 Statistical link between signal and noise correlation\n\n\u03c1ij \u223c N(cid:0)F (\u03c3ij), c2(cid:1),\n\nWe propose that, across all pairs (i, j) of distinct cells in the population, the noise correlation index\nis linked to the signal correlation index by the following statistical relationship:\n\n(4)\nwhere function F (\u03c3ij) provides the expected value for \u03c1ij if \u03c3ij is known, and c measures the\nstatistical variations of \u03c1ij across pairs of cells sharing the same signal correlation \u03c3ij. By extension,\nwe note F (\u03c3) the matrix with 1s on its diagonal, and non-diagonal elements F (\u03c3ij).\nThe choice of F and c is dictated by the experimental data under study.\nIn our case, these are\nneural recordings in the primary somatosensory cortex (S1) of monkeys responding to a frequency\ndiscrimination task (see Section 4). For all pairs (i, j) of simultaneously recorded neurons (total of\nseveral hundred pairs), we computed the two correlation coef\ufb01cients (\u03c3ij, \u03c1ij). This allowed us to\ncompute an experimental estimate for the distribution of \u03c1ij given \u03c3ij (Figure 1). We \ufb01nd that\n\nF (x) = b + a exp(cid:0)\u03b1(x \u2212 1)(cid:1)\n\n(5)\n\n3\n\n\fFigure 1: Statistical link between signal and noise correlations. A: Experimental distribution of\n(\u03c3ij,\u03c1ij) across simultaneously recorded neural pairs in population data from cortical area S1 (dark\ngray: noise correlation coef\ufb01cients signi\ufb01cantly different from 0). B: Same data transformed into\na conditional distribution for \u03c1ij given \u03c3ij. Plain ligns: experimental mean (green) and error bars\n(white). Dotted ligns: model mean F (\u03c3ij) (red) and standard deviation c (yellow).\n\nprovides a good \ufb01t, with a (cid:39) 0.6, \u03b1 (cid:39) 2.5 and b (cid:39) 0.05. For the standard deviation in eq. (4), we\nchoose c = 0.1. This value is slightly reduced compared to experimental data (Figure 1, white vs.\nyellow con\ufb01dence intervals), because part of the variability of \u03c1ij observed experimentally is due to\n\ufb01nite-sample errors in its measurement. We also note that the value found here for a is higher than\nvalues generally reported for noise correlations in the literature [2], possibly due to experimental\nlimitations ; however, this has no in\ufb02uence on the method proposed here, only on its quantitative\nresults.\nOnce that function F is \ufb01tted on the subset of simultaneously recorded neural pairs, we can use\nthe statistical relation (4)-(5) to randomly generate noise correlation matrices \u03c1 for the full neural\npopulation, on the basis of its signal correlation matrix \u03c3. However, such a random generation is\nnot trivial, as one must insure at the same time that individual coef\ufb01cients \u03c1ij follow relation (4),\nand that \u03c1 remains a (positive de\ufb01nite) correlation matrix.\nAs a \ufb01rst step towards this generation, note that the \u201caverage\u201d noise correlation matrix predicted\nby the model, that is F (\u03c3), is itself a correlation matrix. First, by construction, it has 1s on the\ndiagonal and all its elements belong to [\u22121, 1]. Second, F (\u03c3) can be written as a Taylor expansion\non element-wise powers of \u03c3 (plus diagonal term (1\u2212 a\u2212 b)Id), with only positive coef\ufb01cients (due\nto the exponential in eq. (5)). Since the element-wise (or Hadamard) product of two symmetric\nsemi-de\ufb01nite positive matrices is itself semi-positive de\ufb01nite (\u201cSchur\u2019s product theorem\u201d [9]), all\nmatrices in the expansion are semi-de\ufb01nite positive, and so is F (\u03c3). This property is fundamental\nto apply the method of random matrix generation that we propose now.\n\n3.2 Generating random correlation matrices\n\naccording to N (0, \u03a3): \u2126 = 1/k(cid:80)k\n\nWishart and anti-Wishart distributions. The Wishart distribution is probably the most straight-\nforward way of generating a random symmetric, positive de\ufb01nite matrix with an imposed expectancy\nmatrix. Let \u03a3 be an NxN symmetric de\ufb01nite positive matrix, k an integer giving the number of de-\ngrees of freedom, and introduce the sample covariance matrix of k i.i.d Gaussian samples Xi drawn\ni . When k \u2265 N, the matrix \u2126 has almost surely\nfull-rank. In that case, its pdf has a relatively simple expression, and the distribution for \u2126 is called\nthe Wishart distribution [10]. When k < N, the matrix \u2126 is almost surely of rank k, so it is not\ninvertible anymore. In that case, its pdf has a much more intricate expression. This distribution has\nsometimes been referred to as anti-Wishart distribution [11].\n\ni=1 XiXT\n\n4\n\n\fIn both cases, the resulting distribution for random matrix \u2126, which we note W(\u03a3, k), can be proven\nto have the following characteristic function [11]:\n\n(cid:18)\n\n(cid:19)k/2\n\n\u03c6(T ) = E(e\u2212iTr(\u2126T )) = det\n\nId +\n\n2i\nk\n\n\u03a3T\n\n(where T is a real symmetric matrix). This result can be used to \ufb01nd the two \ufb01rst moments of \u2126:\n\nE(\u2126ij) = \u03a3ij\n\ncov(\u2126ij, \u2126kl) =\n\n1\nk\n\n(\u03a3ik\u03a3jl + \u03a3il\u03a3jk),\n\n(6)\n\n(7)\n\nwith a variance naturally scaling as 1/k.\nThen, a second step consists in renormalizing \u2126 by its diagonal elements, to produce a correlation\nmatrix \u03c1. The resulting distribution for \u03c1, which we note W(\u03a3, k), has been studied by Fisher and\nothers [12, 10], and is quite intricate to describe analytically. If one takes the generating matrix\n\u03a3 = F (\u03c3) to be itself a correlation matrix, then E(\u03c1) (cid:39) F (\u03c3) still holds approximately, albeit with\na small bias, and the variance of \u03c1 still scales with 1/k.\nDistribution W(F (\u03c3), k) could be a good candidate to generate a random correlation matrix \u03c1 that\nwould approximately verify E(\u03c1) = F (\u03c3). Unfortunately, this method presents a problem in our\ncase. To \ufb01t the statistical relation eq. (4), we need the variance of an element \u2126ij to be on the order\nof c2 (cid:39) 0.01. But this implies (through eq. 7) that k must be small (typically, around 20), so that\nnoise correlation matrices \u03c1 generated in this way necessarily have a very low rank (anti-Wishart\ndistribution, Figure 2, blue traces). This creates an arti\ufb01cial feature of the noise correlation structure\nwhich is not at all desirable.\n\nIterated Wishart. We propose here an alternative method for generating random correlation ma-\ntrices, based on iterative applications of the Wishart distribution. This method allows to create\nrandom correlation matrices with a higher variance than a Wishart distribution, while retaining a\nmuch wider eigenvalue spectrum than the more simple anti-Wishart distribution.\nThe distribution has two positive integer parameters k and m (plus generative matrix F (\u03c3)). It is\nbased on the following recursive procedure:\n\n1. Start from deterministic matrix \u03c10 = F (\u03c3).\n2. For n = 1 . . . m, pick \u03c1n following the Wishart-correlation distribution W(\u03c1n\u22121, k).\n3. Take \u03c1 = \u03c1m as output random matrix.\n\nSince E(\u03c1n) (cid:39) E(\u03c1n\u22121), one expects approximately E(\u03c1) (cid:39) F (\u03c3). Furthermore, by taking a large\nk, one can produce full-rank matrices, circumventing the \u201clow-rank problem\u201d of the anti-Wishart\ndistribution. Because k is large, the variance added at each step is small (proportional to 1/k),\nwhich is compensated by iterating the procedure a large number m of times.\nSimulations allowed us to study the resulting distribution for \u03c1 (Figure 2, red traces) and compare\nit to the more standard \u201canti-Wishart-based\u201d distribution for \u03c1 (Figure 2, blue traces). We used the\nsignal correlation data \u03c3 observed in a 100-neuron recorded sample from area S1, and the average\nnoise correlation F (\u03c3) given by our experimental \ufb01t of F in that same area (Figure 1). As a simple\ninvestigation into the expectancy and variance of these distributions, we computed the empirical\ndistribution for \u03c1ij conditionned on \u03c3ij, for both distributions (Panel A). On this aspect the two\ndistributions lead to very similar results, with a mean sticking closely to F (\u03c3ij), except for low\nvalues of \u03c3ij where the slight bias, previously mentionned, is observed in both cases. In contrast,\nthe two distributions lead to very different results in term of their spectra (Panel B). The iterative\nWishart, used with a large value of k, preserves a non-null spectrum across all its dimensions. It\nshould be noted, though, that the spectrum is markedly more concentrated on the \ufb01rst eigenvalues\nthan the spectrum of F (\u03c3) (dotted line). However, this tendency towards dimensional reduction is\nmuch milder than in the anti-Wishart case !\nAs long as m is sensibly smaller than k, the variances added at each step (of order 1/k) simply\nsum up, so that m/k is the main factor de\ufb01ning the variance of the distribution. For example, in\n\n5\n\n\fFigure 2: Random generation of noise correlation matrices. N = 100 neurons from our recorded\nsample (area S1). A: Empirical distribution of noise correlation \u03c1ij conditioned on signal correlation\n\u03c3ij (mean \u00b1 std). B: Empirical distribution of eigenvalue spectrum (mean \u00b1 std in log domain).\n\nFigure 2, k/m equals 20, precisely the number of degrees of freedom in the equivalent anti-Wishart\ndistribution. Also, the eigenvalue spectrum of \u03c1 appears to follow a quasi-perfect exponential decay\n(even on a trial-by-trial basis), a result for which we have yet no explanation. The theoretical study\nof the \u201citerated Wishart\u201d distribution, especially when k and m tend to in\ufb01nity in a \ufb01xed ratio, might\nyield an interesting new type of distribution for positive symmetric matrices.\n\n4 Linear encoding of tactile frequency in somatosensory cortex\n\nTo illustrate the interest of random noise correlation matrix generation, we come back to our exper-\nimental data. They consist of neural recordings in the somatosensory cortex of macaques during a\ntwo-frequency discrimination task. Two tactile vibrations are successively applied on the \ufb01ngertips\nof a monkey. The monkey must then decide which vibration had the higher frequency (the detailed\nexperimental protocol has been described elsewhere). Here, we analyze neural responses to the \ufb01rst\npresented frequency, in primary somatosensory cortex (S1). Most neurons there have a positive tun-\ning (\u03bbi(f) grows with f) and positive noise correlations ; however, negative tunings (resulting in\nthe appearance of negative signal correlations) and signi\ufb01cant negative noise-correlations can also\nbe found (Figure 1-A).\nIn the notations of Section 2, stimulus f is the vibration frequency, which can take K = 5 possible\nvalues (14, 18, 22, 26 and 30 Hz). The neural activities Ri consist of each neuron\u2019s mean \ufb01ring\nrate over the duration of the stimulation, with T = 250 ms. Our goal is to estimate the amount\nof information about stimulus f which can be extracted from a linear readout of neural activities,\ndepending on the number of neurons N in the population. This implies to estimate the impact of\nnoise correlations. We thus generate a random noise correlation structure \u03c1 following the above\nprocedure, and assume the resulting distribution for neural activity R to follow eq. (2)-(3). This\nbeing given, one can estimate the sensitivity \u2206f of a linear readout of f from R, as we now present.\n\n4.1 Linear stimulus discriminability in a neural population\n\none-dimensional linear readout, based on a prediction variable \u02c6f =(cid:80)N\nand Q = 1/K(cid:80)\n\nLinear readout from the population. To predict the value of f given R, we resort to a simple\ni=1 aiRi. The set of neural\nweights a = {ai}i=1...N must be chosen in order to maximize the readout performance. We \ufb01nd\nit through 1-dimensional Linear Discriminant Analysis (LDA), as the direction which maximizes\n(aT Ma)/(aT Qa), where M is the inter-class covariance matrix of class centroids {\u00b5(f)}f =f1...fK ,\nf Q(f) is the average intra-class covariance matrix. Then, the norm of a is chosen\nso as for variable \u02c6f to be the best possible predictor of stimulus value f, in terms of mean square\nerror.\n\n6\n\n\fReadout discriminability. The previous procedure produces a prediction variable \u02c6f which is nor-\nmally distributed, with E( \u02c6f | f) = aT \u00b5(f) and var( \u02c6f | f) = aT Q(f)a. As a result, one can\ncompute analytically the neurometric curve giving the probability that two successive stimuli be\ncorrectly compared by the prediction variable:\n\nG(\u2206) = P ( \u02c6f2 > \u02c6f1|f2 \u2212 f1 = \u2206).\n\n(8)\n\nFinally, a sigmoid can be \ufb01t to this curve and provide a single neurometric index \u2206f, as half its\n25% \u2212 75% interval. \u2206f measures what we call the linear discriminability of stimulus f in this\nneural population. It provides an estimate of the amount of information about the stimulus linearly\npresent in the population activity R.\n\n4.2 Discriminability curves\n\nDiscriminability versus population size. The previous paragraphs have described a means to\nestimate the linear discriminability \u2206f of a given neural population, with a given noise correlation\nstructure. We apply this method to estimate \u2206f(N) in growing populations of size N = 1, 2, . . . ,\nup to the full recorded neural sample (approx. 100 neurons in S1, Brodmann area 1). For each\nN, \u2206f(N) is computed to approximate the linear discriminability of the best N-tuple population\navailable from our recorded sample. As it is not tractable to test all possible N-tuples, we resort\nto the following recursive scheme: Search for neuron i1 with best discriminability, then search for\nneuron i2 with the best discriminability for 2-tuple {i1, i2}, etc. We term the resulting curve \u2206f(N)\nthe discriminability curve for the population. Note that this curve is not necessarily decreasing, as\nthe last neurons to be included in the population can actually deteriorate the overall readout, by their\nin\ufb02uence on the LDA axis a.\nEach draw of a sample noise correlation structure gives rise to a different discriminability curve. To\nbetter assess the possible impact of noise correlations, we performed 20 random draws of possible\nnoise correlation structures, each time computing the discriminability curve. This produces an av-\nerage discrimination curve \ufb02anked by a con\ufb01dence interval modelling our ignorance of the exact\nfull correlation structure in the population (Figure 3, red lines). The con\ufb01dence interval is found\nto be rather small. This means that, if our statistical model for the link between signal and noise\ncorrelation (4)-(5) is correct, it is possible to assess with good precision the content of information\npresent in a neural population, even with very partial knowledge of its correlation structure.\nSince the resulting con\ufb01dence interval on \u2206f(N) is small, one could assume that the impact of noise\ncorrelations is only driven by the \u201cstatistical average\u201d matrix F (\u03c3). In this particular application,\nhowever, this is not the case. When the noise correlation matrix \u03c1 is (deterministically) set equal\nto F (\u03c3), the resulting linear discriminability is underestimated (blue curve in Figure 3). Indeed,\nthe statistical \ufb02uctuations in \u03c1ij around F (\u03c3ij), of magnitude c (cid:39) 0.1, induce an overcorrelation of\ncertain neural pairs, and a decorrelation of other pairs (including a signi\ufb01cant minority of negative\ncorrelation indices \u2013 as observed in our data, Figure 1). The net effect of the decorrelated pairs is\nstronger and improves the overall discriminability in the population as compared to the \u201cstatistical\naverage\u201d.\nIn our particular case, the predicted discriminability curve is actually closer to what it would be\nin a totally decorrelated population (\u03c1 = 0, green curve). This result is not generic (it depends\non the parameter values in this particular example), but it illustrates how noise correlations are not\nnecessarily detrimental to coding ef\ufb01ciency [2], in neural populations with balanced tuning and/or\nbalanced noise correlations (as is the case here, for a minority of cells).\n\nComparison with monkey behavior. The measure of discriminability through G(\u2206) (eq. 8) mim-\nics the two-stimulus comparison which is actually performed by the monkey. And indeed, one can\nbuild in the same fashion a psychometric curve for the monkey, describing its behavioral accuracy in\ncomparing correctly f1 and f2 across trials, depending on \u2206 = f2 \u2212 f1. The resulting psychometric\nindex \u2206fmonkey can then directly be compared with \u2206f, to assess the behavioral relevance of the\nproposed linear readout (Figure 3, black dotted line). In our model, the neurometric discriminability\ncurve crosses the monkey\u2019s psychometric index at around N (cid:39) 8. If neurons are assumed to be\ndecorrelated, the crossing occurs at N (cid:39) 5. Using the \u201cstatistical average\u201d of the noise correlation\nstructure, the monkey\u2019s psychometric index is approached around N (cid:39) 20.\n\n7\n\n\fFigure 3: Discriminability curves for various correlation structures. Neural data: Mean \ufb01ring rates\nover T = 250 msec, for N = 100 neurons from our recorded sample (area S1). Green: No noise\ncorrelations. Red: Random noise correlation structure (mean+std). Blue: Statistical average of the\nnoise correlation structure. Black: Psychometric index for the monkey.\n\nThese results illustrate a number of important qualitative points. First, a known fact: the chosen\nnoise correlation structure in a model can have a strong impact on the neural readout. Maybe not so\nknown is the fact that considering a simpli\ufb01ed, \u201cstatistical average\u201d of noise correlations may lead\nto dramatically different results in the estimation of certain quantities such as discriminability. Thus,\ninferring a noise correlation structure must be done with as much care as possible in sticking to the\navailable structure in the data. We think the method of extrapolation of noise correlation matrices\nproposed here offers a means to stick closer to the statistical structure (partially) observed in the\ndata, than more simplistic methods.\nSecond, a comment must be made on the typical number of neurons required to attain the monkey\u2019s\nbehavioral level of performance (N \u2264 10 using our extrapolation method for noise correlations).\nNo matter the exact computation and sensory modality, it is a known fact that a few sensory neurons\nare suf\ufb01cient to convey as much information about the stimulus as the monkey seems to be using,\nwhen their spikes are counted over long periods of time (typically, several hundreds of ms) [13, 14].\nThis is paradoxical when considering the number of neurons involved, even in such a simple task\nas that studied here. The simplest explanation to this paradox is that this spike count over several\nhundreds of milliseconds is not accessible behaviorally to the animal. Most likely, the animal\u2019s\npercept relies on much more instantaneous integrations of its sensory areas\u2019 activities, so that the\ncontributions of many more neurons are required to achieve the animal\u2019s level of accuracy. In this\noptic, we have started to study an alternative type of linear readout from a neural population, based\non its instantaneous spiking activity, which we term \u2018online readout\u2019 [7]. We believe that such an\napproach, combined with the method proposed here to account for noise correlations with more\naccuracy, will lead to better approximations of the number of neurons and typical integration times\nused by the monkey in solving this type of task.\n\n5 Conclusion\n\nWe have proposed a new method to account for the noise correlation structure in a neural population,\non the basis of partial correlation data. The method is based on the statistical link between signal and\nnoise correlation, which is a re\ufb02ection of the underlying neural connectivity, and can be estimated\nthrough pairwise simultaneous recordings. Noise correlation matrices generated in accordance with\nthis statistical link display robust properties across possible con\ufb01gurations, and thus provide reliable\nestimates for the impact of noise correlation \u2013 if, naturally, the statistical model linking signal and\nnoise correlation is accurate enough. We applied this method to estimate the linear discriminability\nin N-tuples of neurons from area S1 when their spikes are counted over 200 msec. We found that\nless than 10 neurons can account for the monkey\u2019s behavioral accuracy, suggesting that percepts\nbased on full neural populations are likely based on much shorter integration times.\n\n8\n\n\fReferences\n[1] Zohary, E. and Shadlen, M.N. and Newsome, W.T. (1994) Correlated neuronal discharge rate and its im-\n\nplications for psychophysical performance, Nature 370(6485): 140\u2013143\n\n[2] Romo, R., Hern\u00b4andez, A., Zainos, A. and Salinas, E. (2003) Correlated neuronal discharges that increase\n\ncoding ef\ufb01ciency during perceptual discrimination, Neuron 38(4): 649\u2013657\n\n[3] Averbeck, B.B., Latham, P.E. and Pouget, A. (2006) Neural correlations, population coding and computa-\n\ntion, Nature Reviews Neuroscience 7(5): 358\u2013366\n\n[4] Abeles, M. (1991) Corticonics: Neural circuits of the cerebral cortex, Cambridge Univ Pr\n[5] Lee, D., Port, N.L., Kruse, W. and Georgopoulos, A.P. (1998) Variability and correlated noise in the dis-\n\ncharge of neurons in motor and parietal areas of the primate cortex, Journal of Neuroscience 18(3)\n\n[6] Petersen, R.S., Panzeri, S. and Diamond, M.E. (2001) Population coding of stimulus location in rat so-\n\nmatosensory cortex, Neuron 32(3): 503\u2013514\n\n[7] Wohrer, A., Romo, R. and Machens, C. K. (2010) Online readout of frequency information in areas SI and\n\nSII Computational and Systems Neuroscience 2010 (CoSyne)\n\n[8] Abbott, LF and Dayan, P. (1999) The effect of correlated variability on the accuracy of a population code,\n\nNeural Computation 11(1): 91\u2013101\n\n[9] Horn, R.A. and Johnson, C.R. (1990) Matrix analysis, Cambridge Univ Pr\n[10] Johnson, R.A. and Wichern, D.W. (1998) Applied multivariate statistical analysis, Prentice Hall Engle-\n\nwood Cliffs, NJ\n\n[11] Janik, R.A. and Nowak, M.A. (2003) Wishart and anti-Wishart random matrices, Journal of Physics A:\n\nMathematical and General 36: 3629\u20133637\n\n[12] Fisher, R.A. (1915) Frequency Distribution of the Values of the Correlation Coef\ufb01cients in Samples from\n\nan Inde\ufb01nitely Large Population, Biometrika 10(4)\n\n[13] Britten, KH, Shadlen, MN, Newsome, WT and Movshon, JA (1992) The analysis of visual motion: a\n\ncomparison of neuronal and psychophysical performance, Journal of Neuroscience 12(12)\n\n[14] Romo, R. and Salinas, E. (2003) Flutter discrimination: neural codes, perception, memory and decision\n\nmaking, Nature Reviews Neuroscience 4(3): 203\u2013218\n\n9\n\n\f", "award": [], "sourceid": 1195, "authors": [{"given_name": "Adrien", "family_name": "Wohrer", "institution": null}, {"given_name": "Ranulfo", "family_name": "Romo", "institution": null}, {"given_name": "Christian", "family_name": "Machens", "institution": null}]}