{"title": "The Effect of Correlations on the Fisher Information of Population Codes", "book": "Advances in Neural Information Processing Systems", "page_first": 167, "page_last": 173, "abstract": null, "full_text": "The Effect of Correlations on the Fisher \n\nInformation of Population Codes \n\nHyoungsoo Yoon \nhyoung@fiz.huji.ac.il \n\nHaim Sompolinsky \nhairn@fiz.huji.ac.il \n\nRacah Institute of Physics and Center for Neural Computation \n\nHebrew University, Jerusalem 91904, Israel \n\nAbstract \n\nWe study the effect of correlated noise on the accuracy of popu(cid:173)\nlation coding using a model of a population of neurons that are \nbroadly tuned to an angle in two-dimension. The fluctuations in \nthe neuronal activity is modeled as a Gaussian noise with pairwise \ncorrelations which decays exponentially with the difference between \nthe preferred orientations of the pair. By calculating the Fisher in(cid:173)\nformation of the system, we show that in the biologically relevant \nregime of parameters positive correlations decrease the estimation \ncapability of the network relative to the uncorrelated population. \nMoreover strong positive correlations result in information capac(cid:173)\nity which saturates to a finite value as the number of cells in the \npopulation grows. In contrast, negative correlations substantially \nincrease the information capacity of the neuronal population. \n\n1 \n\nIntroduction \n\nIn many neural systems, information regarding sensory inputs or (intended) motor \noutputs is found to be distributed throughout a localized pool of neurons. It is gener(cid:173)\nally believed that one of the main characteristics of the population coding scheme is \nits redundancy in representing information (Paradiso 1988; Snippe and Koenderink \n1992a; Seung and Sompolinsky 1993). Hence the intrinsic neuronal noise, which \nhas detrimental impact on the information processing capability, is expected to be \ncompensated by increasing the number of neurons in a pool. Although this expec(cid:173)\ntation is universally true for an ensemble of neurons whose stochastic variabilities \nare statistically independent, a general theory of the efficiency of population coding \nwhen the neuronal noise is correlated within the population, has been lacking. The \nconventional wisdom has been that the correlated variability limits the information \npro cessing capacity of neuronal ensembles (Zohary, Shadlen, and Newsome 1994). \n\n\f168 \n\nH. Yoon and H. Sompolinsky \n\nHowever, detailed studies of simple models of a correlated population that code for \na single real-valued parameter led to apparently contradicting claims. Snippe and \nKoenderink (Snippe and Koenderink 1992b) conclude that depending on the details \nof the correlations, such as their spatial range, they may either increase or decrease \nthe information capacity relative to the un correlated one. Recently, Abbott and \nDayan (Abbott and Dayan 1998) claimed that in many cases correlated noise im(cid:173)\nproves the accuracy of population code. Furthermore, even when the information \nis decreased it still grows linearly with the size of the population. If true, this con(cid:173)\nclusion has an important implication on the utility of using a large population to \nimprove the estimation accuracy. Since cross-correlations in neuronal activity are \nfrequently observed in both primary sensory and motor areas (Fetz, Yoyama, and \nSmith 1991 ; Lee, Port, Kruse, and Georgopoulos 1998), understanding the effect of \nnoise correlation in biologically relevant situations is of great importance. \n\nIn this paper we present an analytical study of the effect of noise correlations on the \npopulation coding of a pool of cells that code for a single one-dimensional variable, \nan angle on a plane, e.g. , an orientation of a visual stimulus, or the direction of \nan arm movement. By assuming that the noise follows the multivariate Gaussian \ndistribution, we investigate analytically the effect of correlation on the Fisher in(cid:173)\nformation. This model is similar to that considered in (Snippe and Koenderink \n1992b; Abbott and Dayan 1998). By analyzing its behavior in the biologically rel(cid:173)\nevant regime of tuning width and correlation range, we derive general conclusions \nabout the effect of the correlations on the information capacity of the population. \n\n2 Population Coding with Correlated Noise \n\nWe consider a population of N neurons which respond to a stimulus characterized \nby an angle (), where -1r < () ~ 1r. The activity of each neuron (indexed by i) is \nassumed to be Gaussian with a mean h((}) which represents its tuning curve, and a \nuniform variance a. The noise is assumed to be pairwise-correlated throughout the \npopulation. Hence the activity profile of the whole population, R = {rl, r2, .. . , r N } , \ngiven a stimulus () , follows the following multivariate Gaussian distribution. \n\nP(RI(}) = Nexp (-~ l:(ri - h((}\u00bb)Gi-/(rj - fj((})) \n\nt ,J \n\nwhere N is a normalization constant and Cj is the correlation matrix. \n\nGij = ac5ij + bij (1 - c5ij ). \n\n(1) \n\n(2) \n\nIt is assumed that the tuning curves of all the neurons are identical in form but \npeaked at different angles, that is fi((}) = f((} - \u00a2i) where the preferred angles \u00a2i \nare distributed uniformly from -1r to 1r with a lattice spacing, w, which is equal \nto 21r IN. We further assume that the noise correlation between a pair of neurons \nis only a function of their preferred angle difference, i.e., bij ((}) = b(ll\u00a2i - \u00a2jll) \nwhere lI(}l -\n(}211 is defined to be the relative angle between (}l and (}2, and hence \nits maximum value is 1r. A decrease in the magnitude of neuronal correlations with \nthe dissimilarity in the preferred stimulus is often observed in cortical areas. We \nmodel this by exponentially decaying correlations \n\nbij = b exp( _1I\u00a2i - \u00a2j II) \n\np \n\n(3) \n\nwhere p specifies the angular correlation length. \n\n\fFisher Information of Correlated Population Codes \n\n169 \n\nThe amount of information that can be extracted from the above population will \ndepend on the decoding scheme. A convenient measure of the information capacitv \nin the population is given by the Fisher information, which in our case is (for it \ngiven stimulus 8) \n\nJ(8) = L giGi-/ gj \n\nwhere \n\ni ,j \n\n. (e) = {) Ii (8) \nae \ngt \n\n-\n\n. \n\n(4) \n\n(5) \n\nThe utility of this measure follows from the well known Cramer-Rao bound for the \nvariance of any unbiased estimators, i.e., ((8 - iJ)2) 2: 1/ J(8). For the rest of this \npaper, we will concentrate on the Fisher information as a function of the noise \ncorrelation parameters, band p, as well as the population size N. \n\n3 Results \n\nIn the case of un correlated population (b \nby (Seung and Sompolinsky 1993) \n\n0) , the Fisher information is given \n\n\\vhere gn is the Fourier transform of gj, defined by \n\n1 L \n\ne \n\n. A. \n\n-'l.n'P] \n\ngj. \n\n-\n\ngn = N \n\nn \n\n(6) \n\n(7) \n\nThe mode number n is an integer running from _N:;l to N:;l (for odd N) and \n\u00a2i = -7f(N + 1)/N + iw, i = 1, .. . , N. Likewise, in the case of b ::j:. 0, J is given by \n\nj \n\nJ = NL Ignl 2 \n\nGn \n\nn \n\nwhere Gn are the eigenvalues of the covariance matrix, \n\nt,] \n\n(a _ 2&) + 2b 1 -\n\n.\\ cos(nuJ) -\n\n( _ 1)11.\\ -y- C'os(nw)(1 -\n\nN+I \n\n(9) \nwhere w = 7J, .\\ = e- w / p , and N is assumed to be an odd integer. Note that thE' \ncovariance matrix Gij remains positive definite as long as \n\n1 - 2,\\ cos(nw) + .\\2 \n\n(10) \n\nwhere the lower bound holds for general N while the upper bound is valid for \nlarge N. \n\nTo evaluate the effect of correlations in a large population it is important to specify \nthe appropriate scales of the system parameters. We consider here the biologically \nrelevant case of broadly tuned neurons that have a smoothly varying tuning curve \nwith a single peak. When the tuning curve is smoothly varying, Ignl 2 will be a \nrapidly decaying function as n increases beyond a characteristic value which is \n\n(8) \n\n.\\) \n\n\f170 \n\nH. Yoon and H. Sompolinsky \n\nproportional to the inverse of the tuning width, a. We further assume a broad \ntuning, namely that the tuning curve spans a substantial fraction of the angular \nextent. This is consistent with the observed typical values of half-width at half \nheight in visual and motor areas, which range from 20 to 60 degrees . Likewise, it \nis reasonable to assume that the angular correlation length p spans a substantial \nfraction of the entire angular range. This broad tuning of correlations with respect \nto the difference in the preferred angles is commonly observed in cortex (Fetz, \nYoyama, and Smith 1991 ; Lee, Port, Kruse, and Georgopoulos 1998). To capture \nthese features we will consider the limit of large N while keeping the parameters p \nand a constant. Note that keeping a of order 1 implies that substantial contributions \nto Eq. (8) come only from n which remain of order 1 as N increases. On the \nother hand, given the enormous variability in the strength of the observed cross(cid:173)\ncorrelations between pairs of neurons in cortex, we do not restrict the value of b at \nthis point. \nIncorporating the above scaling we find that when N is large 1 is given by \n\nN \n\n2 \n\n1=~~19nl p-2+n2+(~)(1-(-1)ne-7I'/p) ' \n\np- 2 + n 2 \n\n, \n\n(11) \n\nInspection of the denominator in the above equation clearly shows that for all \npositive values of b, 1 is smaller than 1 0 , On the other hand, when b is negative \n1 is larger than 10 , To estimate the magnitude of these effects we consider below \nthree different regimes. \n\n1.0 \"\"\"\"'----,-----r---.....,------, \n\n.I \n.10 \n\nD.8 \n\n0.6 \n\n0.4 \n\n0.2 \n\n0.0 \n\n() \n\n1000 \n\nFigure 1: Normalized Fisher information when p '\" 0(1) (p = 0.257r was used). \na = 1 and b = 0.1 , 0.01, and 0.001 from the bottom. We used a circular Gaussian \ntuning curve, Eq. (13), with fmax = 10 and a = 0.27r. \n\n3000 \n\n400() \n\n2000 \nN \n\nStrong positive correlations: We first discuss the regime of strong positiw \ncorrelations, by which we mean that a < b/a \"\" 0(1). In this case the second term \nin the denominator of Eq. (11) is of order Nand Eq. (11) becomes \n\n7rp \n\n1 = b L 19n1 1 _ (-1)ne-7I'/P' \n\n2 \n\np-2 + n 2 \n\nn \n\n(12) \n\nThis result implies that in this regime the Fisher information in the entire popu(cid:173)\nlation does not scale linearly with the population size N but saturates to a size(cid:173)\nindependent finite limit. Thus, for these strong correlations, although the number \nof neurons in the population may be large, the number of independent degrees of \nfreedom is small. \nWe demonstrate the above phenomenon by a numerical evaluation of 1 for the \nfollowing choice of tuning curve \n\nf(O) = fmax exp ((cos(O) - 1)/a2 ) \n\n(13) \n\n\fFisher Information of Correlated Population Codes \n\n171 \n\nwith (J = 0.211\". The results are shown in Fig. 1 and Fig. 2. The results of Fig. 1 \nclearly show the substantial decrease in J as b increases. The reduction in J I Jo \nwhen b '\" 0(1) indicates that J does not scale with N in this limit. Fig. 2 shows t.he \nsaturation of J when N increases. For p = 0.1 and 1 ((c) and (d)), J saturates at. \nabout N = 100, which means that for these parameter values the network contains \nat most 100 independent degrees of freedom. When the correlation range becomes \neither smaller or bigger, the saturation becomes less prominent (( a) and (b)) , which \nis further explained later in the text. \n\n40 \n\n.30 \n\nJ \n\n20 \n\n10 \n\n(c) \n\n( ti) \n\n200 \n\n400 \nN \n\n600 \n\n800 \n\nFigure 2: Saturation of Fisher information with the correlation coefficient kept fixed: \na = 1 and b = 0.5. Both p '\" 0(1) ((c) p = 0.1 and (d) p = 1) and other extreme \nlimits ((a) p = 0.01 and (b) p = 10) are shown. Tuning curve with fmax = 1 and \n(J = 0.211\" was used for all four curves. \n\nWeak positive correlations: This regime is defined formally by positive values \nof b which scale as bla '\" O( -k). In this case, while J is still smaller than .10 the \nsuppressive effects of the correlations are not as strong as in the first case. This is \nshown in Fig. 3 (bottom traces) for N = 1000. While J is less than Jo , it is still a \nsubstantial fraction of Jo , indicating J is of order N. \n\n.:L \nJ\" \n\n2.3 \n\n2.0 \n\n1.3 \n\n1.0 \n\n0.3 \n\n0 \n\n1 \n\n3 \n\n4 \n\n2 \nP \n\nFigure 3: Normalized Fisher information when p '\" 0(1) and bla '\" O(~). N = \n1000, a = 1, fmax = 10, and (J = 0.211\". The top curves represent negative h \n(b = -0.005 and -0.002 from the top) and the bottom ones positive b (b = 0.01 \nand 0.005 from the bottom). \n\nWeak negative correlations: So far we have considered the case of positive b. \nAs stated above, Eq. (11) implies that when b < 0, J > Jo . The lower bound of b \n(Eq. (10)) means that when the correlations are negative and p is of order 1 th!' \namplitude of the c:orrelations must be small. It scales as bla = biN with b which \nis of order 1 and is larger than bmin = -(11\"lp)/(I- exp(-11\"lp)). In this regime \n(.] - .10) IN retains a finite positive value even for large N. This enhancement call \n\n\f172 \n\nH. Yoon and H. Sompolinsky \n\n, \n\n, \n\nbe made large if b comes close to bmin . This behavior is shown in Fig. 3 (upper \ntraces). Note that, for both positive and negative weak correlations, the curves \nhave peaks around a characteristic length scale p '\" a, which is 0.211\" in this figure. \n\nExtremely long and short range correlations: Calculation with strictly uni(cid:173)\nform correlations, i. e., bij = b, shows that in this case the positive correlations \nenhance the Fisher information of the system, leading to claims that this might \nbe a gelleri<: result (Abbott and Dayan 1998). Here we show that this behavior \nis special to cases where the correlations essentially do not vary in strength. We \nconsider the case p '\" O(N). This means that the strength of the correlations is the \nsame for all the neurons up to a correction of order liN. In this limit Eq. (11) is \nnot valid, and the Fisher information is obtained from Eq. (8) and Eq. (9), \n\n(14) \n\nwhere {} = wpl4. Note that even in this extreme regime, only for {} > 1 is 1 \nguaranteed to be always larger than .10' Below this value the sign of 1 -.10 depends \non the particular shape of the tuning curve and the value of b. \nIn fact, a more \ndetailed analysis (Yoon and Sompolinsky 1998) shows that as soon as p\u00ab O(VN), \n1 - 10 < 0, as in the case of p rv 0(1) discussed above. The crossover between these \ntwo opposite behaviors is shown in Fig. 4. For comparison the case with p rv 0(1) \nis also shown. \n\n.J \n.In \n\n4.0 \n\n3.0 \n\n2.0 \n\n1.D \n\nn.o \n\nn.D \n\nn.2 \n\n0.4 \n\nD.G \n\n0.8 \n\n1.0 \n\nFigure 4: Normalized Fisher information when bla rv 0(1). N = 1000 and a = 1. \nWhen p '\" 0(1), increasing b always decreases the Fisher information (bottom curve \np = 0.2511\"). However, this trend is reversed when p ,....., O(VN) and when p > ~N \n.1 - .10 becomes always positive. From the top p = 400, 50, and 25 . \n\nb \n\nAnother extreme regime is where the correlation length p scales as 1 IN but the \ntuning width remains of order 1. This means that a given neuron is correlated with \na small number of its immediate neighbors, which remains finite as N ~ 00. In this \nlimit , the Fishel' information becomes, again from Eq. (8) and Eq. (9), \n\n_ N(>..-l_l) \n\n2 \n1 - a(>..-1_1)+2bI:19nl. \n\n\" \n\n(15 ) \n\nIn this case, the behavior of 1 is similar to the cases of weak correlations discussed \nabove. The information remains of order N but the sign of 1 - 10 depends on the \nsign of b. Thus, when the amplitude of the positive correlation function is 0(1), .] \nincreases linearly with N in the two opposite extremes of very large and very small \np as shown in Fig. 2 ((a) and (b)). \n\n\fFisher Information of Correlated Population Codes \n\n173 \n\n4 Discussion \n\nIn this paper we have studied the effect of correlated variability of neuronal activity \nOIl the maximum accuracy of the population coding. We have shown that the \neffect of correlation on the information capacity of the population crucially depends \non the scale of correlation length. We argue that for the sensory and motor areas \nwhich are presumed to utilize population coding, the tuning of both the correlations \nand the mean response profile is broad and of the same order. This implies that \neach neuron is correlated with a finite fraction of the total number of neurons, \nN, and a given stimulus activates a finite fraction of N. We show that in this \nregime positive correlations always decrease the information. When they are strong \nenough in amplitude they reduce the number of independent degrees of freedom \nto a finite number even for large population. Only in the extreme case of almost \nuniform correlations the information capacity is enhanced. This is reasonable since \nto overcome the positive correlations one needs to subtract the responses of different \nneurons. But in general this will reduce their signal by a larger amount. When the \ncorrelations are uniform, the reduction of the correlated noise by subtraction is \nperfect and can be made in a manner that will little affect the signal component. \n\nAcknow ledgments \n\nH.S. acknowledges helpful discussions with Larry Abbott and Sebastian Seung. This \nresearch is partially supported by the Fund for Basic Research of the Israeli Academy \nof Science and by a grant from the United States-Israel Binational Science Founda(cid:173)\ntion (BSF), Jerusalem, Israel. \n\nReferences \n\nL. F. Abbott and P. Dayan (1998). The effect of correlated variability on the \naccuracy of a population code. Neural Camp., in press. \nE. Fetz, K. Yoyama, and W. Smith (1991). Synaptic interactions between cortical \nneurons. In A. Peters and E. G. Jones (Eds.) , Cerebral Cortex, Volume 9. New \nYork: Plenum Press. \nD. Lee, N. L. Port, W. Kruse, and A. P. Georgopoulos (1998). Variability and \ncorrelated noise in the discharge of neurons in motor and parietal areas of the \nprimate cortex. J. Neurosci. 18, 1161- 1170. \nM. A. Paradiso (1988). A theory for the use of visual orientation informatioll \nwhich exploits the columnar structure of striate cortex. BioI. Cybern. 58, 35- 49. \nH. S. Seung and H. Sompolinsky (1993). Simple models for reading neuronal \npopulation codes. Proc. Natl. Acad. Sci. USA 90, 10749- 10753. \nH. P. Snippe and J. J. Koenderink (1992a). Discrimination thresholds for channel(cid:173)\ncoded Hystems. Biol. Cybern. 66, 543- 551. \nH. P. Snippe and J. J. Koenderink (1992b). Information in chClnnel-coded system: \ncorrelated receivers. Biol. Cybern. 67, 183- 190. \nH. Yoon and H. Sompolinsky (1998). Population coding in neuronal systems with \ncorrelated noise, preprint. \nE. Zohary, M. N. Shadlen, and W. T. Newsome (1994). Correlated neuronal \ndischarge rate and its implications for psychophysical performance. Nature 370, \n140- 143. \n\n\f", "award": [], "sourceid": 1510, "authors": [{"given_name": "Hyoungsoo", "family_name": "Yoon", "institution": null}, {"given_name": "Haim", "family_name": "Sompolinsky", "institution": null}]}