{"title": "Correlation Coefficients are Insufficient for Analyzing Spike Count Dependencies", "book": "Advances in Neural Information Processing Systems", "page_first": 1383, "page_last": 1391, "abstract": "The linear correlation coefficient is typically used to characterize and analyze dependencies of neural spike counts. Here, we show that the correlation coefficient is in general insufficient to characterize these dependencies. We construct two neuron spike count models with Poisson-like marginals and vary their dependence structure using copulas. To this end, we construct a copula that allows to keep the spike counts uncorrelated while varying their dependence strength. Moreover, we employ a network of leaky integrate-and-fire neurons to investigate whether weakly correlated spike counts with strong dependencies are likely to occur in real networks. We find that the entropy of uncorrelated but dependent spike count distributions can deviate from the corresponding distribution with independent components by more than 25% and that weakly correlated but strongly dependent spike counts are very likely to occur in biological networks. Finally, we introduce a test for deciding whether the dependence structure of distributions with Poisson-like marginals is well characterized by the linear correlation coefficient and verify it for different copula-based models.", "full_text": "Correlation Coef\ufb01cients Are Insuf\ufb01cient\nfor Analyzing Spike Count Dependencies\n\nArno Onken\n\nTechnische Universit\u00a8at Berlin / BCCN Berlin\nFranklinstr. 28/29, 10587 Berlin, Germany\n\nSteffen Gr\u00a8unew\u00a8alder\n\nUniversity College London\n\nGower Street, London WC1E 6BT, UK\n\naonken@cs.tu-berlin.de\n\nsteffen@cs.ucl.ac.uk\n\nKlaus Obermayer\n\nTechnische Universit\u00a8at Berlin / BCCN Berlin\n\noby@cs.tu-berlin.de\n\nAbstract\n\nThe linear correlation coef\ufb01cient is typically used to characterize and analyze de-\npendencies of neural spike counts. Here, we show that the correlation coef\ufb01cient is\nin general insuf\ufb01cient to characterize these dependencies. We construct two neu-\nron spike count models with Poisson-like marginals and vary their dependence\nstructure using copulas. To this end, we construct a copula that allows to keep\nthe spike counts uncorrelated while varying their dependence strength. Moreover,\nwe employ a network of leaky integrate-and-\ufb01re neurons to investigate whether\nweakly correlated spike counts with strong dependencies are likely to occur in\nreal networks. We \ufb01nd that the entropy of uncorrelated but dependent spike count\ndistributions can deviate from the corresponding distribution with independent\ncomponents by more than 25 % and that weakly correlated but strongly dependent\nspike counts are very likely to occur in biological networks. Finally, we introduce\na test for deciding whether the dependence structure of distributions with Poisson-\nlike marginals is well characterized by the linear correlation coef\ufb01cient and verify\nit for different copula-based models.\n\n1\n\nIntroduction\n\nThe linear correlation coef\ufb01cient is of central importance in many studies that deal with spike count\ndata of neural populations. For example, a low correlation coef\ufb01cient is often used as an evidence\nfor independence in recorded data and to justify simplifying model assumptions (e.g. [1, 2]). In line\nwith this many computational studies constructed distributions for observed data based solely on\nreported correlation coef\ufb01cients [3, 4, 5, 6]. The correlation coef\ufb01cient is in this sense treated as an\nequivalent to the full dependence.\n\nThe correlation coef\ufb01cient is also extensively used in combination with information measures such\nas the Fisher information (for continuous variables only) and the Shannon information to assess the\nimportance of couplings between neurons for neural coding [7]. The discussion in the literature\nencircles two main topics. On the one hand, it is debated whether pairwise correlations versus\nhigher order correlations across different neurons are suf\ufb01cient for obtaining good estimates of the\ninformation (see e.g. [8, 9, 10]). On the other hand, it is questioned whether correlations matter at\nall (see e.g. [11, 12, 13]). In [13], for example, based on the correlation coef\ufb01cient it was argued\nthat the impact of correlations is negligible for small populations of neurons.\n\nThe correlation coef\ufb01cient is one measure of dependence among others. It has become common to\nreport only the correlation coef\ufb01cient of recorded spike trains without reporting any other properties\n\n\fof the actual dependence structure (see e.g. [3, 14, 15]). The problem with this common practice is\nthat it is unclear beforehand whether the linear correlation coef\ufb01cient suf\ufb01ces to describe the depen-\ndence or at least the relevant part of the dependence. Of course, it is well known that uncorrelated\ndoes not imply statistically independent. Yet, it might seem likely that this is not important for\nrealistic spike count distributions which have a Poisson-like shape. Problems could be restricted\nto pathological cases that are very unlikely to occur in realistic biological networks. At least one\nmight expect to \ufb01nd a tendency of weak dependencies for uncorrelated distributions with Poisson-\nlike marginals. It might also seem likely that these dependencies are unimportant in terms of typical\ninformation measures even if they are present and go unnoticed or are ignored.\n\nIn this paper we show that these assumptions are false. Indeed, the dependence structure can have\na profound impact on the information of spike count distributions with Poisson-like single neuron\nstatistics. This impact can be substantial not only for large networks of neurons but even for two\nneuron distributions. As a matter of fact, the correlation coef\ufb01cient places only a weak constraint on\nthe dependence structure. Moreover, we show that uncorrelated or weakly correlated spike counts\nwith strong dependencies are very likely to be common in biological networks. Thus, it is not\nsuf\ufb01cient to report only the correlation coef\ufb01cient or to derive strong implications like independence\nfrom a low correlation coef\ufb01cient alone. At least a statistical test should be applied that states for\na given signi\ufb01cance level whether the dependence is well characterized by the linear correlation\ncoef\ufb01cient. We will introduce such a test in this paper. The test is adjusted to the setting that a\nneuroscientist typically faces, namely the case of Poisson-like spike count distributions of single\nneurons and small numbers of samples.\n\nIn the next section, we describe state-of-the-art methods for modeling dependent spike counts, to\ncompute their entropy, and to generate network models based on integrate-and-\ufb01re neurons. Sec-\ntion 3 shows examples of what can go wrong for entropy estimation when relying on the correlation\ncoef\ufb01cient only. Emergences of such cases in simple network models are explored. Section 4 intro-\nduces the linear correlation test which is tailored to the needs of neuroscience applications and the\nsection examines its performance on different dependence structures. The paper concludes with a\ndiscussion of the advantages and limitations of the presented methods and cases.\n\n2 General methods\n\nWe will now describe formal aspects of spike count models and their Shannon information.\n\n2.1 Copula-based models with discrete marginals\n\nA copula is a cumulative distribution function (CDF) which is de\ufb01ned on the unit hypercube and has\nuniform marginals [16]. Formally, a bivariate copula C is de\ufb01ned as follows:\nDe\ufb01nition 1. A copula is a function C : [0, 1]2 \u2212\u2192 [0, 1] such that:\n\n1. \u2200u, v \u2208 [0, 1]: C(u, 0) = 0 = C(0, v) and C(u, 1) = u and C(1, v) = v.\n\n2. \u2200u1, v1, u2, v2 \u2208 [0, 1] with u1 \u2264 u2 and v1 \u2264 v2:\n\nC(u2, v2) \u2212 C(u2, v1) \u2212 C(u1, v2) + C(u1, v1) \u2265 0.\n\nCopulas can be used to couple arbitrary marginal CDF\u2019s FX1, FX2 to form a joint CDF F ~X, such that\nF ~X (r1, r2) = C(FX1 (r1), FX2 (r2)) holds [16]. There are many families of copulas representing\ndifferent dependence structures. One example is the bivariate Frank family [17]. Its CDF is given\nby\n\nC\u03b8(u, v) =(\u2212 1\n\nuv\n\n\u03b8 ln(cid:16)1 + (e\u2212\u03b8u\u22121)(e\u2212\u03b8v\u22121)\n\ne\u2212\u03b8\u22121\n\n(cid:17)\n\nif \u03b8 6= 0,\nif \u03b8 = 0.\n\n(1)\n\nThe Frank family is commutative and radial symmetric:\nits probability density c\u03b8 abides by\n\u2200(u, v) \u2208 [0, 1]2 : c\u03b8(u, v) = c\u03b8(1\u2212u, 1\u2212v) [17]. The scalar parameter \u03b8 controls the strength of de-\npendence. As \u03b8 \u2192 \u00b1\u221e the copula approaches deterministic positive/negative dependence: knowl-\nedge of one variable implies knowledge of the other (so-called Fr\u00b4echet-Hoeffding bounds [16]). The\nlinear correlation coef\ufb01cient is capable of measuring this dependence. Another example is the bi-\nvariate Gaussian copula family de\ufb01ned as C\u03b8(u, v) = \u03c6\u03b8(\u03c6\u22121(u), \u03c6\u22121(v)), where \u03c6\u03b8 is the CDF of\n\n\fthe bivariate zero-mean unit-variance multivariate normal distribution with correlation \u03b8 and \u03c6\u22121 is\nthe inverse of the CDF of the univariate zero-mean unit-variance Gaussian distribution. This fam-\nily can be used to construct multivariate distributions with Gauss-like dependencies and arbitrary\nmarginals.\nFor a given realization ~r, which can represent the counts of two neurons, we can set ui = FXi(ri)\nand FX (~r) = C\u03b8(~u), where FXi can be arbitrary univariate CDF\u2019s. Thereby, we can generate a\nmultivariate distribution with speci\ufb01c marginals FXi and a dependence structure determined by C.\nCopulas allow us to have different discrete marginal distributions [18, 19]. Typically, the Poisson\ndistribution is a good approximation to spike count variations of single neurons [20]. For this distri-\nbution the CDF\u2019s of the marginals take the form\n\nFXi(r; \u03bbi) =\n\n\u03bbk\ni\nk!\n\ne\u2212\u03bbi,\n\n\u230ar\u230b\n\nXk=0\n\nwhere \u03bbi is the mean spike count of neuron i for a given bin size. We will also use the negative\nbinomial distribution as a generalization of the Poisson distribution:\n\nFXi(r; \u03bbi, \u03c5i) =\n\n\u230ar\u230b\n\nXk=0\n\n\u03bbk\ni\nk!\n\n1\n(1 + \u03bbi\n\u03c5i\n\n)\u03c5i\n\n\u0393(\u03c5i + k)\n\n\u0393(\u03c5i)(\u03c5i + \u03bbi)k ,\n\nwhere \u0393 is the gamma function. The additional parameter \u03c5i controls the degree of overdispersion:\nthe smaller the value of \u03c5i, the greater the Fano factor: the variance is given by \u03bbi + \u03bb2\n. As \u03c5i\ni\n\u03c5i\napproaches in\ufb01nity, the negative binomial distribution converges to the Poisson distribution.\n\nLikelihoods of discrete vectors can be computed by applying the inclusion-exclusion principle\nof Poincar\u00b4e and Sylvester. The probability of a realization (x1, x2) is given by P ~X (x1, x2) =\nF ~X (x1, x2) \u2212 F ~X (x1 \u2212 1, x2) \u2212 F ~X (x1, x2 \u2212 1) + F ~X (x1 \u2212 1, x2 \u2212 1). Thus, we can compute the\nprobability mass of a realization ~x using only the CDF of ~X.\n\n2.2 Computation of information entropy\n\nThe Shannon entropy [21] of dependent spike counts ~X is a measure of the information that a\ndecoder is missing when it does not know the value ~x of ~X. It is given by\n\nH( ~X) = E[I( ~X)] = X~x\u2208Nd\n\nP ~X (~x)I(~x),\n\nwhere I(~x) = \u2212 log2(P ~X (~x)) is the self-information of the realization ~x.\n\n2.3 Leaky integrate-and-\ufb01re model\n\nThe leaky integrate-and-\ufb01re neuron is a simple neuron model that models only subthreshold mem-\nbrane potentials. The equation for the membrane potential is given by\n\n\u03c4m\n\ndV\ndt\n\n= EL \u2212 V + RmIs,\n\nwhere EL denotes the resting membrane potential, Rm is the total membrane resistance, Is is\nthe synaptic input current, and \u03c4m is the time constant. The model is completed by a rule which\nstates that whenever V reaches a threshold Vth, an action potential is \ufb01red and V is reset to\nVreset [22].\nIn all of our simulations we used \u03c4m = 20 ms, Rm = 20 M\u2126, Vth = \u221250 mV,\nand Vreset = Vinit = \u221265 mV, which are typical values found in [22]. Current-based synaptic\ninput for an isolated presynaptic release that occurs at time t = 0 can be modeled by the so-called\n\u03b1-function [22]: Is = Imax\n). The function reaches its peak Is at time t = \u03c4s and then\ndecays with time constant \u03c4s. We can model an excitatory synapse by a positive Imax and an in-\nhibitory synapse by a negative Imax. We used Imax = 1 nA for excitatory synapses, Imax = \u22121 nA\nfor inhibitory synapses, and \u03c4s = 5 ms.\n\nexp(1 \u2212 t\n\u03c4s\n\nt\n\u03c4s\n\n\f\u03c9\n\n,\n\n\u03b8\n,\n\n2\n\n1\n\n\u03b8\nC\n\n1\n\n0.5\n\n0\n1\n0.5\nv\n\n0\n\n0\n\nv\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n0.5\n\nu\n\n1\n\n(a)\n\n \n\n2\n\n1.5\n\n1\n\n0.5\n\n0\n\n1\n\n0.5\nu\n(d)\n\n\u03c9\n\n,\n\n\u03b8\n,\n\n2\n\n1\n\n\u03b8\nC\n\n1\n\n0.5\n\n0\n1\n0.5\nv\n\n0\n\n0\n\nv\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n0.5\n\nu\n\n1\n\n(b)\n\n \n\n10\n\n8\n\n6\n\n4\n\n2\n\n0\n\n1\n\n0.5\nu\n(e)\n\n\u03c9\n\n,\n\n\u03b8\n,\n\n2\n\n1\n\n\u03b8\nC\n\n1\n\n0.5\n\n0\n1\n0.5\nv\n\n0\n\n0\n\nv\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n0.5\n\nu\n\n1\n\n(c)\n\n \n\n1\n\n8\n\n6\n\n4\n\n2\n\n0\n\n0.5\nu\n(f)\n\nFigure 1: Cumulative distribution functions (a-c) and probability density functions (d-f) of selected\nFrank shuf\ufb02e copulas. (a, d): Independence: \u03b81 = \u03b82 = 0. (b, e): Strong negative dependence\nin outer square: \u03b81 = \u221230, \u03b82 = 5, \u03c9 = 0.2. (c, f): Strong positive dependence in inner square:\n\u03b81 = \u22125, \u03b82 = 30, \u03c9 = 0.2.\n\n3 Counter examples\n\nIn this section we describe entropy variations that can occur when relying on the correlation coef\ufb01-\ncient only. We will evaluate this effect for models of spike counts which have Poisson-like marginals\nand show that such effects can occur in very simple biological networks.\n\n3.1 Frank shuf\ufb02e copula\n\nWe will now introduce the Frank shuf\ufb02e copula family. This copula family allows arbitrarily strong\ndependencies with a correlation coef\ufb01cient of zero for attached Poisson-like marginals. It uses two\nFrank copulas (see Section 2.1) in different regions of its domain such that the linear correlation\ncoef\ufb01cient would vanish.\nProposition 1. The following function de\ufb01nes a copula \u2200\u03b81, \u03b82 \u2208 R, \u03c9 \u2208 [0, 0.5] :\nC\u03b81(u, v) \u2212 \u03c2\u03b81(\u03c9, \u03c9, u, v) + z\u03b81,\u03b82,\u03c9(min{u, v})\u03c2\u03b82 (\u03c9, \u03c9, u, v)\n\nif (u, v) \u2208\n(\u03c9, 1 \u2212 \u03c9)2,\notherwise,\n\nC\u03b81,\u03b82,\u03c9(u, v) =\uf8f1\uf8f2\n\uf8f3\n\nC\u03b81(u, v)\n\nwhere \u03c2\u03b8(u1, v1, u2, v2) = C\u03b8(u2, v2) \u2212 C\u03b8(u2, v1) \u2212 C\u03b8(u1, v2) + C\u03b8(u1, v1) and z\u03b81,\u03b82,\u03c9(m) =\n\u03c2\u03b81(\u03c9, \u03c9, m, 1 \u2212 \u03c9)/\u03c2\u03b82(\u03c9, \u03c9, m, 1 \u2212 \u03c9).\n\nThe proof of the copula properties is given in Appendix A. This family is capable of modeling\na continuum between independence and deterministic dependence while keeping the correlation\ncoef\ufb01cient at zero. There are two regions: the outer region [0, 1]2 \\ (\u03c9, 1 \u2212 \u03c9)2 contains a Frank\ncopula with \u03b81 and the inner square (\u03c9, 1 \u2212 \u03c9)2 contains a Frank copula with \u03b82 modi\ufb01ed by a factor\nz. If we would restrict our analysis to copula-based distributions with continuous marginals it would\nbe suf\ufb01cient to select \u03b81 = \u2212\u03b82 and to adjust \u03c9 such that the correlation coef\ufb01cient would vanish. In\nsuch cases, the factor z would be unnecessary. For discrete marginals, however, this is not suf\ufb01cient\nas the CDF is no longer a continuous function of \u03c9. Different copulas of this family are shown in\nFig. 1.\n\nWe will now investigate the impact of this dependence structure on the entropy of copula-based dis-\ntributions with Poisson-like marginals while keeping the correlation coef\ufb01cient at zero. Introducing\nmore structure into a distribution typically reduces its entropy. Therefore, we expect that the entropy\ncan vary considerably for different dependence strengths, even though the correlation is always zero.\n\n\f6\n\n4\n\n2\n\n)\ns\nt\ni\n\nB\n\n(\n \ny\np\no\nr\nt\n\nn\nE\n\n0\n \n0\n\n \n\nPoisson\nNegative Binomial\n\n\u221230\n\n\u221240\n\n\u221250\n\n)\n\n%\n\n(\n \n\ne\nc\nn\ne\nr\ne\n\nf\nf\ni\n\nD\n \ny\np\no\nr\nt\n\nn\nE\n\n30\n\n20\n\n10\n\n0\n \n0\n\n \n\nPoisson\nNegative Binomial\n\n\u221210\n\n\u221220\n\n\u03b8\n1\n\n\u221230\n\n\u221240\n\n\u221250\n\n(b)\n\n\u221210\n\n\u221220\n\n\u03b8\n1\n\n(a)\n\nFigure 2: Entropy of distributions based on the Frank shuf\ufb02e copula C\u03b81,\u03b82,\u03c9 for \u03c9 = 0.05 and\ndifferent dependence strengths \u03b81. The second parameter \u03b82 was selected such that the absolute\ncorrelation coef\ufb01cient was below 10\u221210. For Poisson marginals, we selected rates \u03bb1 = \u03bb2 = 5.\nFor 100 ms bins this would correspond to \ufb01ring rates of 50 Hz. For negative binomial marginals\nwe selected rates \u03bb1 = 2.22, \u03bb2 = 4.57 and variances \u03c32\n2 = 10.99 (values taken from\nexperimental data recorded in macaque prefrontal cortex and 100 ms bins [18]). (a): Entropy of the\nC\u03b81,\u03b82,\u03c9 based models. (b): Difference between the entropy of the C\u03b81,\u03b82,\u03c9-based models and the\nmodel with independent elements in percent of the independent model.\n\n1 = 4.24, \u03c32\n\nFig. 2(a) shows the entropy of the Frank shuf\ufb02e-based models with Poisson and negative binomial\nmarginals for uncorrelated but dependent elements. \u03b81 was varied while \u03b82 was estimated using\nthe line-search algorithm for constrained nonlinear minimization [23] with the absolute correlation\ncoef\ufb01cient as the objective function. Independence is attained for \u03b81 = 0. With increasing depen-\ndence the entropy decreases until it reaches a minimum at \u03b81 = \u221220. Afterward, it increases again.\nThis is due to the shape of the marginal distributions. The region of strong dependence shifts to a\nregion with small mass. Therefore, the actual dependence decreases. However, in this region the\ndependency is almost deterministic and thus does not represent a relevant case.\n\nFig. 2(b) shows the difference to the entropy of corresponding models with independent elements.\nThe entropy deviated by up to 25 % for the Poisson marginals and up to 15 % for the negative\nbinomial marginals. So the entropy varies indeed considerably in spite of \ufb01xed marginals and un-\ncorrelated elements.\n\nWe constructed a copula family which allowed us to vary the dependence strength systematically\nwhile keeping the variables uncorrelated. It could be argued that this is a pathological example. In\nthe next section, however, we show that such effects can occur even in simple biologically realistic\nnetwork models.\n\n3.2 LIF network\n\nWe will now explore the feasibility of uncorrelated spike counts with strong dependencies in a bio-\nlogically realistic network model. For this purpose, we set up a network of leaky integrate-and-\ufb01re\nneurons (see Section 2.3). The neurons have two common input populations which introduce oppo-\nsite dependencies (see Fig. 3(a)). Therefore, the correlation should vanish for the right proportion of\ninput strengths. Note that the bottom input population does not contradict to Dale\u2019s principle, since\nexcitatory neurons can project to both excitatory and inhibitory neurons.\n\nWe can \ufb01nd a copula family which can model this relation and has two separate parameters for the\nstrengths of the input populations:\n\n(2)\n\nC cm\n\n\u03b81,\u03b82(u, v) =\n\n1\n\n1\n\n2(cid:0)max(cid:8)u\u2212\u03b81 + v\u2212\u03b81 \u2212 1, 0(cid:9)(cid:1)\u22121/\u03b81\n2(cid:16)u \u2212(cid:0)max(cid:8)u\u2212\u03b82 + (1 \u2212 v)\u2212\u03b82 \u2212 1, 0(cid:9)(cid:1)\u22121/\u03b82(cid:17) ,\n\n+\n\nwhere \u03b81, \u03b82 \u2208 (0, \u221e). It is a mixture of the well known Clayton copula and an one element survival\ntransformation of the Clayton copula [16]. As a mixture of copulas this function is again a copula.\nA copula of this family is shown in Fig. 3(b).\n\nFig. 3(c) shows the correlation coef\ufb01cients of the network generated spike counts and of C cm\n\u03b81,\u03b82\n\ufb01ts. The rate of population D that introduces negative dependence is kept constant, while the rate\nof population B that introduces positive dependence is varied. The resulting spike count statistics\n\n\f1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\nv\n\n \n\n10\n\n8\n\n6\n\n4\n\n2\n\n0\n\n(a)\n\n \n\n0.2\n\n0.4\n\n0.8\n\n1\n\n0.6\n\nu\n(b)\n\nt\n\ni\n\nn\ne\nc\ni\nf\nf\n\ne\no\nC\nn\no\n\n \n\ni\nt\n\nl\n\na\ne\nr\nr\no\nC\n\n0.4\n\n0.3\n\n0.2\n\n0.1\n\n0\n\n\u22120.1\n\n\u22120.2\n\n250\n\nInput Rate of Top Center Population (Hz)\n\n300\n\n350\n\n(c)\n\nFigure 3: Strong dependence with zero correlation in a biological network model. (a): Neural net-\nwork models used to generate synthetic spike count data. Two leaky integrate-and-\ufb01re neurons (LIF1\nand LIF2, see Section 2.3) receive spike inputs (circles for excitation, bars for inhibition) from four\nseparate populations of neurons (rectangular boxes and circles, A-D), but only two populations (B,\nD) send input to both neurons. All input spike trains were Poisson-distributed. (b): Probability den-\n\u03b81,\u03b82 with \u03b81 = 1.5 and \u03b82 = 2.0. (c): Correlation coef\ufb01cients of\nsity of the Clayton mixture model C cm\nnetwork generated spike counts compared to correlations of a maximum likelihood \ufb01t of the C cm\n\u03b81,\u03b82\ncopula family to these counts. Solid line: correlation coef\ufb01cients of counts generated by the network\nshown in (a). Each neuron had a total inhibitory input rate of 300 Hz and a total excitatory input rate\nof 900 Hz. Population D had a rate of 150 Hz. We increased the absolute correlation between the\nspike counts by shifting the rates: we decreased the rates of A and C and increased the rate of B. The\ntotal simulation time amounted to 200 s. Spike counts were calculated for 100 ms bins. Dashed line:\n\u03b81,\u03b82. Dashed-dotted line: Correlation\nCorrelation coef\ufb01cients of the \ufb01rst mixture component of C cm\ncoef\ufb01cients of the second mixture component of C cm\n\n.\n\n\u03b81,\u03b82\n\nwere close to typically recorded data. At approximately 275 Hz the dependencies cancel each other\nout in the correlation coef\ufb01cient. Nevertheless, the mixture components of the copula reveal that\nthere are still dependencies: the correlation coef\ufb01cient of the \ufb01rst mixture component that models\nnegative dependence is relatively constant, while the correlation coef\ufb01cient of the second mixture\ncomponent increases with the rate of the corresponding input population. Therefore, correlation\ncoef\ufb01cients of spike counts that do not at all re\ufb02ect the true strength of dependence are very likely\nto occur in biological networks. Structures similar to the investigated network can be formed in any\nfeed-forward network that contains positive and negative weights.\n\nTypically, the network structure is unknown. Hence, it is hard to construct an appropriate copula that\nis parametrized such that individual dependence strengths are revealed. The goal of the next section\nis to assess a test that reveals whether the linear correlation coef\ufb01cient provides an appropriate\nmeasure for the dependence.\n\n4 Linear correlation test\n\nWe will now describe a test for bivariate distributions with Poisson-like marginals that determines\nwhether the dependence structure is well characterized by the linear correlation coef\ufb01cient. This test\ncombines a variant of the \u03c72 goodness-of-\ufb01t test for discrete multivariate data with a semiparametric\nmodel of linear dependence. We \ufb01t the semiparametric model to the data and we apply the goodness-\nof-\ufb01t test to see if the model is adequate for the data.\n\nThe semiparametric model that we use consists of the empirical marginals of the sample coupled by\na parametric copula family. A dependence structure is well characterized by the linear correlation\ncoef\ufb01cient if it is Gauss-like. So one way to test for linear dependence would be to use the Gaussian\ncopula family. However, the likelihood of copula-based models relies on the CDF which has no\nclosed form solution for the Gaussian family. Fortunately, a whole class of copula families that are\nGauss-like exists. The Frank family is in this class [24] and its CDF can be computed very ef\ufb01ciently.\nWe therefore selected this family for our test (see Eq. 1). The Frank copula has a scalar parameter \u03b8.\nThe parameter relates directly to the dependence. With growing \u03b8 the dependence increases strictly\n\n\f1\n\n0\n\nH\n\n \nf\n\no\n\n \n\ne\nc\nn\na\n\nt\n\np\ne\nc\nc\nA\n%\n\n \n\n0.5\n\n \n\n0\n0\n\n \n\n1\n\n0\n\nH\n\nSamples: 128\nSamples: 256\nSamples: 512\n\n\u221240\n\n\u221260\n\n\u221220\n\n\u03b8\n1\n\n(a)\n\n \nf\n\no\n\n \n\ne\nc\nn\na\n\nt\n\np\ne\nc\nc\nA\n%\n\n \n\n0.5\n\n0\n0\n\n20\n\n10\n\u03b8\n1\n\n(b)\n\n1\n\n0\n\nH\n\n \nf\n\no\n\n \n\ne\nc\nn\na\n\nt\n\np\ne\nc\nc\nA\n%\n\n \n\n0.5\n\n0\n\u221210\n\n1\n\n0\n\nH\n\n \nf\n\no\n\n \n\ne\nc\nn\na\n\nt\n\np\ne\nc\nc\nA\n%\n\n \n\n0.5\n\n0\n\n\u22120.5\n\n10\n\n0\n\u03b8\n\n(c)\n\n0.5\n\n0\n\u03b8\n\n(d)\n\nFigure 4: Percent acceptance of the linear correlation hypothesis for different copula-based models\nwith different dependence strengths and Poisson marginals with rates \u03bb1 = \u03bb2 = 5. We used 100\nrepetitions each. The number of samples was varied between 128 and 512. On the x-axis we varied\nthe strength of the dependence by means of the copula parameters. (a): Frank shuf\ufb02e family with\ncorrelation kept at zero.\n(c): Frank family.\n(d): Gaussian family.\n\n(b): Clayton mixture family C cm\n\nwith \u03b81 = 2\u03b82.\n\n\u03b81,\u03b82\n\nmonotonically. For \u03b8 = 0 the Frank copula corresponds to independence. Therefore, the usual \u03c72\nindependence test is a special case of our linear correlation test.\n\nThe parameter \u03b8 of the Frank family can be estimated based on a maximum likelihood \ufb01t. However,\nthis is time-consuming. As an alternative we propose to estimate the copula parameter \u03b8 by means\nof Kendall\u2019s \u03c4 . Kendall\u2019s \u03c4 is a measure of dependence de\ufb01ned as \u03c4 (~x, ~y) = c\u2212d\nc+d , where c is the\nnumber of elements in the set {(i, j)|(xi < xj and yi < yj) or (xi > xj and yi > yj)} and d is\nthe number of element in the set {(i, j)|(xi < xj and yi > yj) or (xi > xj and yi < yj)} [16].\nFor the Frank copula with continuous marginals the relation between \u03c4 and \u03b8 is given by \u03c4\u03b8 =\n1 \u2212 4\nexp(t)\u22121 dt [25]. For discrete\nmarginals this is an approximate relation. Unfortunately, \u03c4 \u22121\ncannot be expressed in closed form,\nbut can be easily obtained numerically using Newton\u2019s method.\n\n\u03b8 [1 \u2212 D1(\u03b8)], where Dk(x) is the Debye function Dk(x) = k\n\nxk R x\n\ntk\n\n0\n\n\u03b8\n\nThe goodness-of-\ufb01t test that we apply for this model is based on the \u03c72 test [26].\nIt is widely\napplied for testing goodness-of-\ufb01t or independence of categorical variables. For the test, observed\nfrequencies are compared to expected frequencies using the following statistic:\n\nX 2 =\n\nk\n\nXi=1\n\n(ni \u2212 m0i)2\n\nm0i\n\n,\n\n(3)\n\nwhere ni are the observed frequencies, moi are the expected frequencies, and k is the number of\nbins. For a 2-dimensional table the sum is over both indices of the table. If the frequencies are large\nenough then X 2 is approximately \u03c72-distributed with df = (N \u22121)(M \u22121)\u2212s degrees of freedom,\nwhere N is the number of rows, M is the number of columns, and s is the number of parameters\nin the H0 model (1 for the Frank family). Thus, for a given signi\ufb01cance level \u03b1 the test accepts\nthe hypothesis H0 that the observed frequencies are a sample from the distribution formed by the\nexpected frequencies, if X 2 is less than the (1 \u2212 \u03b1) point of the \u03c72-distribution with df degrees of\nfreedom.\n\nThe \u03c72 statistic is an asymptotic statistic. In order to be of any value, the frequencies in each bin\nmust be large enough. As a rule of thumb, each frequency should be at least 5 [26]. This cannot\nbe accomplished for Poisson-like marginals since there is an in\ufb01nite number of bins. For such\ncases Loukas and Kemp [27] propose the ordered expected-frequencies procedure. The expected\nfrequencies m0 are sorted monotonically decreasing into a 1-dimensional array. The corresponding\nobserved frequencies form another 1-dimensional array. Then the frequencies in both arrays are\ngrouped from left to right such that the grouped m0 frequencies reach a speci\ufb01ed minimum expected\nfrequency (MEF), e.g. MEF= 1 as in [27]. The \u03c72 statistic is then estimated using Eq. 3 with the\ngrouped expected and grouped observed frequencies.\n\nTo verify the test we applied it to samples from copula-based distributions with Poisson marginals\nand four different copula families: the Frank shuf\ufb02e family (Proposition 1), the Clayton mixture\nfamily (Eq. 2), the Frank family (Eq. 1), and the Gaussian family (Section 2.1). For the Frank\nfamily and the Gaussian family the linear correlation coef\ufb01cient is well suited to characterize their\n\n\fdependence. We therefore expected that the test should accept H0, regardless of the dependence\nstrength. In contrast, for the Frank shuf\ufb02e family and the Clayton mixture family the linear corre-\nlation does not re\ufb02ect the dependence strength. Hence, the test should reject H0 most of the time\nwhen there is dependence.\n\nThe acceptance rates for these copulas are shown in Fig. 4. For each of the families there was no\ndependence when the \ufb01rst copula parameter was equal to zero. The Frank and the Gaussian families\nhave only Gauss-like dependence, meaning the correlation coef\ufb01cient is well-suited to describe the\ndata. In all of these cases the achieved Type I error was small, i.e. the acceptance rate of H0 was\nclose to the desired value (0.95). The plots in (a) and (b) indicate the Type II errors: H0 was accepted\nalthough the dependence structure of the counts was not Gauss-like. The Type II error decreased\nfor increasing sample sizes. This is reasonable since X 2 is only asymptotically \u03c72-distributed.\nTherefore, the test is unreliable when dependencies and sample sizes are both very small.\n\n5 Conclusion\n\nWe investigated a worst-case scenario for reliance on the linear correlation coef\ufb01cient for analyzing\ndependent spike counts using the Shannon information. The spike counts were uncorrelated but had\na strong dependence. Thus, relying solely on the correlation coef\ufb01cient would lead to an oversight of\nsuch dependencies. Although uncorrelated with \ufb01xed marginals the information varied by more than\n25 %. Therefore, the dependence was not negligible in terms of the entropy. Furthermore, we could\nshow that similar scenarios are very likely to occur in real biological networks. Our test provides a\nconvenient tool to verify whether the correlation coef\ufb01cient is the right measure for an assessment of\nthe dependence. If the test rejects the Gauss-like dependence hypothesis, more elaborate measures\nof the dependence should be applied. An adequate copula family provides one way to \ufb01nd such a\nmeasure. In general, however, it is hard to \ufb01nd the right parametric family. Directions for future\nresearch include a systematic approach for handling the alternative case when one has to deal with\nthe full dependence structure and a closer look at experimentally observed dependencies.\n\nAcknowledgments. This work was supported by BMBF grant 01GQ0410.\n\nA Proof of proposition 1\n\nProof. We show that C\u03b81,\u03b82,\u03c9 is a copula. Since C\u03b81,\u03b82,\u03c9 is commutative we assume w.l.o.g. u \u2264 v.\nFor u = 0 or v = 0 and for u = 1 or v = 1 we have C\u03b81,\u03b82,\u03c9(u, v) = C\u03b81(u, v). Hence, property 1\nfollows directly from C\u03b81. It remains to show that C\u03b81,\u03b82,\u03c9 is 2-increasing (property 2). We will\nshow this in two steps:\n1) We show that C\u03b81,\u03b82,\u03c9 is continuous: For \u03c92 = 1 \u2212 \u03c9 and u \u2208 (\u03c9, \u03c92):\n\nlim\nt\u0580\u03c92\n\nC\u03b81,\u03b82,\u03c9(u, t) = C\u03b81(u, \u03c92) \u2212 \u03c2\u03b81(\u03c9, \u03c9, u, \u03c92) +\n\n\u03c2\u03b81(\u03c9, \u03c9, u, \u03c92)\n\u03c2\u03b82(\u03c9, \u03c9, u, \u03c92)\n\n\u03c2\u03b82(\u03c9, \u03c9, u, \u03c92)\n\nFor v \u2208 (\u03c9, 1 \u2212 \u03c9):\n\n= C\u03b81(u, \u03c92).\n\nlim\nt\u0581\u03c9\n\nC\u03b81,\u03b82,\u03c9(t, v) = C\u03b81(\u03c9, v) \u2212 \u03c2\u03b81(\u03c9, \u03c9, \u03c9, v) + lim\nt\u0581\u03c9\n\n\u03c2\u03b81(\u03c9, \u03c9, t, 1 \u2212 \u03c9)\n\u03c2\u03b82(\u03c9, \u03c9, t, 1 \u2212 \u03c9)\n\n\u03c2\u03b82(\u03c9, \u03c9, t, v).\n\nWe can use l\u2019H\u02c6opital\u2019s rule since limt\u0581\u03c9 \u03c2\u03b8(\u03c9, \u03c9, t, 1 \u2212 \u03c9) = 0. It is easy to verify that\n\n\u2202C\u03b8\n\u2202u\n\n(v) =\n\ne\u2212\u03b8u(e\u2212\u03b8v \u2212 1)\n\ne\u2212\u03b8 \u2212 1 + (e\u2212\u03b8u \u2212 1)(e\u2212\u03b8v \u2212 1)\n\n.\n\nThus, the quotient is constant and limt\u0581\u03c9 C\u03b81,\u03b82,\u03c9(t, v) = C\u03b81(\u03c9, v) \u2212 0 + 0.\n2) C\u03b81,\u03b82,\u03c9 has non-negative density almost everywhere on [0, 1]2. This is obvious for u1, v1 /\u2208\n[\u03c9, 1 \u2212 \u03c9]2, because C\u03b81 is a copula. Straightforward but tedious algebra shows that \u2200u1, v1 \u2208\n(\u03c9, 1 \u2212 \u03c9)2 : \u2202 2C\u03b81 ,\u03b82 ,\u03c9\nThus, C\u03b81,\u03b82,\u03c9 is continuous and has density almost everywhere on [0, 1]2 and is therefore 2-\nincreasing.\n\n(u1, v1) \u2265 0.\n\n\u2202u\u2202v\n\n\fReferences\n\n[1] M. Jazayeri and J. A. Movshon. Optimal representation of sensory information by neural populations.\n\nNature Neuroscience, 9(5):690\u2013696, 2006.\n\n[2] L. Schwabe and K. Obermayer. Adaptivity of tuning functions in a generic recurrent network model of a\n\ncortical hypercolumn. Journal of Neuroscience, 25(13):3323\u20133332, 2005.\n\n[3] D. A. Gutnisky and V. Dragoi. Adaptive coding of visual information in neural populations. Nature,\n\n452(7184):220\u2013224, 2008.\n\n[4] M. Shamir and H. Sompolinsky. Implications of neuronal diversity on population coding. Neural Com-\n\nputation, 18(8):1951\u20131986, 2006.\n\n[5] P. Series, P. E. Latham, and A. Pouget. Tuning curve sharpening for orientation selectivity: coding\n\nef\ufb01ciency and the impact of correlations. Nature Neuroscience, 7(10):1129\u20131135, 2004.\n\n[6] L. F. Abbott and P. Dayan. The effect of correlated variability on the accuracy of a population code.\n\nNeural Computation, 11(1):91\u2013101, 1999.\n\n[7] B. B. Averbeck, P. E. Latham, and A. Pouget. Neural correlations, population coding and computation.\n\nNature Review Neuroscience, 7(5):358\u2013366, 2006.\n\n[8] Y. Roudi, S. Nirenberg, and P. E. Latham. Pairwise maximum entropy models for studying large biological\nsystems: When they can work and when they can\u2019t. PLoS Computational Biology, 5(5):e1000380+, 2009.\n[9] E. Schneidman, M. J. Berry II, R. Segev, and W. Bialek. Weak pairwise correlations imply strongly\n\ncorrelated network states in a neural population. Nature, 440:1007\u20131012, 2006.\n\n[10] J. Shlens, G. D. Field, J. L. Gauthier, M. I. Grivich, D. Petrusca, E. Sher, A. M. Litke, and E. J.\nChichilnisky. The structure of multi-neuron \ufb01ring patterns in primate retina. Journal of Neuroscience,\n26:2006, 2006.\n\n[11] B. B. Averbeck and D. Lee. Neural noise and movement-related codes in the macaque supplementary\n\nmotor area. Journal of Neuroscience, 23(20):7630\u20137641, 2003.\n\n[12] S. Panzeri, G. Pola, F. Petroni, M. P. Young, and R. S. Petersen. A critical assessment of different measures\n\nof the information carried by correlated neuronal \ufb01ring. Biosystems, 67(1-3):177\u2013185, 2002.\n\n[13] H. Sompolinsky, H. Yoon, K. Kang, and M. Shamir. Population coding in neuronal systems with corre-\n\nlated noise. Physical Review E, 64(5):051904, 2001.\n\n[14] A. Kohn and M. A. Smith. Stimulus dependence of neuronal correlation in primary visual cortex of the\n\nmacaque. Journal of Neuroscience, 25(14):3661\u20133673, 2005.\n\n[15] W. Bair, E. Zohary, and W. T. Newsome. Correlated \ufb01ring in macaque visual area MT: time scales and\n\nrelationship to behavior. Journal of Neuroscience, 21(5):1676\u20131697, 2001.\n\n[16] R. B. Nelsen. An Introduction to Copulas. Springer, New York, second edition, 2006.\n[17] M. J. Frank. On the simultaneous associativity of f(x,y) and x+y-f(x,y). Aequations Math, 19:194\u2013226,\n\n1979.\n\n[18] A. Onken, S. Gr\u00a8unew\u00a8alder, M. Munk, and K. Obermayer. Modeling short-term noise dependence of\nIn D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou,\n\nspike counts in macaque prefrontal cortex.\neditors, Advances in Neural Information Processing Systems 21, pages 1233\u20131240, 2009.\n\n[19] P. Berkes, F. Wood, and J. Pillow. Characterizing neural dependencies with copula models. In D. Koller,\nD. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems\n21, pages 129\u2013136, 2009.\n\n[20] D. J. Tolhurst, J. A. Movshon, and A. F. Dean. The statistical reliability of signals in single neurons in cat\n\nand monkey visual cortex. Vision Research, 23:775\u2013785, 1982.\n\n[21] C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379\u2013423,\n\n1948.\n\n[22] P. Dayan and L. F. Abbott. Theoretical Neuroscience. Cambridge (Massachusetts): MIT Press, 2001.\n[23] R. A. Waltz, J. L. Morales, J. Nocedal, and D. Orban. An interior algorithm for nonlinear optimization\n\nthat combines line search and trust region steps. Mathematical Programming, 107(3):391\u2013408, 2006.\n\n[24] C. Genest, B. R\u00b4emillard, and D. Beaudoin. Goodness-of-\ufb01t tests for copulas: A review and a power study.\n\nInsurance: Mathematics and Economics, 44(2):199\u2013213, 2009.\n\n[25] C. Genest. Frank\u2019s family of bivariate distributions. Biometrika, 74:549\u2013555, 1987.\n[26] W. G. Cochran. The \u03c7\n2 test of goodness of \ufb01t. Annals of Mathematical Statistics, 23(3):315\u2013345, 1952.\n[27] S. Loukas and C. D. Kemp. On the chi-square goodness-of-\ufb01t statistic for bivariate discrete distributions.\n\nThe Statistician, 35:525\u2013529, 1986.\n\n\f", "award": [], "sourceid": 770, "authors": [{"given_name": "Arno", "family_name": "Onken", "institution": null}, {"given_name": "Steffen", "family_name": "Gr\u00fcnew\u00e4lder", "institution": null}, {"given_name": "Klaus", "family_name": "Obermayer", "institution": null}]}