{"title": "Modeling Short-term Noise Dependence of Spike Counts in Macaque Prefrontal Cortex", "book": "Advances in Neural Information Processing Systems", "page_first": 1233, "page_last": 1240, "abstract": "Correlations between spike counts are often used to analyze neural coding. The noise is typically assumed to be Gaussian. Yet, this assumption is often inappropriate, especially for low spike counts. In this study, we present copulas as an alternative approach. With copulas it is possible to use arbitrary marginal distributions such as Poisson or negative binomial that are better suited for modeling noise distributions of spike counts. Furthermore, copulas place a wide range of dependence structures at the disposal and can be used to analyze higher order interactions. We develop a framework to analyze spike count data by means of copulas. Methods for parameter inference based on maximum likelihood estimates and for computation of Shannon entropy are provided. We apply the method to our data recorded from macaque prefrontal cortex. The data analysis leads to three significant findings: (1) copula-based distributions provide better fits than discretized multivariate normal distributions; (2) negative binomial margins fit the data better than Poisson margins; and (3) a dependence model that includes only pairwise interactions overestimates the information entropy by at least 19% compared to the model with higher order interactions.", "full_text": "Modeling Short-term Noise Dependence\n\nof Spike Counts in Macaque Prefrontal Cortex\n\nArno Onken\n\nTechnische Universit\u00a8at Berlin\n\n/ BCCN Berlin\n\nSteffen Gr\u00a8unew\u00a8alder\n\nTechnische Universit\u00a8at Berlin\n\nFranklinstr. 28/29, 10587 Berlin, Germany\n\naonken@cs.tu-berlin.de\n\ngruenew@cs.tu-berlin.de\n\nMatthias Munk\n\nMPI for Biological Cybernetics\n\nSpemannstr. 38, 72076 T\u00a8ubingen, Germany\nmatthias.munk@tuebingen.mpg.de\n\nKlaus Obermayer\n\nTechnische Universit\u00a8at Berlin\n\n/ BCCN Berlin\n\noby@cs.tu-berlin.de\n\nAbstract\n\nCorrelations between spike counts are often used to analyze neural coding. The\nnoise is typically assumed to be Gaussian. Yet, this assumption is often inappro-\npriate, especially for low spike counts. In this study, we present copulas as an\nalternative approach. With copulas it is possible to use arbitrary marginal distri-\nbutions such as Poisson or negative binomial that are better suited for modeling\nnoise distributions of spike counts. Furthermore, copulas place a wide range of\ndependence structures at the disposal and can be used to analyze higher order in-\nteractions. We develop a framework to analyze spike count data by means of cop-\nulas. Methods for parameter inference based on maximum likelihood estimates\nand for computation of mutual information are provided. We apply the method\nto our data recorded from macaque prefrontal cortex. The data analysis leads to\nthree \ufb01ndings: (1) copula-based distributions provide signi\ufb01cantly better \ufb01ts than\ndiscretized multivariate normal distributions; (2) negative binomial margins \ufb01t the\ndata signi\ufb01cantly better than Poisson margins; and (3) the dependence structure\ncarries 12% of the mutual information between stimuli and responses.\n\n1 Introduction\n\nUnderstanding neural coding is at the heart of theoretical neuroscience. Analyzing spike counts of\na population is one way to gain insight into neural coding properties. Even when the same stimulus\nis presented repeatedly, responses from the neurons vary, i.e. from trial to trial responses of neu-\nrons are subject to noise. The noise variations of neighboring neurons are typically correlated (noise\ncorrelations). Due to their relevance for neural coding, noise correlations have been subject of a con-\nsiderable number of studies (see [1] for a review). However, these studies always assumed Gaussian\nnoise. Thus, correlated spike rates were generally modeled by multivariate normal distributions with\na speci\ufb01c covariance matrix that describes all pairwise linear correlations.\n\nFor long time intervals or high \ufb01ring rates, the average number of spikes is suf\ufb01ciently large for the\ncentral limit theorem to apply and thus the normal distribution is a good approximation for the spike\ncount distributions. However, several experimental \ufb01ndings suggest that noise correlations as well\nas sensory information processing predominantly take place on a shorter time scale, on the order of\ntens to hundreds of milliseconds [2, 3]. It is therefore questionable if the normal distribution is still\nan appropriate approximation and if the results of studies based on Gaussian noise apply to short\ntime intervals and low \ufb01ring rates.\n\n\fi\n\ni\n\n]\nn\nB\n/\ns\ne\nk\np\nS\n#\n[\n \n2\nN\n\n(a)\n\n]\n\ni\n\nn\nB\n/\ns\ne\nk\np\nS\n#\n\ni\n\n2\n\n4\n8\nN1 [#Spikes/Bin]\n\n6\n\n10\n\n12\n\n[\n \n\n2\nN\n\n6\n\n4\n\n2\n\n0\n0\n\n6\n\n4\n\n2\n\n0\n0\n\n2\n\n4\n8\nN1 [#Spikes/Bin]\n\n6\n\n10\n\n12\n\n(b)\n\n2\n\n4\n8\nN1 [#Spikes/Bin]\n\n6\n\n10\n\n12\n\n6\n\n4\n\n2\n\n0\n0\n\n]\n\ni\n\nn\nB\n/\ns\ne\nk\np\nS\n#\n\ni\n\n[\n \n\n2\nN\n\n(c)\n\n(d)\n\nFigure 1: (a): Recording of correlated spike trains from two neurons and conversion to spike counts.\n(b): The distributions of the spike counts of a neuron pair from the data described in Section 4 for\n100 ms time bins. Dark squares represent a high number of occurrences of corresponding pairs of\nspike counts. One can see that the spike counts are correlated since the ratios are high near the\ndiagonal. The distributions of the individual spike counts are plotted below and left of the axes.\n(c): Density of a \ufb01t with a bivariate normal distribution.\n(d): Distribution of a \ufb01t with negative\nbinomial margins coupled with the Clayton copula.\n\nThis is due to several major drawbacks of the multivariate normal distribution: (1) Its margins\nare continuous with a symmetric shape, whereas empirical distributions of real spike counts tend\nto have a positive skew, i.e.\nthe mass of the distribution is concentrated at the left of its mode.\nMoreover, the normal distribution allows negative values which are not meaningful for spike counts.\nEspecially for low rates, this can become a major issue, since the probability of negative values\nwill be high. (2) The dependence structure of a multivariate normal distribution is always elliptical,\nwhereas spike counts of short time bins can have a bulb-shaped dependence structure (see Fig. 1b).\n(3) The multivariate normal distribution does not allow higher order correlations of its elements.\nInstead, only pairwise correlations can be modeled.\nIt was shown that pairwise interactions are\nsuf\ufb01cient for retinal ganglion cells and cortex cells in vitro [4]. However, there is evidence that\nthey are insuf\ufb01cient for subsequent cortex areas in vivo [5]. We will show that our data recorded in\nprefrontal cortex suggest that higher order interactions (which involve more than two neurons) do\nplay an important role in the prefrontal cortex as well.\n\nIn this paper, we present a method that addresses the above shortcomings of the multivariate normal\ndistribution. We apply copulas [6] to form multivariate distributions with a rich set of dependence\nstructures and discrete marginal distributions, including the Poisson distribution. Copulas were\npreviously applied to model the distribution of continuous \ufb01rst-spike-latencies [7]. Here we apply\nthis concept to spike counts.\n\n\f2 Copulas\n\nWe give an informal introduction to copulas and apply the concept to a pair of neurons from our data\nwhich are described and fully analyzed in Section 4. Formal details of copulas follow in Section 3.2.\n\nA copula is a cumulative distribution function that can couple arbitrary marginal distributions. There\nare many families of copulas, each with a different dependence structure. Some families have an\nelliptical dependence structure, similar to the multivariate normal distribution. However, it is also\npossible to use completely different dependence structures which are more appropriate for the data\nat hand.\n\nAs an example, consider the modeling of spike count dependencies of two neurons (Fig. 1). Spike\ntrains are recorded from the neurons and transformed to spike counts (Fig. 1a). Counting leads to a\nbivariate empirical distribution (Fig. 1b). The distribution of the counts depends on the length of the\ntime bin that is used to count the spikes, here 100 ms. In the case considered, the correlation at low\ncounts is higher than at high counts. This is called lower tail dependence.\n\nThe density of a typical population model based on the multivariate normal (MVN) distribution\nis shown in Fig. 1c. Here, we did not discretize the distribution since the standard approach to\ninvestigate noise correlations also uses the continuous distribution [1]. The mean and covariance\nmatrix of the MVN distribution correspond to the sample mean and the sample covariances of the\nempirical distribution. Yet, the dependence structure does not re\ufb02ect the true dependence structure\nof the counts. But the spike count probabilities for a copula-based distribution (Fig. 1d) correspond\nwell to the empirical distribution in Fig. 1b.\n\nThe modeling of spike count data with the help of a copula is done in three steps: (1) A marginal\ndistribution, e.g. a Poisson or a negative binomial distribution is chosen, based on the spike count\ndistribution of the individual neurons.\n(2) The counts are transformed to probabilities using the\ncumulative distribution function of the marginal distribution. (3) The probabilities and thereby the\ncumulative marginal distributions are coupled with the help of a so-called copula function. As an\nexample, consider the Clayton copula family [6]. For two variables the copula is given by\n\nC(p1, p2, \u03b1) =\n\n1\n\n+ 1\np\u03b1\n\n2\n\n\u2212 1, 0}\n\n,\n\n\u03b1qmax{ 1\n\np\u03b1\n\n1\n\nwhere pi denotes the probability of the spike count Xi of the ith neuron being lower or equal to\nri (i.e. pi = P (Xi \u2264 ri)). Note that there are generalizations to more than two margins (see\nSection 3.2). The function C(p1, p2, \u03b1) generates a joint cumulative distribution function by cou-\npling the margins and thereby introduces correlations of second and higher order between the spike\ncount variables. The ratio of the joint probability that corresponds to statistically independent spike\ncounts P (X1 \u2264 r1, X2 \u2264 r2) = p1p2 and the dependence introduced by the Clayton copula (for\n\n1\np\u03b1\n\n1\n\n+ 1\np\u03b1\n\n2\n\n\u2212 1 \u2265 0) is given by\n\np1p2\n\nC(p1, p2, \u03b1)\n\n= p1p2\n\n\u03b1s 1\n\np\u03b1\n\n1\n\n+\n\n1\np\u03b1\n\n2\n\n\u2212 1 = \u03b1pp\u03b1\n\n1 + p\u03b1\n\n2 \u2212 p\u03b1\n\n1 p\u03b1\n2 .\n\nSuppose that \u03b1 is positive. Since pi \u2208 [0, 1] the deviation from the ratio 1 will be larger for small\nprobabilities. Thus, the copula generates correlations whose strengths depend on the magnitude of\nthe probabilities. The probability mass function (Fig. 1d) can then be calculated from the cumulative\nprobability using the difference scheme as described in Section 3.4. Care must be taken whenever\ncopulas are applied to form discrete distributions: while for continuous distributions typical mea-\nsures of dependence are determined by the copula function C only, these measures are affected by\nthe shape of the marginal distributions in the discrete case [8].\n\n3 Parametric spike count models and model selection procedure\n\nWe will now describe the formal aspects of the multivariate normal distribution on the one hand and\ncopula-based models as the proposed alternative on the other hand, both in terms of their application\nto spike counts.\n\n\f3.1 The discretized multivariate normal distribution\n\nThe MVN distribution is continuous and needs to be discretized (and recti\ufb01ed) before it can be ap-\nplied to spike count data (which are discrete and non-negative). The cumulative distribution function\n(cdf) of the spike count vector ~X is then given by\n\nF ~X (r1, . . . , rd) =(cid:26)\u03a6\u00b5,\u03a3(\u230ar1\u230b, . . . , \u230ard\u230b),\n\n0,\n\nif \u2200i \u2208 {1, . . . , d} : ri \u2265 0\notherwise\n\nwhere \u230a.\u230b denotes the \ufb02oor operation for the discretization, \u03a6\u00b5,\u03a3 denotes the cdf of the MVN dis-\ntribution with mean \u00b5 and correlation matrix \u03a3, and d denotes the dimension of the multivariate\ndistribution and corresponds to the number of neurons that are modeled. Note that \u00b5 is no longer the\nmean of ~X. The mean is shifted to greater values as \u03a6\u00b5,\u03a3 is recti\ufb01ed (negative values are cut off).\nThis deviation grows with the dimension d. According to the central limit theorem, the distribution\nof spike counts approaches the MVN distribution only for large counts.\n\n3.2 Copula-based models\n\nFormally, a copula C is a cdf with uniform margins.\nFX1, . . . , FXd to form a joint cdf F ~X, such that\n\nIt can be used to couple marginal cdf\u2019s\n\nholds [6]. There are many families of copulas with different dependence shapes and different num-\nbers of parameters, e.g. the multivariate Clayton copula family with a scalar parameter \u03b1:\n\nF ~X (r1, . . . , rd) = C(FX1 (r1), . . . , FXd (rd))\n\nC\u03b1(~u) = max(1 \u2212 d +\n\n, 0)!\u22121/\u03b1\n\n.\n\nu\u2212\u03b1\ni\n\ndXi=1\n\nThus, for a given realization ~r, which can represent the counts of two neurons, we can set ui =\nFXi(ri) and FX (~r) = C\u03b1(~u), where FXi can be arbitrary univariate cdf\u2019s. Thereby, we can generate\na multivariate distribution with speci\ufb01c margins FXi and a dependence structure determined by C.\nIn the case of discrete marginal distributions, however, typical measures of dependence, such as the\nlinear correlation coef\ufb01cient or Kendall\u2019s \u03c4 are effected by the shape of these margins [8]. Note\nthat \u03b1 does not only control the strength of pairwise interactions but also the degree of higher order\ninteractions.\nAnother copula family is the Farlie-Gumbel-Morgenstern (FGM) copula [6]. It is special in that it\nhas 2d \u2212 d \u2212 1 parameters that individually determine the pairwise and higher order interactions. Its\ncdf takes the form\n\nsubject to the constraints\n\nC~\u03b1(~u) =\uf8eb\uf8ed1 +\ndXk=2 X1\u2264j1<\u00b7\u00b7\u00b7