{"title": "Synergy and Redundancy among Brain Cells of Behaving Monkeys", "book": "Advances in Neural Information Processing Systems", "page_first": 111, "page_last": 117, "abstract": null, "full_text": "Synergy and redundancy among brain \n\ncells of behaving monkeys \n\nItay Gat\u00b7 \n\nInstitute of Computer Science and \n\nCenter for Neural Computation \n\nThe Hebrew University, Jerusalem 91904, Israel \n\nNaftali Tishby t \n\nNEC Research Institute \n\n4 Independence Way \nPrinceton N J 08540 \n\nAbstract \n\nDetermining the relationship between the activity of a single nerve \ncell to that of an entire population is a fundamental question that \nbears on the basic neural computation paradigms. In this paper \nwe apply an information theoretic approach to quantify the level \nof cooperative activity among cells in a behavioral context. It is \npossible to discriminate between synergetic activity of the cells vs . \nredundant activity, depending on the difference between the infor(cid:173)\nmation they provide when measured jointly and the information \nthey provide independently. We define a synergy value that is pos(cid:173)\nitive in the first case and negative in the second and show that the \nsynergy value can be measured by detecting the behavioral mode of \nthe animal from simultaneously recorded activity of the cells. We \nobserve that among cortical cells positive synergy can be found, \nwhile cells from the basal ganglia, active during the same task, do \nnot exhibit similar synergetic activity. \n\ntitay,tishby}@cs.huji.ac.il \nPermanent address: Institute of Computer Science and Center for Neural Computa(cid:173)\n\ntion, The Hebrew University, Jerusalem 91904, Israel. \n\n\f112 \n\n1 \n\nIntroduction \n\nI. Gat and N. Tishby \n\nMeasuring ways by which several neurons in the brain participate in a specific \ncomputational task can shed light on fundamental neural information processing \nmechanisms. While it is unlikely that complete information from any macroscopic \nneural tissue will ever be available, some interesting insight can be obtained from \nsimultaneously recorded cells in the cortex of behaving animals. The question we \naddress in this study is the level of synergy, or the level of cooperation , among brain \ncells, as determined by the information they provide about the observed behavior \nof the animal. \n\n1.1 The experimental data \n\nWe analyze simultaneously recorded units from behaving monkeys during a delayed \nresponse behavioral experiment. The data was collected at the high brain function \nlaboratory of the Haddassah Medical School of the Hebrew universitY[l, 2]. In this \ntask the monkey had to remember the location of a visual stimulus and respond by \ntouching that location after a delay of 1-32 sec. Correct responses were rewarded \nby a drop of juice. \nIn one set of recordings six micro-electrodes were inserted \nsimultaneously to the frontal or prefrontal cortex[l, 3]. In another set of experiments \nthe same behavioral paradigm was used and recording were taken from the striatum \n- which is the first station in basal ganglia (a sub-cortical ganglia)[2]. The cells \nrecorded in the striatum were the tonically active neurons[2], which are known to \nbe the cholinergic inter-neurons of the striatum. These cells are known to respond \nto reward. \n\nThe monkeys were trained to perform the task in two alternating modes, \"Go\" and \n\"No-Go\" [1]. Both sets of behavioral modes can be detected from the recorded spike \ntrains using several statistical modeling techniques that include Hidden Markov \nModels (HMM) and Post Stimulus Histograms (PSTH). The details of these detec(cid:173)\ntion methods are reported elsewhere[4 , 5]. For this paper it is important to know \nthat we can significantly detect the correct behavior, for example in the \"Go\" vs. \nthe \"No-Go\" correct detection is achieved about 90% of the time, where the random \nis 50% and the monkey's average performance is 95% correct on this task. \n\n2 Theoretical background \n\nOur measure of synergy level among cells is information theoretic and was recently \nproposed by Brenner et. aZ. [6] for analysis of spikes generated by a single neuron. \nThis is the first application of this measure to quantify cooperativity among neurons. \n\n2.1 Synergy and redundancy \n\nA fundamental quantity in information theory is the mutual information between \ntwo random variables X and Y. It is defined as the cross-entropy (Kullbak-Liebler \ndivergence) between the joint distribution of the variables, p(x, y), and the product \nof the marginal distributions p(x)p(y). As such it measures the statistical depen(cid:173)\ndence of the variables X and Y. It is symmetric in X and Y and has the following \n\n\fSynergy and Redundancy among Brain Cells of Behaving Monkeys \n\n113 \n\nI(X; Y) \n\nfamiliar relations to their entropies[7]: \n\nDKL [P(X, Y) I P(X) P(Y)] = ~ P( x, y) log (~~ ~~r~) ) \n\n(1) \nH(X) + H(Y) - H(X, Y) = H(X) - H(XIY) = H(Y) - H(YIX). \n\nWhen given three random variables X I, X 2 and Y, one can consider the \nmutual information between the joint variables (X I ,X2 ) and the variable Y, \nI(XI' X 2; Y) (notice the position of the semicolon), as well as the mutual infor(cid:173)\nmations I(XI; Y) and I(X2; Y). Similarly, one can consider the mutual informa(cid:173)\ntion between Xl and X 2 conditioned on a given value of Y = y, I(XI; X21y) = \nDKL[P(XI ,X2Iy)IP(Xl ly)P(X2Iy)]' as well as its average, the conditional mutual \ninformation , \n\nI(XI; X 2IY) = LP(y)Iy(XI; X2)' \n\nY \n\nFollowing Brenner et. al.[6] we define the synergy level of Xl and X2 with respect \nto the variable Y as \n\nSyny(XI ,X2) = I(XI ,X2;Y) - (I(XI;Y) + I(X2;Y)), \n\n(2) \nwith the natural generalization to more than two variables X . This expression can \nbe rewritten in terms of entropies and conditional information as follows: \n\nSyny(XI , X 2) = \n(3) \nH(XI,X2) - H(XI,X2IY) - ((H(Xt) - H(XIIY)) + (H(X2) - H(X2IY))) \nH(XIIY) + H(X2IY) - H(XI' X2IY) + H(XI' X 2) - (H(Xd + H(X2)) \n\n\" \n\n., \n\nI \n\n\" \n\n\" \n\n., \n\nDepends On Y \n\nIndependent of Y \n\nWhen the variables exhibit positive synergy value, with respect to the variable Y, \nthey jointly provide more information on Y than when considered independently, as \nexpected in synergetic cases. Negative synergy values correspond to redundancy -\nthe variables do not provide independent information about Y. Zero synergy value \nis obtained when the variables are independent of Y or when there is no change in \ntheir dependence when conditioned on Y. We claim that this is a useful measure \nof cooperativity among neurons, in a given computational task. \n\nIt is clear from Eq.( 3) that if \n\nIy(XI; X 2) = I(XI; X 2) Vy E Y => Syny (Xl, X 2 ) = 0, \n\n(4) \n\nsince in that case L y P(y)Iy (XI;X2) = I(XI;X2). \nIn other words, the synergy value is not zero only if the statistical dependence, hence \nthe mutual information between the variables, is affected by the value of Y . It is \npositive when the mutual information increase, on the average, when conditioned \non Y, and negative if this conditional mutual information decrease. Notice that \nthe value of synergy can be both positive and negative since information, unlike \nentropy, is not sub-additive in the X variables. \n\n\f114 \n\n1. Gat and N Tishby \n\n3 Synergy among neurons \n\nOur measure of synergy among the units is based on the ability to detect the \nbehavioral mode from the recorded activity, as we discuss bellow. As discussed \nabove, synergy among neurons is possible only if their statistical dependence change \nwith time. An important case where synergy is not expected is pure \"population \ncoding\" [8]. In this case the cells are expected to fire independently, each with its \nown fixed tuning curve. Our synergy value can thus be used to test if the recorded \nunits are indeed participating in a pure population code of this kind, as hypothesized \nfor certain motor cortical activity. \n\nTheoretical models of the cortex that clearly predict nonzero synergy include at(cid:173)\ntractor neural networks (ANN)[9] and synfire chain models(SFC)[3] . Both these \nmodels predict changes in the collective activity patterns, as neurons move between \nattractors in the ANN case, or when different synfire-chains of activity are born \nor disappear in the SFC case. To the extent that such changes in the collective \nactivity depend on behavior, nonzero synergy values can be detected. It remains \nan interesting theoretical challenge to estimate the quantitative synergy values for \nsuch models and compare it to observed quantities. \n\n3.1 Time-dependent cross correlations \n\nIn our previous studies[4] we demonstrated, using hidden Markov models of the \nactivity, that the pairwise cross-correlations in the same data can change signifi(cid:173)\ncantly with time, depending on the underlying collective state of activity. These \nstates, revealed by the hidden Markov model, in turn depend on the behavior and \nenable its prediction . Dramatic and fast changes in the cross-correlation of cells \nhas also been shown by others[lO]. This finding indicate directly that the statistical \ndependence of the neurons can change (rapidly) with time, in a way correlated to \nbehavior. This clearly suggests that nonzero synergy should be observed among \nthese cortical units , relative to this behavior. In the present study this theoretical \nhypothesis is verified. \n\n3.2 Redundancy cases \n\nIf on the other hand the conditioned mutual information equal zero for all behavioral \nmodes, i.e. Iy(Xl; X2) = 0 Vy E Y, while I(Xl; X 2) > 0, we expect to get negative \nsynergy, or redundancy among the cells, with respect to the behavior variable Y. \nWe observed clear redundancy in another part of the brain, the basal ganglia, dur(cid:173)\ning the same experiment, when the behavior was the pre-reward and post-reward \nactivity. In this case different cells provide exactly the same information, which \nyields negative synergy values. \n\n4 Experimental results \n\n4.1 Synergy measurement in practice \n\nTo evaluate the synergy value among different cells, it is necessary to estimate \nthe conditional distribution p(ylx) where y is the current behavior and x represent \na single trial of spike trains of the considered cells. Estimating this probability, \n\n\fSynergy and Redundancy among Brain Cells of Behaving Monkeys \n\n115 \n\nhowever, requires an underlying statistical model, or a represented of the spike \ntrains. Otherwise there is never enough data since cortical spike trains are never \nexactly reproducible. In this work we choose the rate representation, which is the \nsimplest to evaluate. The estimation of p(ylx) goes as follows: \n\n\u2022 For each of the M behavioral modes (Y1, Y2 .. , YM) collect spike train samples \n\n(the tmining data set). \n\n\u2022 Using the training sample, construct a Post Stimulus Time Histogram \n\n(PSTH), i.e. the rate as function of time, for each behavioral mode. \n\n\u2022 Given a spike train, outside of the training set, compute its probability to \n\nbe result in each of the M modes. \n\n\u2022 The spike train considered correctly classified if the most probable mode is \n\nin fact the true behavioral mode, and incorrectly otherwise. \n\n\u2022 The fraction of correct classification, for all spike trains of a given behavioral \nmode Yi, is taken as the estimate of P(Yi Ix), and denoted pc., where Ci 1S \nthe identity of the cells used in the computation. \n\nFor the case of only two categories of behavior and for a uniform distribution of the \ndifferent categories, the value of the entropy H(Y) is the same for all combinations of \ncells, and is simply H (Y) = - Ly p(y) log2 (p(y)) = log22 = 1. The full expression \n(in bits) for the synergy value can be thus written as follows: \n\n~p(x) [-~ Po\"\" log2(P\"\",)] ; \n1+ ~P(x) [- ~ Po, IOg,(P,,)] + ~ p(x) [- ~ Po, IOg2(P,,)] \n\n, \n\n(5) \n\nIf the first expression is larger than the second than there is (positive) synergy and \nvice versa for redundancy. However there is one very important caveat. As we saw \nthe computation of the mutual information is not done exactly, and what one really \ncomputes is only a lower bound. If the bound is tighter for multiple cell calculation, \nthe method could falsely infer positive synergy, and if the bound is tighter for the \nsingle cell computation, the method could falsely infer negative synergy. In previous \nworks we have shown that the method we use for this estimation is quite reasonable \nand robust[5], therefore, we believe that we have even a conservative (i.e. \nless \npositive) estimate of synergy. \n\n4.2 Observed synergy values \n\nIn the first set of experiments we tried to detect the behavioral mode during the \ndelay-period of correct trials. \nIn this case the two types of behavior were the \n\"Go\" and the \"No-Go\" described in the introduction. An example of this detection \nproblem is given in figure lAo In this figure there are 100 examples of multi-electrode \nrecording of spike trains during the delay period. On the left is the \"Go-mode\" data \nand on the right the \"No-Go mode\", for two cells. On the lower part there is an \nexample of two single spike trains that need to be classified by the mode models. \n\n\f116 \n\nA. \n\n00_. \n\n110-00 1104. \n\nB. \n\nPre-r-.r4 \n\n\u2022 \u2022\u2022\u2022\u2022\u2022\u2022.\u2022 _ \n\nPoet-reward \n\n. ..\u2022... __ \u2022\u2022\u2022 -\n\n--\"\"\"--\"\"'-\"--\"\"'''1 \n\nI. Gat and N. Tishby \n\n\" \n\n: , \n\n~:~I 1\", \n,~ \n\n, ,;-,.-c:;;---..------;;;--\"' .\u2022 ~~ -m~~' \n\n. \n[_.:~ \u2022 \u2022\u2022 \u2022\u2022 :~_ \u2022\u2022 ~ \u2022\u2022\u2022 . : \u2022\u2022 ;'~~\"h \u2022\u2022 ~. __ ~ \u2022\u2022\u2022\u2022 ~_ \u2022\u2022 ~.J \n\n\u2022 \n:? \n\n\"\"\"'If-._ \n\n\u2022\u2022 ,,... \n... :::-'--._ \n\n.1 \n? 'T' \n\nr\"'ij\"l\"i\"i\"\"i,~\"':('l',~u,i;~','Ll \nr'\u00b7\u00b7jil\u00b7~\u00b7\u00b7~\u00b7~IIUTI~j~I\u00b7I;i\u00b7\u00b7\u00b7\u00b7\"\u00b7II\u00b7\u00b7II\u00b7\u00b7I\u00b7:.j \nl ... :.~!!.: ...... : ............. : ........... ! ........ ~ .. _.J l .. J ... L .. .I. .. : .. ! ... : .. I ... : ...... l ...... : .... ,.j \n\n811agl. trial \u00bbct. 2 \n\nabag1e trial 110. 1 \n\n\u2022 \n: ? \n\n\"\"\"'If-. ___ \u2022\u2022\u2022\u2022 ,... \n........ - -.---\n\n\u2022 \n? i \n\n................... L.,.:.:.:.~~.~.~................ \n\nI. II 1.1.. .. II i.! \n\ni \n\nI I 1 1 I I I i I \n\n~ \u2022\u2022 : ..... ; ........ : ................. _ \u2022\u2022\u2022 _ \u2022\u2022 ;U: ........ _ .... ::.: \n\n_ ............. ~ .. ::-: .. :-::::::\"',..i. ......................... , \n\n' I. .. 1 U J 1 Jli...i \nI I ! \n\n~ .. : \u2022\u2022\u2022\u2022\u2022 ~ ........ u . . . . . . . . . . . . . . . . . . . . ~ . . . . . . __ \u2022 \n\nI \n\nI \n\n. . . . \":'.: \n\ni\n\n-\n\n81 . . 1& trial 110. 1 \n\nFigure 1: Raster displays of simultaneously recorded cells in the 2 different areas, \nin each area there were 2 behavioral modes. \n\nTable 1 gives some examples of detection results obtained by using 2 cells indepen(cid:173)\ndently, and by using their joint combination. It can be seen that the synergy is \npositive and significant. We examined 19 recording session of the same behavioral \nmodes for two different animals and evaluated the synergy value. In 18 out of the \n19 sessions there was at least one example of significant positive synergy among the \ncells. \n\nFor comparison we analyzed another set of experiments in which the data was \nrecorded from the striatum in the basal ganglia. An example for this detection is \nshown in figure lB. The behavioral modes were the \"pre-reward\" vs. \nthe \"post(cid:173)\nreward\" periods. Nine recording sessions for the two different monkeys were exam(cid:173)\nined using the same detection technique. Although the detection results improve \nwhen the number of cells increase, in none of these recordings a positive synergy \nvalue was found. For most of the data the synergy value was close to zero, i.e. the \nmutual information among two cells jointly was close to the sum of the mutual infor(cid:173)\nmation of the independent cells, as expected when the cells exhibit (conditionally) \nindependent activity. \n\nThe prevailing difference between the synergy measurements in the cortex and in the \nTAN s' of the basal ganglia is also strengthen by the different mechanisms underlying \nthose cells. The TANs' are assumed to be globally mediators of information in the \nstriatum, a relatively simple task, whereas the information processed in the frontal \ncortex in this task is believed to be much more collective and complicated. Here we \nsuggest a first handle for quantitative detection of such different neuronal activities. \n\nAcknowledgments \n\nSpecial thanks are due to Moshe Abeles for his encouragement and support, and to \nWilliam Bialek for suggesting the idea to look for the synergy among cortical cells. \nWe would also like to thank A. Raz, Hagai Bergman, and Eilon Vaadia for sharing \ntheir data with us. The research at the Hebrew university was supported in part \nby a grant from the Unites States Israeli Binational Science Foundation (BSF). \n\n\fSynergy and Redundancy among Brain Cells of Behaving Monkeys \n\n117 \n\nTable 1: Examples of synergy among cortical neurons. For each example the mutual \ninformation of each cell separately is given together with the mutual information \nof the pair. In parenthesis the matching detection probability (average over p(ylx)) \nis also given. The last column gives the percentage of increase from the mutual \ninformation of the single cells to the mutual information of the pair. The table gives \nonly those pairs for which the percentage was larger than 20% and the detection \nrate higher than 60%. \nSession Cells \n\nBoth cells \n\nSyn (%) \n\nCellI \n\nCe1l2 \n\nb116b \nbl21b \nbl21b \nbl26b \nbl26b \ncl77b \ncr38b \ncr38b \ncr38b \ncr43b \n\n5,6 \n1,4 \n3,4 \n0,3 \n1,2 \n2,3 \n0,2 \n0,4 \n3,4 \n0,1 \n\n0.068 (64.84) \n0.201 (73.74) \n0.082 (66.67) \n0.062 (62.63) \n0.030 (60.10) \n0 .054 (62.74) \n0.074 (65.93) \n0.074 (65.93) \n0.051 (62.09) \n0.070 (65.00) \n\n0.083 (66.80) \n0.118 (69.70) \n0.118 (69.70) \n0.077 (66.16) \n0.051 (63.13) \n0.013 (61.50) \n0.058 (63.19) \n0.042 (62.09) \n0.042 (62.09) \n0.063 (64.44) \n\n0.209 (76.17) \n0.497 (87.88) \n0.240 (77.78) \n0.198 (75.25) \n0.148 (72.22) \n0.081 (68.01) \n0.160 (73.08) \n0.144 (71.98) \n0.111 (69.23) \n0.181 (74.44) \n\n38 \n56 \n20 \n42 \n82 \n20 \n21 \n24 \n20 \n36 \n\nReferences \n\n[1] M. Abeles, E. Vaadia, H. Bergman, Firing patterns of single unit in the pre(cid:173)\n\nfrontal cortex and neural-networks models., Network 1 (1990). \n\n[2] E. Raz , et al Neuronal synchronization of tonically active neurons in the \nstriatum of normal and parkinsonian primates, J. Neurophysiol. 76:2083-2088 \n(1996). \n\n[3] M. Abeles, Corticonics, (Cambridge University Press, 1991). \n\n[4] I. Gat , N. Tishby and M. Abeles, Hidden Markov modeling of simultaneously \nrecorded cells in the associative cortex of behaving monkeys, Network,8:297-322 \n(1997). \n\n[5] I. Gat, N. Tishby, Comparative study of different supervised detection methods \n\nof simultaneously recorded spike trains, in preparation. \n\n[6] N. Brenner, S.P. Strong, R. Koberle, W. Bialek, and R. de Ruyter van \nSteveninck, The Economy of Impulses and the Stiffnes of Spike Trains, NEC \nResearch Institute Technical Note (1998). \n\n[7] T.M . Cover and J.A . Thomas, Elements of Information Theory., (Wiley NY, \n\n1991). \n\n[8] A.P. Georgopoulos, A.B. Schwartz, R.E. Kettner, Neuronal Population Coding \n\nof Movement Direction, Science, 233:1416-1419 (1986). \n\n[9] D.J. Amit, Modeling Brain Function , (Cambridge University Press, 1989). \n\n[10] E. Ahissar et al Dependence of Cortical Plasticity on Correlated Activity of \n\nSingle Neurons and on Behavioral Context, Science, 257:1412-1415 (1992). \n\n\f", "award": [], "sourceid": 1611, "authors": [{"given_name": "Itay", "family_name": "Gat", "institution": null}, {"given_name": "Naftali", "family_name": "Tishby", "institution": null}]}