{"title": "Population Decoding Based on an Unfaithful Model", "book": "Advances in Neural Information Processing Systems", "page_first": 192, "page_last": 198, "abstract": null, "full_text": "Population Decoding Based on \n\nan Unfaithful Model \n\ns. Wu, H. Nakahara, N. Murata and S. Amari \n\nRIKEN Brain Science Institute \n\nHirosawa 2-1, Wako-shi, Saitama, Japan \n\n{phwusi, hiro, mura, amari}@brain.riken.go.jp \n\nAbstract \n\nWe study a population decoding paradigm in which the maximum likeli(cid:173)\nhood inference is based on an  unfaithful decoding model (UMLI). This \nis usually the case for neural population decoding because the encoding \nprocess  of the  brain  is  not exactly  known,  or because a  simplified de(cid:173)\ncoding model  is  preferred for  saving computational cost.  We  consider \nan  unfaithful  decoding model  which  neglects the  pair-wise  correlation \nbetween neuronal activities, and prove that UMLI is asymptotically effi(cid:173)\ncient when the neuronal correlation is uniform or of limited-range.  The \nperformance of UMLI is compared with that of the maximum likelihood \ninference based on  a faithful  model  and  that of the  center of mass de(cid:173)\ncoding  method.  It turns  out  that  UMLI  has  advantages  of decreasing \nthe computational complexity remarkablely and maintaining a high-level \ndecoding  accuracy  at  the  same  time.  The  effect of correlation  on  the \ndecoding accuracy is also discussed. \n\n1 \n\nIntroduction \n\nPopulation coding is  a  method to  encode and decode stimuli  in  a distributed way  by  us(cid:173)\ning the joint activities of a number of neurons (e.g.  Georgopoulos et aI.,  1986; Paradiso, \n1988;  Seung and Sompo1insky,  1993).  Recently,  there  has  been  an  expanded interest in \nunderstanding the population decoding methods, which particularly include the maximum \nlikelihood inference (MLI), the center of mass (COM), the complex estimator (CE) and the \noptimal linear estimator (OLE) [see (Pouget et aI.,  1998; Salinas and Abbott, 1994) and the \nreferences therein].  Among them,  MLI has an  advantage of having small decoding error \n(asymptotic efficiency), but may suffers from the expense of computational complexity. \n\nLet us  consider a  population  of N  neurons coding  a  variable  x.  The encoding  process \nof the population code is described by  a conditional probability q(rlx)  (Anderson,  1994; \nZemel et aI.,  1998),  where the  components of the  vector r  = {rd for i  = 1,\u00b7\u00b7\u00b7, N  are \nthe  firing  rates of neurons.  We  study  the  following MLI estimator given  by the  value  of \nx that maximizes the log likelihood Inp(rlx), where p(rlx) is the decoding model  which \nmight be different from  the encoding model q(rlx).  So far,  when  people study MLI in  a \npopulation code,  it  normally  (or implicitly) assumes that p(rlx)  is  equal  to  the  encoding \nmodel q(rlx). This requires that the estimator has full  knowledge of the encoding process. \nTaking account of the complexity of the information process in the brain, it is more natural \n\n\fPopulation Decoding Based on an Unfaithful Model \n\n193 \n\nto assume p(rlx)  :I  q(rlx).  Another reason for choosing this is for saving computational \ncost.  Therefore, a decoding paradigm in  which  the  assumed decoding model  is  different \nfrom the encoding one needs to be studied.  In  the context of statistical theory, this is called \nestimation based on an  unfaithful or a misspecified model.  Hereafter, we call the decoding \nparadigm of using MLI based on  an  unfaithful  model,  UMLI, to  distinguish from that of \nMLI based on  the  faithful  model,  which  is called FMLI.  The unfaithful  model studied in \nthis paper is the one which neglects the pair-wise correlation between neural activities.  It \nturns out that UMLI has attracting properties of decreasing the computational cost of FMLI \nremarkablely and at the same time maintaining a high-level decoding accuracy. \n\n2  The Population Decoding Paradigm of UMLI \n\n2.1  An Unfaithful Decoding Model of Neglecting the Neuronal Correlation \n\nLet us consider a pair-wise correlated neural response model in which the neuron activities \nare assumed to be multivariate Gaussian \n\nq(rlx)  = \n\nJ(21ra 2 )N det(A) \n\nf-(x))(r \u00b7 -\n1 \n\nJ \n\nf \u00b7(x))] \n, \nJ \n\n(I) \n\nI lL  -1 \n(r \u00b7 -\n1 \n\nexp[---\n2a2 \n\nA \n1J \n\n. \n.\n1,J \n\nwhere  fi(X)  is  the tuning function.  In  the  present study,  we  will  only consider the radial \nsymmetry tuning function. \n\nTwo different correlation structures are considered.  One is the uniform correlation model \n(Johnson, 1980; Abbott and Dayan,  1999), with the covariance matrix \n\nAij  =  8ij + c(l - 8ij ), \n\n(2) \n\nwhere the parameter c (with -1 < c < 1) determines the strength of correlation. \nThe other correlation structure is of limited-range (Johnson, 1980; Snippe and Koenderink, \n1992; Abbott and Dayan,  1999), with the covariance matrix \n\nA \u00b7\u00b7  - b1i- jl \n, \n\nlJ  -\n\n(3) \nwhere the parameter b (with 0  < b < 1) determines the range of correlation. This structure \nhas translational invariance in  the sense that Aij  =  A kl ,  if Ii - jl =  Ik - ll. \nThe unfaithful decoding model,  treated in  the present study,  is the one which neglects the \ncorrelation in the encoding process but keeps the tuning functions unchanged, that is, \n\n(4) \n\n2.2  The decoding error of UMLI and FMLI \n\nThe  decoding  error  of  UMLI  has  been  studied  in  the  statistical  theory  (Akahira  and \nTakeuchi,  1981;  Murata  et  al.,  1994).  Here  we  generalize  it  to  the  population  cod(cid:173)\ning.  For  convenience,  some  notations  are  introduced.  \\If(r,x)  denotes  df(r,x)/dx. \nEq[f(r,x)]  and  Vq[f(r,x)]  denote,  respectively,  the  mean  value  and  the  variance  of \nf(r, x)  with  respect  to  the  distribution  q(rlx).  Given  an  observation  of the  population \nactivity  r*,  the  UMLI  estimate  x is  the  value  of x  that  maximizes  the  log  likelihood \nLp(r*,x) = Inp(r*lx). \nDenote by  Xopt  the  value of x  satisfying Eq[\\l Lp(r, xopd]  =  O.  For the faithful  model \nwhere p  = q,  Xopt  = x.  Hence,  (xopt  - x)  is  the  error due  to  the  unfaithful  setting, \nwhereas (x - Xopt)  is the error due to sampling fluctuations.  For the unfaithful model (4), \n\n\f194 \n\ns.  Wu,  H.  Nakahara,  N.  Murata and S.  Amari \n\nsince Eq[V' Lp(r, Xopt)]  =  0,  Li[/i{x) - /i(xopdlfI(xopt)  =  O.  Hence, Xopt  =  x  and \nUMLI gives an unbiased estimator in  the present cases. \nLet us consider the expansion of V' Lp(r*, x) at x. \n\nV'Lp(r*,x)  ~ V'Lp(r*,x) + V'V'Lp{r*,x)  (x  - x). \n\nSince V' Lp(r*, x) =  0, \n\n~ V'V'Lp{r*,x)  (x  - x) ~ - ~ V'Lp(r*,x), \n\n(5) \n\n(6) \n\nwhere N  is  the  number of neurons.  Only  the  large  N  limit is  considered in  the  present \nstudy. \nLet  us  analyze  the  properties  of  the  two  random  variables  ~ V'V' Lp (r* , x)  and \n~ V' Lp(r*, x). We consider first the uniform correlation model. \n\nFor the uniform correlation structure, we can write \n\nr; =  /i(x) + O\"(Ei  + 11), \n\n(7) \n\nwhere 11  and  {Ei},  for i  =  1,\u00b7\u00b7\u00b7, N, are independent random variables having zero mean \nand variance c and 1 - c, respectively. 11  is the common noise for all  neurons, representing \nthe uniform character of the correlation. \n\nBy  using the expression (7), we get \n\n~ V'Lp{r*,x) \n\n;0\" L Ed: (x) + ;0\" L fI (x), \n\ni \n\n. \n\n+ ;0\"  Lf:'(x). \n\nt \n\n(8) \n\n(9) \n\nWithout loss of generality, we assume that the distribution of the preferred stimuli is uni(cid:173)\nform.  For the radial symmetry tuning functions,  ~ Li fI(x)  and  ~ Li fI'(x)  approaches \nzero when N  is large. Therefore, the correlation contributions (the terms of 11)  in the above \ntwo equations can be neglected.  UMLI performs in  this case as if the neuronal signals are \nuncorrelated. \n\nThus, by the weak law of large numbers, \n~ V'V' Lp(r*, x) \n\n(10) \n\nwhere Qp  ==  Eq[V'V' Lp(r, x)]. \nAccording to the central limit theorem, V' Lp (r*, x) / N  converges to a Gaussian distribution \n\n~ V'Lp{r*,x) \n\nN(O, ~~O\"~ LfHx)2) \n\nwhere  N(O, t2 )  denoting the  Gaussian distribution having  zero  mean and  variance  t, and \nGp ==  Vq[V'Lp(r, x)]. \n\nN(O, ~~), \n\n(11) \n\n\fPopulation Decoding Based on an UnfaithfUl Model \n\n195 \n\nCombining the results of eqs.(6), (10) and (11), we obtain the decoding error of UMLI, \n\n(x - x)UMLI \n\nN(O , Q;2Gp), \n(1  - c)a 2 \n\n=  N(O , Li fHx)2)\u00b7 \n\nIn the similar way, the decoding error of FMLI is obtained, \n\n(x - x)FMLI \n\nN(O, Q~2Gq) , \n(1  - c)a2 \n\n=  N(O , Li fI(x)2) ' \n\n(12) \n\n(13) \n\nwhich  has  the  same form  as  that  of UMLI except that Q q  and G q  are  now  defined  with \nrespect to the faithful decoding model, i.e., p(rlx)  =  q(rlx) . To  get eq.(13), the condition \n\nL i fI(x)  = \u00b0 is used.  Interestingly, UMLI and FMLI have the same decoding error. This \n\nis because the uniform correlation effect is actually neglected in both UMLI and FMLI. \nNote that in  FMLI,  Qq  =  Gq =  Vq[\\7 Lq(rlx)]  is  the Fisher information.  Q-;;2Gq is  the \nCramer-Rao bound, which  is  the  optimal  accuracy  for an  unbiased estimator to  achieve. \nEq.(13)  shows  that  FMLI  is  asymptotically  efficient.  For an  unfaithful decoding model, \nQp  and Gp are  usually  different from  the  Fisher  information.  We  call  Q;2Gp the  gen(cid:173)\neralized Cramer-Rao bound, and UMLI quasi-asymptotically efficient if its decoding error \napproaches Q;2Gp asymptotically. Eq.( 12) shows that UMLI is quasi-asymptotic efficient. \n\nIn the above, we have proved the asymptotic efficiency of FMLI and UMLI when the neu(cid:173)\nronal correlation is uniform.  The result relies on the radial symmetry of the tuning function \nand  the  uniform character of the  correlation,  which  make  it  possible to  cancel  the corre(cid:173)\nlation contributions from  different neurons.  For general tuning functions  and correlation \nstructures, the asymptotic efficiency of UMLI and FMLI may not hold. This is because the \nlaw  of large  numbers (eq.(IO\u00bb  and the central  limit theorem (eq.(II\u00bb  are  not in  general \napplicable. \n\nWe  note that for the limited-range correlation model, since the correlation is translational \ninvariant and its strength decreases quickly with the dissimilarity in  the neurons'  preferred \nstimuli, the correlation effect in the decoding of FMLI and UMLI becomes negligible when \nN  is large.  This ensures that the law of large numbers and the central limit theorem hold \nin  the  large  N  limit.  Therefore,  UMLI  and  FMLI  are  asymptotically  efficient.  This  is \nconfirmed in the simulation in Sec.3. \n\nWhen UMLI and FMLI are asymptotic efficient, their decoding errors in  the large N  limit \ncan  be  calculated  according  to  the  Cramer-Rao  bound  and  the  generalized  Cramer-Rao \nbound, respectively, which are \n\na2 L ij  AidI(x)fj(x) \n\n[L i UI(X))2J2 \n\na 2 \n\nL ij  Aijl f;(x)fj(x)\u00b7 \n\n(14) \n\n(15) \n\n3  Performance Comparison \n\nThe performance of UMLI is  compared with  that of FMLI and of the center of mass  de(cid:173)\ncoding method (COM). The neural population model  we consider is  a regular array of N \nneurons (Baldi and Heiligenberg, 1988; Snippe, 1996) with the preferred stimuli uniformly \ndistributed in the range [-D , DJ, that is, Ci  = -D + 2iD /(N + 1), for i  = 1, \u00b7 .. ,N . The \ncomparison is done at the stimulus x  =  0. \n\n\f196 \n\ns.  Wu, H.  Nakahara,  N.  Murata and S.  Amari \n\nCOM is a simple decoding method without using any information of the encoding process, \nwhose  estimate  is  the  averaged  value  of the  neurons'  preferred stimuli  weighted  by  the \nresponses (Georgopoulos et aI.,  1982; Snippe, 1996), i.e., \n\nA \n\nE i rici \nx  - ==:--(cid:173)\n- Ei r i  . \n\nThe shortcoming of COM is a large decoding error. \n\nFor the population model we consider, the decoding error of COM is calculated to be \n\n(16) \n\n( 17) \n\nwhere the condition E i  Ii (x )Ci = 0 is  used, due to the regularity of the distribution of the \npreferred stimuli. \n\nThe tuning function is Gaussian, which has the form \n\nIi(x) = exp[-\n\n(x - Ci)2 \n\n2a2 \n\n], \n\n(18) \n\nwhere the parameter a is the tuning width. \n\nWe note that the Gaussian response model does not give zero probability for negative firing \nrates.  To make it more reliable, we set ri = 0 when  fi(X)  < 0.11  (Ix - cil  > 3a), which \nmeans that only those neurons which are active enough contribute to the decoding. It is easy \nto  see  that this cut-off does not effect much the  results  of UMLI and FMLI, due to  their \nnature of decoding by  using the derivative of the tuning functions.  Whereas, the decoding \nerror of COM will be greatly enlarged without cut-off. \nFor the  tuning  width  a, there  are  N  = Int[6a/d - 1J  neurons  involved in  the  decoding \nprocess, where d is the difference in the preferred stimuli between two consecutive neurons \nand the function Int[\u00b7J  denotes the integer part of the argument. \nIn  all experiment settings, the parameters are chosen as a =  1 and (J  =  0.1.  The decoding \nerrors of the three  methods are compared for  different  values of N  when  the correlation \nstrength is fixed (c = 0.5 for the uniform correlation case and b = 0.5 for the limited-range \ncorrelation case), or different values of the correlation strength when N  is fixed to be 50. \n\nFig.l compares the decoding errors of the three methods for the uniform correlation model. \nIt shows that UMLI has the same decoding error as that of FMLI, and a lower error than that \nof COM. The uniform correlation improves the decoding accuracies of the three methods \n(Fig.lb). \n\nIn Fig.2, the simulation results for the decoding errors of FMLI and UMLI in  the limited(cid:173)\nrange correlation model are compared with those obtained by using the Cramer-Rao bound \nand the  generalized Cramer-Rao bound,  respectively.  It shows that the  two  results agree \nvery  well  when  the  number of neurons,  N,  is  large,  which  means that FMLI and  UMLI \nare  asymptotic efficient as  we analyzed.  In  the simulation,  the  standard gradient descent \nmethod  is  used  to  maximize the  log  likelihood,  and the  initial  guess  for  the  stimulus  is \nchosen  as  the  preferred stimulus of the  most active  neuron.  The CPU  time  of UMLI  is \naround 1/5 of that of FMLI. UMLI reduces the computational cost of FMLI significantly. \n\nFig.3 compares the decoding errors of the three methods for the limited-range correlation \nmodel.  It  shows that  UMLI has a  lower decoding error than  that of COM.  Interestingly, \nUMLI has a comparable performance with that of FMLI for the whole range of correlation. \nThe limited-range correlation degrades the decoding accuracies of the three methods when \nthe strength is small and improves the accuracies when the strength is large (Fig.3b). \n\n\fPopulation Decoding Based on an  Unfaithfol Model \n\n197 \n\n0015  \"  _-~ _\n\n_ __  ~_~ \n\n-FMLI. UMLI \n\u2022\u2022\u2022 \u2022 .  COM \n\n~ \n\n0000  ~-...;.====::;::==== \n\n~ \n\n~ \n\n~ \n\n~ \n\nM \nN \n(a) \n\n. \n\n'. \n\n........ .. \n\n~ \n\nW \n\ng  0010  ..\n'\" c: \n'6 \n8 \n~ 0 005  r \n\n-FMLI. UMLI \n.... \u00b7 COM \n\n\u2022\u2022\u2022\u2022 \n\n................ \n\n0000 L \n\n~L~============~\u00b7\u00b7~\u00b7\u00b7~\u00b7\u00b7\u00b7~\u00b7\u00b7~\u00b7 ' \n. , \n\n08 \n\n0 2 \n\n0 4 \n\n1 0 \n\n0 8 \n\n0 0 \n\n............... '. \n\nC \n(b) \n\nFigure  1:  Comparing the decoding errors of UMLI, FMLI and COM for the uniform cor(cid:173)\nrelation model. \n\n0015  _ __  -\n\n-\n\n- - _ - -\n\n0015 \"  -\n\n- _ - - - - -__ _  _ \n\n- - CRB. boO.5 \n-SMR, b=O.5 \n--- CRB. boO.S \n_  -.;I  SMA, b=08 \n_\n\nGCRB. boO.5  i \n\nI \nt \n\n-\n-\nSMR.  boO.5 \n- - - GCRB. boO.S \n\"'_WC)  SMR.  b::O.8 \n~  T ~~\"\"\"T\"\" \n! 0010,  ~\"'1.~~\n. \n1 \n'6 8 \n\nc: \n\n, \n\n, \n\nI \n\n, \n~ OOO5 r \n\nO OOO~\u00b7~---~~--M---~~-~I OO \n\nN \n(a) \n\nOOOO~~--~~~-~OO~-~~=--~100 \n\nN \n(b) \n\nFigure 2:  Comparing the simulation results of the decoding errors of UMLI and FMLI in \nthe limited-range correlation  model  with  those obtained by using  the Cramer-Rao bound \nand the generalized Cramer-Rao bound, respectively. CRB denotes the Cramer-Rao bound, \nGCRB the generalized Cramer-Rao bound, and SMR the simulation result.  In  the simula(cid:173)\ntion, 10 sets of data is generated, each of which is averaged over 1000 trials.  (a) FMLI; (b) \nUMLI. \n\n4  Discussions and Conclusions \n\nWe  have studied a population decoding paradigm in  which MLI is  based on an  unfaithful \nmodel.  This is  motivated by  the facts that the encoding process of the brain is  not exactly \nknown by the estimator.  As an example, we consider an  unfaithful decoding model which \nneglects  the  pair-wise  correlation  between  neuronal activities.  Two  different correlation \nstructures are considered, namely, the uniform and the limited-range correlations. The per(cid:173)\nformance of UMLI is compared with that of FMLI and COM. It turns out that UMLI has a \nlower decoding error than that of COM. Compared with FMLI, UMLI has comparable per(cid:173)\nformance whereas with  much less computational cost.  It is our future work to understand \nthe biological implication of UMLI. \n\nAs a by-product of the calculation, we also illustrate the effect of correlation on the decod(cid:173)\ning accuracies.  It turns out that the correlation, depending on its form, can either improve \nor degrade the  decoding  accuracy.  This observation  agrees  with  the  analysis  of Abbott \nand Dayan (Abbott and Dayan, 1999), which is done with respect to the optimal decoding \naccuracy, i.e., the Cramer-Rao bound. \n\n\f198 \n\ns.  Wu,  H.  Nakahara,  N  Murata and S.  Amari \n\n0020  ~-~--__ _  --~ \n\n-__  ~\n\n ____ _ \n\n\\ \n\n0015  ..  \"\" \n\n-FMU \n---UMU \n----- COM \n\n003  -\n\n-FMU \n---UMU \n, \u2022\u2022\u2022.  COM \n\n.. \" .... \n\n~ \n0> \n.~  00 10  ~ \n\"0 \n\n~ o OOS \n\n'--\"----,-,-\"-\"--\n\nOOOO~~1  --~-~~--~OO---~'00 \n\nN \n(a) \n\nb \n(b) \n\nFigure 3:  Comparing the decoding errors of UMLI, FMLI and COM for the limited-range \ncorrelation modeL \n\nAcknowledgment \n\nWe thank the three anonymous reviewers for their valuable comments and insight sugges(cid:173)\ntion.  S.  Wu acknowledges helpful discussions with Danmei Chen. \n\nReferences \n\nL. F.  Abbott and P.  Dayan.  1999.  The effect of correlated variability on the accuracy of a population \ncode.  Neural Computation.  II :91-101. \nM.  Akahira  and K.  Takeuchi.  1981 .  Asymptotic  efficiency  of statistical  estimators:  concepts  and \nhigh order asymptotic efficiency.  In Lecture Notes in Statistics 7. \nC.  H.  Anderson.  1994.  Basic elements of biological  computational  systems.  International Journal \nof Modern  Physics C, 5:135-137. \nP.  Baldi  and W.  Heiligenberg.  1988.  How  sensory  maps could enhance resolution through ordered \narrangements of broadly tuned receivers.  Biol. Cybern .. 59:313-318. \nA.  P. Georgopoulos.  1.  F.  Kalaska,  R.  Caminiti. and 1.  T.  Massey.  1982.  On the  relations  between \nthe  direction  of two-dimensional  arm  movements  and  cell  discharge  in  primate  motor  cortex.  1. \nNeurosci .\u2022 2:1527-1537. \nK.  O.  Johnson.  1980.  Sensory  discrimination:  neural  processes  preceding discrimination decision. \nJ. Neurophy .\u2022 43:1793-1815 . \nM.  Murata.  S.  Yoshizawa.  and  S.  Amari.  1994.  Network  information  criterion-determining  the \nnumber of hidden units for an artificial neural network model.  IEEE.  Trans.  Neural Networks. 5:865-\n872. \nA.  Pouget.  K.  Zhang.  S.  Deneve.  and  P.  E.  Latham.  1998.  Statistically efficient  estimation  using \npopulation coding.  Neural Computation.  10:373-401. \nE. Salinas and L. F.  Abbott.  1994.  Vector reconstruction from firing rates.  Journal of Computational \nNeuroscience,  1:89-107. \nH.  P.  Snippe and J. J.  Koenderink.  1992.  Information in channel-coded systems:  correlated receivers. \nBiological Cybernetics.  67: 183-190. \nH.  P.  Snippe.  1996.  Parameter  extraction  from  population  codes:  a  critical  assessment.  Neural \nComputation. 8:511-529. \nR.  S. Zemel.  P.  Dayan.  and  A.  Pouget.  1998.  Population interpolation of population codes.  Neural \nComputation.  10:403-430. \n\nO\n~\n-\n-\n-\n-\n\f", "award": [], "sourceid": 1752, "authors": [{"given_name": "Si", "family_name": "Wu", "institution": null}, {"given_name": "Hiroyuki", "family_name": "Nakahara", "institution": null}, {"given_name": "Noboru", "family_name": "Murata", "institution": null}, {"given_name": "Shun-ichi", "family_name": "Amari", "institution": null}]}