{"title": "Maximum Likelihood Estimation of a Stochastic Integrate-and-Fire Neural Model", "book": "Advances in Neural Information Processing Systems", "page_first": 1311, "page_last": 1318, "abstract": "", "full_text": "Maximum Likelihood Estimation of a Stochastic\n\nIntegrate-and-Fire Neural Model(cid:3)\n\nJonathan W. Pillow, Liam Paninski, and Eero P. Simoncelli\n\nHoward Hughes Medical Institute\n\nCenter for Neural Science\n\nNew York University\n\nfpillow, liam, eerog@cns.nyu.edu\n\nAbstract\n\nRecent work has examined the estimation of models of stimulus-driven\nneural activity in which some linear \ufb01ltering process is followed by\na nonlinear, probabilistic spiking stage. We analyze the estimation\nof one such model for which this nonlinear step is implemented by a\nnoisy, leaky, integrate-and-\ufb01re mechanism with a spike-dependent after-\ncurrent. This model is a biophysically plausible alternative to models\nwith Poisson (memory-less) spiking, and has been shown to effectively\nreproduce various spiking statistics of neurons in vivo. However, the\nproblem of estimating the model from extracellular spike train data has\nnot been examined in depth. We formulate the problem in terms of max-\nimum likelihood estimation, and show that the computational problem\nof maximizing the likelihood is tractable. Our main contribution is an\nalgorithm and a proof that this algorithm is guaranteed to \ufb01nd the global\noptimum with reasonable speed. We demonstrate the effectiveness of our\nestimator with numerical simulations.\n\nA central issue in computational neuroscience is the characterization of the functional re-\nlationship between sensory stimuli and neural spike trains. A common model for this re-\nlationship consists of linear \ufb01ltering of the stimulus, followed by a nonlinear, probabilistic\nspike generation process. The linear \ufb01lter is typically interpreted as the neuron\u2019s \u201creceptive\n\ufb01eld,\u201d while the spiking mechanism accounts for simple nonlinearities like recti\ufb01cation\nand response saturation. Given a set of stimuli and (extracellularly) recorded spike times,\nthe characterization problem consists of estimating both the linear \ufb01lter and the parameters\ngoverning the spiking mechanism.\n\nOne widely used model of this type is the Linear-Nonlinear-Poisson (LNP) cascade model,\nin which spikes are generated according to an inhomogeneous Poisson process, with rate\ndetermined by an instantaneous (\u201cmemoryless\u201d) nonlinear function of the \ufb01ltered input.\nThis model has a number of desirable features, including conceptual simplicity and com-\nputational tractability. Additionally, reverse correlation analysis provides a simple unbi-\nased estimator for the linear \ufb01lter [5], and the properties of estimators (for both the linear\n\ufb01lter and static nonlinearity) have been thoroughly analyzed, even for the case of highly\nnon-symmetric or \u201cnaturalistic\u201d stimuli [12]. One important drawback of the LNP model,\n\n* JWP and LP contributed equally to this work. We thank E.J. Chichilnisky for helpful discussions.\n\n\fl\n\n \n\ne\nd\no\nm\nF\nL\nN\n\u2212\nL\n\nI\n\nl\n\ne\nd\no\nm\nP\nN\nL\n\n \n\ni\n\n)\ne\nk\np\ns\n(\nP\n\n0\n\n50\n\ntime (ms)\n\n100\n\nFigure 1: Simulated responses of L-\nNLIF and LNP models to 20 rep-\netitions of a \ufb01xed 100-ms stimu-\nlus segment of temporal white noise.\nTop: Raster of responses of L-NLIF\nmodel, where (cid:27)noise=(cid:27)signal = 0.5\nand g gives a membrane time con-\nstant of 15 ms. The top row shows\nthe \ufb01xed (deterministic) response of\nthe model with (cid:27)noise set to zero.\nMiddle: Raster of responses of LNP\nmodel, with parameters \ufb01t with stan-\ndard methods from a long run of\nthe L-NLIF model responses to non-\nrepeating stimuli. Bottom: (Black\nline) Post-stimulus time histogram\n(PSTH) of the simulated L-NLIF re-\nsponse.\n(Gray line) PSTH of the\nLNP model. Note that the LNP\nmodel fails to preserve the \ufb01ne tem-\nporal structure of the spike trains,\nrelative to the L-NLIF model.\n\nhowever, is that Poisson processes do not accurately capture the statistics of neural spike\ntrains [2, 9, 16, 1]. In particular, the probability of observing a spike is not a functional of\nthe stimulus only; it is also strongly affected by the recent history of spiking.\n\nThe leaky integrate-and-\ufb01re (LIF) model provides a biophysically more realistic spike\nmechanism with a simple form of spike-history dependence. This model is simple, well-\nunderstood, and has dynamics that are entirely linear except for a nonlinear \u201creset\u201d of the\nmembrane potential following a spike. Although this model\u2019s overriding linearity is often\nemphasized (due to the approximately linear relationship between input current and \ufb01ring\nrate, and lack of active conductances), the nonlinear reset has signi\ufb01cant functional impor-\ntance for the model\u2019s response properties. In previous work, we have shown that standard\nreverse correlation analysis fails when applied to a neuron with deterministic (noise-free)\nLIF spike generation; we developed a new estimator for this model, and demonstrated that a\nchange in leakiness of such a mechanism might underlie nonlinear effects of contrast adap-\ntation in macaque retinal ganglion cells [15]. We and others have explored other \u201cadaptive\u201d\nproperties of the LIF model [17, 13, 19].\n\nIn this paper, we consider a model consisting of a linear \ufb01lter followed by noisy LIF spike\ngeneration with a spike-dependent after-current; this is essentially the standard LIF model\ndriven by a noisy, \ufb01ltered version of the stimulus, with an additional current waveform\ninjected following each spike. We will refer to this as the the \u201cL-NLIF\u201d model. The prob-\nabilistic nature of this model provides several important advantages over the deterministic\nversion we have considered previously. First, an explicit noise model allows us to couch\nthe problem in the terms of classical estimation theory. This, in turn, provides a natural\n\u201ccost function\u201d (likelihood) for model assessment and leads to more ef\ufb01cient estimation of\nthe model parameters. Second, noise allows us to explicitly model neural \ufb01ring statistics,\nand could provide a rigorous basis for a metric distance between spike trains, useful in\nother contexts [18]. Finally, noise in\ufb02uences the behavior of the model itself, giving rise to\n\n\fphenomena not observed in the purely deterministic model [11].\n\nOur main contribution here is to show that the maximum likelihood estimator (MLE) for\nthe L-NLIF model is computationally tractable. Speci\ufb01cally, we describe an algorithm\nfor computing the likelihood function, and prove that this likelihood function contains no\nnon-global maxima, implying that the MLE can be computed ef\ufb01ciently using standard\nascent techniques. The desirable statistical properties of this estimator (e.g. consistency,\nef\ufb01ciency) are all inherited \u201cfor free\u201d from classical estimation theory. Thus, we have a\ncompact and powerful model for the neural code, and a well-motivated, ef\ufb01cient way to\nestimate the parameters of this model from extracellular data.\n\nThe Model\n\nWe consider a model for which the (dimensionless) subthreshold voltage variable V evolves\naccording to\n\ndV =(cid:18) (cid:0) gV (t) + ~k (cid:1) ~x(t) +\n\nh(t (cid:0) tj)(cid:19)dt + (cid:27)Nt;\n\n(1)\n\ni(cid:0)1Xj=0\n\nand resets to Vr whenever V = 1. Here, g denotes the leak conductance, ~k (cid:1) ~x(t) the\nprojection of the input signal ~x(t) onto the linear kernel ~k, h is an \u201cafterpotential,\u201d a current\nwaveform of \ufb01xed amplitude and shape whose value depends only on the time since the last\nspike ti(cid:0)1, and Nt is an unobserved (hidden) noise process with scale parameter (cid:27). Without\nloss of generality, the \u201cleak\u201d and \u201cthreshold\u201d potential are set at 0 and 1, respectively, so the\ncell spikes whenever V = 1, and V decays back to 0 with time constant 1=g in the absence\nof input. Note that the nonlinear behavior of the model is completely determined by only\na few parameters, namely fg; (cid:27); Vrg, and h (where the function h is allowed to take values\nin some low-dimensional vector space). The dynamical properties of this type of \u201cspike\nresponse model\u201d have been extensively studied [7]; for example, it is known that this class\nof models can effectively capture much of the behavior of apparently more biophysically\nrealistic models (e.g. Hodgkin-Huxley).\n\nIn\nFigures 1 and 2 show several simple comparisons of the L-NLIF and LNP models.\n1, note the \ufb01ne structure of spike timing in the responses of the L-NLIF model, which is\nqualitatively similar to in vivo experimental observations [2, 16, 9]). The LNP model fails\nto capture this \ufb01ne temporal reproducibility. At the same time, the L-NLIF model is much\nmore \ufb02exible and representationally powerful, as demonstrated in Fig. 2: by varying Vr\nor h, for example, we can match a wide variety of dynamical behaviors (e.g. adaptation,\nbursting, bistability) known to exist in biological neurons.\n\nThe Estimation Problem\nOur problem now is to estimate the model parameters f~k; (cid:27); g; Vr; hg from a suf\ufb01ciently\nrich, dynamic input sequence ~x(t) together with spike times ftig. A natural choice is\nthe maximum likelihood estimator (MLE), which is easily proven to be consistent and\nstatistically ef\ufb01cient here. To compute the MLE, we need to compute the likelihood and\ndevelop an algorithm for maximizing it.\n\nThe tractability of the likelihood function for this model arises directly from the linearity\nof the subthreshold dynamics of voltage V (t) during an interspike interval. In the noise-\nless case [15], the voltage trace during an interspike interval t 2 [ti(cid:0)1; ti] is given by the\nsolution to equation (1) with (cid:27) = 0:\n\nV0(t) = Vre(cid:0)gt +Z t\n\nti(cid:0)10@~k (cid:1) ~x(s) +\n\ni(cid:0)1Xj=0\n\nh(s (cid:0) tj)1A e(cid:0)g(t(cid:0)s)ds;\n\n(2)\n\n\fstimulus\n\nresponses\n\nt (sec)\n\nstimulus\n\nresponses\n\nt (sec)\n\nstimulus\n\nresponses\n\nA\nh current\n\n0\n\n0\n\n0\n\n0\n\nt\n\n0.2\n\n0\n\nc=1\n\nc=2\n\nc=5\n\n0\n\nB\n\nx c\n\nh current\n\n0\n\n0.2\n\nt\n\nC\nh current\n\n0\n\n0\n\n0\n\n0\n\n0\n\n.05\n\n0\n\nt\n\nt (sec)\n\nFigure 2: Illustration of diverse behaviors\nof L-NLIF model.\nA: Firing rate adaptation. A positive\nDC current (top) was injected into three\nmodel cells differing only in their h cur-\nrents (shown on left:\ntop, h = 0; mid-\ndle, h depolarizing; bottom, h hyperpo-\nlarizing). Voltage traces of each cell\u2019s re-\nsponse (right, with spikes superimposed)\nexhibit rate facilitation for depolarizing h\n(middle), and rate adaptation for hyperpo-\nlarizing h (bottom).\nB: Bursting. The response of a model cell\nwith a biphasic h current (left) is shown as\na function of the three different levels of\nDC current. For small current levels (top),\nthe cell responds rhythmically. For larger\ncurrents (middle and bottom), the cell re-\nsponds with regular bursts of spikes.\nC: Bistability. The stimulus (top) is a\npositive followed by a negative current\npulse. Although a cell with no h current\n(middle) responds transiently to the posi-\ntive pulse, a cell with biphasic h (bottom)\nexhibits a bistable response: the positive\npulse puts it into a stable \ufb01ring regime\nwhich persists until the arrival of a neg-\native pulse.\n\n1\n\n1\n\n1\n\nwhich is simply a linear convolution of the input current with a negative exponential. It\nis easy to see that adding Gaussian noise to the voltage during each time step induces a\nGaussian density over V (t), since linear dynamics preserve Gaussianity [8]. This density is\nuniquely characterized by its \ufb01rst two moments; the mean is given by (2), and its covariance\ng , where Eg is the convolution operator corresponding to e(cid:0)gt. Note that this\nis (cid:27)2EgET\ndensity is highly correlated for nearby points in time, since noise is integrated by the linear\ndynamics.\nIntuitively, smaller leak conductance g leads to stronger correlation in V (t)\nat nearby time points. We denote this Gaussian density G(~xi; ~k; (cid:27); g; Vr; h), where index\ni indicates the ith spike and the corresponding stimulus chunk ~xi (i.e.\nthe stimuli that\nin\ufb02uence V (t) during the ith interspike interval).\nNow, on any interspike interval t 2 [ti(cid:0)1; ti], the only information we have is that V (t)\nis less than threshold for all times before ti, and exceeds threshold during the time bin\ncontaining ti. This translates to a set of linear constraints on V (t), expressed in terms of\nthe set\n\nTherefore, the likelihood that the neuron \ufb01rst spikes at time ti, given a spike at time ti(cid:0)1,\nis the probability of the event V (t) 2 Ci, which is given by\n\nCi = \\ti(cid:0)1(cid:20)t