{"title": "Dynamic Ensemble Modeling Approach to Nonstationary Neural Decoding in Brain-Computer Interfaces", "book": "Advances in Neural Information Processing Systems", "page_first": 6089, "page_last": 6098, "abstract": "Brain-computer interfaces (BCIs) have enabled prosthetic device control by decoding motor movements from neural activities. Neural signals recorded from cortex exhibit nonstationary property due to abrupt noises and neuroplastic changes in brain activities during motor control. Current state-of-the-art neural signal decoders such as Kalman filter assume fixed relationship between neural activities and motor movements, thus will fail if this assumption is not satisfied. We propose a dynamic ensemble modeling (DyEnsemble) approach that is capable of adapting to changes in neural signals by employing a proper combination of decoding functions. The DyEnsemble method firstly learns a set of diverse candidate models. Then, it dynamically selects and combines these models online according to Bayesian updating mechanism. Our method can mitigate the effect of noises and cope with different task behaviors by automatic model switching, thus gives more accurate predictions. Experiments with neural data demonstrate that the DyEnsemble method outperforms Kalman filters remarkably, and its advantage is more obvious with noisy signals.", "full_text": "Dynamic Ensemble Modeling Approach to\n\nNonstationary Neural Decoding in Brain-Computer\n\nInterfaces\n\nqiyu@zju.edu.cn, bins@ieee.org, ymingwang@zju.edu.cn, gpan@zju.edu.cn\n\n1 College of Computer Science and Technology, Zhejiang University\n\n2 School of Computer Science, Nanjing University of Posts and Telecommunications\n\nYu Qi1, Bin Liu2, Yueming Wang3,\u2217, Gang Pan1,4,\u2217\n\n3 Qiushi Academy for Advanced Studies, Zhejiang University\n\n4 State Key Lab of CAD&CG, Zhejiang University\n\nAbstract\n\nBrain-computer interfaces (BCIs) have enabled prosthetic device control by de-\ncoding motor movements from neural activities. Neural signals recorded from\ncortex exhibit nonstationary property due to abrupt noises and neuroplastic changes\nin brain activities during motor control. Current state-of-the-art neural signal\ndecoders such as Kalman \ufb01lter assume \ufb01xed relationship between neural activi-\nties and motor movements, thus will fail if this assumption is not satis\ufb01ed. We\npropose a dynamic ensemble modeling (DyEnsemble) approach that is capable\nof adapting to changes in neural signals by employing a proper combination of\ndecoding functions. The DyEnsemble method \ufb01rstly learns a set of diverse can-\ndidate models. Then, it dynamically selects and combines these models online\naccording to Bayesian updating mechanism. Our method can mitigate the effect of\nnoises and cope with different task behaviors by automatic model switching, thus\ngives more accurate predictions. Experiments with neural data demonstrate that\nthe DyEnsemble method outperforms Kalman \ufb01lters remarkably, and its advantage\nis more obvious with noisy signals.\n\n1\n\nIntroduction\n\nBrain-computer interfaces (BCIs) decode motor intentions directly from brain signals for external\ndevice control [1\u20133]. Intracortical BCIs (iBCIs) utilize neural signals recorded from implanted\nelectrode arrays to extract information about movement intentions. Advances in iBCIs have enabled\nthe development in control of prosthetic devices or computer cursors by neural activities [4].\nIn iBCI systems, neural decoding algorithm plays an important role. Many algorithms have been\nproposed to decode motor information from neural signals [5\u20137], including population vector [8],\nlinear estimators [9], deep neural networks [10], and recursive Bayesian decoders [11]. Among these\napproaches, Kalman \ufb01lter is considered to provide more accurate trajectory estimation by incorporat-\ning the process of trajectory evolution as a prior knowledge [12], which has been successfully applied\nto online cursor and prosthetic control, achieving the state-of-the-art performance [5, 13].\nOne critical challenge in neural decoding is the nonstationary property of neural signals [14]. Current\niBCI neural decoders mostly assume a static functional relationship between neural signals and\nmovements by using \ufb01xed decoding models. However, in an online decoding process, signals\nfrom neurons can be temporarily noised or even lost, and brain activities can also change due to\n\n\u2217Corresponding authors: Yueming Wang and Gang Pan\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fneuroplasticity [15]. With the presence of noises and changes, the functional mapping between neural\nsignals and movements can be nonstationary and changes continuously in time [16]. Static decoders\nwith \ufb01xed decoding functions can be inaccurate and unstable given nonstationary neural signals [14],\nthus need to be retrained periodically to maintain the performance [17, 18].\nMost existing neural decoders dealing with nonstationary problems can be classi\ufb01ed into two groups.\nThe \ufb01rst group is recalibration-based, which uses a static model, and periodically recalibrates it\n(with of\ufb02ine paradigms) or adaptively updates the parameters online (usually requires true inten-\ntion/trajectory). Most approaches belong to this group [5, 18, 19]. The second group uses dynamic\nmodels to track nonstationary changes in signals [20\u201322]. These approaches can avoid the expense of\nrecalibration, which is potentially more suitable for long-term decoding tasks. However, there are\nvery few studies in this group for the challenge in modeling nonstationary neural signals.\nTo obtain robust decoding performance with nonstationary neural signals, we improve upon the\nKalman \ufb01lter\u2019s measurement function by introducing a dynamic ensemble measurement model,\ncalled DyEnsemble, capable of adapting to changes in neural signals. Different from static models,\nDyEnsemble allows the measurement function to be adaptively adjusted online. DyEnsemble \ufb01rstly\nlearns a set of diverse candidate models. In online prediction, it adaptively adjusts the measurement\nfunction along with changes in neural signals by dynamically selecting and combining these models\naccording to the Bayesian updating mechanism. Experimental results demonstrate that DyEnsemble\nmodel can effectively mitigate the effect of noises and cope with different task behaviors by automatic\nmodel switching, thus gives more accurate predictions. The main contributions of this work are\nsummarized as follows:\n\n\u2022 We propose a novel dynamic ensemble model (DyEnsemble) to cope with nonstationary\nneural signals. We propose to use the particle \ufb01lter algorithm for recursive state estimation in\nDyEnsemble, which adaptively estimates the posterior probability of each candidate model\naccording to incoming neural signals, and combines them online with the Bayesian updating\nmechanism. The process of dynamic ensemble modeling is illustrated in Fig. 1.\n\n\u2022 We propose a candidate model generation strategy to learn a diverse candidate set from\nneural signals. The strategy includes a neuron dropout step to deal with noisy neurons, and\na weight perturbation step to handle functional changes in neural signals.\n\nExperiments are carried out with both simulation data and neural signal data. It is demonstrated that\nthe DyEnsemble method outperforms Kalman \ufb01lters remarkably, and its advantage is more obvious\nwith noisy signals.\n\n2 Dynamic Ensemble Modeling Algorithm\n\n2.1 Classic state-space model\n\nA classic state-space model consists of a state transition function f (.) and a measurement function\nh(.) as follows:\n\nxk = f (xk\u22121) + vk\u22121,\nyk = h(xk) + nk,\n\n(1)\n(2)\nwhere k denotes the discrete time step, xk \u2208 Rdx is the state of our interest, yk \u2208 Rdy is the\nmeasurement or observation, vk and nk are i.i.d. state transition noise and measurement noise.\nIn the context of neural decoding, the state and the measurement represent the movement trajectory\nand the neural signals, respectively. Given a sequence of neural signals y0:k, the task is to estimate\nthe probability density of xk recursively. For linear Gaussian cases, Kalman \ufb01lter can provide an\nanalytical optimal solution to the above task.\n\n2.2 DyEnsemble based state-space model\n\nIn the classic state-space model mentioned earlier, the measurement function h(.) is assumed to be\nprecisely determined beforehand, which can not adapt to functional changes in neural signals. In\nDyEnsemble, we allow the measurement function to be adaptively adjusted online. Speci\ufb01cally, we\n\n2\n\n\fFigure 1: The dynamic ensemble modeling process.\n\nconsider an improved measurement model as follows:\n\nyk = hHk (xk) + nk,\n\n(3)\nin which Hk \u2208 {1, 2, . . . , M} denotes the index of our hypotheses about the measurement function.\nSpeci\ufb01cally, we use the notation Hk = m to denote the hypothesis that the working measurement\nfunction at time k is hm.\nHere we adopt a set of functions, i.e. candidate models, to characterize the relationship between the\nmeasurement and the state to be estimated. A Bayesian updating mechanism [23\u201325] is used for\ndynamically switching among these models in a data-driven manner. Given an observation sequence\ny0:k, the posterior of the state at time k is given by [26]:\n\np(xk|y0:k) =\n\np(xk|Hk = m, y0:k)p(Hk = m|y0:k),\n\n(4)\n\nM(cid:88)\n\nm=1\n\nwhere p(xk|Hk = m, y0:k) is the posterior of the state corresponding to hypothesis Hk = m;\np(Hk = m|y0:k) denotes the posterior probability of the m-th hypothesis.\nNow we consider how to derive p(Hk = m|y0:k) based on p(Hk\u22121 = m|y0:k\u22121). This is required\nfor developing a recursive algorithm. Following [27], we specify a model transition process in term\nof forgetting, to predict the model indicator at k as follows:\n\np(Hk = m|y0:k\u22121) =\n\n(cid:80)M\np(Hk\u22121 = m|y0:k\u22121)\u03b1\nj=1 p(Hk\u22121 = j|y0:k\u22121)\u03b1\n\n,\n\n(5)\n\nwhere \u03b1 (0 < \u03b1 < 1) denotes the forgetting factor which controls the rate of reducing the impact\nof historical data. Employing Bayes\u2019 rule, the posterior probability of the m-th hypothesis at k is\nobtained as below:\n\np(Hk = m|y0:k) =\n\n.\n\n(6)\n\n(cid:80)M\np(Hk = m|y0:k\u22121)pm(yk|y0:k\u22121)\nj=1 p(Hk = j|y0:k\u22121)pj(yk|y0:k\u22121)\n(cid:90)\n\npm(yk|xk)p(xk|y0:k\u22121)dxk,\n\npm(yk|y0:k\u22121) =\n\nThe term of pm(yk|y0:k\u22121) is the marginal likelihood of model hm at time k, which is de\ufb01ned as:\n\n(7)\n\nwhere pm(yk|xk) is the likelihood function associated with the m-th hypothesis.\n\n2.3 Particle \ufb01lter algorithm for state estimation in DyEnsemble\n\nHere we develop a generic particle-based solution to Eqn. (4) by adapting the particle \ufb01lter (PF) to\n\ufb01t our model. In PF, the posterior distribution at each time step is approximated with a weighted\nparticle set [28]. As shown in Eqn. (4), to estimate p(xk|y0:k) with particles, we need to derive a\nparticle-based solution to: 1) p(xk|Hk = m, y0:k); and 2) p(Hk = m|y0:k).\n\n3\n\n\f(cid:80)Ns\n\nAssume that we are standing at the beginning of the k-th time step, having at hand p(Hk\u22121 =\nm|y0:k\u22121), m = 1, . . . , M, and a weighted particle set {\u03c9i\ni=1, where Ns denotes the\nk\u22121, xi\nk\u22121. Assume that p(xk\u22121|y0:k\u22121) (cid:39)\nparticle size, xi\nk\u22121 the i-th particle with importance weight \u03c9i\nk\u22121), where \u03b4(.) denotes the Dirac delta function, we show how to get a\nparticle solution to p(xk|Hk = m, y0:k) and p(Hk = m|y0:k), for m = 1, . . . , M.\nStep 1. Particle based estimation of p(xk|Hk = m, y0:k). To begin with, we draw particles xi\nfrom the state transition prior p(xk|xi\nk\u22121), for i = 1, . . . , Ns. Then according to the principle of\nimportance sampling, we have:\n\nk\u22121\u03b4(xk\u22121 \u2212 xi\n\nk\u22121}Ns\n\ni=1 \u03c9i\n\nk\n\nm,k\u03b4(xk \u2212 xi\n\u03c9i\nk),\n\n(8)\n\np(xk|Hk = m, y0:k) \u2248 Ns(cid:88)\nk),(cid:80)Ns\n\nm,k = 1. \u03c9i\n\ni=1\n\nm,k \u221d \u03c9i\n\nk\u22121pm(yk|xi\n\ni=1 \u03c9i\n\nwhere \u03c9i\nm,k denotes the normalized importance weight\nof the ith particle under the hypothesis Hk = m.\nStep 2. Particle based estimation of p(Hk = m|y0:k). Given p(Hk\u22121 = m|y0:k\u22121), \ufb01rst we\ncalculate the predictive probability of Hk = m using Eqn. (5). Then we can calculate p(Hk =\nm|y0:k) using Eqn. (6) provided that pm(yk|y0:k\u22121), m = 1, . . . , M is available. Now we show\nhow to make use of the weighted particle set in Step 1 to estimate pm(yk|y0:k\u22121), m = 1, . . . , M.\nRecall that, in Step 1, the state transition prior is adopted as the importance function, namely\nq(xk|xk\u22121, y0:k) = p(xk|xk\u22121). It naturally leads to a particle approximation to the predictive\n. Then, according to Eqn.\n\ndistribution of xk, which is shown to be p(xk|y0:k\u22121) \u2248(cid:80)N\n\ni=1 \u03c9i\n\nk\u22121\u03b4xi\n\n(7), we have\n\npm(yk|y0:k\u22121) \u2248 Ns(cid:88)\n\nk\n\nk\u22121pm(yk|xi\n\u03c9i\nk).\n\n(9)\n\nNote that PF usually suffers from the problem of particle degeneracy. That says, after several\niterations, only a few particles have large weights. Hence, we adopt a resampling procedure in our\nmethod, which is a common practice in the literature for mitigating particle degeneracy by removing\nparticles with negligible weights and duplicate particles with large weights.\n\ni=1\n\n2.4 Candidate model generation\n\nHere we propose a candidate model generation strategy to learn a diverse model set from neural\nsignals. The strategy includes two stages of neuron dropout and weight perturbation. To create proper\ncandidate models, we analyze the properties of neural signals. The details of neural signal data are\ngiven in Section 4.1. The candidate model generation strategy is given in Algorithm 1.\nNeuron dropout. Firstly, we evaluate the decoding ability of each neuron by mutual information\n(MI) between its spike rate and target trajectory in Fig. 2 (a). It shows that only some of the neurons\ncontain useful information for motor decoding [29]. The activities of uncorrelated neurons can\ndecrease decoding performance, and the informative neurons can also be temporarily noised or even\nlost. In neuron dropout, we randomly disconnect candidate models with several neurons to improve\nthe noise-resistant ability and increase model diversity. After neuron dropout, each candidate only\nconnects to a neuron subset containing s neurons, where s is the parameter of model size.\nWeight perturbation. In Fig. 2 (b), we analyze the functional changes over time. Speci\ufb01cally, we\n\ufb01t the linear mapping function between neuron\u2019s \ufb01ring rate and target trajectory in every 20-second\ntemporal window with a stride of 1 second, and illustrate the distribution of the slope parameter. The\nred plus sign indicates the slope estimated with the whole time length. Results show that the mapping\nfunction between neuron and motor activity swings slightly around the mean in time.\nTo track the functional changes in neural signals, we propose a weight perturbation process. After\nmodel training, we randomly disturb the weights of each candidate model hm (hm \u2208 M) in a small\nrange:\n\n(10)\nwhere \u0001 is randomly sampled from Gaussian(0,1), p is the weight perturbation factor. The weight\nperturbation step gives the model set better tolerance of functional changes.\n\nw = w + p \u00d7 \u0001,\n\n4\n\n\fAlgorithm 1 Candidate Model Generation Strategy.\n1: s: model size, M: model number, p: weight perturbation factor\n2: D: training data, N : neuron set\n3: Init M = {}\n4: for i = 1, ..., M do\n5:\n6:\n7:\n8:\n9:\n10:\n11: end for\n\nNsubset = Neuron-dropout(N, s)\nhi = Train-model(D, Nsubset)\nfor w in weights of hi do\n\nw = Weight-perturbation(w, p)\n\nend for\nAdd hi to M\n\nFigure 2: Neuron activity analysis.\n\n3 Experiments with Simulation Data\n\nThe DyEnsemble approach is \ufb01rstly evaluated with simulation data. In this experiment, we simulate a\ntime series data where the measurement model is formulated by a piecewise function, to see how\nDyEnsemble tracks changes in functions.\nThe state transition function of the simulation data is given by:\n\nxk+1 = 1 + sin(0.04\u03c0 \u00d7 (k + 1)) + 0.5xk + vk,\n\n(11)\n\nwhere vk is a Gamma(3,2) random state process noise. The formulation of state transition function\nfollows [30]. The measurement function is de\ufb01ned as:\n\n(cid:40) h1(x) = 2x \u2212 3 + nk,\n\n0 < k (cid:54) 100,\nh2(x) = \u2212x + 8 + nk,\n100 < k (cid:54) 200,\nh3(x) = 0.5x + 5 + nk, 200 < k (cid:54) 300,\n\nyk =\n\n(12)\n\nwhere nk is Gaussian(0,1) random measurement noise. The goal is to estimate state xk given a\nsequence of measurement y0:k. The length of simulation data is 300. In DyEnsemble, h1, h2, and h3\nare adopted as candidate models. The forgetting factor \u03b1 is set to 0.5, and the particle number is 200.\nFig. 3 shows the posterior probability of candidate models over time. In DyEnsemble, the weights of\nthe candidate models switch automatically along with changes in signals. Candidate h1, h2, and h3\ntakes the dominating weight alternately, which is highly consistent with the piecewise function. We\nalso evaluate the in\ufb02uence of forgetting factor \u03b1, which adjusts the smoothness of weight transition.\nWith a higher \u03b1, model weights change more smoothly in time.\n\nFigure 3: Weights of candidate models with different \u03b1.\n\n5\n\n\f4 Experiments with Neural Signals\n\n4.1 Neural signals and experiment settings\n\nNeural signals were recorded from rats during lever-pressing tasks. The rats were trained to use\ntheir right forelimb to press a lever for water rewards. 16-channel microwire electrode arrays (8\u00d72,\ndiameter = 35 \u00b5m) were implanted in the primary motor cortex of rats. Neural signals were recorded\nby a CerebusT M system at a sampling rate of 30 kHz. Spikes were sorted and binned by 100 ms\nwindows without overlap. The forelimb movements were acquired by lever trajectory, which was\nrecorded at a sampling rate of 500 Hz, and downsampled to 10 Hz, to align to the spike bins. The\nexperiments conformed to the Guide for the Care and Use of Laboratory Animals.\nThe neural signal dataset includes two rats, for each rat, the data length is about 400 seconds. We use\nthe \ufb01rst 200 seconds for training and the last 100 seconds for test. After spike sorting, there are 22\nand 58 neurons for rat1 and rat2 respectively. We evaluate the neurons with mutual information (MI)\nbetween the \ufb01ring rate and lever trajectory, and select the top 20 neurons for movement decoding.\nThe movement trajectory at time step k is a 3 \u00d7 1 vector xk = [pk, vk, ak]T , where pk, vk and ak are\nthe position, velocity and acceleration, respectively. The binned neural signal yk is a 20 \u00d7 1 vector\nyk = [y1\n\nk denotes the spike count of the i-th neuron.\n\nk, y2\n\nk, ..., y20\n\nk ]T , where each element yi\n\n4.2 Analysis of dynamic process\n\nAdaptation to changing noises. To analyze the dynamic ensemble process with changing noises, we\nadd noise to neuron 2 and neuron 13 at around the 2nd and 4th second, as in Fig. 4 (a). The additional\nnoise is randomly distributed integers in [0,10]. Fig. 4 (b) shows the weights of candidate models\nover time. There are a total of 20 candidate models with model size s = 15 and weight perturbation\nfactor p = 0.1. Specially, we set the forgetting factor \u03b1 = 0.98 to force model weight transition to be\nhighly smooth, for analysis convenience.\nAs shown in Fig. 4 (a) and (b), when there is no additional noise (the \ufb01rst 2 seconds), the 13th, 15th\nand 19th candidate models are with high weights in assembling, as in Fig. 4 (c). When noise occurs\nin the 2nd neuron (2-4 seconds), the weights of the 13th and 19th candidates become small because\nthey both connect to the 2nd neuron. While the 15th candidate, which does not connect to neuron\n2, takes the dominating weight, as in Fig. 4 (d). When noise occurs in neuron 13 at the 4th second,\nthe weight of the 15th candidate decreases due to its connection to neuron 13, and the new winner is\nthe 4th candidate, which does not connect to both noisy neurons (Fig. 4 (e)). The results strongly\nsuggest that, given signals with changing noise, DyEnsemble approach can adaptively switch its\nmodel combination to mitigate the effect of noise.\nAdaptation to task behaviors. Fig. 5 (a) visualizes the model weight transition process with\n\u03b1 = 0.1 and \u03b1 = 0.5. It is interesting to \ufb01nd that, the model assembling behaviors are different in\nlever-pressing and non-pressing periods. As highlighted in the dashed boxes in Fig. 5 (a), during\nlever-pressing, only several certain models are selected. In Fig. 5 (b) and (c), we illustrate the average\nweights of some candidate models in both lever-pressing and non-pressing periods. The results\nsuggest that different models show different preferences to task behaviors, and DyEnsemble approach\ncan automatically switch to suitable candidate models to cope with changes of behaviors.\n\n4.3 Performance of neural decoding\n\nExperiments are carried out to compare the neural decoding performance of DyEnsemble with other\nmethods. The neural decoding performance is evaluated by commonly used criteria of correlation\ncoef\ufb01cient (CC) between lever trajectory and estimations. The results are shown in Table. 1.\nTo simulate noisy situations with unpredictable noises, we randomly replace several neurons\u2019 signals\nby noise in the test data. In Table. 1, we replace two (Noisy #2) and four (Noisy #4) neurons\u2019 signals\nby random integer noises in a range of [0, 10].\nEvaluation of neuron dropout and weight perturbation. Here we evaluate the two key parts of\nneuron dropout and weight perturbation. In Table. 1, we compare the baseline approach (without neu-\nron dropout and perturbation), the DyEnsemble with perturbation (p=0.1) alone, and the DyEnsemble\nwith both perturbation (p=0.1) and dropout (DyEnsemble-2 and DyEnsemble-5 drop 2 and 5 neurons\n\n6\n\n\fFigure 4: Dynamic ensemble modeling with changing noises.\n\nFigure 5: Comparison of model weights in lever-pressing and non-pressing periods.\n\nrespectively). Compared with the baseline, weight perturbation improves the performance by about\n10% in noisy situations, and neuron dropout (DyEnsemble-5) leads to a further 20% performance\nimprovement with 4 noisy neurons.\nComparison with other decoders. The methods in comparison include: Kalman \ufb01lter, which is a\nbaseline approach in neural motor decoding; long short-term memory (LSTM) [31] recurrent neural\nnetwork, which excels in learning from sequential data in machine learning \ufb01eld [32]; dual decoder\nwith a Kalman \ufb01lter [21, 22], which can be regarded as the current state-of-the-art dynamic modeling\napproach for nonstationary neural signals.\nFor a fair comparison, we use linear functions of f (.) and h(.), zero-mean Gaussian terms of vk and\nnk in Kalman, dual decoder, and DyEnsemble. The f (.) and h(.) are estimated by the least square\nalgorithm. For LSTM, we use a 1-hidden-layer model with 8 hidden neurons. In DyEnsemble, the\nforgetting factor \u03b1 is 0.1, model number M is 20, weight perturbation factor p is 0.1, and the particle\nnumber is 1000. For DyEnsemble-2 and DyEnsemble-5, the model sizes are 18 and 15, respectively.\nAll the methods are carefully tuned by validation and the validation set is the last 400 points of\ntraining data. The results are averaged over three random runs.\nIn Table. 1, we highlighted the top two performances in bold. Without additional noises (Original\ncolumn), the CCs of DyEnsemble-2 and DyEnsemble-5 are 0.799 and 0.775 for Rat1, which are\nslightly higher than Kalman (0.777) and LSTM (0.753), and comparable to dual decoder (0.779).\nFor Rat2, the best CC of DyEnsemble is 0.803 which is higher than Kalman (0.798) while slightly\n\nTable 1: Correlation coef\ufb01cient with different numbers of noisy neurons.\n\nMethod\n\nKalman \ufb01lter\n\nLSTM\n\nDual decoder\n\nDyEnsemble (w/o P, w/o D)\nDyEnsemble (P(0.1), w/o D)\nDyEnsemble-2 (P(0.1), D(2))\nDyEnsemble-5 (P(0.1), D(5))\n\nOriginal\n\n0.777\u00b10.000\n0.753\u00b10.017\n0.779\u00b10.000\n0.776\u00b10.002\n0.780\u00b10.008\n0.799\u00b10.012\n0.775\u00b10.015\n\nRat 1\n\nNoisy (#2)\n0.696\u00b10.012\n0.687\u00b10.033\n0.694\u00b10.010\n0.684\u00b10.014\n0.711\u00b10.004\n0.735\u00b10.006\n0.739\u00b10.021\n\nNoisy (#4)\n0.560\u00b10.009\n0.617\u00b10.045\n0.575\u00b10.013\n0.558\u00b10.009\n0.557\u00b10.035\n0.583\u00b10.090\n0.671\u00b10.039\n\nOriginal\n\n0.798\u00b10.000\n0.846\u00b10.021\n0.803\u00b10.000\n0.798\u00b10.002\n0.780\u00b10.006\n0.788\u00b10.009\n0.803\u00b10.009\n\nRat 2\n\nNoisy (#2)\n0.580\u00b10.039\n0.551\u00b10.127\n0.585\u00b10.025\n0.579\u00b10.066\n0.665\u00b10.024\n0.633\u00b10.064\n0.584\u00b10.035\n\nNoisy (#4)\n0.381\u00b10.093\n0.338\u00b10.050\n0.387\u00b10.030\n0.377\u00b10.155\n0.472\u00b10.080\n0.516\u00b10.092\n0.596\u00b10.035\n\n* w/o: without; P(k): weight perturbation with p=k; D(l): neuron dropout with l neurons dropped.\n\n7\n\n\fFigure 6: Evaluation of key parameters.\n\nlower than LSTM (0.846). Overall, with original neural signals, the performance of DyEnsemble is\ncomparable to state-of-the-art approaches.\nWith noisy neurons, the performances of Kalman, LSTM and dual decoder decrease signi\ufb01cantly. For\nRat1, when there are 2 noisy neurons, the CCs of Kalman, LSTM, and dual decoder are 0.696, 0.687\nand 0.694, respectively. The CC of DyEnsemble-5 is 0.739, which improves by 6.2%, 7.6%, 6.5%\ncompared with Kalman, LSTM, and dual decoder, respectively. With 4 noisy neurons, DyEnsemble-5\nachieves a CC of 0.671, which is 19.8%, 8.8% and 16.7% higher than Kalman, LSTM and dual\ndecoder, respectively. Similar results are observed with Rat2. DyEnsemble-5 is more stable and\nrobust than DyEnsemble-2 with noisy neurons especially when there are 4 noisy neurons. Since\nEnsemble-2 only drops 2 neurons in candidate models, the performance decreases when more than 2\nnoisy neurons occur. Dropping more neurons can increase the robustness against noises, while it may\nalso harm estimation accuracy.\n\n4.4\n\nIn\ufb02uence of parameters\n\nHere we evaluate the key parameters in DyEnsemble: model size (s), model number (M), forgetting\nfactor (\u03b1), and weight perturbation factor (p). The results are illustrated in Fig. 6.\nModel size. Model size s de\ufb01nes the number of neurons connected to each candidate model. The rest\nof the parameters is set to: \u03b1 = 0.1, p = 0.1, M = 20. As shown in Fig. 6 (a), overall, CC improves\nwith increase of model size. While for Rat2, CC decreases slightly after model size reaches 18.\nCommonly, a larger model size brings more information, while it also decreases the noise-resistant\nability as discussed in Section 4.3.\nModel number. The model number denotes the number of candidate models in M. The rest of the\nparameters is set to: \u03b1 = 0.1, p = 0.1, s = 15. As shown in Fig. 6 (b), the performance increases\nwhen we tune M from 5 to 20, while the improvement is subtle when M is larger than 20.\nForgetting factor. The parameter of forgetting factor from Eqn. (5) controls the inertia in the\ncandidate model transition. With a large \u03b1, the candidate models prefer to keep the weights from the\nlast time step. From Fig. 6 (c), we \ufb01nd that smaller \u03b1 gives better performance in the neural decoding\ntask, which re\ufb02ects the nonstationary properties of neural signals. The rest of the parameters is set to:\ns = 15, p = 0.1, M = 20.\nWeight perturbation factor. The weight perturbation factor p controls the range that candidate\nmodels can deviate from the mean. A higher p can tolerate larger changes in functional relationships,\nhowever, it also leads to inaccurate predictions. As shown in Fig. 6 (d), the performance improves\nwhen p is adjusted from 0.01 to 0.1, while decreases rapidly when p is bigger than 0.2. The rest of\nthe parameters is set to: s = 15, \u03b1 = 0.1, M = 20.\n\n5 Conclusion\n\nWe proposed a dynamic ensemble modeling approach, called DyEnsemble, for robust movement\ntrajectory decoding from nonstationary neural signals. The DyEnsemble model improved upon the\nclassic state-space model by introducing a dynamic ensemble measurement function, which is capable\nof adapting to changes in neural signals. Experimental results demonstrated that the DyEnsemble\napproach could automatically switch to suitable models to mitigate the effect of noises and cope\nwith different task behaviors. The proposed method can provide valuable solutions for robust neural\ndecoding tasks and nonstationary signal processing problems.\n\n8\n\n\f6 Acknowledgment\n\nThis work was partly supported by grants from the National Key Research and Development Pro-\ngram of China (2018YFA0701400, 2017YFB1002503), National Natural Science Foundation of\nChina (61906166, 61571238, 61906099), Zhejiang Provincial Natural Science Foundation of China\n(LZ17F030001), and the Zhejiang Lab (2018EB0ZX01).\n\nReferences\n[1] J. R. Wolpaw, \u201cBrain\u2013computer interfaces as new brain output pathways,\u201d The Journal of\n\nphysiology, vol. 579, no. 3, pp. 613\u2013619, 2007.\n\n[2] S. Todorova and V. Ventura, \u201cNeural decoding: A predictive viewpoint,\u201d Neural computation,\n\nvol. 29, no. 12, pp. 3290\u20133310, 2017.\n\n[3] S. Zhang, S. Yuan, L. Huang, X. Zheng, Z. Wu, K. Xu, and G. Pan, \u201cHuman mind control of\nrat cyborg\u2019s continuous locomotion with wireless brain-to-brain interface,\u201d Scienti\ufb01c reports,\nvol. 9, no. 1, p. 1321, 2019.\n\n[4] U. Chaudhary, N. Birbaumer, and A. Ramos-Murguialday, \u201cBrain\u2013computer interfaces for\n\ncommunication and rehabilitation,\u201d Nature Reviews Neurology, vol. 12, no. 9, p. 513, 2016.\n\n[5] V. Gilja, C. Pandarinath, C. H. Blabe, P. Nuyujukian, J. D. Simeral, A. A. Sarma, B. L. Sorice,\nJ. A. Perge, B. Jarosiewicz, L. R. Hochberg et al., \u201cClinical translation of a high-performance\nneural prosthesis,\u201d Nature medicine, vol. 21, no. 10, p. 1142, 2015.\n\n[6] C. Qian, X. Sun, S. Zhang, D. Xing, H. Li, X. Zheng, G. Pan, and Y. Wang, \u201cNonlinear\nmodeling of neural interaction for spike prediction using the staged point-process model,\u201d\nNeural computation, vol. 30, no. 12, pp. 3189\u20133226, 2018.\n\n[7] Y. Qi, K. Lin, Y. Wang, F. Ren, Q. Lian, S. Wang, H. Jiang, J. Zhu, Y. Wang, Z. Wu et al.,\n\u201cEpileptic focus localization via brain network analysis on riemannian manifolds,\u201d IEEE Trans-\nactions on Neural Systems and Rehabilitation Engineering, vol. 27, no. 10, pp. 1942\u20131951,\n2019.\n\n[8] D. M. Taylor, S. I. H. Tillery, and A. B. Schwartz, \u201cDirect cortical control of 3d neuroprosthetic\n\ndevices,\u201d Science, vol. 296, no. 5574, pp. 1829\u20131832, 2002.\n\n[9] L. R. Hochberg, M. D. Serruya, G. M. Friehs, J. A. Mukand, M. Saleh, A. H. Caplan, A. Branner,\nD. Chen, R. D. Penn, and J. P. Donoghue, \u201cNeuronal ensemble control of prosthetic devices by\na human with tetraplegia,\u201d Nature, vol. 442, no. 7099, p. 164, 2006.\n\n[10] M. A. Schwemmer, N. D. Skomrock, P. B. Sederberg, J. E. Ting, G. Sharma, M. A. Bockbrader,\nand D. A. Friedenberg, \u201cMeeting brain\u2013computer interface user performance expectations using\na deep neural network decoding framework,\u201d Nature medicine, vol. 24, no. 11, p. 1669, 2018.\n\n[11] K. V. Shenoy and J. M. Carmena, \u201cCombining decoder design and neural adaptation in brain-\n\nmachine interfaces,\u201d Neuron, vol. 84, no. 4, pp. 665\u2013680, 2014.\n\n[12] B. M. Yu, C. Kemere, G. Santhanam, A. Afshar, S. I. Ryu, T. H. Meng, M. Sahani, and\nK. V. Shenoy, \u201cMixture of trajectory models for neural decoding of goal-directed movements,\u201d\nJournal of neurophysiology, vol. 97, no. 5, pp. 3763\u20133780, 2007.\n\n[13] A. K. Vaskov, Z. T. Irwin, S. R. Nason, P. P. Vu, C. S. Nu, A. J. Bullard, M. Hill, N. North,\nP. G. Patil, and C. A. Chestek, \u201cCortical decoding of individual \ufb01nger group motions using re\ufb01t\nkalman \ufb01lter,\u201d Frontiers in neuroscience, vol. 12, 2018.\n\n[14] S.-P. Kim, F. Wood, M. Fellows, J. P. Donoghue, and M. J. Black, \u201cStatistical analysis of\nthe non-stationarity of neural population codes,\u201d in The First IEEE/RAS-EMBS International\nConference on Biomedical Robotics and Biomechatronics. BioRob.\nIEEE, 2006, pp. 811\u2013816.\n\n[15] J. N. Sanes and J. P. Donoghue, \u201cPlasticity and primary motor cortex,\u201d Annual review of\n\nneuroscience, vol. 23, no. 1, pp. 393\u2013415, 2000.\n\n9\n\n\f[16] J. L. Collinger, B. Wodlinger, J. E. Downey, W. Wang, E. C. Tyler-Kabara, D. J. Weber,\nA. J. McMorland, M. Velliste, M. L. Boninger, and A. B. Schwartz, \u201cHigh-performance\nneuroprosthetic control by an individual with tetraplegia,\u201d The Lancet, vol. 381, no. 9866, pp.\n557\u2013564, 2013.\n\n[17] L. R. Hochberg, D. Bacher, B. Jarosiewicz, N. Y. Masse, J. D. Simeral, J. Vogel, S. Haddadin,\nJ. Liu, S. S. Cash, P. van der Smagt et al., \u201cReach and grasp by people with tetraplegia using a\nneurally controlled robotic arm,\u201d Nature, vol. 485, no. 7398, p. 372, 2012.\n\n[18] D. M. Brandman, T. Hosman, J. Saab, M. C. Burkhart, B. E. Shanahan, J. G. Ciancibello, A. A.\nSarma, D. J. Milstein, C. E. Vargas-Irwin, B. Franco et al., \u201cRapid calibration of an intracortical\nbrain\u2013computer interface for people with tetraplegia,\u201d Journal of neural engineering, vol. 15,\nno. 2, p. 026007, 2018.\n\n[19] M. M. Shanechi, A. L. Orsborn, and J. M. Carmena, \u201cRobust brain-machine interface de-\nsign using optimal feedback control modeling and adaptive point process \ufb01ltering,\u201d PLoS\ncomputational biology, vol. 12, no. 4, pp. 1\u201329, 2016.\n\n[20] U. T. Eden, L. M. Frank, R. Barbieri, V. Solo, and E. N. Brown, \u201cDynamic analysis of neural\nencoding by point process adaptive \ufb01ltering,\u201d Neural computation, vol. 16, no. 5, pp. 971\u2013998,\n2004.\n\n[21] Y. Wang and J. C. Principe, \u201cTracking the non-stationary neuron tuning by dual kalman \ufb01lter\nfor brain machine interfaces decoding,\u201d in 30th Annual International Conference of the IEEE\nEngineering in Medicine and Biology Society.\n\nIEEE, 2008, pp. 1720\u20131723.\n\n[22] Y. Wang, X. She, Y. Liao, H. Li, Q. Zhang, S. Zhang, X. Zheng, and J. Principe, \u201cTracking\nneural modulation depth by dual sequential monte carlo estimation on point processes for\nbrain\u2013machine interfaces,\u201d IEEE Transactions on Biomedical Engineering, vol. 63, no. 8, pp.\n1728\u20131741, 2015.\n\n[23] J. A. Hoeting, D. Madigan, A. E. Raftery, and C. T. Volinsky, \u201cBayesian model averaging: a\n\ntutorial,\u201d Statistical science, pp. 382\u2013401, 1999.\n\n[24] B. Liu, \u201cRobust particle \ufb01lter by dynamic averaging of multiple noise models,\u201d in IEEE\nIEEE, 2017,\n\nInternational Conference on Acoustics, Speech and Signal Processing (ICASSP).\npp. 4034\u20134038.\n\n[25] Y. Dai and B. Liu, \u201cRobust video object tracking via bayesian model-averaging based feature\n\nfusion,\u201d Optical Engineering, vol. 55, no. 8, pp. 083 102(1)\u2013083 102(11), 2016.\n\n[26] A. E. Raftery, D. Madigan, and J. A. Hoeting, \u201cBayesian model averaging for linear regression\nmodels,\u201d Journal of the American Statistical Association, vol. 92, no. 437, pp. 179\u2013191, 1997.\n[27] B. Liu, \u201cInstantaneous frequency tracking under model uncertainty via dynamic model averaging\nand particle \ufb01ltering,\u201d IEEE Transactions on Wireless Communications, vol. 10, no. 6, pp. 1810\u2013\n1819, June 2011.\n\n[28] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, \u201cA tutorial on particle \ufb01lters for\nonline nonlinear/non-gaussian bayesian tracking,\u201d IEEE Transactions on signal processing,\nvol. 50, no. 2, pp. 174\u2013188, 2002.\n\n[29] D. R. Kipke, R. J. Vetter, J. C. Williams, and J. F. Hetke, \u201cSilicon-substrate intracortical\nmicroelectrode arrays for long-term recording of neuronal spike activity in cerebral cortex,\u201d\nIEEE transactions on neural systems and rehabilitation engineering, vol. 11, no. 2, pp. 151\u2013155,\n2003.\n\n[30] R. Van Der Merwe, A. Doucet, N. De Freitas, and E. A. Wan, \u201cThe unscented particle \ufb01lter,\u201d in\n\nAdvances in neural information processing systems, 2001, pp. 584\u2013590.\n\n[31] S. Hochreiter and J. Schmidhuber, \u201cLong short-term memory,\u201d Neural computation, vol. 9,\n\nno. 8, pp. 1735\u20131780, 1997.\n\n[32] Y. Wang, K. Lin, Y. Qi, Q. Lian, S. Feng, Z. Wu, and G. Pan, \u201cEstimating brain connectivity with\nvarying-length time lags using a recurrent neural network,\u201d IEEE Transactions on Biomedical\nEngineering, vol. 65, no. 9, pp. 1953\u20131963, 2018.\n\n10\n\n\f", "award": [], "sourceid": 3270, "authors": [{"given_name": "Yu", "family_name": "Qi", "institution": "Zhejiang University"}, {"given_name": "Bin", "family_name": "Liu", "institution": "Nanjing University of Posts and Telecommunications"}, {"given_name": "Yueming", "family_name": "Wang", "institution": "Zhejiang University"}, {"given_name": "Gang", "family_name": "Pan", "institution": "Zhejiang University"}]}