{"title": "Estimating vector fields using sparse basis field expansions", "book": "Advances in Neural Information Processing Systems", "page_first": 617, "page_last": 624, "abstract": "We introduce a novel framework for estimating vector fields using sparse basis field expansions (S-FLEX). The notion of basis fields, which are an extension of scalar basis functions, arises naturally in our framework from a rotational invariance requirement. We consider a regression setting as well as inverse problems. All variants discussed lead to second-order cone programming formulations. While our framework is generally applicable to any type of vector field, we focus in this paper on applying it to solving the EEG/MEG inverse problem. It is shown that significantly more precise and neurophysiologically more plausible location and shape estimates of cerebral current sources from EEG/MEG measurements become possible with our method when comparing to the state-of-the-art.", "full_text": "Estimating vector \ufb01elds using\nsparse basis \ufb01eld expansions\n\nStefan Haufe1, 2, * Vadim V. Nikulin3, 4 Andreas Ziehe1, 2 Klaus-Robert M\u00a8uller1, 2, 4\n\nGuido Nolte2\n\n1TU Berlin, Dept. of Computer Science, Machine Learning Laboratory, Berlin, Germany\n\n2Fraunhofer Institute FIRST (IDA), Berlin, Germany\n\n3Charit\u00b4e University Medicine, Dept. of Neurology, Campus Benjamin Franklin, Berlin, Germany\n\n4Bernstein Center for Computational Neuroscience, Berlin, Germany\n\n* haufe@cs.tu-berlin.de\n\nAbstract\n\nWe introduce a novel framework for estimating vector \ufb01elds using sparse basis\n\ufb01eld expansions (S-FLEX). The notion of basis \ufb01elds, which are an extension\nof scalar basis functions, arises naturally in our framework from a rotational in-\nvariance requirement. We consider a regression setting as well as inverse prob-\nlems. All variants discussed lead to second-order cone programming formula-\ntions. While our framework is generally applicable to any type of vector \ufb01eld, we\nfocus in this paper on applying it to solving the EEG/MEG inverse problem. It\nis shown that signi\ufb01cantly more precise and neurophysiologically more plausible\nlocation and shape estimates of cerebral current sources from EEG/MEG measure-\nments become possible with our method when comparing to the state-of-the-art.\n\n1 Introduction\n\nCurrent machine learning is frequently concerned with the estimation of functions with multivariate\noutput. While in many cases the outputs can be treated as mere collections of scalars (e.g. different\ncolor channels in image processing), in some contexts there might be a deeper interpretation of them\nas spatial vectors with a direction and a magnitude. Such \u201ctruly\u201d vectorial functions are called vector\n\ufb01elds and become manifest for example in optical \ufb02ow \ufb01elds, electromagnetic \ufb01elds and wind \ufb01elds\nin meteorology. Vector \ufb01eld estimators have to take into account that the numerical representation of\na vector depends on the coordinate system it is measured in. That is, the estimate should be invariant\nwith respect to a rotation of the coordinate system.\nLet v : RP (cid:55)\u2192 RQ be a vector \ufb01eld. Mathematically speaking, we are seeking to approximate v\nby a \ufb01eld \u02c6v using empirical measurements. Here we consider two types of measurements. The \ufb01rst\ntype are direct samples (xn, yn), xn \u2208 RP , yn \u2208 RQ, n = 1, . . . , N of v leading to a regression\nproblem. The second case occurs, if only indirect measurements zm \u2208 R, m = 1, . . . , M are\navailable, which we assume to be generated by a known linear1 transformation of the vector \ufb01eld\noutputs yn belonging to nodes xn, n = 1, . . . , N. This kind of estimation problem is known as\nan inverse problem. Let z = (z1, . . . , zM )T denote the vector of indirect measurements, Y =\nN )T the N \u00d7 Q matrix of vector \ufb01eld outputs and vec(Y ) a column vector containing\n(yT\nthe stacked transposed rows of Y . The linear relationship between Y and z can be written as z =\nF vec(Y ) using the forward model F \u2208 RM\u00d7N Q.\n\n1 , . . . , yT\n\n1If the true relation is nonlinear, it is here assumed to be linearized.\n\n1\n\n\fAs an example of an inverse problem consider the way humans localize acoustic sources. Here z\ncomprises the signal arriving at the ears, v is the spatial distribution of the sound sources and F\nis given by physical equations of sound propagation. Using information from two ears, humans\ndo already very well in estimating the direction of incoming sounds. By further incorporating prior\nknowledge, e.g. on the loudness of the sources, v can usually be well approximated. The use of prior\nknowledge (a.k.a. regularization) is indeed the most effective strategy for solving inverse problems\n[13], which are inherently ambiguous. Hence, the same mechanisms used to avoid over\ufb01tting in,\ne.g., regression may be applied to cope with the ambiguity of inverse problems.\nFor the estimation of scalar functions, methods that utilize sparse linear combinations of basis func-\ntions have gained considerable attention recently (e.g. the \u201classo\u201d [14]). Apart from the computa-\ntional tractability that comes with the sparsity of the learned model, the possibility of interpreting the\nestimates in terms of their basis functions is a particularly appealing feature of these methods. While\nsparse expansions are also desirable in vector \ufb01eld estimation, lasso and similar methods cannot be\nused for that purpose, as they break rotational invariance in the output space RQ. This is easily seen\nas sparse methods tend to select different basis functions in each of the Q dimensions.\nOnly few attempts have been made on rotation-invariant sparse vector \ufb01eld expansions so far. In [8]\na dense expansion is discussed, which could be modi\ufb01ed to a sparse version maintaining rotational\ninvariance. Unfortunately, this method is restricted to approximating curl-free \ufb01elds. In contrast,\nwe here propose a method that can be used to decompose any vector \ufb01eld. We will derive the\ngeneral framework in section 2. In section 3 we will apply the (appropriately customized) method\nfor solving the EEG/MEG inverse problem. Finally, we will draw a brief conclusion in section 4.\n\n2 Method\n\nOur model is based on the assumption that v can be well approximated by a linear combination\nof some basis \ufb01elds. A basis \ufb01eld is de\ufb01ned here (unlike in [8]) as a vector \ufb01eld, in which all\noutput vectors point in the same direction, while the magnitudes are proportional to a scalar (basis)\nfunction b : RP (cid:55)\u2192 R. As demonstrated in Fig. 1, this model has an expressive power which\nis comparable to a basis function expansion of scalar functions. Given a set (dictionary) of basis\nfunctions bl(x), l = 1, . . . , L, the basis \ufb01eld expansion is written as\n\nl=1\n\nv(x) =\n\nclbl(x) ,\n\n(1)\nwith coef\ufb01cients cl \u2208 RQ, l = 1, . . . , L to be estimated. Note that by including one coef\ufb01cient for\neach output dimension, both orientations and proportionality factors are learned in this model (the\nterm \u201cbasis \ufb01eld\u201d thus refers to a basis function with learned coef\ufb01cients). In order to select a small\nset of \ufb01elds, most of the coef\ufb01cient vectors cl have to vanish. This can be accomplished by solving\na least-squares problem with an additional lasso-like (cid:96)1-norm penalty on the coef\ufb01cients. However,\ncare has to be taken in order to maintain rotational invariance of the solution. We here propose to use\na regularizer that imposes sparsity and is invariant with respect to rotations, namely the (cid:96)1-norm of\nthe magnitudes of the coef\ufb01cient vectors. Let C = (c1, . . . , cL)T \u2208 RL\u00d7Q contain the coef\ufb01cients\nand\n\nL(cid:88)\n\n\uf8eb\uf8ec\uf8ed b1(x1)\n\n...\n\nB =\n\n. . .\n\nbL(x1)\n\n...\n\nb1(xN )\n\n. . .\n\nbL(xN )\n\n\uf8f6\uf8f7\uf8f8 \u2208 RN\u00d7L\n\n(2)\n\nC\n\nthe basis functions evaluated at the xn. The parameters are estimated using\n\n\u02c6C = arg min\n\nL(C) + \u03bbR(C) ,\n\nwhere R(C) = (cid:107)C(cid:107)1,2 =(cid:80)L\n\n(3)\nl=1 (cid:107)cl(cid:107)2 is the regularizer (the so-called (cid:96)1,2-norm of the matrix C),\nL(C) is the quadratic loss function, which is de\ufb01ned by L(C) = (cid:107) vec(Y \u2212BC)(cid:107)2\n2 in the regression\ncase and L(C) = (cid:107)z\u2212F vec(BC)(cid:107)2\n2 in the inverse reconstruction case, and \u03bb is a positive constant.\nIn the statistics literature (cid:96)1,2-norm regularization is already known as a general mechanism for\nachieving sparsity of grouped predictors [18]. Besides vector \ufb01eld estimation, this concept has\nnatural applications in, e.g, multiple kernel learning [1] and channel selection for brain computer\ninterfacing [15]. It has also recently been considered in the general multiple output setting [17].\n\n2\n\n\fFigure 1: Complicated vector \ufb01eld (SUM) as a sum of three basis \ufb01elds (1-3).\n\n2.1 Rotational Invariance\n\nRotational invariance, in the sense that the estimates after rotation of the coordinates axes are equal\nto the rotated estimates, is a desirable property of an estimator. One has to distinguish invariance\nin input- from invariance in output space. The former requirement may arise in many estimation\nsettings and can be ful\ufb01lled by the choice of appropriate basis functions bl(x). The latter one is\nspeci\ufb01c to vector \ufb01eld estimation and has to be assured by formulating a rotationally invariant cost\nfunction. Our proposed estimator Eq. 3 is rotationally invariant. This is due to the use of the (cid:96)2-\nnorm in output space RQ, which does not change under rotation. I.e. for an orthogonal matrix\nR \u2208 RQ\u00d7Q, RT R = I\n\nL(cid:88)\n\n(cid:107)Rcl(cid:107)2 =\n\nL(cid:88)\n\n(cid:113)\n\nL(cid:88)\n\ntr(cT\n\nl RT Rcl) =\n\n(cid:107)cl(cid:107)2 .\n\n(4)\n\nl=1\n\nl=1\n\nl=1\n\nFor the same argument, additional regularizers R\u2217(C) = (cid:107) vec(D\u2217C)(cid:107)2\n2 (the well-known Tikhonov\nregularizer) or R+(C) = (cid:107)D+C(cid:107)1,2 (promoting sparsity of the linearly transformed vectors) may\nbe introduced without breaking the rotational invariance in RQ.\n\n2.2 Optimization\nEq. 3 is a convex problem, composed of the quadratic term L(C) and the convex nondifferentiable\nterm R(C). It is equivalent to the following program\n\nL(cid:80)\n\n\u02c6C = arg min\n\nC,u\n\ns.t. (cid:107)cl(cid:107)2 \u2264 ul ,\n\nul\n\nl = 1, . . . , L\n\nl=1\n\nL(C) \u2264 \u03b5 ,\n\n(5)\n\nin which a linear function of the variables is minimized subject to quadratic and second-order cone\nconstraints [6]. The latter constraints are obtained by introducing auxiliary variables ul \u2208 R, l =\n1, . . . , L encoding upper bounds of the magnitudes of the coef\ufb01cient vectors. Problem Eq. 5 is\nan instance of second-order cone programming (SOCP), a standard class of convex programs, for\nwhich ef\ufb01cient interior-point based solvers are available. The problem stays inside the SOCP class\neven if the original formulation is modi\ufb01ed in any of the following ways:\n\n\u2022 Additional regularizers R+(C) or R\u2217(C) are used.\n\u2022 The quadratic loss function is replaced by a more robust (cid:96)1-norm based loss (e.g. hinge\nloss). In the regression case, this loss should be de\ufb01ned based on the magnitude of the\nresidual vector, which leads to a formulation involving the (cid:96)1,2-norm (and thus additional\nSOCP constraints).\n\n\u2022 Complex basis functions (e.g. Fourier bases or Morlet wavelets) are used. This approach\nalso requires complex coef\ufb01cients, by which it is then possible not only to optimally scale\nthe basis functions, but also to optimally shift their phase. Similarly, it is possible to recon-\nstruct complex vector \ufb01elds from complex measurements using real-valued basis functions.\n\n3\n\n123SUM\f3 Application to the EEG/MEG inverse problem\n\nVector \ufb01elds occur, for example, in form of electrical currents in the brain, which are produced by\npostsynaptic neuronal processes. Knowledge of the electrical \ufb01elds during a certain experimental\ncondition allows one to draw conclusions about the locations in which the cognitive processing\ntakes place and is thus of high value for research and medical diagnosis. Invasive measurements\nallow very local assessment of neuronal activations, but such procedure in humans is only possible\nwhen electrodes are implanted for treatment/diagnosis of neurological diseases, e.g., epilepsy. In\nthe majority of cases recordings of cortical activity are performed with non-invasive measures such\nas electro- and magnetoencephalography, EEG and MEG respectively. The reconstruction of the\ncurrent density from such measurements is an inverse problem.\n\n3.1 Method speci\ufb01cation\n\nIn the following the task is to infer the generating cerebral current density given an EEG measure-\nment z \u2208 RM . The current density is a vector \ufb01eld v : R3 (cid:55)\u2192 R3 assigning a vectorial current source\nto each location in the brain. We obtained a realistic head model from high-resolution MRI (mag-\nnetic resonance imaging) slices of a human head [4]. Inside the brain, we arranged 2142 nodes in a\nregular grid of 1 cm distance. The forward mapping F \u2208 RM\u00d72142\u00b73 from these nodes to the elec-\ntrodes was constructed according to [9] \u2013 taking into account the realistic geometry and conductive\nproperties of brain, skull and skin.\n\nDictionary\n\nIn most applications the \u201ctrue\u201d sources are expected to be small in number and spatial extent. How-\never, many commonly used methods estimate sources that almost cover the whole brain (e.g. [11]).\nAnother group of methods delivers source estimates that are spatially sparse, but usually not ro-\ntationally invariant (e.g. [7]). Here often too many sources, which are scattered around the true\nsources, are estimated. Both the very smooth and the very sparse estimates are unrealistic from a\nphysiological point of view. Only very recently, approaches capable of achieving a compromise be-\ntween these two extremes have been outlined [16, 3]. For achieving a similar effect we here propose\na sparse basis \ufb01eld expansion using radial basis functions. More speci\ufb01cally we consider spherical\nGaussians\n\n(cid:19)\n\n(cid:18)\n\n\u22121\n2\n\nbn,s(x) = (2\u03c0\u03c3s)\u2212 3\n\n2 exp\n\n(cid:107)x \u2212 xn(cid:107)2\n\n2 \u03c3\u22122\n\ns\n\n(6)\n\ns = 1, . . . , 4, having spatial standard deviations \u03c31 = 0.5 cm, \u03c32 = 1 cm, \u03c33 = 1.5 cm, \u03c34 = 2 cm\nand being centered at nodes xn, n = 1, . . . , N (see Fig. 2 for examples). Using this redundant\ndictionary our expectation is that sources of different spatial extent can be reconstructed by selecting\nthe appropriate basis functions. Unlike the approaches taken in [16, 3] this approach does not require\nan additional hyperparameter for controlling the tradeoff between sparsity and smoothness.\n\nFigure 2: Gaussian basis functions with \ufb01xed center and standard deviations 0.5 cm \u2212 2 cm.\n\nNormalization\n\nOur (cid:96)1,2-norm based regularization is a heuristic for selecting the smallest possible number of basis\n\ufb01elds necessary to explain the measurement. Using this approach, however, not only the number\nof nonzero coef\ufb01cient vectors, but also their magnitudes enter the cost function.\nIt is therefore\nimportant to normalize the basis functions in order not to a-priori prefer some of them. Let Bs\nbe the N \u00d7 N matrix containing the basis functions with standard deviation \u03c3s. The large matrix\nB = (B1/(cid:107) vec(B1)(cid:107)1, . . . , B4/(cid:107) vec(B4)(cid:107)1) \u2208 RN\u00d74N is then constructed using normalized Bs.\nBy this means, no length scale is arti\ufb01cially prefered.\n\n4\n\n\f\uf8eb\uf8ec\uf8ed W1\n\n...\n0\n\nW =\n\n0\n. . .\n...\n...\n. . . WN\n\n\uf8f6\uf8f7\uf8f8 \u2208 R3N\u00d73N ,\n(cid:80)L\n\nl=1 \u02c6clbl(xn).\n\n(7)\n\nAn estimation bias is also introduced by the location of the sources. Due to volume conduction, the\nsignal captured at the sensors is much stronger for super\ufb01cial sources compared to deep sources.\n\nIn [10] the variance estimate \u02c6S = \u00afF T(cid:0) \u00afF \u00afF T(cid:1)\u22121 \u00afF \u2208 R3N\u00d73N is derived for the (least-squares)\n\nestimated sources, where \u00afF = HF and H = I \u2212 11T /1T 1 \u2208 RM\u00d7M . We found that \u02c6S can be\nused for removing the location bias. This can be done by either penalizing activity at locations with\nhigh variance or by penalizing basis functions with high variance in the center. We here employ the\nformer approach, as the latter may be problematic for basis functions with large extent. Using this\napproach, evaluation of \u02c6v(x) requires knowledge of the forward model for x. Therefore, we restrict\nourselves here to nodes xn, n = 1, . . . , N. Let Wn \u2208 R3\u00d73 denote the inverse matrix square root of\nthe part of \u02c6S belonging to node xn. De\ufb01ning\n\nthe coef\ufb01cients are estimated using \u02c6C = arg min\nestimated current density at node xn is \u02c6v(xn) = Wn\n\nC\n\n(cid:107)C(cid:107)1,2\n\ns.t. (cid:107)z \u2212 F W vec(BC)(cid:107)2\n\n2 < \u03b5. The\n\n3.2 Experiments\n\nValidation of methods for inverse reconstruction is generally dif\ufb01cult due to the lack of a \u201cground\ntruth\u201d. The measurements z cannot be used in this respect, as the main goal is not to predict the\nEEG/MEG measurements, but the vector \ufb01eld v(x) as accurately as possible. Therefore, the only\nway to evaluate inverse methods is to assess their ability to reconstruct known functions. We do\nthis by reconstructing a) simulated current sources and b) sources of real EEG data that are already\nwell-localized by other studies. For each EEG measurement, simulated or not, we conduct a 5 \u00d7 5\ncrossvalidation, i.e. we perform 25 inverse reconstructions based on different training sets contain-\ning 80 % of the electrodes. In each crossvalidation run, we evaluate two criteria. Most important\nis the reconstruction error, de\ufb01ned as Cy = (cid:107) vec(Y )/(cid:107) vec(Y )(cid:107)2 \u2212 vec( \u02c6Y tr)/(cid:107) vec( \u02c6Y tr)(cid:107)2(cid:107)2,\nwhere \u02c6Y tr are the vector \ufb01eld outputs at nodes xn, n = 1, . . . , N estimated using only the training\nset. This criterion can only be evaluated for the simulated data. For real and simulated data we also\nevaluate the generalization error, i.e. the error in the prediction of the remaining 20% (the test set)\nof the EEG measurements. This is de\ufb01ned as Cz = (cid:107)zte \u2212 F te vec( \u02c6Y tr)(cid:107)2\n2, where zte and F te are\nthe parts of z and F belonging to the test set.\nWe compared the sparse basis \ufb01eld expansion (S-FLEX) approach using Gaussian basis functions\n(see section 3.1) to the commonly used approaches of LORETA [11] and Minimum Current Estimate\n(MCE) [7], and the recently proposed Focal Vector\ufb01eld Reconstruction (FVR) technique [3]. All\nthree competitors correspond to using unit impulses as basis functions while employing different\nregularizers. The LORETA solution, e.g., is a Tikhonov regularized least-squares estimate while\nMCE is equivalent to applying lasso to each dimension separately, yielding current vectors that are\nbiased towards being axes-parallel. We here used a variant of MCE, in which the original depth\ncompensation approach was replaced by the approach outlined in section 3.1. Interestingly, FVR\ncan be interpreted as a special case of S-FLEX employing the rotation-invariant regularizer R+(C)\nto enforce both sparsity and smoothness. The tradeoff parameter \u03b1 of this method was chosen as\nsuggested in [3]. All methods were formulated such that the \ufb01tness of the solution was ensured by\nthe constraint (cid:107)z \u2212 F vec( \u02c6Y tr)(cid:107)2\n2 < \u03b5. The optimization was carried out using freely available\npackages for convex programming [12, 2].\n\nSimulated data\n\nWe simulated current densities in the following way. First, we sampled outputs yn, n = 1, . . . , N\nfrom a multivariate standard normal distribution. The function (xn, yn) was then spatially smoothed\nusing a Gaussian lowpass \ufb01lter with standard deviation 2.5 cm. Finally, each yn was shortened by\nthe 90th percentile of the magnitudes of all yn \u2013 leaving only 10% of the current vectors active.\nCurrent densities obtained by this procedure usually feature 2-3 active patches (sources) with small\nto medium extent and smoothly varying magnitude and orientation (see Fig. 3 for an example). This\n\n5\n\n\fbehaviour was considered consistent with the general believe on the sources. We simulated \ufb01ve\ndensities and computed respective pseudo-measurements for 118 channels using the forward model\nF . As no noise was injected in the system, \u03b5 was set to zero in the following reconstruction.\n\nReal data\n\nWe recorded 113-channel EEG of one healthy subject (male, 26 years) during electrical median\nnerve stimulation. The EEG electrodes were positioned according to the international 10-20 sys-\ntem. The exact positions were obtained using a 3D digitizer and mapped onto the surface of the\nhead model. EEG data were recorded with sampling frequency of 2500 Hz and digitally bandpass-\n\ufb01ltered between 15 Hz and 450 Hz. Left and right median nerves were stimulated in separate blocks\nby applying constant square 0.2 ms current pulses to the respective thenars. Current pulses had\nintensities above motor threshold (approx. 9 mA), inducing unintended twitches of the thumbs.\nThe interstimulus interval varied randomly between 500 ms and 700 ms. About 1100 trials were\nrecorded for each hand. Artifactual trials as well as artifactual electrodes were excluded from the\nanalysis. For the remaining data, baseline correction was done based on the mean amplitude in the\nprestimulus interval (-100 ms to -10 ms). Finally, a single measurement vector was constructed by\naveraging the EEG amplitudes at 21 ms across 1946 trials (50% left hand, 50% right hand). By this\nmeans the EEG response to somatosensory input at the hands was captured with high signal-to-noise\nratio (SNR). Based on that the brain areas representing left and right hand were to be reconstructed\nwith \u03b5 set according to the estimated SNR.\n\n3.3 Results\n\nFig. 3 shows a simulated current density along with reconstructions according to LORETA, MCE,\nFVR and S-FLEX. From the \ufb01gure it becomes apparent, that LORETA and MCE do not approximate\nthe true current density very well. While the LORETA solution is rather blurry, merging the two true\nsources, the MCE solution exhibits many spikes, which could easily be misinterpreted as different\nsources. Note that the strong orientation bias of MCE cannot be seen in Fig. 3 as only dipole\namplitudes are plotted. The estimates of FVR and S-FLEX approximately recover the shape of the\nsources. S-FLEX comes closest to the true shape, as its estimates are less focal than the ones of\nFVR. However, S-FLEX still slightly underestimate the extent of the sources.\nThe localization results of left and right N20 generators are shown in Fig. 4. The solutions of FVR\nand S-FLEX are almost indistinguishable. Both show activity concentrated in two major patches,\none in each contralateral somatosensory cortex. This is in good agreement with the localization of\nthe hand areas reported in the literature (e.g. [5]). LORETA estimates only one large active region\nover the whole central area, with the maximum lying exactly in between the hand areas. The MCE\nsolution consists of eight spikes scattered across the whole somatosensory area.\nTab. 1 shows that S-FLEX generalizes better than its competitors, although insigni\ufb01cantly. More\nimportantly S-FLEX outperforms its peers in terms of reconstruction accuracy. The distance to\nthe runner-up FVR is, however, larger than expected from Fig. 3. This is due to the fact that the\nparameter of FVR controlling the tradeoff between sparsity and smoothness was \ufb01xed here to a\nvalue promoting \u201cmaximally sparse sources which are still smooth\u201d. While this might be a good\nassumption in practise, it was not rewarded in our validation setting. We here explicitly required\nreconstruction rather than shrinkage of the sources.\n\nCy SIM\n\nLORETA 1.00 \u00b1 0.01\n0.955 \u00b1 0.02\nFVR\n0.71 \u00b1 0.04\nS-FLEX\n1.21 \u00b1 0.01\nMCE\n\nCz SIM\n2.87 \u00b1 0.78\n1.21 \u00b1 1.00\n0.952 \u00b1 0.28\n1.86 \u00b1 0.57\n\nCz REAL\n8.18 \u00b1 1.38\n8.01 \u00b1 1.79\n7.95 \u00b1 1.84\n8.13 \u00b1 1.60\n\nTable 1: Ability of LORETA, FVR, S-FLEX and MCE to reconstruct simulated currents (Cy SIM)\nand generalization performance with respect to the EEG measurements (Cz SIM/REAL). Winning\nentries (reaching signi\ufb01cance) are shown in bold face.\n\n6\n\n\fSIM\n\nLORETA\n\nFVR\n\nS-FLEX\n\nMCE\n\nFigure 3: Simulated current density (SIM) and reconstruction according to LORETA, FVR, S-FLEX\nand MCE. Color encodes current magnitude.\n\nLORETA\n\nFVR\n\nS-FLEX\n\nMCE\n\nFigure 4: Localization of somatosensory evoked N20 generators according to LORETA, FVR,\nS-FLEX and MCE. Color encodes current magnitude.\n\n7\n\n\f4 Conclusion and Outlook\n\nThis paper contributes a novel and general methodology for obtaining sparse decompositions of\nvector \ufb01elds. An important ingredient of our framework is the insight that the vector \ufb01eld estimate\nshould be invariant with respect to a rotation of the coordinate system.\nInterestingly, the latter\nconstraint together with sparsity leads to a second-order cone programming formulation.\nWe have focussed here on solving the EEG/MEG inverse problem, where our proposed S-FLEX\napproach outperformed the state-of-the-art in approximating the true shape of the current sources.\nHowever, other \ufb01elds might as well bene\ufb01t from the use of S-FLEX: in meteorology for example, an\nimproved decomposition of wind \ufb01elds into their driving components might provide novel insights\nthat could be useful for better weather forecasting.\n\nAcknowledgments\n\nThis work was supported in part by the German BMBF grants BCCNB-A4 (FKZ 01GQ0415),\nBFNTB-A1 (FKZ 01GQ0850) and FaSor (FKZ 16SV2234). We thank Friederike Hohlefeld and\nMonika Weber for help in preparing the experiment, and Ryota Tomioka for fruitful discussions.\n\nReferences\n[1] F.R. Bach, G.R.G. Lanckriet, and M.I. Jordan. Multiple kernel learning, conic duality and the SMO\n\nalgorithm. In Proceedings of the Twenty-\ufb01rst International Conference on Machine Learning, 2004.\n\n[2] M. Grant, S. Boyd, and Y. Ye. CVX: Matlab Software for Disciplined Convex Programming, October\n\n2006. http://www.stanford.edu/\u02dcboyd/cvx/, Version 1.0RC.\n\n[3] S. Haufe, V.V. Nikulin, A. Ziehe, K.-R. M\u00a8uller, and G. Nolte. Combining sparsity and rotational invari-\n\nance in EEG/MEG source reconstruction. NeuroImage, 42(2):26\u2013738, 2008.\n\n[4] C.J. Holmes, R. Hoge, L. Collins, R. Woods, A.W. Toga, and A.C. Evans. Enhancement of MR images\n\nusing registration for signal averaging. J. Comput. Assist. Tomogr., 22(2):324\u2013333, 1998.\n\n[5] J. Huttunen, S. Komssi, and L. Lauronen. Spatial dynamics of population activities at S1 after median\n\nand ulnar nerve stimulation revisited: An MEG study. NeuroImage, 32:1024\u20131031, 2006.\n\n[6] M.S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret. Applications of second-order cone programming.\n\nLin. Alg. Appl., 284:193\u2013228, 1998.\n\n[7] K. Matsuura and Y. Okabe. Selective minimum-norm solution of the biomagnetic inverse problem. IEEE\n\nTrans. Biomed. Eng., 42:608\u2013615, 1995.\n\n[8] F.A. Mussa-Ivaldi. From basis functions to basis \ufb01elds: vector \ufb01eld approximation from sparse data. Biol.\n\nCybern., 67:479\u2013489, 1992.\n\n[9] G. Nolte and G. Dassios. Analytic expansion of the EEG lead \ufb01eld for realistic volume conductors. Phys.\n\nMed. Biol., 50:3807\u20133823, 2005.\n\n[10] R.D. Pascual-Marqui. Standardized low-resolution brain electromagnetic tomography (sLORETA): tech-\n\nnical details. Meth. Find. Exp. Clin. Pharmacol., 24(1):5\u201312, 2002.\n\n[11] R.D. Pascual-Marqui, C.M. Michel, and D. Lehmann. Low resolution electromagnetic tomography: a\n\nnew method for localizing electrical activity in the brain. Int. J. Psychophysiol., 18:49\u201365, 1994.\n\n[12] J.F. Sturm. Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim.\n\nMethod. Softw., 11\u201312:625\u2013653, 1999.\n\n[13] A. Tarantola. Inverse Problem Theory and Model Parameter Estimation. SIAM, Philadelphia, 2005.\n[14] R. Tibshirani. Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B Meth., 58(1):267\u2013288,\n\n1996.\n\n[15] R. Tomioka and S. Haufe. Combined classi\ufb01cation and channel/basis selection with L1-L2 regularization\nwith application to P300 speller system. In Proceedings of the 4th International Brain-Computer Interface\nWorkshop and Training Course 2008. Verlag der Technischen Universit\u00a8at Graz, 2008.\n\n[16] M. Vega-Hern\u00b4andez, E. Mart\u00b4\u0131nez-Montes, J.M. S\u00b4anchez-Bornot, A. Lage-Castellanos, and P.A. Vald\u00b4es-\nIn\n\nSosa. Penalized least squares methods for solving the EEG inverse problem. Stat. Sinica, 2008.\npress.\n\n[17] D.P. Wipf and B.D. Rao. An empirical bayesian strategy for solving the simultaneous sparse approxima-\n\ntion problem. IEEE Trans. Signal Proces., 55(7):3704\u20133716, 2007.\n\n[18] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. J. Roy. Stat.\n\nSoc. B Meth., 68(1):49\u201367, 2006.\n\n8\n\n\f", "award": [], "sourceid": 6, "authors": [{"given_name": "Stefan", "family_name": "Haufe", "institution": null}, {"given_name": "Vadim", "family_name": "Nikulin", "institution": null}, {"given_name": "Andreas", "family_name": "Ziehe", "institution": null}, {"given_name": "Klaus-Robert", "family_name": "M\u00fcller", "institution": null}, {"given_name": "Guido", "family_name": "Nolte", "institution": null}]}