{"title": "Predicting Brain States from fMRI Data: Incremental Functional Principal Component Regression", "book": "Advances in Neural Information Processing Systems", "page_first": 537, "page_last": 544, "abstract": "We propose a method for reconstruction of human brain states directly from functional neuroimaging data. The method extends the traditional multivariate regression analysis of discretized fMRI data to the domain of stochastic functional measurements, facilitating evaluation of brain responses to naturalistic stimuli and boosting the power of functional imaging. The method searches for sets of voxel timecourses that optimize a multivariate functional linear model in terms of Rsquare-statistic. Population based incremental learning is used to search for spatially distributed voxel clusters, taking into account the variation in Haemodynamic lag across brain areas and among subjects by voxel-wise non-linear registration of stimuli to fMRI data. The method captures spatially distributed brain responses to naturalistic stimuli without attempting to localize function. Application of the method for prediction of naturalistic stimuli from new and unknown fMRI data shows that the approach is capable of identifying distributed clusters of brain locations that are highly predictive of a specific stimuli.", "full_text": "Predicting Brain States from fMRI Data:\n\nIncremental Functional Principal Component\n\nRegression\n\nS. Ghebreab\n\nISLA/HCS lab, Informatics Institute\n\nA.W.M. Smeulders\n\nISLA lab, Informatics Institute\n\nUniversity of Amsterdam, The Netherlands\n\nUniversity of Amsterdam, The Netherlands\n\nghebreab@science.uva.nl\n\nsmeulders@science.uva.nl\n\nP. Adriaans\n\nHCS lab, Informatics Institute\n\nUniversity of Amsterdam, The Netherlands\n\npietera@science.uva.nl\n\nAbstract\n\nWe propose a method for reconstruction of human brain states directly from func-\ntional neuroimaging data. The method extends the traditional multivariate re-\ngression analysis of discretized fMRI data to the domain of stochastic functional\nmeasurements, facilitating evaluation of brain responses to complex stimuli and\nboosting the power of functional imaging. The method searches for sets of voxel\ntime courses that optimize a multivariate functional linear model in terms of R2-\nstatistic. Population based incremental learning is used to identify spatially dis-\ntributed brain responses to complex stimuli without attempting to localize func-\ntion \ufb01rst. Variation in hemodynamic lag across brain areas and among subjects\nis taken into account by voxel-wise non-linear registration of stimulus pattern to\nfMRI data. Application of the method on an international test benchmark for\nprediction of naturalistic stimuli from new and unknown fMRI data shows that\nthe method successfully uncovers spatially distributed parts of the brain that are\nhighly predictive of a given stimulus.\n\n1 Introduction\n\nTo arrive at a better understanding of human brain function, functional neuroimaging traditionally\nstudies the brain\u2019s responses to controlled stimuli. Controlled stimuli have the bene\ufb01t of leading to\nclear and often localized response signals in fMRI as they are speci\ufb01cally designed to a\ufb00ect only\ncertain brain functions. The drawback of controlled stimuli is that they are a reduction of reality: one\ncannot be certain whether the response is due to the reduction or due to the stimulus. Naturalistic\nstimuli open the possibility to avoid the question whether the response is due to the reduction or\nthe signal. Naturalistic stimuli, however, carry a high information content in their spatio-temporal\nstructure that is likely to instigate complex brain states. The immediate consequence hereof is that\none faces the task of isolating relevant responses amids complex patterns.\n\nTo reveal brain responses to naturalistic stimuli, advanced signal processing methods are required\nthat go beyond conventional mass univariate data analysis. Univariate techniques generally lack\nsu\ufb03cient power to capture the spatially distributed response of the brain to naturalistic stimuli. Mul-\ntivariate pattern techniques, on the other hand, have the capacity to identify patterns of information\nwhen they are present across the full spatial extent of the brain without attempting to localize func-\n\n\ftion. Here, we propose a multivariate pattern analysis approach for predicting naturalistic stimuli\non the basis of fMRI data. Inverting the task from correlating stimuli with fMRI data to predicting\nstimuli from fMRI data makes it easier to evaluate brain responses to naturalistic stimuli and may\nextend the power of functional imaging substantially [1].\n\nVarious multivariate approaches for reconstruction of brain states directly from fMRI measurements\nhave recently been proposed. In most of these approaches, a classi\ufb01er is trained directly on the fMRI\ndata to discriminate between known di\ufb00erent brain states. This classi\ufb01er is then used to predict brain\nstates on the basis of new and unknown fMRI data alone. Such approaches have been used to predict\nwhat percept is dominant in a binocular rivalry protocol [2], what the orientation is of structures sub-\njects are viewing [3] and what the semantic category is of objects [4] and words [5] subjects see on\na screen. In one competition [6], participants trained pattern analyzers on fMRI of subjects viewing\ntwo short movies as well as on the subject\u2019s movie feature ratings. Then participants employed the\nanalyzers to predict the experience of subjects watching a third movie based purely on fMRI data.\nVery accurate predictions were reported for identifying the presence of speci\ufb01c time varying movie\nfeatures (e.g. faces, motion) and the observers who coded the movies [7].\n\nWe propose an incremental multivariate linear modeling approach for functional covariates, i.e.\nwhere both the fMRI data and external stimuli are continuous. This approach di\ufb00ers fundamentally\nfrom existing multivariate linear approaches (e.g. [8]) that instantly \ufb01t a given model to the data\nwithin the linear framework under the assumption that both the data and the model are discrete.\nContemporary neuroimaging studies increasingly use high-resolution fMRI to accurately capture\ncontinuous brain processes, frequently instigated by continuous stimulations. Hence, we propose\nthe use of functional data analysis [9], which treats data, or the processes giving rise to them, as\nfunctions. This not only allows to overcome limitations in neuroimaing studies due to the large\nnumber of data points compared to the number of samples, but also allows to exploit the fact that\nfunctions de\ufb01ned on a speci\ufb01c domain form an inner product vector space, and in most circum-\nstances can be treated algebraically like vectors [10].\n\nWe extend classical multivariate regression analysis of fMRI data [11] to stochastic functional mea-\nsurements. We show that, cast into an incremental pattern searching framework, functional multi-\nvariate regression provides a powerful technique for fMRI-based prediction of naturalistic stimuli.\n\n2 Method\n\nIn the remainder, we consider stimuli data and data produced by fMRI scanners as continuous func-\ntions of time, sampled at the scan interval and subject to observational noise. We treat the data\nwithin a functional linear model where both the predictant and predictor are functional, but where\nthe design matrix that takes care of the linear mapping between the two is vectorial.\n\n2.1 The Predictor\n\nThe predictor data are derived directly from the four-dimensional fMRI data I(x, t), where x \u2208 \u211c3\ndenotes the spatial position of a voxel and t denotes its temporal position. We represent each of\nthe S voxel time courses in functional form by fs(t), with t denoting the continuous path parameter\nand s = 1, ..., S . Rather than directly using voxel time courses for prediction, we use their principal\ncomponents to eliminate collinearity in the predictor set. Following [10], we use functional principal\ncomponent analysis. Viviani et al. [10] showed that functional principal components analysis is\nmore e\ufb00ective than is its ordinary counterpart in recovering the signal of interest in fMRI data, even\nif limited or no prior knowledge of the hemodynamic function or experimental design is speci\ufb01ed.\nIn contrast to [10], however, our approach incrementally zooms in on stimuli-related voxel time\ncourses for dimension reduction (see section 2.5).\nGiven the set of S voxel time courses represented by the vector of functionals f(t) = [ f1(t), ..., fS (t)]T ,\nfunctional principal components analysis extracts main modes of variation in f(t). The number\nof modes to retain is determined from the proportion of the variance that needs to be explained.\nAssuming this is Q, the central concept is that of taking the linear combination\n\nfsq = Zt\n\nfs(t)\u03b1q(t)dt\n\n(1)\n\n\fwhere fsq is the principal component score value of voxel time course fs(t) in dimension q. Principal\ncomponents \u03b1q(t), q = 1, .., Q are sought for one-by-one by optimizing\n\n\u03b1q(t) = max\nq(t)\n\u03b1\u2217\n\n1\nS\n\nS\n\nXs=1\n\nf 2\nsq\n\nwhere \u03b1q(t) is subject to the following orthonormal constraints\n\n\u03b1q(t)2dt = 1\n\nZt\n\nZt\n\n\u03b1k(t)\u03b1q(t)dt = 0, k \u2264 q.\n\n(2)\n\n(3)\n\nThe mapping of fs(t) onto the subspace spanned by the \ufb01rst Q principal component curves results in\nthe vector of scalars fs = [ fs1, ..., fsQ]. We de\ufb01ne the S \u00d7 Q matrix F = [f1, ..., fS ]T of principal com-\nponents scores as our predictor data in linear regression. That is, we perform principal component\nregression with F as model, allowing to naturally deal with temporal correlations, multicollinearity\nand systematic signal variation.\n\n2.2 The Predictand\n\nWe represent the stimulus pattern by the functional \u0001(t), t being the continuous time parameter. We\nregister \u0001(t) to each voxel time course fs(t) in order to be able to compare equivalent time points\non stimulus and brain activity data. Alignment reduces to \ufb01nding the warping function \u03c9s(t) that\nproduces the warped stimulus function\n\ngs(t) = \u0001(\u03c9s(t)).\n\n(4)\n\nThe time warping function \u03c9s(t) is strictly monotonic, di\ufb00erentiable up to a certain order and takes\ncare of a small shift and nonlinear transformation. A global alignment criteria and least squares\nestimation is used:\n\ns Zt\n\u03c9s(t) = min\n\u03c9\u2217\n\n(\u0001(\u03c9\u2217\n\ns(t)) \u2212 fs(t))2dt.\n\n(5)\n\nRegistration of \u0001(t) to all voxel time courses S results in predictand data g(t) = [g1(t), ..., gS (t)]T ,\nwhere g(t) is \u0001(t) registered onto voxel times-course f (t). Our motivation for using voxel-wise\nregistration over standard convolution of stimulus \u0001(t) with the hemodynamic reponse function, is\nthe large variability in hemodynamic delays across brain regions and subjects. A non -linear warp\nof \u0001(t) does not guarantee an outcome that is associated with brain physiology, however it allows\nto capture unknown subtle localized variations in hemodynamic delays across brain regions and\nsubjects.\n\n2.3 The Model\n\nWe employ the predictor data to explain the predictand data within a linear modeling approach, i.e.\nour multivariate linear model is de\ufb01ned as\n\n(6)\nwith \u03b2(t) = [\u03b21(t), ..., \u03b2Q(t)]T being the Q\u00d71 vector of regression functions. The regression functions\nare estimated by least squares minimization such that\n\ng(t) = F\u03b2(t) + \u01eb(t)\n\n(t)Zt\n\u02c6\u03b2(t) = min\n\n\u03b2\u2217\n\n(g(t) \u2212 F\u03b2\u2217(t))2dt,\n\n(7)\n\nunder the assumption that the residual functions \u01eb(t) = [\u01eb1(t), ...., \u01ebS (t)]T are independent and nor-\nmally distributed with zero mean. The estimated regression functions provide the best estimate of\ng(t) in least squares sense:\n\n\u02c6g(t) = F \u02c6\u03b2(t).\n\n(8)\n\nGiven a new (sub)set of voxel time courses, prediction of a stimulus pattern now reduces to comput-\ning the matrix of principal component scores from this new set and weighting these scores by the\nestimated regression functions \u02c6\u03b2(t).\n\n\f2.4 The Objective\n\nThe overall \ufb01t of the model to the data is expressed in terms of adjusted R2 statistic. The functional\ncounterpart of the traditional R2 is computed on the basis of g(t), its mean \u00afg(t) and its estimation\n\u02c6g(t). For the voxel set S ,\n\nS\n\n\u02d9gS (t) =\n\n\u00a8gS (t) =\n\n(gs(t) \u2212 \u00afg(t))2\n\nXs=1\n(gs(t) \u2212 \u02c6gs(t))2\nXs=1\n\nS\n\n(9)\n\n(10)\n\nare derived, where the \ufb01rst term is the variation of the response about its mean and the second the\nerror sum of squares function. The adjusted R-square function is then de\ufb01ned as\n\nRS (t) = 1 \u2212\n\n\u00a8gS (t)/S \u2212 Q \u2212 1\n\n\u02d9gS (t)/S \u2212 1\n\n(11)\n\nwhere degrees of freedom S \u2212 Q \u2212 1 and S \u2212 1 adjust the R-square. Our objective is to \ufb01nd the set\nof voxel time courses S de\ufb01ned as\n\nS = max\n\nS \u2217\u2282S Zt\n\nRS \u2217(t)dt\n\n(12)\n\nwhere S \u2217 denotes a subset of the entire collection of voxels time courses S extracted from a single\nfMRI scan. That is, we aim at \ufb01nding spatially distributed voxel responses S that best explain the\nnaturalistic stimuli, without making any prior assumptions about location and size of voxel subsets.\n\n2.5 The Search\n\nIn order to e\ufb03ciently \ufb01nd the subset of voxels that maximizes Equation (12), we use Population-\nBased Incremental Learning (PBIL) [12], which combines Genetic Algorithms with Competitive\nLearning. The PBIL algorithm uses a probability vector to explore the space of solutions. It in-\ncrementally generates solutions by sampling from that probability vector, evaluates these solutions\nand selects promising ones to update the probability vector. Here, at increment i, the probability\nvector pi = [pi\nN], where\neach member is an S-vector of binary values: mi\nnS ]. A value of 1 for mns means\nthat for solution n the corresponding voxel time course fs(t) is included in the predictor set, while\nn is evaluated in terms of its adjusted R2 value, and\na value 0 indicates exclusion. Each member mi\nthe members with highest values form the joint probability vector p\u2217. A new probability vector is\nsubsequently constructed for the next generation via competitive learning:\n\nS ] is used to generate a population of N solutions Mi = [mi\n\nn1, ..., mi\n\nn = [mi\n\n1, ..., mi\n\n1, ..., pi\n\npi+1 = \u03b3pi + (1 \u2212 \u03b3)p\u2217.\n\n(13)\nThe learning parameter \u03b3 controls the search: a low value enables to focus entirely on the most\nrecent voxel subset while a low value ensures that previously selected voxel subsets are exploited.\nIn order to ensure spatial coherence and limit computation load, we employ the PBIl algorithm not\non single time courses, but on averages of spatial clusters of voxel time courses. That is, we \ufb01rst\nspatially cluster voxel locations as shown in Figure 1, then compute average time course for each\ncluster and then explore the averages via PBIL for model building.\n\n2.6 The Prediction\n\nThe subset of voxel time courses that results from population based incremental learning de\ufb01nes\nthe most predictive voxel locations and associated regression functions. Given new and spatially\nnormalized fMRI data, represented by \u02dcf(t) = [ \u02dcf1(t), ..., \u02dcfS (t)]T , prediction of a stimulus then reduces\nto computing\n\n(14)\nIn here, \u02dcg(t) is the vector of predicted stimuli of which the mean is considered to be the sought stim-\nulus. The matrix \u02dcF is the principal component scores matrix obtained from performing functional\nprincipal components analysis on subset \u02dcfS(t), with S referring to the set of most predictive voxels\nas determined by training.\n\n\u02dcg(t) = \u02dcF \u02c6\u03b2(t).\n\n\fFigure 1: Examples of K-means clustering of voxel locations using Euclidean distance. Left: 1024-\nmeans clustering output. Right: 512-means clustering output. Di\ufb00erent gray values indicate di\ufb00er-\nent clusters in a spatially normalized brain atlas.\n\n3 Experiments and Results\n\n3.1 Experiment\n\nEvaluation of our method is done on a data subset from the 2006 Pittsburgh brain activity interpre-\ntation competition (PBAIC) [6, 7], involving fMRI scans of three di\ufb00erent subjects and two movie\nsessions. In each session, a subject viewed a new Home Improvement sitcom movie for approxi-\nmately 20 minutes. The 20-minute movie contained 5 interruptions where no video was present, only\na white \ufb01xation cross on a black background. All three subjects watched the same two movies. The\nscans produced volumes with approximately 35,000 brain voxels, each approximately 3.28mm by\n3.28mm by 3.5mm, with one volume produced every 1.75 seconds. These scans were preprocessed\n(motion correction, slice time correction, linear trend removal) and spatially normalized (non-linear\nregistration to the Montreal Neurological Institute brain atlas).\n\nAfter fMRI scanning, the three subjects watched the movie again to rate 30 movie features at time\nintervals corresponding to the fMRI scan rate. In our experiments, we focus on the 13 core movie\nfeatures: amusement, attention, arousal, body parts, environmental sounds, faces, food, language,\nlaughter, motion, music, sadness and tools. The real-valued ratings were convolved with a hemo-\ndynamic response function (HRF) modeled by two gamma functions, then subjected to voxel-wise\nnon-linear registration as described in 2.2.\n\nFor training and testing our model, we removed parts corresponding with video presentations of a\nwhite \ufb01xation cross on a black background. Taking into account the hemodynamic lag, we divided\neach fMRI scan and each subject rating into 6 parts corresponding with the movie on parts. On\naverage each movie part contained 105 discrete measurements. We then functionalized these parts\nby \ufb01tting a 30 coe\ufb03cient B-spline to each voxel\u2019s discrete time course. This resulted in 18 data\nsets for training (3 subjects \u00d7 6 movie parts) and another 18 for testing. We used movie 1 data for\ntraining and movie 2 data for prediction, and vice versa. We performed data analysis at two levels.\nFor each feature, \ufb01rst the individual brain scans were analyzed with our method, resulting in a \ufb01rst\nsifting of voxels. First-level analysis results for a given feature were then subjected to second level\nanalysis to identify across subject predictive voxels. Pearson product-moment correlation coe\ufb03cient\nbetween manual feature rating functions and the automatically predicted feature functions was used\nas an evaluation measure.\n\n3.2 Results\n\nAll results were obtained with Q = 4 principal component dimensions, learning parameter value\n\u03b3 = 0.6 and K-means clustering with 1024 clusters for all movie features. These values for Q\nand \u03b3 produced overall highest average cross correlation value in a small parameter optimization\nexperiment (data not shown here). Little performance di\ufb00erences were seen for various numbers of\ndimensions, indicating that the essential information can be captured with as little as 4 dimension.\nSigni\ufb01cant performance di\ufb00erences across features, however, were observed for di\ufb00erent learning\nparameter values, indicating considerable variation in brain response to distinct stimuli.\n\n\fManual versus Predicted Feature Ratings\n\n Manual\n Prediction\n\n0.2\n\n0.4\n\n0.6\n\n0.8\n\n1\n\narguments\n\ns\nn\no\n\ni\nt\nc\nn\nu\n\nf\n\n0.6\n\n0.5\n\n0.4\n\n0.3\n\n0.2\n\n0.1\n\n0\n\n\u22120.1\n\n0\n\nFigure 2: Left: normalized cross correlation values from cross-validation for 13 core movie features.\nRight: functionalized subject3 (solid red) and predicted (dotted blue) rating for the language feature\nof part 5 of movie 1.\n\nFigure 2 (left) shows the average of 2 \u00d7 18 cross correlation coe\ufb03cients from cross validation for all\n13 movie features. For features faces, language and motion cross correlation values above 0.5 were\nobtained, meaning that there is a signi\ufb01cant degree of match between the subject ratings and the\npredicted ratings. Reasonable predictions were also obtained for features arousal and body parts.\nOur results are consistent with top 3 rank entries of 2006 PBAIC in that features faces and language\nare reliably predicted. These entries used recurrent neural networks, ridge regression and a dynamic\nGaussian Markov Random Field modeling on the entire test data benchmark, yielding across feature\naverage cross correlations of: 0.49, 0.49 and 0.47 respectively. Here, the feature average cross\ncorrelation value based on the reduced training data set is 0.36. Note, that in the 2006 competition\nour method ranked \ufb01rst in the actor category [6]. We were able to accurately predict which actor the\nsubjects were seeing purely based on fMRI scans [7].\n\nThe best single result, with highest cross correlation value of 0.76, was obtained for feature language\nof subject 3 watching part 5 of movie 1. For this feature, \ufb01rst level analysis of each of the 18 training\ndata sets associated with movie 2 produced a total number of 1738 predictive voxels. In the second\nlevel analysis, these voxels were analyzed again to arrive at a reduced data set of 680 voxels for\nbuilding the multivariate functional linear model and determining regression functions \u03b2(t). For\nprediction of feature language, corresponding voxel time courses were extracted from the fMRI data\nof subject 3 watching movie 1 part 5, and weighted by \u03b2(t). The manual rating of feature language\nof movie 1 part 5 by subject 3 and the average of the automatically predicted feature functions are\nshown in Figure 2 (right).\n\nFigure 3: Glass view, gray level image with color overlay and surface rendering of 1738 voxels from\n\ufb01rst level analysis. Color denotes predictive power and cross hair shows most predictive location.\n\n\fFigure 3 shows glass view, gray level image with color overlay and surface rendering of the 1738\nvoxels (approximately 40 clusters) from \ufb01rst level analysis. The cross hair shows the voxel location\nin Brodman area 47 that was found to be predictive across most subjects and movie parts: it was\nselected in 6 out of 18 training items (see color bar). The predictive locations correspond with\nthe left and right inferior frontal gyrus, which are known to be involved in language processing.\nThe distributed nature of these clusters is consistent with earlier \ufb01ndings that processing involved\nin language occurs in di\ufb00use brain regions, including primary auditory and visual cortex, frontal\nregions in the left and right hemisphere, in homologues regions [13].\n\nAs we are dealing with curves, the possibility exists to explore additional data characteristics such as\ncurvature. We performed an experiment with 1st order derivative functions, rather than the original\nfunctions to exploit potentially available higher order structure. Figure 4 (left) shows the cross\ncorrelation for 1st order derivative functions. The cross correlation values are similar to the ones\nshown in Figure 2. The average cross correlation value is slightly better than for the original data:\n0.38. This may indicate that higher order structures may contain more predictive power.\n\nIn order to get insight in the e\ufb00ect of non-linear warping on prediction performance, we conducted\nan experiment in which we used convolutions of the stimulus \u0001(t) with di\ufb00erent forms of a HRF\nfunction modeled by two gamma functions. Various HRF functions were obtained by varing the\ndelay of response (relative to onset), delay of undershoot (relative to onset), dispersion of response,\ndispersion of undershoot, ratio of response to undershoot. To determine gs(t), we convolved \u0001(t)\nwith 16 di\ufb00erent HRF functions, and selected the convolved one with highest cross correlation with\nfs(t) to be gs(t). Hence, we parametrically modeled the HRF and learned its parameters from the\ndata.\n\nFigure 4 (right) shows the results of the experiments with convolution of stimuli data with HRF\nmodels learned from the data. As can be seen, the cross correlation values are much lower compared\nto the values in Figure 2 (left). The average cross correlation value is 0.31. Hence, non-linear\nwarping of stimulus onto voxel time course signi\ufb01cantly enhances the predictive power of our model.\nThis suggests that non-linear warping is a potential alternative for determining the best possible HRF\nestimate to overcome potential negative consequences of assuming HRF consistency across subjects\nor brain regions [14].\n\nFigure 4: Left: normalized cross correlation values from cross-validation for 13 core movie features,\nusing 1st order derivative data. Right: cross correlation values from cross-validation for 13 core\nmovie features, using HRF convoluted rather than warped stimuli data.\n\n4 Conclusion\n\nFunctional data analysis provides the possibility to fully exploit structure in inherently continuous\ndata such as fMRI. The advantage of functional data analysis for principal component analysis of\nfMRI data was recently demonstrated in [10]. Here, we proposed a functional linear model that\ntreats fMRI and stimuli as stochastic functional measurements. Cast into an incremental pattern\nsearching framework, the method provides the ability to identify important covariance structure\n\n\fof spatially distributed brain responses and stimuli, i.e. it directly couples activation across brain\nregions rather than \ufb01rst localizing and then integrating function. The method is suited for unbiased\nprobing of functional characteristics of brain areas as well as for exposing meaningful relations\nbetween complex stimuli and distributed brain responses. This \ufb01nding is supported by the good\nprediction performance of our method in the 2006 PBAIC international competition for brain activity\ninterpretation. We are currently extending the method with new objective functions, dimension\nreduction techniques and multi-target search techniques to cope with multiple (interacting) stimuli.\nAlso, in this work we made use of spatial clusters at a single hierarchical level. Preliminary results\nwith hierarchical clustering to arrive at \u201dsupervoxels\u201d at di\ufb00erent spatial resolutions, seem to further\nimprove prediction power.\n\nReferences\n\n[1] J. Haynes and G. Rees. Decoding mental states from brain activity in humans. Nature Neuro-\n\nscience, 7(8):523\u2013534, 2006.\n\n[2] J. Haynes and G. Rees. Predicting the orientation of invisible stimuli from activity in human\n\nprimary visual cortex. Nature Neuroscience, 7(5):686\u2013691, 2005.\n\n[3] Y. Kamitani and F. Tong. Decoding the visual and subjective contents of the human brain.\n\nNature Neuroscience, 8(5):679\u2013685, 2005.\n\n[4] S.M. Polyn, V.S. Natu, J.D. Cohen, and K.A. Norman. Category-speci\ufb01c cortical activity\n\nprecedes retrieval during memory search. Science, 310(5756):1963\u20131966, 2005.\n\n[5] T.M. Mitchell, R. Hutchinson, R.S. Niculescu, F. Pereira, X. Wang, M. Just, and S. Newman.\n\nLearning to decode cognitive states from brain images. Machine Learning, 57(1-2), 2004.\n\n[6] W. Schneider, A. Bartels, E. Formisano, J. Haxby, R. Goebel, T. Mitchell, T. Nichols, and\nIn Proceedings\n\nG. Siegle. Competition: Inferring experience based cognition from fmri.\nOrganization of Human Brain Mapping Florence Italy June 15, 2006.\n[7] Editorial. What\u2019s on your mind. Nature Neuroscience, 6(8):981, 2006.\n[8] K.J. Worsley, J.B. Poline, K.J. Friston, and A.C. Evans. Characterizing the response of pet and\n\nfmri data using multivariate linear models. Neuroimage, 6, 1997.\n\n[9] J. Ramsay and B. Silverman. Functional Data Analysis. Springer-Verlag, 1997.\n[10] R. Viviani, G. Grohn, and M. Spitzer. Functional principal component analysis of fmri data.\n\nHuman Brain Mapping, 24:109\u2013129, 2005.\n\n[11] D.B. Rowe and R.G. Ho\ufb00mann. Multivariate statistical analysis in fmri. IEEE Engineering in\n\nMedicine and Biology, 25:60\u201364, 2006.\n\n[12] Shumeet Baluja. Population-based incremental learning: A method for integrating genetic\nsearch based function optimization and competitive learning. Technical Report CMU-CS-94-\n163, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 1994.\n\n[13] M.A. Gernsbacher and M.P. Kaschak. Neuroimaging studies of language production and com-\n\nprehension. Annual Review of Psychology, 54:91\u2013114, 2003.\n\n[14] D.A. Handwerker, J.M. Ollinger, and M. D\u2019Esposito. Variation of bold hemodynamic response\nfunction across subjects and brain regions and their e\ufb00ects on statistical analysis. NeuroImage,\n8(21):1639\u20131651, 2004.\n\n\f", "award": [], "sourceid": 276, "authors": [{"given_name": "Sennay", "family_name": "Ghebreab", "institution": null}, {"given_name": "Arnold", "family_name": "Smeulders", "institution": null}, {"given_name": "Pieter", "family_name": "Adriaans", "institution": null}]}