{"title": "EEG-GRAPH: A Factor-Graph-Based Model for Capturing Spatial, Temporal, and Observational Relationships in Electroencephalograms", "book": "Advances in Neural Information Processing Systems", "page_first": 5371, "page_last": 5380, "abstract": "This paper presents a probabilistic-graphical model that can be used to infer characteristics of instantaneous brain activity by jointly analyzing spatial and temporal dependencies observed in electroencephalograms (EEG). Specifically, we describe a factor-graph-based model with customized factor-functions defined based on domain knowledge, to infer pathologic brain activity with the goal of identifying seizure-generating brain regions in epilepsy patients. We utilize an inference technique based on the graph-cut algorithm to exactly solve graph inference in polynomial time. We validate the model by using clinically collected intracranial EEG data from 29 epilepsy patients to show that the model correctly identifies seizure-generating brain regions. Our results indicate that our model outperforms two conventional approaches used for seizure-onset localization (5-7% better AUC: 0.72, 0.67, 0.65) and that the proposed inference technique provides 3-10% gain in AUC (0.72, 0.62, 0.69) compared to sampling-based alternatives.", "full_text": "EEG-GRAPH: A Factor-Graph-Based Model for\nCapturing Spatial, Temporal, and Observational\n\nRelationships in Electroencephalograms\n\nYogatheesan Varatharajah \u2217 Min Jin Chong\u2217 Krishnakant Saboo\u2217\n\nBrent Berry\u2020\n\nBenjamin Brinkmann\u2020\n\nGregory Worrell\u2020\n\nRavishankar Iyer\u2217\n\nAbstract\n\nThis paper presents a probabilistic-graphical model that can be used to infer char-\nacteristics of instantaneous brain activity by jointly analyzing spatial and tempo-\nral dependencies observed in electroencephalograms (EEG). Speci\ufb01cally, we de-\nscribe a factor-graph-based model with customized factor-functions de\ufb01ned based\non domain knowledge, to infer pathologic brain activity with the goal of identify-\ning seizure-generating brain regions in epilepsy patients. We utilize an inference\ntechnique based on the graph-cut algorithm to exactly solve graph inference in\npolynomial time. We validate the model by using clinically collected intracra-\nnial EEG data from 29 epilepsy patients to show that the model correctly iden-\nti\ufb01es seizure-generating brain regions. Our results indicate that our model out-\nperforms two conventional approaches used for seizure-onset localization (5\u20137%\nbetter AUC: 0.72, 0.67, 0.65) and that the proposed inference technique provides\n3\u201310% gain in AUC (0.72, 0.62, 0.69) compared to sampling-based alternatives.\n\nIntroduction\n\n1\nStudying the neurophysiological processes within the brain is an important step toward understand-\ning the human brain. Techniques such as electroencephalography are exceptional tools for studying\nthe neurophysiological processes, because of their high temporal and spatial resolution. An elec-\ntroencephalogram (EEG) typically contains several types of rhythms and discrete neurophysiolog-\nical events that describe instantaneous brain activity. On the other hand, the neural activity taking\nplace in a brain region is very likely dependent on activities that took place in the same region at pre-\nvious time instances. Furthermore, some EEG channels show inter-channel correlation due to their\nspatial arrangement [1]. Those three characteristics are related, respectively, to the observational,\ntemporal, and spatial dependencies observed in time-series EEG signals.\nThe majority of the literature focuses on identifying and developing detectors for features relating to\nthe different rhythms and discrete neurophysiological events in the EEG signal [2]. Some effort has\nbeen made to understand the inter-channel correlations [3] and temporal dependencies [4] observed\nin EEG. Despite these separate efforts, very little effort has been made to combine those depen-\ndencies into a single model. Since those dependencies possess complementary information, using\nonly one of them generally results in poor understanding of the underlying neurophysiological phe-\nnomena. Hence, a uni\ufb01ed framework that jointly captures all three dependencies in EEG, addresses\nan important research problem in electrophysiology. In this paper, we describe a graphical-model-\nbased approach to capture all three dependencies, and we analyze its ef\ufb01cacy by applying it to a\ncritical problem in clinical neurology.\n\nIllinois 61801. Email: {varatha2, mchong6, ksaboo2, rkiyer}@illinois.edu\n\n\u2217Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana,\n\u2020Department of Neurology, Mayo Clinic, Rochester, Minnesota 55904.\n\nBrinkmann.Benjamin, Worrell.Gregory}@mayo.edu\n31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.\n\nEmail: {Berry.Brent,\n\n\fGraphical models in general are useful for representing dependencies between random variables.\nFactor graphs are a speci\ufb01c type of graphical models that have random variables and factor functions\nas the vertices in the graph [5]. A factor function is used to describe the relationship between two or\nmore random variables in the graph. Factor graphs are particularly useful when custom de\ufb01nitions\nof the dependencies, such as in our case, need to be encoded in the graph. Hence, we have chosen to\nadopt a factor graph model to represent the three kinds of dependencies described previously. These\ndependencies are represented via three different factor functions, namely observational, spatial,\nand temporal factor functions. We assess the applicability of this model in localization of seizure\nonset zones (SOZ), which is a critical step in treating patients with epilepsy [6]. In particular, our\nmodel is utilized to isolate those neural events in EEG that are associated with the SOZ, and are\neventually used to deduce the location of the SOZ. However, in a general setting, with appropriate\nde\ufb01nitions of factor functions, one can utilize our model to describe other neural events of interest\n(e.g., events related to behavioral states or memory processing). Major contributions of our work\nare the following.\n\n1. A framework based on factor graphs that jointly represents instantaneous observation-\nbased, temporal, and spatial dependencies in EEG. This is the \ufb01rst attempt to combine\nthese three aspects into a single model in the context of EEG analysis.\n\n2. A lightweight and exact graph inference technique based on customized de\ufb01nitions of fac-\ntor functions. Exact graph inference is typically intractable in most graphical model repre-\nsentations because of exponentially growing state spaces.\n\n3. A markedly improved technique for localizing SOZ based on the factor-graph-based model\ndeveloped in this paper. Existing approaches utilize only the observations made in the EEG\nto determine the SOZ and do not utilize spatial and temporal dependencies.\n\nOur study establishes the feasibility of the factor-graph-based model and demonstrates its application\nin SOZ localization on a real EEG dataset collected from epilepsy patients who underwent epilepsy\nsurgery. Our results indicate that utilizing the spatial and temporal dependencies in addition to\nobservations made in the EEG provides a 5\u20137% improvement in the AUC (0.72, 0.67, 0.65) and\noutperforms alternative approaches utilized for SOZ localization. Furthermore, our experiments\ndemonstrate that the lightweight graph inference technique provides a considerable improvement\n(3\u201310%) in SOZ localization compared to sampling-based alternatives (AUC: 0.72, 0.62, 0.69).\n\n2 Related work\nIdentifying features (or biomarkers) that describe underlying neurophysiological phenomena has\nbeen a major focus of research in the EEG literature [2]. Spectral features [7], interictal spikes [8],\nhigh-frequency oscillations [2], and phase-amplitude coupling [4] are some of the widely used fea-\ntures. Although feature identi\ufb01cation is an important step in any electrophysiologic study, features\nalone often cannot completely describe the underlying physiological phenomena. Researchers have\nalso looked at spatial connectivity between EEG channels as means of describing neurophysiological\nactivities [3]. In recent times, because of the availability of long-term EEG recordings, understand-\ning of the temporal dependencies within various brain activities has also advanced signi\ufb01cantly [4].\nA recent attempt at combining spatial and temporal constraints has shown promise despite lacking\ncomprehensive validation [9]. Regardless, a throughly validated and general model that captures all\nthe factors, and is applicable to a variety of problems has not, to our knowledge, been proposed in\nthe EEG literature. Since the three factors are complementary to each other, a model that jointly\nrepresents them addresses an important research gap in the \ufb01eld of electrophysiology.\nGraphical models have been widely used in medical informatics [10], intrusion detection [11], so-\ncial network modeling [12], and many other areas. Although factor graphs are applicable in all these\nsettings, their applications in practice are still very much dependent on problem-speci\ufb01c custom\nde\ufb01nitions of factor functions. Nevertheless, with some level of customization, our work provides a\ngeneral framework to describe the different dependencies observed in EEG signals. A similar frame-\nwork for emotion prediction is described in Moodcast [12], for which the authors used a factor graph\nmodel to describe the in\ufb02uences of historical information, other users, and dynamic status to predict\na user\u2019s emotions in a social network setting. Although our factor functions are derived in a simi-\nlar fashion, we show that graph inference can be performed exactly using the proposed lightweight\nalgorithm, and that it outperforms the sampling-based inference method utilized in Moodcast. Our\n\n2\n\n\falgorithm for inference was inspired by [13], in which the authors used an energy-minimization-\nbased approach for performing exact graph inference in a Markov random \ufb01eld-based model.\n\n3 Model description\nHere we provide a mathematical description of the model and the inference procedure. In a nutshell,\nwe are interested in inferring the presence of a neurophysiological phenomenon of interest by ob-\nserving rhythms and discrete events (referred to as observations) present in the EEG, and by utilizing\ntheir spatial and temporal patterns as represented by a probabilistic graph. Since the generality of\nour model relies on the ability to customize the de\ufb01nitions of speci\ufb01c dependencies described by the\nmodel, we have adopted a factor-graph-based setting to represent our model.\nDe\ufb01nitions: Suppose that EEG data of a subject are recorded through M channels. Initially, the\ndata is discretized by dividing the recording duration into N epochs. We represent the interactions\nbetween the channels at an epoch n as a dynamic graph Gn = (V, En), where V is the set of\n|V | = M channels and En \u2282 V \u00d7 V is the set of undirected links between channels. The state of a\nchannel k in the nth epoch is denoted by Yn(k), which might represent a phenomenon of interest. For\nexample, in the case of SOZ localization, the state might be a binary value representing whether the\nkth channel in the nth epoch exhibits a SOZ-likely phenomenon. We also use Yn to denote the states\nof all the channels at epoch n, and use Y to denote the set of all possible values that Yn(k) can take.\nWe refer to the EEG rhythm or discrete event present in the EEG as observations and use Xn(k)\nto denote the observation present in the nth epoch of the kth channel. Depending on the number of\nrhythms and/or events, Xn(k) could be a scalar or vector random variable. The observations made\nin all the channels at epoch n are denoted by Xn.\nInference: Given a dynamic network Gn, and the observations Xn, our goal is to infer the states\nof the channels at epoch n, i.e., Yn. In our approach, we derive the inference model using a factor\ngraph with factor functions de\ufb01ned as shown in Table 1. The factor functions are de\ufb01ned using\nexponential relationships so that they attain their maximum values when the exponents are zero, and\nexponentially decay otherwise. All factor functions range in [0, 1].\n\nTable 1: Factor functions used in our EEG model and their descriptions, de\ufb01nitions, and notations.\n\nFunction\n\nDescription\n\nObservational:\nf (Yn(k), \u03c6(Xn(k)))\n\nMeasures the direct contri-\nbution of the observations\nmade in a channel\nto the\nphenomenon of interest.\n\nDefnition\ne\u2212(Yn(k)\u2212\u03c6(Xn(k)))2\n\nSpatial:\ng(Yn(k), Yn(l))\n\nTemporal:\nh(Yn(k), \u2126n\u22121(k))\n\nMeasures the correlation be-\ntween the states of two chan-\nnels at the same epoch.\nMeasures the correlation be-\ntween a channel\u2019s current\nstate and its previous states.\n\n\u2212 1\nd2\ne\nkl\n\n(Yn(k)\u2212Yn(l))2\n\n\u2212(Yn(k)\u2212\u2126n\u22121(k))2\n\ne\n\nNotations\n\u03c6 : X \u2192 Y is a mapping\nfrom the observations to the\nphenomenon of interest.\nIn\ngeneral, it is not an accurate\nmap, because it is based on\nobservations alone.\ndkl denotes the physical dis-\ntance between electrodes (or\nchannels) k and l.\n\u2126n\u22121(k) is a function of all\nprevious states of channel k.\ni=1 Yi(k)\nE.g., \u2126n\u22121(k) =\n\n(cid:80)n\u22121\n\nn\u22121\n\nWith these de\ufb01nitions, the state of a channel is spatially related to the states of every other channel,\ntemporally related to a function of all its previous states, and, at the same time, explained by the\ncurrent observation of the channel. These dependencies and the factor functions that represent them\nare illustrated in Fig. 1a and 1b respectively. (Note that Fig. 1b illustrates only the factor functions\nrelated to Channel 1 and that similar factor functions exist for other channels although they are\nnot shown in the \ufb01gure.) Provided with that information, for a particular state vector Y , we can\nwrite P (Y |Gn) as in Eq. 1, where Z is a normalizing factor. In general, it is infeasible to \ufb01nd the\nnormalizing constant Z, because it would require exploration of the space |Y|M .\n\n\uf8ee\uf8f0(cid:89)\n\ni(cid:54)=k\n\nM(cid:89)\n\nk=1\n\nP (Y |Gn) =\n\n1\nZ\n\n\uf8f9\uf8fb\n\n(1)\n\ng (Y (k), Y (i)) \u00d7 f (Y (k), \u03c6(Xn(k))) \u00d7 h (Y (k), \u2126n\u22121(k))\n\n3\n\n\f(a) Factors that explain the state of a brain region.\n\n(b) Dependencies as factor functions.\n\nFigure 1: The dependencies observed in brain activity and a representative factor graph model.\n\nTherefore, we de\ufb01ne the following predictive function (Eq. 2) for inferring Yn with the highest\nlikelihood per Eq. 1.\n\ng (Y (k), Y (i)) \u00d7 f (Y (k), \u03c6(Xn(k))) \u00d7 h (Y (k), \u2126n\u22121(k))\n\n\uf8ee\uf8f0(cid:89)\n\ni(cid:54)=k\n\nM(cid:89)\n\nk=1\n\nYn = arg max\nY \u2208YM\n\n\uf8f9\uf8fb (2)\n\n\uf8f9\uf8fb\n\n(cid:21)\n\nStill, \ufb01nding a Y that maximizes this objective function involves a discrete optimization over the\nspace |Y|M . A brute-force approach to \ufb01nding an exact solution is infeasible when M is large.\nSeveral methods, such as junction trees [14], belief propagation [15], and sampling-based methods\nsuch as Markov Chain Monte Carlo (MCMC) [16, 17], have been proposed to \ufb01nd approximate\nsolutions. However, we show that this can be calculated exactly when the aforementioned de\ufb01nitions\nof the factor functions are utilized. We can rewrite Eq. 2 using the de\ufb01nitions in Table 1 as follows.\n\n\u2212 1\nd2\nkl\n\n(Y (k)\u2212Y (l))2 \u00d7 e\u2212(Y (k)\u2212\u03c6(Xn(k)))2 \u00d7 e\u2212(Y (k)\u2212\u2126n\u22121(k))2\n\ne\n\n(3)\n\n\uf8ee\uf8f0(cid:89)\n\nl(cid:54)=k\n\nM(cid:89)\n\nk=1\n\nYn = arg max\nY \u2208YM\n\nNow, representing the product terms as summations inside the exponent and using the facts that the\nexponential function is monotonically increasing and that maximizing a function is equivalent to\nminimizing the negative of that function, we can rewrite Eq. 3 as:\n\n(cid:80)M\n\n(cid:20)(cid:80)\n\nYn = arg min\nY \u2208YM\n\nk=1\n\nl(cid:54)=k\n\n(Y (k)\u2212Y (l))2+(Y (k)\u2212\u03c6(Xn(k)))2+(Y (k)\u2212\u2126n\u22121(k))2\n\n1\nd2\nkl\n\n(4)\n\nAlthough the individual components in this objective function are solvable optimization problems,\nthe combination of them makes it dif\ufb01cult to solve. However, the objective function resembles\nthat of a standard graph energy minimization problem and hence can be solved using graph-cut\nalgorithms [18]. In this paper, we describe a solution for minimizing this objective function when\n|Y| = 2, i.e., the brain states are binary. Although that is a limitation, the majority of the brain state\nclassi\ufb01cation problems can be reduced to binary state cases when the time window of classi\ufb01cation\nis appropriately chosen. Regardless, potential solutions for |Y| > 2 are discussed in Section 6.\nGraph inference using min-cut for the binary state case: We constructed the graph shown in\nFig. 2a with two special nodes in addition to the EEG channels as vertices. The additional nodes\nfunction as source (marked by 1) and sink (marked by 0) nodes in the conventional min-cut/max-\ufb02ow\nproblem. Weights in this graph are assigned as follows:\n\nl is assigned a weight of 1\nd2\nkl\n\n\u2022 Every channel is connected with every other channel, and the link between channels k and\n\u2022 Every channel is connected with the source node, and the link between channel k and the\n\u2022 Every channel is also connected with the sink node, and the link between channel k and the\n\nsource is assigned a weight of (1 \u2212 \u2126n\u22121(k))2 + (1 \u2212 \u03c6 (Xn(k)))2.\n\n(Y (k) \u2212 Y (l))2 based on the distance between them.\n\nsink is assigned a weight of \u21262\n\nn\u22121(k) + (\u03c6 (Xn(k)))2.\n\nProposition 1. An optimal min-cut partitioning of the graph shown in Fig. 2a minimizes the objec-\ntive function given in Eq. 4.\n\n4\n\nCurrent state of a brain regionStates of nearby regionsCurrent observation (events, rhythms)Previous states of the same region\f(a) New graphical structure\n\n(b) Min-cut partitioning\n\nFigure 2: Graph inference using the min-cut algorithm.\n\n(cid:21)\n\n(Y (k) \u2212 Y (l))2\n\nProof: Suppose that we perform an arbitrary cut on the graph shown in Fig. 2a, resulting in two sets\nof vertices S and T . The energy of the graph after the cut is performed is:\n\n(cid:104)\n(Y (k) \u2212 \u2126n\u22121(k))2 + (Y (k) \u2212 \u03c6 (Xn(k)))2(cid:105)\n\n+\n\n(cid:88)\n\n(cid:88)\n\n(cid:20) 1\n\nk\u2208T\n\nl\u2208S\n\nd2\nkl\n\nM(cid:88)\n\nk=1\n\nEcut =\n\nIt can be seen that, for the same partition of vertices, the objective function given in Eq. 4 attains the\nsame quantity as Ecut. Therefore, since the optimal min-cut partition minimizes the energy Ecut, it\nminimizes the objective function given in Eq. 4.\nNow suppose that we are given two sets of nodes {S\u2217,T \u2217} as the optimal partitioning of the graph.\nWithout loss of generality, let us assume that S\u2217 contains the source and T \u2217 contains the sink. Then,\nthe other vertices in S\u2217 and T \u2217, are assigned 1 and 0 as their respective states to obtain the optimal\nY that minimizes the objective function given in Eq. 4.\n4 Application of the model in seizure onset localization\nBackground: Epilepsy is a neurological disorder characterized by spontaneously occurring\nseizures. It affects roughly 1% of the world\u2019s population, and many do not respond to drug treatment\n[19]. Epilepsy surgery, which involves resection of a portion of the patient\u2019s brain, can reduce and\noften eliminate seizures [20]. The success of resective surgery depends on accurate localization of\nthe seizure-onset zone [21]. The conventional practice is to identify the EEG channels that show the\nearliest seizure discharge via visual inspection of the EEG recorded during seizures, and to remove\nsome tissue around these channels during the resective surgery. This method, despite being the cur-\nrent clinical standard, is very costly, time-consuming, and burdensome to the patients, as it requires\na lengthy ICU stay so that an adequate number of seizures can be captured. One approach, which\nhas recently become a widely researched topic, utilizes between-seizure (interictal) intracranial EEG\n(iEEG) recording to localize the seizure onset zones [22, 6]. This type of localization is preferable\nto the conventional method, as it does not require a lengthy ICU stay.\nInterictal SOZ identi\ufb01cation methodology: Like that of the conventional approach, the goal here\nis to identify a few channels that are likely to be in the SOZ. Channels situated directly on or close to\na SOZ exhibit different forms of transient electrophysiologic events (or abnormal events) between\nseizures [23]. The frequency of such abnormal neural events plays a major role in determining\nthe SOZ. However, capturing these abnormal neural events that occur in distinct locations of the\nbrain alone is often not suf\ufb01cient to establish an area in the brain as the SOZ. The reason is that\ninsigni\ufb01cant artifacts present in the EEG may show characteristics of those abnormal events that are\nassociated with SOZ (referred to as SOZ-likely events). In order to set apart the SOZ-likely events,\ntheir spatial and temporal patterns could be utilized. It is known that SOZ-likely events occur in a\nrepetitive and spatially correlated fashion (i.e., neighboring channels exhibit such events at the same\ntime) [6]. Hence, the factor-graph-based model described in Section 3 can be applied to capture and\nutilize the spatial and temporal correlations in isolating the SOZ-likely events.\n\n5\n\n\fIdentifying abnormal neural events: Spectral characteristics of iEEG measured in the form of\npower-in-bands (PIB) features have been widely utilized to identify abnormal neural events [24,\nIn this paper, PIB features are extracted as spectral power in the frequency bands Delta\n6, 7].\n(0\u20133 Hz), Low-Theta (3\u20136 Hz), High-Theta (6\u20139 Hz), Alpha (9\u201314 Hz), Beta (14\u201325 Hz), Low-\nGamma (30\u201355 Hz), High-Gamma (65\u2013115 Hz), and Ripple (125\u2013150 Hz) and utilized to make\nobservations from channels. As described in Section 3, a \u03c6 function is used to relate the observations\nto abnormal events. In Section 6, we evaluate different techniques for obtaining a mapping from\nextracted PIB features to the presence of an abnormal neural event. However, a mapping obtained\nusing observations alone is not suf\ufb01cient to deduce SOZ because in addition to SOZ-likely events,\nsignal artifacts will also be captured by this mapping. This phenomenon is illustrated in Fig.3, in\nwhich PIB features show similar characteristics for the events related to both SOZ and non-SOZ.\nTherefore, we utilize the factor graph model presented in this paper to further \ufb01lter the detected\nabnormal events based on their spatial and temporal patterns and isolate the SOZ-likely events.\n\nFigure 3: EEG events related to both SOZ and non-SOZ are captured by PIB features because they\npossess similar spectral characteristics.\nSpatial and temporal dependencies in SOZ localization: Although artifacts show spectral charac-\nteristics similar to those of SOZ-likely events, unlike the latter, the former do not occur in a spatially\ncorrelated manner. This spatial correlation is measured with respect to the physical distances be-\ntween the electrodes placed in the brain. Therefore, the same de\ufb01nition of the spatial factor function\ndescribed in Section 3 is applicable. If a channel\u2019s observation is classi\ufb01ed as an abnormal neural\nevent and the spatial factor function attains a large value with an adjacent channel, it would mean\nthat both channels likely show similar patterns of abnormalities which therefore must be SOZ-likely\nevents. In addition, the SOZ-likely events show a repetitive pattern, which artifacts usually do not.\nIn Section 3, we described the temporal correlation as a function of all previous states. As such, the\ntemporal correlation here is established with the intuition that a channel that previously exhibited a\nlarge number of SOZ-likely events is likely to exhibit more because of the repetitive pattern. Hence,\ntemporal correlation is measured as the correlation between the state of a channel and the observed\ni=1 Yi(k)\nfrequency of SOZ-likely events in that channel until the previous epoch, i.e., \u2126n\u22121(k) =\n.\nn\u22121\nTherefore, when \u2126n\u22121(k) is close to 1 and the observation made from channel k is classi\ufb01ed as an\nabnormal neural event, the event is more likely to be a SOZ-likely event than an artifact.\n\n(cid:80)n\u22121\n\n5 Experiments\nData: The data used in this work are from a study approved by the Mayo Clinic Institutional Review\nBoard. The dataset consists of iEEG recordings collected from 29 epilepsy patients. The iEEG\nsensors were surgically implanted in potentially epileptogenic regions in the brain. Patients were\n\n6\n\n12345Time (sec)SOZSOZSOZNon-SOZChannels0 0.51 1.5Time(sec)-0.500.5Non-SOZ Signal0 0.51 1.5Time(sec)12345678Normalized PIB0 0.51 1.5Time(sec)-0.500.51SOZ Signal0 0.51 1.5Time(sec)12345678Normalized PIB\fimplanted with different numbers of sensors, and they all had different SOZs. Ground truth (the\ntrue SOZ channels) was established from clinical reports and veri\ufb01ed independently through visual\ninspection of the seizure iEEGs. During data collection, basic preprocessing was performed to\nremove line-noise and other forms of signal contamination from the data.\n\nFigure 4: A \ufb02ow diagram illustrating the SOZ determination process.\n\nAnalytic scheme: Two-hour between-seizure segments were chosen for each patient to represent\na monitoring duration that could be achieved during surgery. The two-hour iEEG recordings were\ndivided into non-overlapping three-second epochs. This epoch length was chosen because it would\nlikely accommodate at least one abnormal neural event that could be associated with the SOZ [6].\nSpectral domain features (PIB) were extracted in the 3-second epochs to capture abnormal neural\nevents [6]. Based on the features extracted in a 3-second recording of a channel, a binary value\n\u03c6 (Xn(k)) \u2208 {0, 1} was assigned to that channel, indicating whether or not an abnormal event was\npresent. Section 6 provides a comparison of supervised and unsupervised techniques used to create\nthis mapping. In the case of supervised techniques, a classi\ufb01cation model was trained using the PIB\nfeatures extracted from an existing corpus of manually annotated abnormal neural events. In the\ncase of unsupervised techniques, channels were clustered into two groups based on the PIB features\nextracted during an epoch, and the cluster with the larger cluster center (measured as the Euclidean\ndistance from the origin) was labeled as the abnormal cluster. Consequently, the respective epochs\nof those channels in the abnormal cluster were classi\ufb01ed as abnormal neural events. The factor graph\nmodel was then used to \ufb01lter the SOZ-likely events out of all the detected abnormal neural events. A\nfactor graph is generated using the observational, spatial, and temporal factor functions described\nabove speci\ufb01cally for this application. The best combination of states that minimizes the objective\nfunction given in Eq. 4, Yn, is found by using the min-cut algorithm. In our approach, we used the\nBoykov-Kolmogorov algorithm [25] to obtain the optimal partition of the graph. The states Yn here\nare binary values and represent the presence or absence of SOZ-likely events in the channels. This\nprocess is repeated for all the 3-second epochs and the SOZ is deduced at the end using a maximum\nlikelihood (ML) approach (described in the following). This whole process is illustrated in Fig. 4.\nMaximum likelihood SOZ deduction: We model the occurrences of SOZ-likely events in channel\nk as independent Bernoulli random variables with probability \u03c0(k). Here, \u03c0(k) denotes the true bias\nof the channel\u2019s being in SOZ. We estimate \u03c0(k) using a maximum likelihood (ML) approach and\nuse \u02c6\u03c0(k) to denote the estimate. Each Yn(k) that results from the factor graph inference is treated\nas an outcome of a Bernoulli trial and the log-likelihood function after N such trials is de\ufb01ned as:\n\n(cid:35)\n\nlog (L(\u03c0(k))) = log\n\n\u03c0(k)Yn(k)(1 \u2212 \u03c0(k))1\u2212Yn(k)\n\n(5)\n\nAn estimate for \u03c0(k) that maximizes the above likelihood function (known as MLE, i.e., maximum\nlikelihood estimate) after N epochs is derived as \u02c6\u03c0(k) =\nEvaluation: The ML approach generates a likelihood probability for each channel k for being in\nthe SOZ. We compared these probabilities against the ground truth (binary values with 1 meaning\n\nn=1 Yn(k)\n\nN\n\n.\n\n(cid:80)N\n\n(cid:34) N(cid:89)\n\nn=1\n\n7\n\n2-hour data segmentChannel k3-sec window 3-sec window3-sec window PIB feature extractionFeature classificationFactor graph inference\fthat the channel is in the SOZ and 0 otherwise) to generate the area under the ROC curve (AUC),\nsensitivity, speci\ufb01city, precision, recall, and F1-score metrics. First, we evaluated a number of\ntechniques for generating a mapping from the extracted PIB features to the presence of abnormal\nevents. We evaluated three unsupervised approaches, namely k-means, spectral, and hierarchical\nclustering methods and two supervised approaches, namely support vector machine (SVM) and\ngeneralized linear model (GLM), for this task. Second, we evaluated the bene\ufb01ts of utilizing the\nmin-cut algorithm for inferring instantaneous states. Here we compared our results using the min-\ncut algorithm against those of two sampling-based techniques [12]: MCMC with random sampling,\nand MCMC with sampling per prior distribution. Belief-propagation-based methods are not suitable\nhere because our factor graph contains cycles [26]. Third, we compared our results against two\nrecent solutions for interictal SOZ localization, including a summation approach [6] and a clustering\napproach [22]. In the summation approach, summation of the features of a channel normalized by the\nmaximum feature summation was used as the likelihood of that channel\u2019s being in the SOZ. In the\nclustering approach, the features of all the channels during the whole 2-hour period were clustered\ninto two classes by a k-means algorithm, and the cluster with the larger cluster mean was chosen\nas the abnormal cluster. For each channel, the fraction of all its features that were in the abnormal\ncluster was used as the likelihood of that channel being in the SOZ. Both of these approaches utilize\nonly the observations and lack the additional information of the spatial and temporal correlations.\n\n6 Results & discussion\nTable 2 lists the results obtained for the experiments explained in Section 5, performed using a\ndataset containing non-seizure (interictal) iEEG data from 29 epilepsy patients. First, a comparison\nof supervised and unsupervised techniques for the mapping from PIB features to the presence of\nabnormal events was performed. The results indicate that using a k-means clustering approach for\nmapping PIB features to abnormal events is better than any other supervised or unsupervised ap-\nproach, while other approaches also prove useful. Second, a comparison between sampling-based\nmethods and the min-cut approach was performed for the task of graph inference. Our results in-\ndicate that utilizing the min-cut approach to infer instantaneous states is considerably better than a\nrandom-sampling-based MCMC approach (with a 10% higher AUC and 14% higher F1-score) and\nmarginally better than an MCMC approach with sampling per a prior distribution (with a 3% higher\nAUC and a similar F1-score), when used with k-means algorithm for abnormal event classi\ufb01cation.\nHowever, unlike this approach, our method does not require a prior distribution to sample from.\nThird, we show that our factor-graph-based model for interictal SOZ localization performs signif-\nicantly better than either of the traditional approaches (with 5% and 7% higher AUCs) when used\nwith k-means algorithm for abnormal event classi\ufb01cation and min-cut algorithm for graph inference.\n\nTable 2: Goodness-of-\ufb01t metrics obtained for unsupervised and supervised methods for PIB-to-\nabnormal-event mapping (\u03c6); sampling-based approaches for instantaneous state estimation; and\nconventional approaches utilized for interictal SOZ localization. (\u201cFG/kmeans/min-cut\" means that\nwe utilized a factor-graph-based method, with a k-means clustering algorithm for mapping PIB\nfeatuers to abnormal neural events and the min-cut algorithm for performing graph inference.)\n\nAUC\n\nSensitivity\n\nSpeci\ufb01city\n\n0.74\u00b10.03\n0.60\u00b10.07\n0.52\u00b10.06\n0.68\u00b10.06\n0.62\u00b10.07\n\nMethod\nEvaluation: techniques for PIB to abnormal event mapping (\u03c6)\n0.72\u00b10.03\nFG/kmeans/min-cut\n0.68\u00b10.03\nFG/spectral/min-cut\n0.69\u00b10.03\nFG/hierarch/min-cut\n0.71\u00b10.03\nFG/svm/min-cut\n0.69\u00b10.03\nFG/glm/min-cut\nEvaluation: sampling vs. min-cut\nFG/kmeans/Random 0.62\u00b10.03\n0.69\u00b10.03\nFG/kmeans/Prior\nEvaluation: comparison against conventional approaches\nSummation\nClustering\n\n0.61\u00b10.02\n0.48\u00b10.05\n0.51\u00b10.05\n0.54\u00b10.05\n0.47\u00b10.05\n\n0.51\u00b10.08\n0.65\u00b10.04\n\n0.40\u00b10.07\n0.66\u00b10.04\n\n0.67\u00b10.04\n0.65\u00b10.04\n\n0.59\u00b10.05\n0.49\u00b10.06\n\n0.67\u00b10.03\n0.72\u00b10.04\n\nPrecision\n\nRecall\n\nF1-score\n\n0.39\u00b10.05\n0.31\u00b10.05\n0.29\u00b10.05\n0.36\u00b10.05\n0.31\u00b10.05\n\n0.74\u00b10.03\n0.60\u00b10.07\n0.52\u00b10.06\n0.68\u00b10.06\n0.62\u00b10.08\n\n0.46\u00b10.04\n0.36\u00b10.05\n0.34\u00b10.05\n0.43\u00b10.05\n0.37\u00b10.05\n\n0.35\u00b10.06\n0.40\u00b10.04\n\n0.51\u00b10.08\n0.65\u00b10.04\n\n0.32\u00b10.05\n0.46\u00b10.04\n\n0.38\u00b10.05\n0.42\u00b10.06\n\n0.59\u00b10.05\n0.49\u00b10.06\n\n0.43\u00b10.05\n0.44\u00b10.05\n\n8\n\n\fSigni\ufb01cance: Overall, the factor-graph-based model with k-means clustering for abnormal event\nclassi\ufb01cation and the min-cut algorithm for instantaneous state inference outperforms all other meth-\nods for the application of interictal SOZ localization. Utilization of spatial and temporal factor\nfunctions improves the localization AUC by 5\u20137%, relative to pure observation-based approaches\n(summation and clustering). On the other hand, the runtime complexity of instantaneous state infer-\nence is greatly reduced by the min-cut approach. The complexity of a brute-force approach grows\nexponentially with the number of nodes in the graph, while the min-cut approach has a reasonable\nruntime complexity of O(|V ||E|2), where |V | is the number of nodes and |E| is the number of\nedges in the graph. Although sampling-based methods are able to provide approximate solutions\nwith moderate complexity, the min-cut method provided superior performance in our experiments.\nFuture work: Signi\ufb01cant domain knowledge is required to come up with manual de\ufb01nitions of\ngraphical models, and in many situations, almost no domain knowledge is available. Hence, the\nmanually de\ufb01ned factor-graphical model and associated factor functions are a potential limitation\nof our work, as a framework that automatically learns the graphical representation might result in\na more generalizable model. Dynamic Bayesian networks [27] may provide a platform that can be\nused to learn dependencies from the data while allowing the types of dependencies we described.\nAnother potential limitation of our work is the binary-brain-state assumption made while solving the\ngraph energy minimization task. We surmise that extensions of the min-cut algorithm such as the\none proposed in [28] are applicable for non-binary cases. In addition, we also believe that optimal\nweighting of the different factor functions could further improve localization accuracy and provide\ninsights on the contributions of spatial, temporal, and observational relationships to a speci\ufb01c appli-\ncation that involves EEG signal analysis. We plan to investigate those in our future work.\n\n7 Conclusion\nWe described a factor-graph-based model to encode observational, temporal, and spatial dependen-\ncies observed in EEG-based brain activity analysis. This model utilizes manually de\ufb01ned factor\nfunctions to represent the dependencies, which allowed us to derive a lightweight graph inference\ntechnique. This is a signi\ufb01cant advancement in the \ufb01eld of electrophysiology because a general and\ncomprehensively validated model that encodes different forms of dependencies in EEG does not ex-\nist at present. We validated our model for the application of interictal seizure onset zone (SOZ) and\ndemonstrated the feasibility in a clinical setting. Our results indicate that our approach outperforms\ntwo widely used conventional approaches for the application of SOZ localization. In addition, the\nfactor functions and the technology for exactly inferring the states described in this paper can be\nextended to other applications of factor graphs in \ufb01elds such as medical diagnoses, social network\nanalysis, and preemptive attack detection. Therefore, we assert that further investigation is necessary\nto understand the different usecases of this model.\nAcknowledgements: This work was partly supported by National Science Foundation grants CNS-\n1337732 and CNS-1624790, National Institute of Health grants NINDS-U01-NS073557, NINDS-\nR01-NS92882, NHLBI-HL105355, and NINDS-UH2-NS095495-01, Mayo Clinic and Illinois Al-\nliance Fellowships for Technology-based Healthcare Research and an IBM faculty award. We thank\nSubho Banerjee, Phuong Cao, Jenny Applequist, and the reviewers for their valuable feedback.\n\nReferences\n[1] C. P. Warren, S. Hu, M. Stead, B. H. Brinkmann, M. R. Bower, and G. A. Worrell, \u201cSynchrony in normal\nand focal epileptic brain: The seizure onset zone is functionally disconnected,\u201d Journal of Neurophysiol-\nogy, vol. 104, no. 6, pp. 3530\u20133539, 2010.\n\n[2] G. A. Worrell, A. B. Gardner, S. M. Stead, S. Hu, S. Goerss, G. J. Cascino, F. B. Meyer, R. Marsh,\nand B. Litt, \u201cHigh-frequency oscillations in human temporal lobe: Simultaneous microwire and clinical\nmacroelectrode recordings,\u201d Brain, vol. 131, no. 4, pp. 928\u2013937, 2008.\n\n[3] M. Rubinov and O. Sporns, \u201cComplex network measures of brain connectivity: Uses and interpretations,\u201d\n\nNeuroimage, vol. 52, no. 3, pp. 1059\u20131069, 2010.\n\n[4] C. Alvarado-Rojas, M. Valderrama, A. Fouad-Ahmed, H. Feldwisch-Drentrup, M. Ihle, C. Teixeira,\nF. Sales, A. Schulze-Bonhage, C. Adam, A. Dourado, S. Charpier, V. Navarro, and M. Le Van Quyen,\n\u201cSlow modulations of high-frequency activity (40\u2013140 [emsp14] hz) discriminate preictal changes in hu-\nman focal epilepsy,\u201d Scienti\ufb01c Reports, vol. 4, 2014.\n\n9\n\n\f[5] B. J. Frey, F. R. Kschischang, H.-A. Loeliger, and N. Wiberg, \u201cFactor graphs and algorithms,\u201d in Proceed-\nings of the 35th Annual Allerton Conference on Communication Control and Computing. University of\nIllinois, 1997, pp. 666\u2013680.\n\n[6] Y. Varatharajah, B. M. Berry, Z. T. Kalbarczyk, B. H. Brinkmann, G. A. Worrell, and R. K. Iyer, \u201cInter-\nictal seizure onset zone localization using unsupervised clustering and bayesian \ufb01ltering,\u201d in 8th Interna-\ntional IEEE/EMBS Conference on Neural Engineering (NER).\n\nIEEE, 2017, pp. 533\u2013539.\n\n[7] Y. Varatharajah, R. K. Iyer, B. M. Berry, G. A. Worrell, and B. H. Brinkmann, \u201cSeizure forecasting and\nthe preictal state in canine epilepsy,\u201d International Journal of Neural Systems, vol. 27, p. 1650046, 2017.\n[8] R. Katznelson, \u201cEEG recording, electrode placement, and aspects of generator localization,\u201d Electric\n\nFields of the Brain, pp. 176\u2013213, 1981.\n\n[9] J. D. Martinez-Vargas, G. Strobbe, K. Vonck, P. van Mierlo, and G. Castellanos-Dominguez, \u201cImproved\nlocalization of seizure onset zones using spatiotemporal constraints and time-varying source connectivity,\u201d\nFrontiers in Neuroscience, vol. 11, p. 156, 2017.\n\n[10] L. R. Andersen, J. H. Krebs, and J. D. Andersen, \u201cSteno: An expert system for medical diagnosis based\non graphical models and model search,\u201d Journal of Applied Statistics, vol. 18, no. 1, pp. 139\u2013153, 1991.\n[11] P. Cao, E. Badger, Z. Kalbarczyk, R. Iyer, and A. Slagell, \u201cPreemptive intrusion detection: Theoretical\nframework and real-world measurements,\u201d in Proceedings of the 2015 Symposium and Bootcamp on the\nScience of Security. ACM, 2015, pp. 21:1\u201321:2.\n\n[12] Y. Zhang, J. Tang, J. Sun, Y. Chen, and J. Rao, \u201cMoodcast: Emotion prediction via dynamic continuous\n\nfactor graph model,\u201d in 10th International Conference on Data Mining (ICDM), 2010, pp. 1193\u20131198.\n\n[13] J. Liu, C. Zhang, C. McCarty, P. Peissig, E. Burnside, and D. Page, \u201cHigh-dimensional structured feature\nscreening using binary Markov random \ufb01elds,\u201d in Arti\ufb01cial Intelligence and Statistics, 2012, pp. 712\u2013721.\n[14] W. Wiegerinck, \u201cVariational approximations between mean \ufb01eld theory and the junction tree algorithm,\u201d\n\nin Proceedings of the 16th conference on Uncertainty in arti\ufb01cial intelligence, 2000, pp. 626\u2013633.\n\n[15] J. S. Yedidia, W. T. Freeman, Y. Weiss et al., \u201cGeneralized belief propagation,\u201d in Advances in Neural\n\nInformation Processing Systems, vol. 13, 2000, pp. 689\u2013695.\n\n[16] W. R. Gilks, S. Richardson, and D. Spiegelhalter, Markov chain Monte Carlo in practice. CRC Press,\n\n1995.\n\n[17] S. Chib and E. Greenberg, \u201cUnderstanding the Metropolis-Hastings algorithm,\u201d The American Statisti-\n\ncian, vol. 49, no. 4, pp. 327\u2013335, 1995.\n\n[18] V. Kolmogorov and R. Zabin, \u201cWhat energy functions can be minimized via graph cuts?\u201d IEEE Transac-\n\ntions on Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 147\u2013159, 2004.\n\n[19] R. G. Andrzejak, D. Chicharro, C. E. Elger, and F. Mormann, \u201cSeizure prediction: Any better than\n\nchance?\u201d Clinical Neurophysiology, vol. 120, no. 8, pp. 1465\u20131478, 2009.\n\n[20] R. W. Lee, G. A. Worrell, W. R. Marsh, G. D. Cascino, N. M. Wetjen, F. B. Meyer, E. C. Wirrell, and E. L.\nSo, \u201cDiagnostic outcome of surgical revision of intracranial electrode placements for seizure localization,\u201d\nJournal of Clinical Neurophysiology, vol. 31, no. 3, pp. 199\u2013202, 2014.\n\n[21] N. M. Wetjen, W. R. Marsh, F. B. Meyer, G. D. Cascino, E. So, J. W. Britton, S. M. Stead, and G. A.\nWorrell, \u201cIntracranial electroencephalography seizure onset patterns and surgical outcomes in nonlesional\nextratemporal epilepsy,\u201d Journal of Neurosurgery, vol. 110, no. 6, pp. 1147\u20131152, 2009.\n\n[22] S. Liu, Z. Sha, A. Sencer, A. Aydoseli, N. Bebek, A. Abosch, T. Henry, C. Gurses, and N. F. Ince, \u201cEx-\nploring the time\u2013frequency content of high frequency oscillations for automated identi\ufb01cation of seizure\nonset zone in epilepsy,\u201d Journal of Neural Engineering, vol. 13, no. 2, p. 026026, 2016.\n\n[23] M. Stead, M. Bower, B. H. Brinkmann, K. Lee, W. R. Marsh, F. B. Meyer, B. Litt, J. Van Gompel,\nand G. A. Worrell, \u201cMicroseizures and the spatiotemporal scales of human partial epilepsy,\u201d Brain, pp.\n2789\u20132797, 2010.\n\n[24] G. P. Kalamangalam, L. Cara, N. Tandon, and J. D. Slater, \u201cAn interictal eeg spectral metric for temporal\n\nlobe epilepsy lateralization,\u201d Epilepsy Research, vol. 108, no. 10, pp. 1748\u20131757, 2014.\n\n[25] Y. Boykov and V. Kolmogorov, \u201cAn experimental comparison of min-cut/max-\ufb02ow algorithms for energy\nminimization in vision,\u201d IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9,\npp. 1124\u20131137, 2004.\n\n[26] Y. Weiss and W. T. Freeman, \u201cOn the optimality of solutions of the max-product belief-propagation algo-\n\nrithm in arbitrary graphs,\u201d IEEE Transactions on Information Theory, vol. 47, pp. 736\u2013744, 2001.\n\n[27] P. Dagum, A. Galper, and E. Horvitz, \u201cDynamic network models for forecasting,\u201d in Proceedings of the\n\n8th International Conference on Uncertainty in Arti\ufb01cial Intelligence, 1992, pp. 41\u201348.\n\n[28] A. Delong and Y. Boykov, \u201cGlobally optimal segmentation of multi-region objects,\u201d in 2009 12th IEEE\n\nInternational Conference on Computer Vision.\n\nIEEE, 2009, pp. 285\u2013292.\n\n10\n\n\f", "award": [], "sourceid": 2786, "authors": [{"given_name": "Yogatheesan", "family_name": "Varatharajah", "institution": "University of Illinois at Urbana Champaign"}, {"given_name": "Min Jin", "family_name": "Chong", "institution": "University of Illinois at Urbana-Champaign"}, {"given_name": "Krishnakant", "family_name": "Saboo", "institution": null}, {"given_name": "Brent", "family_name": "Berry", "institution": "Mayo Clinic"}, {"given_name": "Benjamin", "family_name": "Brinkmann", "institution": "Mayo Clinic"}, {"given_name": "Gregory", "family_name": "Worrell", "institution": "Mayo Clinic, Rochester"}, {"given_name": "Ravishankar", "family_name": "Iyer", "institution": null}]}