{"title": "Unmixing Hyperspectral Data", "book": "Advances in Neural Information Processing Systems", "page_first": 942, "page_last": 948, "abstract": null, "full_text": "U nmixing Hyperspectral Data \n\nLucas Parra, Clay Spence, Paul Sajda \n\nSarnoff Corporation, CN-5300, Princeton, NJ 08543, USA \n\n{lparra, cspence,psajda} @sarnoff.com \n\nAndreas Ziehe, Klaus-Robert Miiller \n\nGMD FIRST.lDA, Kekulestr. 7, 12489 Berlin, Germany \n\n{ziehe,klaus}@first.gmd.de \n\nAbstract \n\nIn hyperspectral imagery one pixel typically consists of a mixture \nof the reflectance spectra of several materials, where the mixture \ncoefficients correspond to the abundances of the constituting ma(cid:173)\nterials. We assume linear combinations of reflectance spectra with \nsome additive normal sensor noise and derive a probabilistic MAP \nframework for analyzing hyperspectral data. As the material re(cid:173)\nflectance characteristics are not know a priori, we face the problem \nof unsupervised linear unmixing. The incorporation of different \nprior information (e.g. positivity and normalization of the abun(cid:173)\ndances) naturally leads to a family of interesting algorithms, for \nexample in the noise-free case yielding an algorithm that can be \nunderstood as constrained independent component analysis (ICA). \nSimulations underline the usefulness of our theory. \n\n1 \n\nIntroduction \n\nCurrent hyperspectral remote sensing technology can form images of ground surface \nreflectance at a few hundred wavelengths simultaneously, with wavelengths ranging \nfrom 0.4 to 2.5 J.Lm and spatial resolutions of 10-30 m. The applications of this \ntechnology include environmental monitoring and mineral exploration and mining. \nThe benefit of hyperspectral imagery is that many different objects and terrain \ntypes can be characterized by their spectral signature. \nThe first step in most hyperspectral image analysis systems is to perform a spectral \nunmixing to determine the original spectral signals of some set of prime materials. \nThe basic difficulty is that for a given image pixel the spectral reflectance patterns \nof the surface materials is in general not known a priori. However there are gen(cid:173)\neral physical and statistical priors which can be exploited to potentially improve \nspectral unmixing. In this paper we address the problem of unmixing hyperspectral \nimagery through incorporation of physical and statistical priors within an unsuper(cid:173)\nvised Bayesian framework. \n\nWe begin by first presenting the linear superposition model for the reflectances \nmeasured. We then discuss the advantages of unsupervised over supervised systems. \n\n\fUnmixing Hyperspectral Data \n\n943 \n\nWe derive a general maximum a posteriori (MAP) framework to find the material \nspectra and infer the abundances. Interestingly, depending on how the priors are \nincorporated, the zero noise case yields (i) a simplex approach or (ii) a constrained \nleA algorithm. Assuming non-zero noise our MAP estimate utilizes a constrained \nleast squares algorithm. The two latter approaches are new algorithms whereas the \nsimplex algorithm has been previously suggested for the analysis of hyperspectral \ndata. \n\nLinear Modeling To a first approximation the intensities X (Xi>.) measured in \neach spectral band A = 1, ... , L for a given pixel i = 1, ... , N are linear combi(cid:173)\nnations of the reflectance characteristics S (8m >.) of the materials m = 1, ... , M \npresent in that area. Possible errors of this approximation and sensor noise are \ntaken into account by adding a noise term N (ni>'). In matrix form this can be \nsummarized as \n\nX = AS + N, subject to: AIM = lL, A ~ 0, \n\n(1) \n\nwhere matrix A (aim) represents the abundance of material m in the area cor(cid:173)\nresponding to pixel i, with positivity and normalization constraints. Note that \nground inclination or a changing viewing angle may cause an overall scale factor for \nall bands that varies with the pixels. This can be incorporated in the model by sim(cid:173)\nply replacing the constraint AIM = lL with AIM ~ lL which does does not affect \nthe discussion in the remainder of the paper. This is clearly a simplified model of \nthe physical phenomena. For example, with spatially fine grained mixtures, called \nintimate mixtures, multiple reflectance may causes departures from this first or(cid:173)\nder model. Additionally there are a number of inherent spatial variations in real \ndata, such as inhomogeneous vapor and dust particles in the atmosphere, that will \ncause a departure from the linear model in equation (1). Nevertheless, in practical \napplications a linear model has produced reasonable results for areal mixtures. \n\nSupervised vs. Unsupervised techniques Supervised spectral un mixing re(cid:173)\nlies on the prior knowledge about the reflectance patterns S of candidate surface \nmaterials, sometimes called endmembers, or expert knowledge and a series of semi(cid:173)\nautomatic steps to find the constituting materials in a particular scene. Once the \nuser identifies a pixel i containing a single material, i.e. aim = 1 for a given m and \ni, the corresponding spectral characteristics of that material can be taken directly \nfrom the observations, i.e., 8 m >. = Xi>. [4]. Given knowledge about the endmembers \none can simply find the abundances by solving a constrained least squares problem. \nThe problem with such supervised techniques is that finding the correct S may re(cid:173)\nquire substantial user interaction and the result may be error prone, as a pixel that \nactually contains a mixture can be misinterpreted as a pure endmember. Another \napproach obtains endmembers directly from a database. This is also problematic \nbecause the actual surface material on the ground may not match the database en(cid:173)\ntries, due to atmospheric absorption or other noise sources. Finding close matches \nis an ambiguous process as some endmembers have very similar reflectance charac(cid:173)\nteristics and may match several entries in the database. \nUnsupervised unmixing, in contrast, tries to identify the endmembers and mixtures \ndirectly from the observed data X without any user interaction. There are a variety \nof such approaches. In one approach a simplex is fit to the data distribution [7, 6, 2]. \nThe resulting vertex points of the simplex represent the desired endmembers, but \nthis technique is very sensitive to noise as a few boundary points can potentially \nchange the location of the simplex vertex points considerably. Another approach by \nSzu [9] tries to find abundances that have the highest entropy subject to constraints \nthat the amount of materials is as evenly distributed as possible - an assumption \n\n\f944 \n\nL. Parra, C. D. Spence, P Sajda, A. Ziehe and K.-R. Muller \n\nwhich is clearly not valid in many actual surface material distributions. A relatively \nnew approach considers modeling the statistical information across wavelength as \nstatistically independent AR processes [1]. This leads directly to the contextual \nlinear leA algorithm [5]. However, the approach in [1] does not take into account \nconstraints on the abundances, noise, or prior information. Most importantly, the \nmethod [1] can only integrate information from a small number of pixels at a time \n(same as the number of endmembers). Typically however we will have only a few \nendmembers but many thousand pixels. \n\n2 The Maximum A Posterior Framework \n\n2.1 A probabilistic model of unsupervised spectral unmixing \n\nOur model has observations or data X and hidden variables A, S, and N that \nare explained by the noisy linear model (1). We estimate the values of the hidden \nvariables by using MAP \n\n(A SIX) = p(XIA, S)p(A, S) = Pn(XIA, S)Pa(A)ps(S) \np \n\np(X) \n\n, \n\np(X) \n\n(2) \n\nwith Pa(A), Ps(S), Pn(N) as the a priori assumptions of the distributions. With \nMAP we estimate the most probable values for given priors after observing the data, \n\nA MAP , SMAP = argmaxp(A, SIX) \n\nA,S \n\n(3) \n\nNote that for maximization the constant factor p(X) can be ignored. Our first as(cid:173)\nsumption, which is indicated in equation (2) is that the abundances are independent \nof the reflectance spectra as their origins are completely unrelated: (AO) A and S \nare independent. \n\nThe MAP algorithm is entirely defined by the choices of priors that are guided by \nthe problem of hyperspectral unmixing: (AI) A represent probabilities for each \npixel i. (A2) S are independent for different material m. (A3) N are normal i.i.d. \nfor all i, A. In summary, our MAP framework includes the assumptions AO-A3. \n\n2.2 \n\nIncluding Priors \n\nPriors on the abundances Positivity and normalization of the abundances can \nbe represented as, \n\n(4) \nwhere 60 represent the Kronecker delta function and eo the step function. With \nthis choice a point not satisfying the constraint will have zero a posteriori probabil(cid:173)\nity. This prior introduces no particular bias of the solutions other then abundance \nconstraints. It does however assume the abundances of different pixels to be inde(cid:173)\npendent. \n\nPrior on spectra Usually we find systematic trends in the spectra that cause \nsignificant correlation. However such an overall trend can be subtracted and/or \nfiltered from the data leaving only independent signals that encode the variation \nfrom that overall trend. For example one can capture the conditional dependency \nstructure with a linear auto-regressive (AR) model and analyze the resulting \"inno(cid:173)\nvations\" or prediction errors [3]. In our model we assume that the spectra represent \nindependent instances of an AR process having a white innovation process em.>. dis(cid:173)\ntributed according to Pe(e). With a Toeplitz matrix T of the AR coefficients we \n\n\fUnmixing Hyperspectral Data \n\n945 \n\ncan write, em = Sm T. The AR coefficients can be found in a preprocessing step on \nthe observations X. If S now represents the innovation process itself, our prior can \nbe represented as, \n\nPe (S) .d>.>.,) , \n\nM \n\nL \n\nL \n\nm=1 >.=1 \n\n>.'=1 \n\n(5) \n\nAdditionally Pe (e) is parameterized by a mean and scale parameter and potentially \nparameters determining the higher moments of the distributions. For brevity we \nignore the details of the parameterization in this paper. \n\nPrior on the noise As outlined in the introduction there are a number of prob(cid:173)\nlems that can cause the linear model X = AS to be inaccurate (e.g. multiple \nreflections, inhomogeneous atmospheric absorption, and detector noise.) As it is \nhard to treat all these phenomena explicitly, we suggest to pool them into one noise \nvariable that we assume for simplicity to be normal distributed with a wavelength \ndependent noise variance a>., \n\np(XIA, S) = Pn(N) = N(X - AS,~) = II N(x>. - As>., a>.l) , \n\nL \n\n(6) \n\nwhere N (', .) represents a zero mean Gaussian distribution, and 1 the identity matrix \nindicating the independent noise at each pixel. \n\n>.=1 \n\n2.3 MAP Solution for Zero Noise Case \n\nLet us consider the noise-free case. Although this simplification may be inaccurate it \nwill allow us to greatly reduce the number of free hidden variables - from N M + M L \nto M2 . In the noise-free case the variables A, S are then deterministically dependent \non each other through a N L-dimensional 8-distribution, Pn(XIAS) = 8(X - AS). \nWe can remove one of these variables from our discussion by integrating (2). It is \ninstructive to first consider removing A \n\np(SIX) M. In that case the observations X can be \nmapped into a M dimensional subspace using the singular value decomposition (SVD) , \nX = UDVT , The discussion applies then to the reduced observations X = u1x with \nU M being the first M columns of U . \n\n\f946 \n\nL. Parra. C. D. Spence. P Sajda. A. Ziehe and K.-R. Muller \n\nthe prior on A will restrict the solutions to satisfy the abundance constraints and \nbias the result depending on the detailed choice of Pa(A), so we are led to con(cid:173)\nstrained ICA. \nIn summary, depending on which variable we integrate out we obtain two methods \nfor solving the spectral unmixing problem: the known technique of simplex fitting \nand a new constrained ICA algorithm. \n\n2.4 MAP Solution for the Noisy Case \n\nCombining the choices for the priors made in section 2.2 (Eqs.(4), (5) and (6)) with \n(2) and (3) we obtain \n\nAMAP, SMAP = \"''i~ax ft {g N(x\", - a,s\" a,) ll. P,(t. 'm,d\",) } , \n\n(9) \n\nsubject to AIM = lL, A 2: O. The logarithm of the cost function in (9) is denoted \nby L = L(A, S). Its gradient with respect to the hidden variables is \n\n88L = _AT nm diag(O')-l -\nSm \n\nfs(sm) \n\n(10) \n\nwhere N = X - AS, nm are the M column vectors of N, fs(s) = - olnc;(s). In (10) \nfs is applied to each element of Sm. \nThe optimization with respect to A for given S can be implemented as a standard \nweighted least squares (L8) problem with a linear constraint and positivity bounds. \nSince the constraints apply for every pixel independently one can solve N separate \nconstrained LS problems of M unknowns each. We alternate between gradient steps \nfor S and explicit solutions for A until convergence. Any additional parameters of \nPe(e) such as scale and mean may be obtained in a maximum likelihood (ML) sense \nby maximizing L. Note that the nonlinear optimization is not subject to constraints; \nthe constraints apply only in the quadratic optimization. \n\n3 Experiments \n\n3.1 Zero Noise Case: Artificial Mixtures \n\nIn our first experiment we use mineral data from the United States Geological Sur(cid:173)\nvey (USGS)2 to build artificial mixtures for evaluating our unsupervised unmixing \nframework. Three target endmembers where chosen (Almandine WS479, Montmo(cid:173)\nrillonite+Illi CM42 and Dickite NMNH106242). A spectral scene of 100 samples \nwas constructed by creating a random mixture of the three minerals. Of the 100 \nsamples, there were no pure samples (Le. no mineral had more than a 80% abun(cid:173)\ndance in any sample). Figure 1A is the spectra of the endmembers recovered by the \nconstrained ICA technique of section 2.3, where the constraints were implemented \nwith penalty terms added to the conventional maximum likelihood ICA algorithm. \nThese are nearly identical to the spectra of the true endmembers, shown in fig(cid:173)\nure 1B, which were used for mixing. Interesting to note is the scatter-plot of the \n100 samples across two bands. The open circles are the absorption values at these \ntwo bands for endmembers found by the MAP technique. Given that each mixed \nsample consists of no more than 80% of any endmember, the endmember points \non the scatter-plot are quite distant from the cluster. A simplex fitting technique \nwould have significant difficulty recovering the endmembers from this clustering. \n\n2see http://speclab.cr . usgs.gov /spectral.lib.456.descript/ decript04.html \n\n\fUnmixing Hyperspectral Data \n\n947 \n\nfound endmembers \n\ntarget endmembers \n\nobserved X and found S \n\no \n\ng 0.8 \n~ \n., \n~0.6 \n~ \n~ 0.4 \n\no \n\nO~------' \n\n50 \n\n100 150 200 \n\n50 \n\n100 150 200 \n\nO~------' \n\n0.2'---~------' \n\nwavelength \n\nA \n\nwavelength \n\nB \n\n0.4 \n\n0.6 \n\n0.8 \nwavelength=30 \n\nC \n\nFigure 1: Results for noise-free artificial mixture. A recovered endmembers using \nMAP technique. B \"true\" target endmembers. C scatter plot of samples across 2 \nbands showing the absorption of the three endmembers computed by MAP (open \ncircles). \n\n3.2 Noisy Case: Real Mixtures \n\nTo validate the noise model MAP framework of section 2.4 we conducted an ex(cid:173)\nperiment using ground truthed USGS data representing real mixtures. We selected \nlOxl0 blocks of pixels from three different regions3 in the AVIRIS data of the \nCuprite, Nevada mining district. We separate these 300 mixed spectra assuming \ntwo endmembers and an AR detrending with 5 AR coefficients and the MAP tech(cid:173)\nniques of section 2.4. Overall brightness was accounted for as explain in the linear \nmodeling of section 1. The endmembers are shown in figure 2A and B in comparison \nto laboratory spectra from the USGS spectral library for these minerals [8J . Figure \n2C shows the corresponding abundances, which match the ground truth; region \n(III) mainly consists of Muscovite while regions (1)+(I1) contain (areal) mixtures of \nKaolinite and Muscovite. \n\n4 Discussion \n\nHyperspectral unmixing is a challenging practical problem for unsupervised learn(cid:173)\ning. Our probabilistic approach leads to several interesting algorithms: (1) simplex \nfitting, (2) constrained ICA and (3) constrained least squares that can efficiently use \nmulti-channel information. An important element of our approach is the explicit \nuse of prior information. Our simulation examples show that we can recover the \nendmembers, even in the presence of noise and model uncertainty. The approach \ndescribed in this paper does not yet exploit local correlations between neighboring \npixels that are well known to exist. Future work will therefore exploit not only \nspectral but also spatial prior information for detecting objects and materials. \n\nAcknowledgments \n\nWe would like to thank Gregg Swayze at the USGS for assistance in obtaining the \ndata. \n\n3The regions were from the image plate2.cuprite95.alpha.2um.image.wlocals.gif in \nftp:/ /speclab.cr.usgs.gov /pub/cuprite/gregg.thesis.images/, at the coordinates (265,710) \nand (275,697), which contained Kaolinite and Muscovite 2, and (143,661), which only \ncontained Muscovite 2. \n\n\f948 \n\n0.65 \n\n0.6 \n\n0.55 \n\n0.5 \n\n0.45 \n\nL. Parra, C. D, Spence, P Sajda, A. Ziehe and K-R. Muller \n\nMuscovite \n\nKaolinite \n\n0.8 \n\n0.7 \n\n0.6 \n\n0.4 \n\n0.3 \n\n'c .\u2022.\u2022 \", \"'0 .. \n' ., \n\n0.4,--~--:-:-:-\"-~----:--:--~ \n220 \n\n210 \n\n160 \n\n190 \n200 \nwaveleng1h \n\n180 \n\n190 \n\n200 \nwavelength \n\n210 \n\n220 \n\nA \n\nB \n\nC \n\nFigure 2: A Spectra of computed endmember (solid line) vs Muscovite sample \nspectra from the USGS data base library. Note we show only part of the spectrum \nsince the discriminating features are located only between band 172 and 220. B \nComputed endmember (solid line) vs Kaolinite sample spectra from the USGS data \nbase library. C Abundances for Kaolinite and Muscovite for three regions (lighter \npixels represent higher abundance). Region 1 and region 2 have similar abundances \nfor Kaolinite and Muscovite, while region 3 contains more Muscovite. \n\nReferences \n\n[1] J. Bayliss, J. A. Gualtieri, and R. Cromp. Analyzing hyperspectral data with \nindependent component analysis. In J. M. Selander, editor, Proc. SPIE Applied \nImage and Pattern Recognition Workshop, volume 9, P.O. Box 10, Bellingham \nWA 98227-0010, 1997. SPIE. \n\n[2] J.W. Boardman and F.A. Kruse. Automated spectral analysis: a geologic exam(cid:173)\n\nple using AVIRIS data, north Grapevine Mountains, Nevada. In Tenth Thematic \nConference on Geologic Remote Sensing, pages 407-418, Ann arbor, MI, 1994. \nEnvironmental Research Institute of Michigan. \n\n[3] S. Haykin. Adaptive Filter Theory. Prentice Hall, 1991. \n[4] F. Maselli, , M. Pieri, and C. Conese. Automatic identification of end-members \nfor the spectral decomposition of remotely sensed scenes. Remote Sensing for \nGeography, Geology, Land Planning, and Cultural Heritage (SPIE) , 2960:104-\n109,1996. \n\n[5] B. Pearlmutter and L. Parra. Maximum likelihood blind source separation: A \ncontext-sensitive generalization ofICA. In M. Mozer, M. Jordan, and T. Petsche, \neditors, Advances in Neural Information Processing Systems 9, pages 613-619, \nCambridge MA, 1997. MIT Press. \n\n[6] J.J. Settle. Linear mixing and the estimation of ground cover proportions. In(cid:173)\n\nternational Journal of Remote Sensing, 14:1159-1177,1993. \n\n[7] M.O. Smith, J .B. Adams, and A.R. Gillespie. Reference endmembers for spectral \nmixture analysis. In Fifth Australian remote sensing conference, volume 1, pages \n331-340, 1990. \n\n[8] U.S. Geological Survey. USGS digital spectral library. Open File Report 93-592, \n\n1993. \n\n[9] H. Szu and C. Hsu. Landsat spectral demixing a la superresolution of blind \nmatrix inversion by constraint MaxEnt neural nets. In Wavelet Applications \nIV, volume 3078, pages 147-160. SPIE, 1997. \n\n\f", "award": [], "sourceid": 1714, "authors": [{"given_name": "Lucas", "family_name": "Parra", "institution": null}, {"given_name": "Clay", "family_name": "Spence", "institution": null}, {"given_name": "Paul", "family_name": "Sajda", "institution": null}, {"given_name": "Andreas", "family_name": "Ziehe", "institution": null}, {"given_name": "Klaus-Robert", "family_name": "M\u00fcller", "institution": null}]}