{"title": "PATTERN CLASS DEGENERACY IN AN UNRESTRICTED STORAGE DENSITY MEMORY", "book": "Neural Information Processing Systems", "page_first": 674, "page_last": 682, "abstract": "", "full_text": "674 \n\nP A 'ITERN CLASS DEGENERACY IN AN UNRESTRICfED STORAGE \n\nDENSITY MEMORY \n\nChristopher L. Scofield, Douglas L. Reilly, \n\nCharles Elbaum, Leon N. Cooper \n\nNestor, Inc., 1 Richmond Square, Providence, Rhode Island, \n\n02906. \n\nABSTRACT \n\nin \n\n(and \n\nto \n\ntheir \n\ninherent architectural \n\nThe study of distributed memory systems has produced a \nlimited domains. \n\nnumber of models which work well \nHowever, until recently, the application of such systems to real(cid:173)\nworld problems has been difficult because of storage limitations, \nand \nfor serial simulation, \ncomputational) complexity. \nRecent development of memories \nwith unrestricted storage capacity and economical feedforward \nthe application of such \narchitectures has opened the way \nHowever, \nsystems \nsuch problems are sometimes underspecified by \nthe features \nwhich describe the environment, and \nthus a significant portion \nof the pattern environment \nis often non-separable. We will \nreview current work on high density memory systems and their \nnetwork implementations. We will discuss a general learning \nalgorithm \nits \napplication to separable point sets. Finally, we will introduce an \nextension of \nthe probability \ndistributions of non-separable point sets. \n\nto complex pattern recognition problems. \n\nfor such high density memories and \n\nthis method \n\nlearning \n\nreview \n\nfor \n\nINTRODUcnON \n\nthe \n\nlong been \n\nfocused on \n\nInformation storage \n\ntopic of intense study. \n\nin distributed content addressable \nEarly \nmemories has \nresearch \nthe development of correlation matrix \nmemories 1, 2, 3, 4. Workers in the field found that memories of \nthis sort allowed storage of a number of distinct memories no \ninput space. \nlarger \nFurther storage beyond this number caused the system to give \nan incorrect output for a memorized input. \n\nthe number of dimensions of \n\nthan \n\nthe \n\n@ American Institute of Physics 1988 \n\n\f675 \n\nlayer, recurrent networks. \n\nthe analysis of settling of activity \nThis method defined \n\nRecent work on distributed memory systems has focused on \nHopfield 5, 6 introduced a \nsingle \nin recurrent \nmethod for \nthe network as a dynamical \nnetworks. \nsystem for which a global function called the 'energy' (actually a \nLiapunov function \nthe \nstate of the network) could be defined. Hopfield showed that \nflow \ntoward the fixed points of the \ndynamical system if the matrix of recurrent connections satisfies \ncertain conditions. With this property, Hopfield was able to \ndefine \nthe sites of memories of network \nacti vity. \n\nthe autonomous system describing \n\nthe fixed points as \n\nfor \n\nin state space is always \n\nLike its forerunners, the Hopfield network is limited in \nstorage capacity. Empirical study of the system found that for \nrandomly chosen memories, storage capacity was limited to m ~ \nO.lSN, where m \nthat could be \naccurately recalled, and N is the dimensionality of the network \n(this has since been improved to m ~ N, 7, 8). The degradation of \nmemory recall with increased storage density is directly related \nto the proliferation in the state space of unwanted local minima \nwhich serve as basins of flow. \n\nthe number of memories \n\nis \n\nUNRESTRICIEn STORAGE DENSITY MEMORIES \n\nBachman et al. 9 have studied another relaxation system \nsimilar in some respects to the Hopfield network. However, in \ncontrast to Hopfield, they have focused on defining a dynamical \nsystem \nlocations of the minima are explicitly \nknown. \n\nin which \n\nthe \n\nIn particular, they have chosen a system with a Liapunov \n\nfunction given by \n\nE = -IlL ~ Qj I Il- Xj I - L, \n\nJ \n\n(1) \n\nwhere E is the total 'energy' of the network, Il (0) is a vector \ndescribing the initial network activity caused by a test pattern, \nand Xj' the site of the jth memory, for m memories in RN. L is a \nparameter related to the network size. Then 1l(0) relaxes to Il(T) \n= Xj for some memory j according to \n\n\f676 \n\n(2) \n\nThis system is isomorphic to the classical electrostatic potential \nbetween a positive (unit) test charge, and negative charges Qj at \nthe sites Xj (for a 3-dimensional input space, and L = 1). The N(cid:173)\ndimensional Coulomb energy function \nthen defines exactly m \nbasins of attraction to the fixed points located at the charge sites \nIt can been shown that convergence to the closest distinct \nXj. \nindependent of the number of stored \nmemory \nmemories m, for proper choice of Nand L 9, to. \n\nis guaranteed, \n\nEquation 1 shows that each cell receives feedback from the \n\nnetwork in the form of a scalar \n\n~ Q-I Jl- x-I- L \nJ \n\nJ \n\nJ \n\n\u2022 \n\n(3) \n\nImportantly, this quantity is the same for all cells; it is as if a \nsingle virtual cell was computing the distance in activity space \nbetween the current state and stored states. The result of the \ncomputation is then broadcast to all of the cells in the network. \nA 2-layer feedforward network implementing such a system has \nbeen described elsewhere 10. \n\nThe connectivity for this architecture is of order m\u00b7N, where \nm is the number of stored memories and N is the dimensionality \nof layer 1. \nThis is significant since the addition of a new \nmemory m' = m + 1 will change the connectivity by the addition \nof N + 1 connections, whereas in the Hopfield network, addition \nof a new memory requires the addition of 2N + 1 connections. \n\nAn equilibrium feedforward network with similar properties \nhas been under investigation for some time 11. This model does \nnot employ a relaxation procedure, and thus was not originally \nframed in the language of Liapunov functions. However, it is \npossible to define a similar system if we identify the locations of \nthe 'prototypes' of this model as\u00b7 the locations in state space of \npotentials which satisfy the following conditions \n\nEj = -Qj lRo for I j.t - Xj I < Aj \nfor I fl - Xj I > A]. \n\n= 0 \n\n(4) \n\n\fwhere Ro is a constant. \n\n677 \n\nThis form of potential is often referred to as the 'square-well' \n\npotential. This potential may be viewed as a limit of the N(cid:173)\ndimensional Coulomb potential, in which the l/R (L = l) well is \nreplaced with a square well (for which L \u00bb \nl). Equation 4 \ndescribes an energy landscape which consists of plateaus of zero \npotential outside of wells with flat, zero slope basins. Since the \nlandscape has only \nflat regions separated by discontinuous \nboundaries, the state of the network is always at equilibrium, \nand relaxation does not occur. For this reason, this system has \nbeen called an equilibrium model. This model, also referred to \nas \nthe \nproperty of unrestricted storage density. \n\nthe Restricted Coulomb Energy (RCE)14 model, shares \n\nLEARNING IN HIGH DENSITY MEMORIES \n\nA simple learning algorithm for the placement of the wells has \nbeen described in detail elsewhere 11, 12. \n\nFigurel: 3-layer feedforward network. Cell i \ncomputes the quantity IJl - xii and compares \nto internal threshold Ai. \n\n\f678 \n\nto \n\nReilly et. al. have employed a three layer feedforward \nnetwork (figure 1) which allows the generalization of a content \naddressable memory \na pattern classification memory. \nBecause the locations of the minima are explicitly known in the \nequilibrium model, \nthe \nenergy function for an arbitrary energy landscape. This allows \nthe construction of geographies of basins associated with \nthe \nclasses constituting the pattern environment. Rapid learning of \ncomplex, non-linear, disjoint, class regions is possible by \nthis \nmethod 12, 13. \n\nto dynamically program \n\nit is possible \n\nLEARNING NON-SEPARABLE CLASS REGIONS \n\nfocused on \n\nstudies have \n\nPrevious \nthe \ngeography and boundaries of non-linearly separable point sets. \nHowever, a method by which such high density models can \nacquire \nthe probability distributions of non-separable sets has \nnot been described. \n\nthe acquisition of \n\nNon-separable sets are defined as point sets in the state \nlabelled with multiple class \nspace of a system which are \naffiliations. \nThis can occur because the input space has not \ncarried all of the features in the pattern environment, or because \nthe pattern set itself is not separable. Points may be degenerate \nwith respect to the explicit features of the space, however they \nmay have different probability distributions within \nthe \nimportant \nenvironment. \ninformation for the identification of patterns by such memories \n10 the presence of feature space degeneracies. \n\nthe environment is \n\nThis structure \n\nin \n\nWe now describe one possible mechanism for the acquisition \nof the probability distribution of non-separable points. \nIt is \nassumed that all points in some region R of the state space of the \nnetwork are the site of events Jl (0, Ci ) which are examples of \npattern classes C = {C 1 , ... , CM }. A basin of attraction, xk( C i ), \ndefined by equation 4, is placed at each site fl(O, Ci) unless \n\n(5) \n\nthat is, unless a memory at Xj (of the class Ci ) already contains \nfl(O, Ci)\u00b7 The initial values of Qo and Ro at xk(Ci) are a constant for \nall sites Xj. Thus as events of the classes C1, ... , C M occur at a \nparticular site in R, multiple wells are placed at this location. \n\n\f679 \n\nIf a well x/ C i) correctly covers an event Jl (0, Ci ), then the \nis \ncharge at that site (which defines \nincremented by a constant amount ~ Q o. \nIn this manner, the \nregion R is covered with wells of all classes {C 1 , ... , CM }, with the \ndepth of well XiCi) proportional to the frequency of occurence of \nCi at Xj. \n\nthe depth of the well) \n\nThe architecture of this network is exactly the same as that \nalready described. As before, this network acquires a new cell \nfor each well placed in the energy landscape. Thus we are able \nto describe the meaning of wells that overlap as the competition \nby multiple cells in layer 2 in firing for the pattern of activity in \nthe input layer. \n\nAPPLICATIONS \n\nThis system has been applied to a problem in the area of risk \nassessment in mortgage lending. The input space consisted of \nfeature detectors with continuous firing rates proportional to the \nvalues of 23 variables in the application for a mortgage. For this \nset of features, a significant portion of the space was non(cid:173)\nseparable. \n\nFigures 2a and 2b illustrate the probability distributions of \nhigh and low risk applications for two of the features. \nIt is clear \nthat in this 2-dimensional subspace, the regions of high and low \nrisk are non-separable but have different distributions. \n\nt-----------#llir----- Prob. = 1.0. \n\n1000 Patterns \n\nProb. = 0.5 \n\n0.0 \n\nFeature 1 \n\n1.0 \n\nFigure 2a: Probability distribution for High \nand Low risk patterns for feature 1. \n\n\f680 \n\n1 - - - - -1 - - - - \\ - - - - - - - - - Prob. = 1.0. \n\nt 000 Patterns \n\nProb. = 0.5 \n\n0.0 \n\nFeature 2 \n\n1.0 \n\nFigure 2b: Probability distribution for High \nand Low risk patterns for feature 2. \n\nFigure 3 depicts the probability distributions acquired by \nthe system for \nimage, \ncircle radius is proportional to the degree of risk: Small circles \nare regions of low risk, and large circles are regions of high \nrisk. \n\nthis 2-dimensional subspace. \n\nthis \n\nIn \n\n00 \n\no \n0:>0 \nt?. \nV \n\n0 \n\n00 \n\n0 0 0 \n\no 0 \n\no \n\n0 0 0 0 o \no \no \n\n0 \n\n00 0 \n\n0 0 \n\nFeature 1 \n\nFigure 3: Probability distribition for Low and \nHigh risk. \nindicate \nlow risk \nregIons and \nindicate high risk \nregions. \n\nSmall circles \nlarge circles \n\n\f681 \n\nOf particular interest is the clear clustering of high and low risk \nregions in the 2-d map. Note that the regions are in fact non(cid:173)\nlinearly separable. \n\nDISCUSSION \n\nin non-separable point sets. \n\nthe acquisition of \nWe have presented a simple method for \nprobability distributions \nThis \nmethod generates an energy landscape of potential wells with \ndepths that are proportional to the local probability density of \nthe classes of patterns in the environment. These well depths \nset \nIn a 3-layer \nfeedforward network. \n\nfiring of class cells \n\nthe probability of \n\nApplication of this method to a problem in risk assessment \nthat even completely non-separable subspaces may \nThis method improves \nlittle additional \n\nhas shown \nbe modeled with surprising accuracy. \npattern classification \ncomputational burden. \n\nin such problems with \n\nThis algorithm has been run in conjunction with the method \ndescribed by Reilly et. al. II for separable regions. This combined \nsystem is able to generate non-linear decision surfaces between \nthe \nthe probability \ndistributions of the non-separable zones in a seemless manner. \nFurther discussion of this system will appear in future reports. \n\napproximate \n\nseparable \n\nzones, \n\nand \n\nCurrent work is focused on the development of a more \nthe \nthe \nis smooth \n\ngeneral method for modelling \ndistributions. \nto \ntransition from separable \nand should not be handled with a 'hard' threshold. \n\nthe scale of variations \nthis \n\nto non-separable regions \n\nSensitivity \n\nin \nthat \n\nsuggests \n\nscale \n\nACKNOWLEDGEMENTS \n\nWe would like to thank Ed Collins and Sushmito Ghosh for their \nsignificant contributions to this work through the development \nof the mortgage risk assessment application. \n\nREFERENCES \n\n[1] Anderson, J .A.: A simple neural network generating an \ninteractive memory. Math. Biosci. 14, 197-220 (1972). \n\n\f682 \n\na system-theoretical \n\n[2] Cooper, L.N.: A possible organization of animal memory and \nlearning. In: Proceedings of the Nobel Symposium on Collective \nProperties of Physical Systems, Lundquist, B., Lundquist, S. \n(24), 252-264 London, New York: Academic Press 1973. \n(eds.). \n[3] Kohonen, T.: Correlation matrix memories. \nIEEE Trans. \nComput. 21, 353-359 (1972). \n[4] Kohonen, T.: Associative memory -\napproach. Berlin, Heidelberg, New York: Springer 1977. \n[5] Hopfield, J.J.: Neural networks and physical systems with \nemergent collective computational abilities. Proc. Natl. Acad. Sci. \nUSA 79, 2554-2558 (April 1982). \n[6] Hopfield, J.J.: Neurons with graded response have collective \ncomputational properties like those of two-state neurons. Proc. \nNatl. Acad. Sci. USA 81, 2088-3092 (May, 1984). \n[7] Hopfield, J.J., Feinstein, D.I., Palmer, R.G.: 'Unlearning' has a \nstabilizing effect in collective memories. Nature 304, 158-159 \n(July 1983). \n[8] Potter, T.W.: Ph.D. Dissertation \nS.U.N.Y. Binghampton, (unpublished). \n[9] Bachmann, C.M., Cooper, L.N., Dembo, A., Zeitouni, 0.: A \nrelaxation model for memory with high density storage. \nto be \npublished in Proc. Nati. Acad. Sci. USA. \n[10] Dembo, A., Zeitouni, 0.: ARO Technical Report, Brown \nUniversity, Center for Neural Science, Pr0vidence, R.I., (1987), \nalso submitted to Phys. Rev. A. \n[11] Reilly, D.L., Cooper, L.N., Elbaum, C.: A neural model for \ncategory learning. BioI. Cybern. 45, 35 -41 (1982). \n[12] Reilly, D.L., Scofield, C., Elbaum, C., Cooper, L.N.: Learning \nsystem architectures composed of multiple learning modules. \nto \nappear in Proc. First In1'1. Conf. on Neural Networks (1987). \n[13] Rimey, R., Gouin, P., Scofield, C., Reilly, D.L.: Real-time 3-D \nobject classification using a learning system. \nIntelligent Robots \nand Computer Vision, Proc. SPIE 726 (1986). \n[14] Reilly, D.L., Scofield, C. L., Elbaum, C., Cooper, L.N: Neural \nNetworks with \nlow connectivity and unrestricted memory \nstorage density. To be published. \n\nin advanced \n\ntechnology, \n\n\f", "award": [], "sourceid": 21, "authors": [{"given_name": "Christopher", "family_name": "Scofield", "institution": null}, {"given_name": "Douglas", "family_name": "Reilly", "institution": null}, {"given_name": "Charles", "family_name": "Elbaum", "institution": null}, {"given_name": "Leon", "family_name": "Cooper", "institution": null}]}