{"title": "Learning Lateral Interactions for Feature Binding and Sensory Segmentation", "book": "Advances in Neural Information Processing Systems", "page_first": 1009, "page_last": 1016, "abstract": null, "full_text": "Learning Lateral Interactions for\n\nFeature Binding and Sensory Segmentation\n\nHeiko Wersing\n\nHONDA R&D Europe GmbH\n\nCarl-Legien-Str.30, 63073 Offenbach/Main, Germany\n\nheiko.wersing@hre-ftr.f.rd.honda.co.jp\n\nAbstract\n\nWe present a new approach to the supervised learning of lateral inter-\nactions for the competitive layer model (CLM) dynamic feature binding\narchitecture. The method is based on consistency conditions, which were\nrecently shown to characterize the attractor states of this linear threshold\nrecurrent network. For a given set of training examples the learning prob-\nlem is formulated as a convex quadratic optimization problem in the lat-\neral interaction weights. An ef\ufb01cient dimension reduction of the learning\nproblem can be achieved by using a linear superposition of basis inter-\nactions. We show the successful application of the method to a medical\nimage segmentation problem of \ufb02uorescence microscope cell images.\n\n1 Introduction\n\nFeature binding has been proposed to provide elegant solution strategies to the segmenta-\ntion problem in perception [11, 12, 14]. A lot of feature binding models have thus tried\nto reproduce groping mechanisms like the Gestalt laws of visual perception, e.g. connect-\nedness and good continuation, using temporal synchronization [12] or spatial coactivation\n[9, 14] for binding. Quite generally in these models, grouping is based on lateral interac-\ntions between feature-representing neurons, which characterize the degree of compatibility\nbetween features. Currently in most of the approaches this lateral interaction scheme is cho-\nsen heuristically, since the experimental data on the corresponding connection patterns in\nthe visual cortex is insuf\ufb01cient. Nevertheless, in more complex feature spaces this heuristic\napproach becomes infeasible, raising the question for more systematic learning methods\nfor lateral interactions.\n\nMozer et al. [4] suggested supervised learning for a dynamic feature binding model of\ncomplex-valued directional units, where the connections to hidden units guiding the group-\ning dynamics were adapted by recurrent backpropagation learning. The application was\nlimited to synthetic rectangle patterns. Hofmann et al. [2] considered unsupervised texture\nsegmentation by a pairwise clustering approach on feature vectors derived from Gabor \ufb01lter\nbanks at different frequencies and orientations. In their model the pairwise feature com-\npatibilities are determined by a divergence measure of the local feature distributions which\nwas shown to achieve good segmentation results for a range of image types. The problem\nof segmentation can also be phrased as a labeling problem, where relaxation labeling algo-\nrithms have been used as a popular tool in a wide range of computer vision applications.\n\n\fPelillo & Re\ufb01ce [7] suggested a supervised learning method for the compatibility coef\ufb01-\ncients of relaxation labeling algorithms, based on minimizing the distance between a target\nlabeling vector and the output after iterating a \ufb01xed number of relaxation steps. The main\nproblem are multiple local minima arising in this highly nonlinear optimization problem.\n\nRecent results have shown that linear threshold (LT) networks provide interesting archi-\ntectures for combining properties of digital selection and analogue context-sensitive am-\npli\ufb01cation [1, 13] with ef\ufb01cient hardware implementation options [1]. Xie et al.\n[16]\ndemonstrated how these properties can be used to learn winner-take-all competition be-\ntween groups of neurons in an LT network with lateral inhibition. The CLM binding model\nis implemented by a large-scale topographically organized LT network, and it was shown\nthat this leads to consistency conditions characterizing its binding states [14]. In this con-\ntribution we show how these conditions can be used to formulate a learning approach for\nthe CLM as a quadratic optimization problem. In Section 2 we brie\ufb02y introduce the com-\npetitive layer binding model. Our learning approach is elaborated in Section 3. In Section\n4 we show application results of the approach to a cell segmentation problem and give a\ndiscussion in the \ufb01nal Section 5.\n\n2 The CLM architecture\n\n%&\u0004\n\n\u0004\u001d\u001c\n\u0006\u001f\u001e! \n\n\u0004\u0007\"#\u001c\n\u0006\u001f\u001e! \n\n,\n\nin layer\n\nThe CLM [9, 14] consists of a set of \n\u0002\t\b\u000b\n\r\f\u000f\u000e\u0010\u000e\u0010\u000e\u0010\f\n\nlayers of feature-selective neurons (see Fig. 1). The\nactivity of a neuron at position\ndenotes the set\n. With each column\nof the neuron activities\na particular \u201cfeature\u201d is associated, which is described by a set of parameters like e.g. local\n. A binding between\nedge elements characterized by position and orientation\n, is expressed by simultaneous activities\ntwo features, represented by columns\nare equally\n, which represents the signi\ufb01cance of the detection of feature\nwith a connection\n\n\u0003\u0005\u0004\u0007\u0006\n\u0011\u0012\u0003\u0013\u0004\r\f\u0015\u0014\r\u0004\u0016\f\u0018\u0017\u0016\u0004\u001a\u0019\n\nis denoted by\n, sharing a common position\n\nby a preprocessing step. The afferent input\n\ndriven by an external input\n\nthat share a common layer\n\n. All neurons in a column\n\nis fed to the activities\n\n, and a column\n\nand\n\nand\n\n\u0001\r\u001b\n\n\u0004\u0007\u0006\n\nand\n\n\u001e( \n\n. Within each layer\n\nthe activities are coupled via lateral connections\n\nweight\nand which is a\nwhich characterize the degree of compatibility between features\nsymmetric function of the feature parameters, thus\n. The purpose of the layered\narrangement in the CLM is to enforce an assignment of the input features to the layers\nby the dynamics, using the contextual information stored in the lateral interactions. The\nunique assignment to a single layer is realized by a columnar Winner-Take-All (WTA)\ncircuit, which uses mutual symmetric inhibitory interactions with absolute strength\n. Due to the WTA\nbetween neural activities\ncoupling, for a stable equilibrium state of the CLM only a neuron from one layer can be\nactive within each column [14]. The number of layers does not predetermine the number\nof active groups, since for suf\ufb01ciently many layers only those are active that carry a salient\ngroup. The combination of afferent inputs and lateral and vertical interactions is combined\ninto the standard linear threshold additive activity dynamics\n\nthat share a common column\n\n\b*)\n\n\u001e, \n\n\u0003-\u0004\u0007\u0006\n\n\u0003.\u0004\u0018/\n\n\u0004\u0007\u0004\u0015\"\n\n\u0004\u0015\"+\u0004\n\nand\n\n\u0004\u0015\u0004\n\n\u0003&\u0004\u0007\u0006\n\n%&\u0004\n\nwhere\nneuron in a column reproduces its afferent input,\nstates of (1) satisfy the consistency conditions\n\n\u0003.\u0004\u0007\u00061\b324\u0003&\u0004\u0015\u00066587:9\r';\u0011<%.\u0004=2?>\n\f\u0018\u0003-\u0019\n\n7F\u0011\u0012\u0003&\u0019G\b!HJI\u0016K\u0013\u0011\n\n. For\n\nlarge compared to the lateral weights\n\n, the single active\n. As was shown [14], the stable\n\nwhich express the assignment of a feature\n\nto the layer\n\nwith highest lateral support.\n\n\u0004\u0007\u0004\u0015\"\n\n\u0004\u0007\"N/\u001fO\n\n\u0004\u0007\u0004\u0015\"\n\n\u0004\u0015\"P\u001c\n\n(1)\n\n(2)\n\n\u0003&\u0004@/.\u0019\u00135A>\n\u0004\u0015\u0006ML\n\nfor all\n\n\u0004\u0015\u0004\n\n\u0003.\u0004\u0007\"B\u0006DCE\f\n\u0004\u0015\u0004\n\n\bU$\n\u0001\u0016\fRQTS\n\u0002G\u0011\u0012\u0001V\u0019W\f\n\u0002=\u0011\u0012\u0001V\u0019\n\n\u0001\n\u0002\n\u0001\n\u0003\n\n\u0001\n\u0001\n\u0003\n\u0003\n$\n\u0002\n\u0001\n\u0001\n'\n\u0002\n)\n\u0006\n\"\n\u0001\n\u0001\n\u001b\n)\n\u0006\n\u0006\n'\n\u0001\n0\n/\n\u0004\n\"\n)\n\u0006\n\"\n \n'\n)\n\u0006\n\"\n\u0003\n%\n\u0004\n>\n\u0004\n\"\n)\n/\n\u0003\n>\n\u0004\n\"\n)\n\u001c\n\u0006\n\u0003\n\u0006\n\u0001\n$\n\fxrL\n\nvertical WTA\ninteraction\n\nlateral interaction\n\nxr2\n\nxr1\n\nrh\n\nxr\u2019L\n\nxr\u20192\n\nxr\u20191\n\nr\u2019h\n\nlayer L\n\nlayer 2\n\nlayer 1\n\ninput\n\nFigure 1: The competitive layer model architecture (see text for description).\n\n3 Learning of CLM Lateral Interactions\n\n\b\b\u0007\u000f\u0001\t\u0002\n\n\u000b\r\f\u0005\u000e\n\f\u000f\u000e\u0010\u000e\u0010\u000e\u000f\f\u0018\u0001\t\u0002\n\u0004\u001d\u001c\n\n\f\u0017\u0016\n\n\u0004\u0005\u0018\n\n\f\n\n\u0004\u0018/\n\nFormulation of the Learning Problem. The overall task of the learning algorithm is\nto adapt the lateral interactions, given by the interaction coef\ufb01cients\n, such that the\nCLM architecture performs appropriate segmentation on the labeled training data and also\ngeneralizes to new test data. We assume that the training data consists of a set of\nlabeled training patterns\n\u0001\u0003\u0002\nof\n\u000f\u0010\u0002\n\n,\ndifferent features with their corresponding labels\n\neach labeled training pattern a target labeling vector\n\nconsists of a subset\n. For\n\nis constructed by choosing\n\n, where each pattern\n\n\f\u0010\u000e\u000f\u000e\u0010\u000e\u000f\f\u0005\n\n\u0001\u0003\u0002\n\n\u0004\u0007\u0004\n\n\u0002\u0011\u0002\u0007\u0011\u0012\u0001\u0013\u0012\n\n(3)\n\nrun over all possible\n\n. Columns for features which are not contained\nfor the labeled columns, assuming\n. In\nin the training pattern are \ufb01lled with zeroes according to\nfeatures, e.g. all edges of different ori-\nthe following indices\nrun over the subset of features realized in\nentations at different image positions, while\na particular pattern, e.g. only one oriented edge at each image position. The assignment\nvectors\nform the basis of the learning approach since they represent the\ntarget activity distribution, which we want to obtain after iterating the CLM with appropri-\nis used to\nately adjusted lateral interactions. In the following the abbreviation\nkeep the notation readable.\n\n\f\u000f\u000e\u0010\u000e\u0010\u000e\u0010\f\u0005\n\n\u0014\u001d\u0002\u0015\f\u001e\u0004=\b\n\n\u0001\u0016\f\u0015\u0001V\u001b\n\nfor all\n\nfor\n\n\f\u0005\u001c\n\n\f\u0005\u001c\n\nThe goal of the learning process is to make the training patterns consistent, which is in\naccordance with (2) expressed by the inequalities\n\n\u0002\u001f\u0002\u0007\u0011\u0012\u0001V\u0019\n\nfor all\n\n\u0014\u0015\u0002\n\u0001\u001a\u0019\n\n\f\u0005Q,S\n\bU$\n\n\u0011\u0012\u0001V\u0019\n\n2A\n\u001a\u0019\"!\n\n\u000f\u0010\u0002\n\nOT>\n\n\u001b%\u001b\n\n\u0004\u0007\u0004\u0007\"\n\n\u0004\u0015\u0004\n\nfor all\n\n\u0004#\f\u0018\u0001 \u0019\n\n\f@Q!S\n\nThese\nfollowing. Let us develop a more compact notation. We can rewrite (4) as\n\ninequalities de\ufb01ne the learning problem that we want to solve in the\n\nfor all\n\n\u001b+*\n\n#$\u001b%\u001b\nwhere &\n\u0014+\u0002\nintroducing multiindices\n\u0004\u0018/\n#$\u001b%\u001b\n)5670\n\ufb01ned for the labeled columns of the assignment vectors. The vectors\n&;8\ninteraction. The index\n\n\u0004@/\n#\t\u001b'\u001b\n\b)(\n, &98\n\nwhich correspond to\n\n/10\n2\u001f\n\u001a\u0019\n\nruns over all\n\n. The index\n\nand\n\n\u001b%\u001b\n\n\bU$\n\f@QTS\n\u0002G\f\n\u001132E\f\u0005\u001c\n\f\u0005\u001c\n\u000f\u0010\u0002\n\nO! \n\u0014+\u0002\n\"P\u001c\n\u0006\"-\n\n#$\u001b%\u001b\n2,(\n\n\u0004#\f\u0018\u0001\u0003\u0019\n\nare called consistency vectors and represent the consistency constraints for the lateral\nruns over all entries in the lateral interaction matrix. The vector\n\n. The form of the inequalities can be simpli\ufb01ed by\nand\nconsistency relations de-\n\n,\n\n\u00115\u0004#\f\u0015\u0001\u0016\fRQE\u0019\n\n.40\n8 with components\n\n(4)\n\n(5)\n\n)\n\u0006\n\"\n\n\u0004\n\b\n\u0006\n\u0002\n\n$\n\u0019\n\u0014\n\u0002\n\u0006\n\b\n\u0014\n\u0002\n\b\n \n\u0006\n\u0002\n\u0002\n\u0002\n%\n\u0004\n\b\n\n\u0014\n\u0002\n\u001b\n\u0006\n\b\n \n\u0002\nS\n\u0019\n\u0006\n\u0002\n\u001c\n\u001b\n\u000f\n$\n\u0002\n$\n>\n\u0004\n\"\n)\n/\n\u0014\n\u0002\n\u0004\n\"\n/\n\u0004\n\"\n)\n\u001c\n\u0006\n\"\n\u0014\n\u0002\n\u0004\n\"\n\u001c\n\u0006\n\u0006\n\u0002\n\b\n$\n\u0002\n\u000e\n\u0011\n\n\u0002\n>\n\"\n&\n\u0004\n\u0002\n/\n\"\n)\n#\n\"\n\u0006\n\u0002\n\u0002\n\"\n\u0004\n(\n/\n#\n\u001b\n\"\n/\n\u001c\n\u0006\n#\n\u001b\n.\n/\n\u001b\n\u0019\n)\n#\n\"\n6\n0\n&\n\u0002\n\"\n.\n\u0011\n\n!\n\u0002\n:\n6\n/\n\fmatrix entries. The inequalities (4) can then be written in the form\n\n\u0005\n\n\f\u000f\u000e\u0010\u000e\u0010\u000e\u000f\f@)\n\n\f\u0010\u000e\u0010\u000e\u000f\u000e\u0010\f\u0018)\u0002\u0001\n\n\u001e\n\n\u000b\r\u000b\n\f\u0010\u000e\u0010\u000e\u000f\u000e\u0010\f\u0018)\u0002\u0001\n\ncomponents contains the corresponding\n\n\u000f\u0004\u0003\nfor all\n\nwith \n)56\n\n.F\u000e\n\nThis illustrates the nature of the learning problem. The problem is to \ufb01nd a weight vector \nwhich leads to a lateral interaction matrix, such that the consistency vectors lie in the oppo-\nsite half space of the weight state space. Since the conditions (6) determine the attractivity\nof the training patterns, it is customary to introduce a positive margin\nto achieve\ngreater robustness. This gives the target inequalities\n\n(6)\n\n(7)\n\n)5665\u0006\u0005\n\nO! \n\nfor all\n\n.F\f\n\nfor given training data. If the system of inequalities admits a\nit is called compatible. If there is no  satisfying all constraints, the system\n\nwhich we want to solve in \nsolution for \nis called incompatible.\nSuperposition of Basis Interactions. If the number of features\nis large, the number\nof parameters in the complete interaction matrix\nmay be too large to be robustly es-\ntimated from a limited number of training examples. To achieve generalization from the\ntraining data, it is necessary to reduce the number of parameters which have to be adapted\nduring learning. This is also useful to incorporate a priori knowledge into the interaction.\nAn example is to choose basis functions which incorporate invariances such as translation\nand rotation invariance, or which satisfy the constraint that the interaction is equal in all\nlayers. A simple but powerful approach is to choose a set of\n\ufb01xed basis interactions\nobtained by linear\n\u001b'\u001b\n\n\u0012 with compatibilities \b\n\n, with an interaction\n\nsuperposition\n\n\u001b%\u001b\n\n\u001b%\u001b\n\n\f\u000f\u000e\u0010\u000e\u000f\u000eW\f\u0002\u0007\n\u001b%\u001b\n. Now the learning problem of solving the in-\n. After inserting (8) into (7) we\n\n(8)\n\nwith weight coef\ufb01cients \n\nequalities (7) can be recast in the new free parameters \n\nobtain the transformed problem\n\n\u0012\r\f\u000b\t\n\nfor all\n\n(9)\n\n\u0012\r\f\n\n5\u000e\u0005\n\nOA \n\n.F\f\n\n&\u001a8\n\n\b1!\n\nin the basis interac-\n\nis the component of the consistency vector\n\n\u0012 . The basis interactions can thus be used to reduce the dimensionality of the learning\n\nwhere\ntion \b\nproblem. To avoid any redundancy, the basis interactions should be linearly independent.\nAlthough the functions are here denoted \u201cbasis\u201d functions, they need neither be orthogonal\nnor span the whole space of interactions \nQuadratic Consistency Optimization. The generic case in any real world application is\nthat the majority of training vectors contains relevant information, while single spurious\nvectors may be present due to noise or other disturbing factors. Consequently, in most\napplications the equations (7) or (9) will be incompatible and can only be satis\ufb01ed approx-\nimately. This will be especially the case, if a low-dimensional embedding is used for the\nbasis function templates as described above. We therefore suggest to adapt the interactions\nby minimizing the following convex cost function\n\n\u0019\u0010\u000f\u0011\u0001\n\n\u000b\u0013\u0012 .\n\n\u001b%\u001b\n\n\tJ\b\n\b*>\n\n\f\u000f\u000e\u0010\u000e\u000f\u000eW\f\u0002\u0007\n5\u0006\u00051\b\n\n\u0014 QCO\n\n5\u0015\u0005\n\n(10)\n\n\n\b\n\u0011\n)\n\n\u000b\n\u000b\n\u0019\n>\n6\n&\n8\n6\nO\n \n\u0005\n\u001e\n \n>\n6\n&\n8\n6\n\u000f\n)\n#\n\"\n\u0007\n\b\n\u0012\n#\n\"\n\f\n)\n#\n\"\n)\n#\n\"\n\u0012\n\n\u0012\n\b\n\u0012\n#\n\"\n\b\n>\n\u0012\n\n\u0012\n\b\n\u0012\n6\n\b\n\u0012\n>\n6\n&\n8\n6\n>\n\u0012\n\n\u0012\n\b\n\u0012\n6\n>\n\u0012\n\n8\n\u0012\n\f\n8\n\u0012\n6\n6\n\b\n\u0012\n6\n:\n8\n\b\n>\n8\n9\n>\n6\n&\n8\n6\n)\n6\nC\n\u0003\n\u000e\n\fA similar minimization approach was suggested for the imprinting of attractors for the\nBrain-State-in-a-Box (BSB) model [8], and a recent study has shown that the approach is\ncompetitive with other methods for designing BSB associative memories [6].\n\n\u001eA \n\nis attained if the inner products are all equal to\n\nFor a \ufb01xed positive margin\n, the cost function (10) is minimized by making the inner\nproducts of the weight vector and the consistency vectors negative. The global minimum\nwith \u0014 QCO\n, which can be interpreted\nsuch that all consistency inequalities are ful\ufb01lled in an equal manner. Although this addi-\ntional regularizing constraint is hard to justify on theoretical grounds, the later application\nshows that it works quite well for the application examples considered.\nIf we insert the expansion of \n\nin the basis of function templates we obtain according to (8)\n\u0014 QCO\n\n(11)\n\nwhich results in a\neters. The coef\ufb01cients\ninteractions, are given by\noptimization problem is then given by minimizing\n\n-dimensional convex quadratic minimization problem in the \n\nparam-\n, which give the components of the training patterns in the basis\n. The quadratic\n\n\u0012\u001d\u001c\n\u0004\u0015\u0004\u0007\"\n\n\u0004\u0007\"\u0007\u001c\n\n(12)\n\n\u0014 QCO\n\n5\u0006\u0005\n\n\u0012@/\n\u0004\u0015\u0004\u0007\"\n\n\u0004\u0015\"N/\n\n5\u0006\u0005\n\nand \u0003\n\nare unconstrained,\nwhere \u0002\nthen the minimum of (12) can be obtained by solving the linear system of equations\n\u0014\b\u0007\n4 Application to Cell Segmentation\n\n. If the coef\ufb01cients \n\n\u0012\u001f\b\u0005\u0004\n\n\b\t\u0004\n\nfor all\n\n.\n\nThe automatic detection and segmentation of individual cells in \ufb02uorescence micrographs\nis a key technology for high-throughput analysis of immune cell surface proteins [5]. The\nstrong shape variability of cells in tissue, however, poses a strong challenge to any au-\ntomatic recognition approach. Figure 2a shows corresponding \ufb02uorescence microscopy\nimages from a tissue section containing lymphocyte cells (courtesy W. Schubert). In the\nbottom row corresponding image patches are displayed, where individual cell regions were\nmanually labeled to obtain training data for the learning process.\n\n, where\n\nis the position in the image and\n\nFor each of the image patches, a training vector consists of a list of labeled edge features\nis a unit local\nparameterized by\nedge orientation vector computed from the intensity gradient. For a\npixel image\nthis amounts to a set of\nlabeled edge features. Since the \ufb01gure-ground separating mech-\nanism as implemented by the CLM [14] is also used for this cell segmentation application,\nfeatures which are not labeled as part of a cell obtain the corresponding background label,\n. Each training pattern contains one additional free layer, to enable the\ngiven by\nlearning algorithm to generalize over the number of layers.\n\n\fE\u0004\n \u0010\u000f\n\n\u0011\u000b\nE\u0004\r\f\r\f\n\nE\u0004\n\n\u0004\u000f\u0019\n\nThe lateral interaction to be adapted is decomposed into the following weighted basis com-\nponents: i) A constant negative interaction between all features, which facilitates group\nseparation, ii) a self-coupling interaction in the background layer which determines the at-\ntractivity of the background for \ufb01gure-ground segmentation, and iii) an angular interaction\nwith limited range, which is in itself decomposed into templates, capturing the interac-\ntion for a particular combination of the relative angles between two edges. This angu-\nlar decomposition is done using a discretization of the space of orientations, turning the\nunit-vector representation into an angular orientation variable\n. To achieve ro-\ntation invariance of the interaction, it is only dependent on the edge orientations relative\n\n\u0019\t\u0011\n\n\f\u0012\u0004\u0014\u0013\u0015\u0011\n\n\u0005\n\b\n \n2\n\u0005\n\b\n>\n8\n9\n>\n\u0012\n\n\u0012\n\f\n8\n\u0012\nC\n\u0003\n\f\n\u0007\n\u0012\n\f\n8\n\u0012\n\f\n8\n\u0012\n\b\n!\n6\n&\n8\n6\n\b\n\u0012\n6\n\b\n!\n\u0004\n\"\n\u0014\n\u0002\n\b\n2\n!\n\u0004\n\"\n\u0014\n\u0002\n\u0006\n\b\n\u0006\n\b\n>\n\u0012\n\u0001\n\u0002\n\u0012\n\u0001\n\n\u0012\n\n\u0001\n5\n>\n\u0012\n\u0003\n\u0012\n\n\u0012\n\u0003\n\f\n\u0012\n\u0001\n\b\n!\n8\n\f\n8\n\u0012\n\f\n8\n\u0001\n\u0005\n!\n8\n\f\n8\n\u0012\n\u0012\n\u0006\n\u0006\n\n\u0012\n!\n\u0001\n\u0002\n\u0012\n\u0001\n\n\u0001\n5\n\u0003\n\u0012\n\b\n \n\t\n\u000e\n\u000e\n \n\u000e\n \n\u0003\n\u0002\n\b\n\n\u0017\n \n\f6\n\n4\n\n2\n\n7\n\n3\n\n8\n\n5\n\n8\n\n9\n\n7\n\n6\n\n45\n\n2 3\n\na) Manually labelled training patterns\n\nb) Grouping results after learning\n\nFigure 2: a) Original images and manually labeled training patterns from a \ufb02uorescence\nmicrograph. b) Test patterns and resulting CLM segmentation with learned lateral interac-\ntion. Grayscale represents different layer activations, where a total of 20 layers plus one\nbackground layer (black) was used.\n\n. The angles\n\nand\n\n\f\u0012\u0004\u0014\u0013\u0015\u0011\n\nto their mutual position difference vector\nare dis-\ncretized by partitioning the interval\ninto 8 subintervals. For each combination of\nthe two discretized edge orientations there is an interaction template generated, which is\nonly responding in this combined orientation interval. Thus the angular templates do not\noverlap in the combined\nfor a particular\n. Since the interaction must be symmet-\n, then \b\nric under feature exchange, this does not result in\ndifferent combinations, but\nonly 36 independent templates. Apart form the discretization, the interaction represents\nthe most arbitrary angular-dependent interaction within the local neighborhood, which is\nsymmetric under feature exchange. We use two sets of angular templates for\nand\nis the maximal local interaction range. With\nthe abovementioned two components, the resulting optimization problem is 36+36+2=74-\ndimensional. Figure 3 compares the optimized interaction \ufb01eld to earlier heuristic lateral\ninteractions for contour grouping. See [15] for a more detailed discussion.\n\nspace, i.e. if \b\nfor all\n\n\f\u0018\u0017\n\u0019W\f\u000f\u0011\n\b\u0004\u0003\n\nrespectively, where\n\n\f\u0018\u0017\n\n\u0019\u0015\u0019\n\n\f\u0018\u0017\n\u0019\u0018\u0019\n\n\u0006\u0005\n\n\u0011\u0018\u0011\n\n\u0011\u0018\u0011\u000b\n\n\f\u0018\u0017\n\n\u0019P\f\u000f\u0011\u000b\n\n\u0007\u0005\n\nThe performance of the learning approach was investigated by choosing a small number\nof the manually labeled patterns as training patterns. For all the training examples we\nused, the resulting inequalities (9) were in fact incompatible, rendering a direct solution of\n(9) infeasible. After training was completed by minimizing (12), a new image patch was\nselected as a test pattern and the CLM grouping was performed with the lateral interaction\nlearned before, using the dynamical model as described in [14]. The quadratic consistency\noptimization was performed as described in the previous section, exploring the free margin\n. For a set of two training patterns as shown in Fig. (2)a with a total of 1600\nparameter\nfeatures each, a learning sweep takes about 4 minutes on a standard desktop computer.\n\nTypical segmentation results obtained with the quadratic consistency optimization ap-\nproach are shown in Figure 2b, where the margin was given by\n. The grouping\nresults were not very sensitive to\n. The grouping results show\na good segmentation performance where most of the salient cells are detected as single\ngroups. There are some spurious groups where a dark image region forms an additional\ngroup and some smaller cells are rejected into the background layer. Apart from these\nminor errors, the optimization has achieved an adequate balancing of the different lateral\ninteraction components for this segmentation task.\n\nin a range of\n\n\u0005*\b\n\n \r \n\n\n\b\n\n\u0003\n2\n\n\u0017\n\n\u0017\n\u0003\n\u0011\n \n\u0011\n\u0017\n\n\u0003\n\u0019\n\u0012\n#\n\n\u0003\n\u0003\n\b\n\n\t\n\u0001\n#\n\n\u0003\n\u000e\n\u0017\n\u0003\n\b\n \n\u0001\nS\n\b\n\t\n\u0002\n\u0003\n\u000e\n\u0005\nO\n\u000f\n\u0007\n\u0004\n\u000f\n\u0007\n\u0004\nO\n\u0005\nO\n\u000f\n\u000f\n\u0005\n\n \n\u0005\n\b\nO\n\u0005\nO\n\n\fn\n2\n\nn\n2\n\nn\n1\n\na) Plotting scheme\n\nn2\n\nn\n1\n\np\n1\n\nd\n\np\n2\n\nb) Edge parameters\n\nc) Standard continuity interaction field\n\nd) Learned interaction field\n\n\f\r\f\n\nand two unit vectors\n\nFigure 3: Comparison between heuristic continuity grouping interaction \ufb01eld and a learned\nlateral interaction \ufb01eld for cell segmentation. The interaction depends on the difference\nvector\n, shown in b), encoding directed orientation. a) ex-\nplains the interaction visualizations c) and d) by showing a magni\ufb01cation of the plot c) of\nthe interaction \ufb01eld of a single horizontal edge pointing to the left. The plots are gener-\nated by computing the interaction of the central directed edge with directed edges of all\ndirections (like a cylindrical plot) at a spatial grid. Black edges share excitatory, white\nedges share inhibitory interaction with the central edge and length codes for interaction\nstrength. The cocircular continuity \ufb01eld in c) depends on position and orientation but is not\ndirection-selective. It supports pairs of edges which are cocircular, i.e. lie tangentially to a\ncommon circle and has been recently used for contour segmentation [3, 14]. The learned\nlateral interaction \ufb01eld is shown in d). It is direction-selective and supports pairs of edges\nwhich \u201cturn right\u201d. The strong local support is balanced by similarly strong long-range\ninhibition.\n\n5 Discussion\n\nThe presented results show that appropriate lateral interactions can be obtained for the\nCLM binding architecture from the quadratic consistency optimization approach. The only\na priori conditions which were used for the template design were the properties of locality,\nsymmetry, and translation as well as rotation invariance. This supervised learning approach\nhas clear advantages over the manual tuning of complex feature interactions in complex fea-\nture spaces with many parameters. We consider this as an important step towards practical\napplicability of the feature binding concept.\n\nThe presented quadratic consistency optimization method is based on choosing equal mar-\ngins for all consistency inequalities. There exist other approaches to large margin classi\ufb01ca-\n\n\n\f\n\n\u0003\n\ftion, like support vector machines [10], where more sophisticated methods were suggested\nfor appropriate margin determination. The application of similar methods to the supervised\nlearning of CLM interactions provides an interesting \ufb01eld for future work.\nAcknowledgments: This work was supported by DFG grant GK-231 and carried out at the\nFaculty of Technology, University of Bielefeld. The author thanks Helge Ritter and Tim\nNattkemper for discussions and Walter Schubert for providing the cell image data.\n\nReferences\n\n[1] R. Hahnloser, R. Sarpeshkar, M. A. Mahowald, R. J. Douglas, and H. S. Seung. Digital selection\nand analogue ampli\ufb01cation coexist in a cortex-inspired silicon circuit. Nature, 405:947\u2013951,\n2000.\n\n[2] T. Hofmann, J. Puzicha, and J. Buhmann. Unsupervised texture segmentation in a deterministic\nannealing framework. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(8):803\u2013818,\n1998.\n\n[3] Z. Li. A neural model of contour integration in the primary visual cortex. Neural Computation,\n\n10:903\u2013940, 1998.\n\n[4] M. Mozer, R. S. Zemel, M. Behrmann, and C. K. I. Williams. Learning to segment images\n\nusing dynamic feature binding. Neural Computation, 4(5):650\u2013665, 1992.\n\n[5] T. W. Nattkemper, H. Ritter, and W. Schubert. A neural classi\ufb01cator enabling high-throughput\nIEEE Trans. Inf. Techn. in Biomed.,\n\ntopological analysis of lymphocytes in tissue sections.\n5(2):138\u2013149, 2001.\n\n[6] J. Park, H. Cho, and D. Park. On the design of BSB associative memories using semide\ufb01nite\n\nprogramming. Neural Computation, 11:1985\u20131994, 1999.\n\n[7] M. Pelillo and M Re\ufb01ce. Learning compatibility coef\ufb01cients for relaxation labeling processes.\n\nIEEE Trans. Pattern Analysis and Machine Intelligence, 16(9):933\u2013945, 1994.\n\n[8] Renzo Perfetti. A synthesis procedure for Brain-State-in-a-Box neural networks. IEEE Trans-\n\nactions on Neural Networks, 6(5):1071\u20131080, September 1995.\n\n[9] H. Ritter. A spatial approach to feature linking. In Proc. International Neural Network Confer-\n\nence Paris Vol.2, pages 898\u2013901, 1990.\n\n[10] V. Vapnik. The nature of statistical learning theory. Springer, New York, 1995.\n[11] C. von der Malsburg. The what and why of binding: The modeler\u2019s perspective. Neuron,\n\n24:95\u2013104, 1999.\n\n[12] D. Wang and D. Terman. Image segmentation based on oscillatory correlation. Neural Compu-\n\ntation, 9(4):805\u2013836, 1997.\n\n[13] H. Wersing, W.-J. Beyn, and H. Ritter. Dynamical stability conditions for recurrent neural net-\nworks with unsaturating piecewise linear transfer functions. Neural Computation, 13(8):1811\u2013\n1825, 2001.\n\n[14] H. Wersing, J. J. Steil, and H. Ritter. A competitive layer model for feature binding and sensory\n\nsegmentation. Neural Computation, 13(2):357\u2013387, 2001.\n\n[15] Heiko Wersing. Spatial Feature Binding and Learning in Competitive Neural Layer Architec-\n\ntures. PhD thesis, University of Bielefeld, 2000. Published by Cuvillier, Goettingen.\n\n[16] X. Xie, R. Hahnloser, and H.S. Seung. Learning winner-take-all competition between groups of\nneurons in lateral inhibition networks. In Advances in Neural Information Processing Systems,\nvolume 13. The MIT Press, 2001.\n\n\f", "award": [], "sourceid": 2022, "authors": [{"given_name": "Heiko", "family_name": "Wersing", "institution": null}]}