Efficient Out-of-Sample Extension of Dominant-Set Clusters

Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)

Bibtex Metadata Paper


Massimiliano Pavan, Marcello Pelillo


Dominant sets are a new graph-theoretic concept that has proven to be relevant in pairwise data clustering problems, such as image seg- mentation. They generalize the notion of a maximal clique to edge- weighted graphs and have intriguing, non-trivial connections to continu- ous quadratic optimization and spectral-based grouping. We address the problem of grouping out-of-sample examples after the clustering process has taken place. This may serve either to drastically reduce the compu- tational burden associated to the processing of very large data sets, or to efficiently deal with dynamic situations whereby data sets need to be updated continually. We show that the very notion of a dominant set of- fers a simple and efficient way of doing this. Numerical experiments on various grouping problems show the effectiveness of the approach.