An Impossibility Theorem for Clustering

Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)

Bibtex Metadata Paper

Authors

Jon Kleinberg

Abstract

Although the study of clustering is centered around an intuitively compelling goal, it has been very di(cid:14)cult to develop a uni(cid:12)ed framework for reasoning about it at a technical level, and pro- foundly diverse approaches to clustering abound in the research community. Here we suggest a formal perspective on the di(cid:14)culty in (cid:12)nding such a uni(cid:12)cation, in the form of an impossibility theo- rem: for a set of three simple properties, we show that there is no clustering function satisfying all three. Relaxations of these prop- erties expose some of the interesting (and unavoidable) trade-o(cid:11)s at work in well-studied clustering techniques such as single-linkage, sum-of-pairs, k-means, and k-median.