Martin Szummer, Tommi Jaakkola
Classiﬁcation with partially labeled data requires using a large number of unlabeled examples (or an estimated marginal P (x)), to further con- strain the conditional P (yjx) beyond a few available labeled examples. We formulate a regularization approach to linking the marginal and the conditional in a general way. The regularization penalty measures the information that is implied about the labels over covering regions. No parametric assumptions are required and the approach remains tractable even for continuous marginal densities P (x). We develop algorithms for solving the regularization problem for ﬁnite covers, establish a limiting differential equation, and exemplify the behavior of the new regulariza- tion approach in simple cases.