Structured Machine Learning for 'Soft' Classification with Smoothing Spline ANOVA and Stacked Tuning, Testing and Evaluation

Part of Advances in Neural Information Processing Systems 6 (NIPS 1993)

Bibtex Metadata Paper


Grace Wahba, Yuedong Wang, Chong Gu, Ronald Klein, MD, Barbara Klein, MD


We describe the use of smoothing spline analysis of variance (SS(cid:173) ANOVA) in the penalized log likelihood context, for learning (estimating) the probability p of a '1' outcome, given a train(cid:173) ing set with attribute vectors and outcomes. p is of the form pet) = eJ(t) /(1 + eJ(t)), where, if t is a vector of attributes, f is learned as a sum of smooth functions of one attribute plus a sum of smooth functions of two attributes, etc. The smoothing parameters governing f are obtained by an iterative unbiased risk or iterative GCV method. Confidence intervals for these estimates are available.

  1. Introduction to 'soft' classification and the bias-variance tradeoff.

In medical risk factor analysis records of attribute vectors and outcomes (0 or 1) for each example (patient) for n examples are available as training data. Based on the training data, it is desired to estimate the probability p of the 1 outcome for any