Grammatical Bigrams

Paskin, Mark

Grammatical Bigrams

Mark A. Paskin

Advances in Neural Information Processing Systems 14 (NIPS 2001)

Abstract

Unsupervised learning algorithms have been derived for several sta(cid:173) tistical models of English grammar, but their computational com(cid:173) plexity makes applying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but which admits an effi(cid:173) cient EM training algorithm. The model is based upon grammat(cid:173) ical bigrams, i.e. , syntactic relationships between pairs of words. We present the results of experiments that quantify the represen(cid:173) tational adequacy of the grammatical bigram model, its ability to generalize from labelled data, and its ability to induce syntactic structure from large amounts of raw text.

Abstract

Name Change Policy