High-Dimensional Graphical Model Selection Using $\ell_1$-Regularized Logistic Regression

Part of Advances in Neural Information Processing Systems 19 (NIPS 2006)

Bibtex Metadata Paper


Martin J. Wainwright, John Lafferty, Pradeep Ravikumar


We focus on the problem of estimating the graph structure associated with a discrete Markov random field. We describe a method based on 1- regularized logistic regression, in which the neighborhood of any given node is estimated by performing logistic regression subject to an1-constraint. Our framework applies to the high-dimensional setting, in which both the number of nodes p and maximum neighborhood sizes d are allowed to grow as a function of the number of observations n. Our main result is to estab- lish sufficient conditions on the triple (n, p, d) for the method to succeed in consistently estimating the neighborhood of every node in the graph simul- taneously. Under certain mutual incoherence conditions analogous to those imposed in previous work on linear regression, we prove that consistent neighborhood selection can be obtained as long as the number of observa- tions n grows more quickly than 6d6 log d + 2d5 log p, thereby establishing that logarithmic growth in the number of samples n relative to graph size p is sufficient to achieve neighborhood consistency.

Keywords: Graphical models; Markov random fields; structure learning; `1-regularization; model selection; convex risk minimization; high-dimensional asymptotics; concentration.