A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences

Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)

Bibtex Metadata Paper

Authors

Eric Xing, Michael Jordan, Richard Karp, Stuart J. Russell

Abstract

We propose a dynamic Bayesian model for motifs in biopolymer se- quences which captures rich biological prior knowledge and positional dependencies in motif structure in a principled way. Our model posits that the position-specific multinomial parameters for monomer distribu- tion are distributed as a latent Dirichlet-mixture random variable, and the position-specific Dirichlet component is determined by a hidden Markov process. Model parameters can be fit on training motifs using a vari- ational EM algorithm within an empirical Bayesian framework. Varia- tional inference is also used for detecting hidden motifs. Our model im- proves over previous models that ignore biological priors and positional dependence. It has much higher sensitivity to motifs during detection and a notable ability to distinguish genuine motifs from false recurring patterns.