{"title": "Approximate Inference and Protein-Folding", "book": "Advances in Neural Information Processing Systems", "page_first": 1481, "page_last": 1488, "abstract": null, "full_text": "Approximate Inference and \n\nProtein-Folding \n\nChen Yanover  and Yair Weiss \n\nSchool of Computer Science and Engineering \n\nThe Hebrew  University of J erusalem \n\n91904 Jerusalem, Israel \n\n{cheny,yweiss} @cs.huji.ac.it \n\nAbstract \n\nSide-chain prediction is an important subtask in the protein-folding \nproblem.  We  show  that  finding  a  minimal  energy  side-chain  con(cid:173)\nfiguration  is  equivalent  to  performing  inference  in  an  undirected \ngraphical model.  The graphical model  is  relatively  sparse yet  has \nmany cycles.  We used this equivalence to assess the performance of \napproximate inference  algorithms  in  a  real-world  setting.  Specifi(cid:173)\ncally we  compared belief propagation (BP), generalized BP (GBP) \nand naive mean field  (MF). \nIn  cases  where  exact  inference  was  possible,  max-product  BP  al(cid:173)\nways found the global minimum of the energy  (except  in few  cases \nwhere it failed  to converge), while other approximation algorithms \nof similar  complexity  did  not.  In  the  full  protein  data set,  max(cid:173)\nproduct  BP  always  found  a  lower  energy  configuration  than  the \nother algorithms, including  a  widely  used  protein-folding software \n(SCWRL). \n\n1 \n\nIntroduction \n\nInference  in  graphical  models  scales  exponentially  with  the  number  of variables. \nSince  many  real-world  applications  involve  hundreds  of variables,  it  has  been  im(cid:173)\npossible to utilize the powerful mechanism of probabilistic inference in these appli(cid:173)\ncations.  Despite  the  significant  progress  achieved  in  approximate inference,  some \nit is  not yet known which  algorithm to use \npractical questions still remain open -\nfor  a  given  problem  nor  is  it  understood  what  are  the  advantages  and  disadvan(cid:173)\ntages  of each  technique.  We  address  these  questions  in  the  context  of real-world \nprotein-folding application -\n\nthe side-chain prediction problem. \n\nPredicting side-chain conformation given the backbone structure is  a  central prob(cid:173)\nlem  in  protein-folding  and  molecular  design.  It arises  both  in  ab-initio  protein(cid:173)\nfolding  (which can be divided into two sequential tasks -\nthe generation of native(cid:173)\nlike backbone folds  and the positioning of the side-chains upon these backbones [6]) \nand in  homology modeling schemes  (where the backbone and some side-chains  are \nassumed to be  conserved  among the homologs  but the configuration of the rest  of \nthe side-chains needs  to be found). \n\n\fFigure 1:  Cow  actin binding protein  (PDB code  1pne, top)  and closer  view  of its 6 \ncarboxyl-terminal residues  (bottom-left).  Given  the protein backbone  (black)  and \namino acid sequence, native side-chain conformation (gray) is searched for.  Problem \nrepresentation as  a  graphical model  for  those carboxyl-terminal residues  shown  in \nthe bottom-right figure  (nodes located at COl  atom positions, edges drawn in black). \n\nIn  this paper, we  show the equivalence between side-chain prediction and inference \nin an undirected graphical model.  We  compare the performance of BP, generalized \nBP  and  naive  mean  field  on  this  problem  as  well  as  comparing to  a  widely  used \nprotein-folding program called  SCWRL. \n\n2  The side-chain prediction problem \n\nProteins are chains  of simpler  molecules  called  amino  acids.  All  amino  acids  have \na  central  carbon  atom  (COl)  to  which  a  hydrogen  atom, \na  common  structure  -\nan  amino  group  (N H 2 )  and  a  carboxyl  group  (COOH)  are  bonded.  In  addition, \neach  amino  acid  has  a  chemical  group  called  the  side-chain,  bound  to  COl.  This \ngroup distinguishes one amino acid from another and gives its distinctive properties. \nAmino  acids  are  joined  end  to  end  during  protein  synthesis  by  the  formation  of \npeptide bonds.  An amino acid unit in a  protein is  called  a  residue.  The formation \nof  a  succession  of peptide  bonds  generate  the  backbone  (consisting  of COl  and  its \nadjacent atoms, N  and CO, of each reside),  upon which the side-chains are hanged \n(Figure  1). \n\n\fWe  seek to predict the configuration of all  the side-chains relative to the backbone. \nThe standard approach to this problem is  to define  an energy function  and use the \nconfiguration that achieves  the global minimum of the energy as the prediction. \n\n2.1  The  energy function \n\nWe  adopted the van der Waals energy function,  used by SCWRL [3],  which approx(cid:173)\nimates the repulsive  portion of Lennard-Jones 12-6 potential.  For  a  pair of atoms, \na  and b,  the energy of interaction is  given  by: \n\nE(a, b)  =  {  -k2 :'0  + k~ \n\nEmax \n\nd> Ro \nRo  ~ d  ~ k1Ro \nk1Ro  > d \n\n(1) \n\nwhere  Emax  =  10, kl  =  0.8254  and  k2  =  ~~k;'  d  denotes  the  distance  between \na  and  band Ro  is  the  sum  of  their  radii.  Constant  radii  were  used  for  protein's \natoms  (Carbon  - 1.6A,  Nitrogen  and  Oxygen - 1.3A,  Sulfur  - 1.7 A).  For  two  sets \nof atoms,  the  interaction  energy  is  a  sum  of the  pairwise  atom  interactions.  The \nenergy  surface  of a  typical  protein  in  the  data set  has  dozens  to  thousands  local \nminima. \n\n2.2  Rotamers \n\nThe configuration of a  single side-chain is  represented by at most 4 dihedral angles \n(denoted  Xl,X2,X3  and X4)'  Any assignment of X angles for  all the residues defines \na protein configuration.  Thus the energy minimization problem is a highly nonlinear \ncontinuous optimization problem. \n\nIt turns out, however,  that side-chains have a  small repertoire of energetically pre(cid:173)\nferred conformations, called rotamers.  Statistical analysis of those conformations in \nwell-determined protein structures produce a  rotamer library.  We  used a  backbone \ndependent  rotamer library  (by  Dunbrack and  Kurplus,  July  2001  version).  Given \nthe  coordinates  of the  backbone  atoms,  its  dihedral  angles  \u00a2  (defined,  for  the  ith \nresidue,  by  Ci - 1  - Ni  - Ci - Ci )  and 'IjJ  (defined  by  Ni  - Ci - Ci  - NHd  were \ncalculated.  The library then gives the typical rotamers for each side-chain and their \nprior probabilities. \n\nBy using the library we  convert the continuous optimization problem into a discrete \none.  The  number of discrete  variables  is  equal  to  the  number  of residues  and  the \npossible values each variable can take lies  between 2 and 81. \n\n2.3  Graphical model \n\nSince  we  have a  discrete optimization problem and the energy function  is  a  sum of \npairwise  interactions,  we  can  transform  the  problem  into  a  graphical  model  with \npairwise potentials.  Each node corresponds to a  residue, and the state of each node \nrepresents the configuration of the side  chain of that residue.  Denoting by  {rd  an \nassignment of rotamers for  all  the residues  then: \n\nP({ri}) =  !e- +E({r;}) \n\nZ \n\n!e -+ L;j E(r;)+E(r;,rj) \nZ \n1 Z II 'lti(ri) II 'ltijh,rj) \n\n(2) \n\nwhere  Z  is  an  explicit  normalization  factor  and  T  is  the  system  \"temperature\" \n(used  as  free  parameter).  The  local  potential  'ltih) takes  into  account  the  prior \n\ni \n\ni ,j \n\n\fprobability of the  rotamer Pi(ri)  (taken from  the  rotamer library)  and  the  energy \nof the interactions between that rotamer and the backbone: \n\n\\(Ii(ri)  =  Pi (ri)e-,j,E(ri ,backbone) \n\n(3) \nEquation 2 requires multiplying  \\(I ij  for all pairs of residues i, j  but note that equa(cid:173)\ntion 1 gives zero energy for  atoms that are sufficiently far  away.  Thus we  only need \nto calculate the pairwise interactions for  nearby residues.  To define the topology of \nthe undirected graph,  we  examine all  pairs of residues i, j  and check whether there \nexists  an assignment ri, rj  for  which  the energy is  nonzero.  If it exists,  we  connect \nnodes i  and j  in  the graph and set the potential to be: \n\n(4) \n\nFigure  1 shows a  subgraph of the undirected graph.  The graph is  relatively sparse \n(each  node  is  connected  to  nodes  that  are  close  in  3D  space)  but  contains  many \nsmall  loops.  A  typical  protein in the  data set gives  rise  to a  model  with  hundreds \nof loops of size 3. \n\n3  Experiments \n\nWhen the protein was small enough we  used the max-junction tree algorithm [1]  to \nfind  the  most likely  configuration of the variables  (and hence  the  global minimum \nof the  energy function).  Murphy's implementation of the  JT algorithm in  his  BN \ntoolbox for  Matlab was  used  [10]. \n\nThe approximate inference algorithms we tested were loopy belief propagation (BP), \ngeneralized BP  (GBP)  and naive mean field  (MF). \n\nBP is  an exact and efficient  local message passing algorithm for  inference in singly \nconnected  graphs  [15].  Its  essential  idea is  replacing  the  exponential  enumeration \n(either  summation  or  maximizing)  over  the  unobserved  nodes  with  series  of  lo(cid:173)\ncal  enumerations  (a  process  called \"elimination\"  or \"peeling\").  Loopy  BP, that  is \napplying BP to multiply connected graphical models, may not converge due to cir(cid:173)\nculation  of messages  through the  loops  [12].  However,  many groups  have  recently \nreported  excellent  results  using  loopy  BP  as  an  approximate  inference  algorithm \n[4,  11,  5].  We  used  an  asynchronous  update  schedule  and  ran for  50  iterations  or \nuntil numerical  convergence. \n\nGBP  is  a  class  of approximate inference  algorithms  that  trade  complexity for  ac(cid:173)\ncuracy  [15].  A  subset  of  GBP  algorithms  is  equivalent  to  forming  a  graph  from \nclusters of nodes and edges in  the original graph and then running ordinary BP on \nthe  cluster  graph.  We  used  two  large  clusters.  Both  clusters  contained  all  nodes \nin  the graph but each  cluster contained only  a  subset of the edges.  The first  clus(cid:173)\nter  contained  all  edges  resulting  from  residues,  for  which  the  difference  between \nits  indices  is  less  than  a  constant  k  (typically,  6).  All  other  edges  were  included \nin  the  second  cluster.  It can  be  shown  that  the  cluster  graph  BP  messages  can \nbe  computed  efficiently  using  the  JT algorithm.  Thus  this  approximation tries  to \ncapture dependencies  between a  large number of nodes  in  the original graph while \nmaintaining computational feasibility. \n\nThe  naive  MF  approximation tries  to  approximate the joint  distribution  in  equa(cid:173)\ntion  2  as  a  product  of independent  marginals  qi(ri) .  The  marginals  qi(ri)  can  be \nfound  by iterating: \n\nqi(ri)  f- a\\(li(ri) exp (L L qj(rj) log \\(Iij(ri, rj )) \n\nJENi  rj \n\n(5) \n\n\fwhere  a  denotes  a  normalization  constant  and  Ni  means  all  nodes  neighboring  i. \nWe  initialized  qi(ri)  to  \\[Ii(ri)  and  chose  a  random update  ordering for  the nodes. \nFor each protein we  repeated this minimization 10 times  (each time with a different \nupdate order)  and chose the local minimum that gave the lowest energy. \n\nIn addition to the approximate inference algorithms described above,  we  also  com(cid:173)\npared the results to two approaches in use in side-chain prediction:  the SCWRL and \nDEE  algorithms.  The  Side-Chain  placement  With  a  Rotamer  Library  (SCWRL) \nalgorithm is  considered one of the leading algorithms for  predicting side-chain con(cid:173)\nformations  [3]. \nheuristic search strategy to find  a  minimal  energy conformation in  a  discrete  con(cid:173)\nformational  space  (defined  using rotamer library). \n\nIt  uses  the  energy  function  described  above  (equation  1)  and  a \n\nDead  end  elimination  (DEE)  is  a  search algorithm that tries  to reduce  the search \nspace  until  it  becomes  suitable  for  an  exhaustive  search.  It is  based  on  a  simple \ncondition that identifies  rotamers that cannot  be  members  of the global minimum \nenergy  conformation  [2].  If enough  rotamers  can  be  eliminated,  the  global  mini(cid:173)\nmum  energy  conformation  can be  found  by  an exhaustive search of the  remaining \nrotamers. \n\nThe various inference algorithms were tested on set of 325  X-ray crystal structures \nwith resolution better than or equal to 2A, R factor below 20% and length up to 300 \nresidues.  One representative structure was selected from each cluster of homologous \nstructures  (50%  homology or more) .  Protein structures were acquired from  Protein \nData Bank site  (http://www.rcsb.org/pdb). \n\nMany proteins contain Cysteine residues which tend to form  strong disulfide bonds \nwith  each  other.  A  standard  technique  in  side-chain  prediction  (used  e.g. \nin \nSCWRL)  is  to  first  search  for  possible  disulfide  bonds  and  if  they  exist  to  freeze \nthese residues in that configuration.  This essentially reduces the search space.  We \nrepeated our experiments with and without freezing  the  Cysteine residues. \n\nSide-chain to backbone interaction seems to be much severe than side-chain to side(cid:173)\nchain interaction -\nthe  backbone is  more  rigid  than side-chains  and its  structure \nassumed to be known.  Therefore, the parameter R was introduced into the pairwise \npotential equation, as  follows: \n\n\\[Io(ro  ro)  -\n-\n\n\",  J \n\n(6) \nUsing  R  >  1  assigns  an  increased  weight  for  side-chain  to  backbone  interactions \nover side-chain to side-chain  interactions.  We  repeated our experiments both  with \nR =  1 and R > 1.  It worth mentioning that SCWRL implicitly adopts a  weighting \nassumption that assigns an increased weight to side-chain to backbone interactions. \n\n(e -,f-E(ri ,r;))* \n\n\"J \n\n4  Results \n\nIn  our  first  set  of  experiments  we  wanted  to  compare  approximate  inference  to \nexact inference.  In order to make exact inference possible we restricted the possible \nrotamers  of each  residue.  Out  of  the  81  possible  states  we  chose  a  subset  whose \nlocal probability accounted for  90% of the local probability.  We constrained the size \nof the  subset  to  be  at  least  2.  The resulting  graphical model  retains  only  a  small \nfraction  of the  loops  occurring in  the  full  graphical model  (about  7% of the  loops \nof size  3).  However, it still  contains many small loops, and in particular,  dozens of \nloops of size  3. \n\nOn these  graphs  we  found  that ordinary max-product  BP  always  found  the  global \nminimum of the energy function  (except  in few  cases  where  it failed  to  converge). \n\n\f80 \n\n70 \n\n80 \nII! \n.!! 50 \na. \n~ <1l \n\"' 30 \n~ \n\n20 \n\n10 \n\n0 \n\n80 \n\n70 \neo \n\n.. .!! 50 \n\na. \n~ <1l \n\"' 30 \n~ \n\n20 \n\n10 \n\n0 \n\nI \n{;>  \"  \" \n\n\u2022 \n\n,,,  01>  ~ {> \n\n.\"  .\"  <9  4>  <P  $' \nE(Sum-product BP) - E(Max-product BP) \n\n..,\" \n\n.\u00a7> \n\n-- . . . -. - -\n\n,\"  01>  ~ {> \nE(SCWRL) - E(Max-product BP) \n\n..,\" \n\n.\u00a7> \n\n.\"  .\"  <9  4>  <p.* \n\n{;>  \" \" \n\n80 \n\n70 \neo \n\n.. .!! 50 \n\na. \n~ <1l \n\"' 30 \n~ \n\n20 \n\n10 \n\n0 \n\nI \n\nI. \n\n{;>  \"  \",,,  01>  ~ {>.\u00a7>..,\".\".\"  <9  4>  <p.<p \n\nE(Mean field) - E(Max-product BP) \n\n-\n\n,---\n\n-\n\nOJ \n\n100 \ng  98 \nOJ t 96 \n> c \no  94 \n(,) \n\";J!. 92 \n\n90 \n\nnn \n\nSCWRL  Sum, R=1  Sum, R>1  Max. R=1  Max. R>1 \n\nFigure  2:  Sum-product BP  (top-left),  naive MF  (top-right)  and SCWRL  (bottom(cid:173)\nleft)  algorithms energies are always higher than or equal to max-product BP energy. \nConvergence rates for  the various algorithms shown in bottom-right chart. \n\nSum-product  BP failed  to find  sum-JT conformation in  1%  of the graphs only.  In \ncontrast the naive  MF algorithm found  the global minimum conformation for  38% \nof  the  proteins  and  on  17%  of the  runs  only.  The  GBP  algorithm gave  the  same \nresult  as  the  ordinary  BP  but  it  converged  more  often  (e.g.  99.6%  and  98.9%  for \nsum-product GBP and BP, respectively). \n\nIn  the  second  set  of experiments  we  used  the  full  graphical  models.  Since  exact \ninference  is  impossible  we  can  only  compare  the  relative  energies  found  by  the \ndifferent  approximate  inference  algorithms.  Results  are  shown  in  Figure  2.  Note \nthat, when it  converged, max-product BP always found a lower energy configuration \ncompared to  the  other algorithms.  This finding agrees with the observation that the \nmax-product solution is  a \"neighborhood optimum\"  and therefore guaranteed to be \nbetter than all  other assignments in a  large region around it  [13]. \n\nWe  also  tried  decreasing  T ,  the  system  \"temperature\",  for  sum-product  (in  the \nlimit  of zero  temperature  it  should  approach  max-product) .  In  96%  of the  time, \nusing  lower  temperature  (T  =  0.3  instead  of T  =  1)  indeed  gave  a  lower  energy \nconfiguration.  Even  at  this  reduced  temperature,  however,  max-product  always \nfound  a  lower energy configuration. \n\nAll  algorithms  converged  in  more  than  90%  of  the  cases.  However,  sum-product \nconverged more often than max-product  (Figure 2, bottom-right) .  Decreasing tem(cid:173)\nperature  resulted  in  lower  convergence  rate  for  sum-product  BP  algorithm  (e.g. \n95.7%  compared to  98.15% in full  size  graphs  using disulfide  bonds).  It should  be \nmentioned that SCWRL failed  to converge on  a  single  protein in the data set. \n\nApplying the DEE algorithm to the side-chain prediction graphical models dramat(cid:173)\nically  decreased the size of the conformational search space, though,  in most cases, \nthe  resulted space  was  still  infeasible.  Moreover,  max-product  BP  was  indifferent \n\n\f;::; 3 \n~ e::. \n\n.. \n.. \n\n~ 2 \nu \n:::J \nrn \n<1' 1 \n\n;::; 3 \n.. \n~ e::. \n~  2 \nU \n:::J \nrn \n<1' 1 \n\n0 \n\nXl \n\nx2 \n\nx3 \n\nx4 \n\nXl \n\nX2 \n\nXl \n\nX4 \n\nSCWRL buried residues success  rates \n\nXl \n\nX2 \n\n85.9%  62.2%  40.3% \n\nX3 \n\nX4 \n\n25.5% \n\nFigure  3:  Inference  results  - success  rate.  SCWRL  buried  residues  success  rate \nsubtracted  from  sum-product  BP  (light  gray),  max-product  BP  (dark  gray)  and \nMF  (black)  rates when equally  weighting side-chain to backbone and side-chain to \nside-chain  clashes  (left)  and  assigning  increased  weight  for  side-chain to backbone \nclashes  (right). \n\nto  that  space  reduction  -\nconverged, found  the same conformation. \n\nit  failed  to  converge  for  the  same  models  and,  when \n\n4.1  Success rate \n\nIn comparing the performance of the algorithms,  we  have focused  on the energy of \nthe found  configuration since  this is  the  quantity the  algorithms  seek  to optimize. \nA  more  realistic  performance  measure  is:  how  well  do  the  algorithms  predict  the \nnative structure of the protein? \n\nThe dihedral angle Xi  is deemed correct when it is  within 40\u00b0  of the native (crystal) \nstructure  and  Xl  to  Xi-l  are  correct.  Success  rate  is  defined  as  the  portion  of \ncorrectly predicted dihedral angles. \n\nThe  success  rates  of the  conformations,  inferred  by  both  max- and  sum-product \noutperformed  SCWRL's  (Figure  3).  For  buried  residues  (residues  with  relative \naccessibility lower than 30%  [9])  both algorithms added 1 % to SCWRL's Xl  success \nrate.  Increasing  the  weight  of side-chain  to  backbone  interactions  over  side-chain \nto side-chain interactions resulted in better success rates (Figure 3,  right).  Freezing \nCysteine  residues  to  allow  the  formation  of disulfide  bonds  slightly  increased  the \nsuccess  rate. \n\n5  Discussion \n\nRecent years have shown much progress in approximate inference.  We  believe that \nthe  comparison  of different  approximate  inference  algorithms  is  best  done  in  the \ncontext  of  a  real-world  problem.  In  this  paper  we  have  shown  that  for  a  real(cid:173)\nworld problem with many loops,  the performance of belief propagation is  excellent. \nIn problems where  exact  inference  was  possible  max-product BP always  found  the \nglobal minimum of the energy function and in the full protein data set, max-product \nBP  always  found  a  lower  energy  configuration  compared  to  the  other  algorithms \ntested. \n\n\fSCWRL is considered one of the leading algorithms for  modeling side-chain confor(cid:173)\nmations.  However, in the last couple of years several groups reported better results \ndue to more accurate energy function [7],  better searching algorithm [8] , or extended \nrotamer library [14]. \n\nAs  shown,  by  using  inference  algorithms  we  achieved  low  energy  conformations, \ncompared to existing algorithms.  However, this leads only to a  modest increase in \nprediction accuracy.  Using an energy function,  which gives  a  better approximation \nto the  \"true\"  physical energy  (and particularly, assigns lowest energy to the native \nstructure)  should significantly improve the success  rate.  A  promising direction for \nfuture  research  is  to  try  and  learn  the  energy  function  from  examples.  Inference \nalgorithms such as BP may play an important role in the learning procedure. \n\nReferences \n[1]  R.  Cowell.  Introduction  to  inference  in  Bayesian  networks.  In  Michael  I.  Jordan, \n\neditor,  Learning  in  Graphical  Models.  Morgan Kauffmann , 1998. \n\n[2]  Johan  Desmet,  Marc  De  Maeyer,  Bart  Hazes,  and  Ignace  Lasters.  The  dead-end \nelmination  theorem  and  its  use  in  protein  side-chain  positioning.  Nature,  356:539-\n542,  1992. \n\n[3]  Roland L.  Dunbrack, Jr.  and Martin Kurplus.  Back-bone dependent rotamer library \nfor  proteins:  Application  to side-chain  predicrtion.  J.  Mol.  Biol,  230:543- 574,  1993. \nSee  also  http://www.fccc.edu/research/labs/dunbrack/scwrlj. \n\n[4]  William T. Freeman and Egon C.  Pasztor. Learning to estimate scenes from images.  In \nM.S.  Kearns,  S.A.  SoHa, and D.A.  Cohn, editors,  Adv.  Neural  Information Processing \nSystems  11.  MIT  Press,  1999. \n\n[5]  Brendan J.  Frey, Ralf Koetter, and Nemanja Petrovic.  Very loopy belief propagation \nfor  unwrapping  phase  images.  In  Adv.  Neural  Information  Processing  Systems  14. \nMIT Press,  200l. \n\n[6]  Enoch S. Huang, Patrice Koehl, Michael Levitt, Rohit V.  Pappu, and Jay W.  Ponder. \nAccuracy  of side-chain  prediction  upon  near-native  protein  backbones  generated  by \nab  initio folding  methods.  Proteins, 33(2):204- 217,  1998. \n\n[7]  Shide  Liang  and  Nick  V.  Grishin.  Side-chain  modeling  with  an  optimized  scoring \n\nfunction.  Protein  Sci, 11(2):322- 331,  2002. \n\n[8]  Loren  L.  Looger  and  Homme  W.  HeHinga.  Generalized  dead-end  elimination  algo(cid:173)\n\nrithms make large-scale protein side-chain structure prediction tractable:  implications \nfor  protein design and structural genomics.  J  Mol  Biol, 307(1) :429- 445,  200l. \n\n[9]  Joaquim Mendes, Cludio M. Soare, and Maria Armnia Carrondo. mprovement of side(cid:173)\n\nchain modeling in proteins with the self-consistent mean field theory method based on \nan analysis of the factors  influencing prediction.  Biopolym ers, 50(2):111- 131,  1999. \n\n[10]  Kevin  Murphy.  The bayes net toolbox for  matlab.  Computing  Science  and Statistics, \n\n33,  200l. \n\n[11]  Kevin  P.  Murphy,  Yair  Weiss,  and  Micheal  I.  Jordan.  Loopy  belief propagation  for \napproximate inference:  an empirical study.  In Proceedings  of Uncertainty  in AI, 1999. \n[12]  Judea  Pearl.  Probabilistic  R easoning  in  Intelligent  Systems:  Networks  of Plausible \n\nInference.  Morgan Kaufmann,  1988. \n\n[13]  Yair  Weiss  and  William  T.  Freeman.  On  the  optimality  of solutions  of  the  max(cid:173)\n\nproduct  belief  propagation  algorithm.  IEEE  Transactions  on  Information  Th eory, \n47(2) :723- 735,  2000. \n\n[14]  Zhexin  Xiang and Barry Honig.  Extending the accuracy limits of prediction for  side(cid:173)\n\nchain conformations.  J  Mol  Bioi,  311(2):421-430,  200l. \n\n[15]  Jonathan  S.  Yedidia,  William  T.  Freeman,  and  Yair  Weiss.  Understanding  belief \npropagation and its generalization.  In G.  Lakemayer and B.  Nebel, editors,  Exploring \nArtificial Intelligence  in  the  New  Millennium. Morgan  Kauffmann, 2002. \n\n\f", "award": [], "sourceid": 2181, "authors": [{"given_name": "Chen", "family_name": "Yanover", "institution": null}, {"given_name": "Yair", "family_name": "Weiss", "institution": null}]}