May 24, 2010:
Found another bug! When computing the levels probability, I had a for loop iterating over L. I was supposed to be raising each value by the number of words at that level.
However I was only raising half of the probability by this value and the other not at all. ***sigh***

I also implemented a Poisson truncated level distribution. There is one function to sample the level from a poisson. And another function to update the poisson parameter
given a Gamma prior.


May 5, 2010:
Found three bugs:
1. When computing the levels probability. I was computing the equation wrongly. I fixed it by putting in a loop over L which made everything simpler.
2. When computing the log-likelihood I had a loop over j in the path portion. 
3. When MH sampling eta I was not counting the fixed topics. Which I should since when I compute p(w|fixed_topic) I was using eta (not the eta from training)


April 15, 2010:
I made two separate functions. The main_sample and main_resample which perform sampling (training) and resampling (testing) independently.
Another important BUG fix:
Was computing the acceptance probability for the eta MH sampler totally wrongly (everything was in the next lower for-loop). Have fixed this in the directory
stick_break15


April 6, 2010:
Comes from stick_break14. I have added sampling the hyperparameters eta and beta using a Metropolis-Hastings sampler. I put an Exponential() prior on each one
One important bug discovered: the random number generator rgen starts over every time it is created. So we should create one at the beginning of the program
and pass it into the functions that need it.


March 23, 2010:
I have added sampling the pi variables as well. This means the level variable sampling equation is simpler and I use rejection sampling for the pi variables.
Unfortunately this means I'm sampling from a beta distribution using gsl which only works on the datalab machines.


March 9, 2010:
Copied from stick_break11/ directory. I'm going to be running a series of experiments with this set of code that requires hard coding information. In particular,

1. Given G, learn phi and levels
2. Given which nodes are in which equiv class learn the edges, phi, paths and levels
3. Removing a node and given G can we learn the node back?
4. Removing a node and given the equiv. classes can we learn the node back (as well as learning the edges, phi, paths, and levels)? 



February 26, 2010:
Copied from stick_break10 directory. Here documents have paths of variable length (not necessarily of length L). Words are assigned to levels
starting at the last node and moving upward according to a truncated geometric distribution.


February 14, 2010:
Each document has a path. The words are assigned to some level among the path.


February 12, 2010:
I changed it back so that when a word is assigned to a path it increments only for the exit node.
This is the correct way of doing things (and the previous way didn't really do anything)

February 11, 2010:
Modified so that when a word is assigned to a path, it increments the word distribution
of every node in the path.


February 10, 2010:
This is an offshoot of stick_break7. The difference is that cp is just a vector of 
length W instead of a matrix of size W-by-L. 


January 29, 2010:

START FROM SCRATH.
PROBABILITY OF WORD FROM EXIT NODE ONLY.

Put a finite depth L.
Sample with exit probabilities.

This means we have to compute probability of all P paths (i.e. we can't do it a node at a time).
But we don't have to bother with a level distribution.
Compute probability of every path using modified DFS.
GOT RID OF PASS NODES but I'm not ready to code the full cross-level edges yet.
Instead this is no pass nodes, but nodes can only create an edge to the level below it.
(One step at a time!)
I believe there is still a memory leak.

