
Submitted by
Assigned_Reviewer_4
Q1: Comments to author(s).
First provide a summary of the paper, and then address the following
criteria: Quality, clarity, originality and significance. (For detailed
reviewing guidelines, see
http://nips.cc/PaperInformation/ReviewerInstructions)
This paper provides a rigorous mathematical framework
for the limits of memory capacity with complex synapses. Instead of
analyzing individual models with different transition matrices, the
authors developed a general theory which provides an upper bound of memory
capacity for the entire model space. The idea is novel and most parts of
the paper are clearly written. However, whether their toy model captures
sufficient amount of biological reality is questionable. I would suggest
the author exploring further the connections of their theory to biology.
More biological evidences are expected. They may also suggest biological
experiments to test their theory. As shown in Figure 5, the numerical
results are off their theoretical derivations. They should discuss more
about the origin of such discrepancy. Otherwise, the theory is not
convincing enough. In the mathematical derivation, the authors assume
constant transition matrix between different states. It may be important
to check how robust their results against perturbations of the transition
matrix from time to time due to the noisy biological environment. Overall,
this paper is novel and the results may provide a important guidance for
the design of memory storage devices. Q2: Please
summarize your review in 12 sentences
The author(s) developed a general theory of memory
capacity with complex synapses, which bounds the functional limits of
memory for the entire space of models. This is novel and interesting,
however the numerical experiments do not support their theory
well. Submitted by
Assigned_Reviewer_6
Q1: Comments to author(s).
First provide a summary of the paper, and then address the following
criteria: Quality, clarity, originality and significance. (For detailed
reviewing guidelines, see
http://nips.cc/PaperInformation/ReviewerInstructions)
The paper applies the theory if ergodic Markov chains
in continuous time to the analysis of the memory properties of online
learning in synapses with intrinsic states extending earlier work of
Abbott, Fusi and their coworkers.
The main contributions is quite
interesting: a derivation of an envelope that memory curves (SNR of memory
recall over time) of particular complex binary learning rules cannot
cross.
Comments:
1) It would help the reader to explicitly
describe the derivation of the central equation (3).
2) The
original parts of the supplement should somehow be integrated in the main
paper.
Q2: Please summarize your review in 12
sentences
The paper applies the theory if ergodic Markov chains
in continuous time to the analysis of the memory properties of online
learning in synapses with intrinsic states. Submitted by
Assigned_Reviewer_7
Q1: Comments to author(s).
First provide a summary of the paper, and then address the following
criteria: Quality, clarity, originality and significance. (For detailed
reviewing guidelines, see
http://nips.cc/PaperInformation/ReviewerInstructions)
The paper studies the problem of memory storage with
discrete (digital) synapses. Previous work established that memory
capacity can be increased by adding a cascade of (latent) states but the
optimal state transition dynamics was unknown and the actual dynamics was
usually handpicked using some heuristic rules. In this paper the authors
aim to derive the optimal transition dynamics for synaptic cascades. They
first derive an upper bound on achievable memory capacity and show that
simple models with linear chain structures can approach (achieve) this
bound.
The paper is clear, high quality, generally well written
and has a clear contribution to the field.
Minor comments:
line 195: Different systems in the brain might be optimised for
storage at different timescale. Are there any evidence for that? line
220: When looking at the molecular network of a synapse it is not
immediately obvious what are the "states" of the system and what is the
associated state transition dynamics. A strong prediction of the paper is
that the molecular network in the synapse has a meaningful structure for
information storage, not like the one at Fig 2ab. Is there a simple way
to check this prediction? line 294: The authors may refer to the
supplementary material around equation 14. line 299: Isn't it
problematic, than nu_i depends on M  that we try to optimize? line
304: An intuitive description of the process depicted on Fig. 3ab is that
a potentiation has to move all synaptic states to more potentiated states,
while a depression event must depress it. line 326: In all the
examples provided in the paper, half of the states are associated with
w=1 the other half has w=+1. How the memory capacity changes if the
states are more asymmetric, e.g., in the most extreme case there is only a
single state with w=1 and M1 state with w=1? line 415: Linear chains
maximize area  but not necessarily achieve the best performance (touches
the envelope) at t_0? Is it true that for high performance at a certain
time a chain other than linear may be optimal? Is the chain shown on Fig5b
a linear chain?
I appreciate the detailed rebuttal from the
authors. A short definition of the linear chain would be more useful than
repeating Fig 4. A possible definition would be that a chain is linear if
there is a unique ordering of the nodes from depressed to potentiated 
but this is not what the authors used.
Q2: Please
summarize your review in 12 sentences
The paper is clear, high quality, generally well
written and has a clear contribution to the field.
Q1:Author
rebuttal: Please respond to any concerns raised in the reviews. There are
no constraints on how you want to argue your case, except for the fact
that your text should be limited to a maximum of 6000 characters. Note
however that reviewers and area chairs are very busy and may not read long
vague rebuttals. It is in your own interest to be concise and to the
point.
We thank the reviewers for their thoughtful reviews.
Below are specific comments to each reviewer. Reviewer 1: We briefly
discussed a suggestion for biological experiments to test the theory in
line 428. We could expand on this. If we measured pre and postsynaptic
spike trains, and also recorded changes in postsynaptic potentials to
measure changes in synaptic weight, we could use hidden Markov model
techniques to find the bestfit synaptic model. Then given our theory, we
could match this measured synaptic model to optimal models to infer which
timescales the synapse operates on. Regarding the discrepancy between the
envelope and the numerical results: the envelope is just an upper bound
and we do not claim that it is a tight bound at all times (see
discussion). The reason is that equation (18) is not a complete set of
constraints, as discussed below it. In the paragraph from line 377 to 400,
we discuss this point further. Note that the numerical results are not
always instructive as the numerical procedures can be prevented from
reaching the global maximum by local maxima. This is shown by the fact
that our hand designed models can outperform the numerical methods at late
times. In fact, the apparent dropoff of the solid red curve at very late
times comes from not allowing small enough epsilon. We will fix this in
fig5. Yes, the level of noise tolerance is an important consideration.
Note that the stochasticity of the models is an expression of biological
noise. In fact, the type of noise described by the reviewer could be
included by adding extra states to the model with the different transition
rates. However, the synapses should not be allowed to optimize all of this
noise away. It would be interesting to consider such limits on the noise
levels. I think this would be beyond the scope of this work. Reviewer
2: We'll add a derivation of equation 3 to the supplement. We could add a
description along the lines of: "The factor of p^infinity describes the
synapses being in the steadystate distribution before the memory is
encoded. The factor of (M^potM^dep) comes from the encoding of the memory
at t=0, with w_ideal being +/1 in synapses that are
potentiated/depotentiated. The factor of exp(rt W^F) describes the
subsequent evolution of the probability distribution, averaged over all
sequences of plasticity events and the factor of w indicates the readout
via the synaptic weight." We referred to all of the original parts of the
supplement in the main paper, but we can be more explicit, and would be
happy to modify the paper to do so. Reviewer 3: line 195: in the
introduction, we cited [17], which describes diversity in synaptic
structure across the vertebrate brain. This could be related to
optimization for different timescales, but anything more than speculation
would require a better understanding of the relation between structure and
function (which is what we intend to begin with this work). line 220: Yes,
fig2a,b is an example of why it is difficult to map molecular states to
functional states as, despite appearances, these models actually only have
two functional states due to their equivalence to fig 2c. One experimental
investigation of this could be along the lines of the experiment described
in line 428 (see first paragraph of our reply to reviewer 1 above for
elaboration). Presumably we will find fewer functional states than
molecular states. Making the link between molecular and functional states
is an important research question for neurobiology. Our work helps make
progress towards this by providing a theory for how functional states
might be related to each other when memory is optimal, giving
experimentalists clues to look for. line 294: agreed. line 299: The fact
that eta_i depends on M^pot/dep is taken into account. In equation (57) of
the supplement, the term involving c_g comes from this dependence. If the
reviewer is concerned that the order of the eta_i could change during the
maximization procedure: note that necessary conditions for a maximum only
require that there is no infinitesimal perturbation that increases the
area. Therefore we need only consider an infinitesimal neighborhood of the
model, in which the order will not change. line 304: Yes. We'll add this
phrase to the text. line 326: In fig2(a,b), we need not have an equal
number of w=+/1 states (we could change the figure to reflect this), but
those states are not functionally relevant, as shown by their equivalence
to fig2c. The model in fig2c, of course, has no room for such asymmetry.
The models in figs4,5b do need to have equal numbers of +/ states,
asymmetry would make them worse. For fig4, the effect of the asymmetry
would be reduced in the limit as epsilon > 0, as all states other than
the end states would have very small p^infinity. It is important to note
that while our figures do show symmetric models, our proofs are general
and apply to asymmetric models as well. line 415: It is only for t_0 >
sqrt(N)M that the models that nearly touch the envelope are linear chains.
The model in fig5b is not a linear chain, as it has shortcut transitions,
but they are only the best models (that we have found) for times t_0 <
sqrt(N)M. Would it help if we repeated fig4 as fig5c?
 