NeurIPS 2020

Adaptation Properties Allow Identification of Optimized Neural Codes

Review 1

Summary and Contributions: The authors explore optimal, efficient neural codes based on Fisher information of a one-dimensional stimulus. The goal of the paper was to build a general framework to help discover what objective functions neural codes might be optimal for (rather than requiring a hand-selected value).

Strengths: The motivations and setup were well-defined, and much of the theory was well-grounded. In particular, it's a compelling notion to discover what neural codes are doing rather than looking for specific pre-defined optimalities. The generality of the framework may be of interest to those in the NeurIPS community studying neural coding.

Weaknesses: Unfortunately, I found much of the results and methods difficult to follow. In particular, the instructions for applying this methodology to real data was not obvious (sections 2.4-2.5) and lacked a concrete example. For example, how would perform the fixed-point search on line 255? (and when does such a distribution exist?) If this requires a parametric assumption, is in the Poisson case in section 2.5, what is gained by the general framework that does not require these assumptions (line 106-109)? UPDATE: The authors' response has helped clarify some of my questions about the fixed point approach and the modeling assumptions.

Correctness: The approach and methodology used is appropriate and appears correct.

Clarity: Most of the paper is well-written, though I found some of the derivations and presentations of the results unclear. The contents of figure 2 and the caption could have been more clearly labeled and explained to better demonstrate the coding framework. (For 2b, does the “uniform” part only apply to the left panel?)

Relation to Prior Work: The authors discuss how their approach differs to previous efficient coding frameworks.

Reproducibility: Yes

Additional Feedback:

Review 2

Summary and Contributions: This paper derives a formal and general solution to the problem of optimal coding of a 1-dimensional variable, where optimality is defined with respect to a function of the Fisher information (and thus local discriminability) and subject to both explicit resource constraints and implicit constraints of monotone continuity of the encoding function. The result subsumes and generalizes a number of previous findings. The authors also use it to propose a scheme to determine the cost function and constraints

Strengths: While it builds on substantial earlier work in optimal coding theory, this paper offers a new and elegant, general formulation of the problem that is valuable in itself, and may well seed further progress in characterizing neural tuning and adaptation.

Weaknesses: The paper shares some limitations with much previous work (some of which are mentioned in the Discussion): a limitation to 1-D stimuli, a treatment of coding in isolation from computation, and a combination of a local discriminability measure with smooth monotone encoding (rather than a consideration of more general encoding schemes under mutual information). It is unfortunate that there is no direct link to known neural codes, nor a discussion of how known adaptation results may be interpreted in light of these arguments.

Correctness: I did not notice any errors.

Clarity: Yes.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: I have noted the author feedback, but see no reason to revise my review.

Review 3

Summary and Contributions: The submission “Adaptation properties allow identification of optimized neural codes” describes a new formulation of efficient coding theory. This work builds upon several pieces of previous work. It uses a slightly more flexible efficient coding formulation than some of the previously work, by having an objective function that is based on the Fisher information while also considering the constraints on the neural activity. This theory does not directly operate on the firing rate, rather it operates on a more abstract quantify Fisher information. I think the model formulation is quite clever. The authors derived a set of results by consider different instantiations of the model. While in some cases it recovers known results in the literature, there are a few instances which have not been considered previously. The authors also consider the problem of recovering the objective function and the constraints from the neural data.

Strengths: Although I didn’t check every single line of math, the theoretical deviations in the paper appear to be sound for the most part. The theoretical formulation is general, yet could be used to make specific predications. The theory generalizes and helps organize some the recent results in efficient coding in a coherent framework. Efficient coding is an important topic in neural coding, and is clearly relevant to the Neuroscience community at NeurIPS. While one might consider the paper to be incremental, I do find the general formation presented in the paper to be quite interesting and informative. The paper also considers how to reverse-engineer the cost function and constraint under certain conditions. This is a previously overlooked problem in my view. Although the assumptions may seem to be a bit restrictive, these kind of exercises are still insightful.

Weaknesses: The writing of the manuscript is quite dense, and occasionally it is difficult to follow the arguments and understand the key points the authors would like to deliver. The paper would be stronger if the authors could include real dataset from physiology。

Correctness: I found the overall claims and method to be sound.

Clarity: Section 2.3 asks the question of “how efficient codes depend on neural noise”. However, I didn’t find a clear answer there. Maybe the authors could re-write to make it clear. Line 235-236, it is confusing to talk about p(\theta) “starts” to depend on p(s), because here the model didn’t consider time explicitly. Line 255-262, this is an important aspect of the results. However, the logic here is bit difficult to follow. Maybe it would be useful to first describe the key ideas and intuitions, then followed by the details.

Relation to Prior Work: Overall, the relation to the previous work is well described. The only thing that is unclear to me is how the results on deviation from activity constraints relate to the results in ref [14].

Reproducibility: Yes

Additional Feedback: A major technical concern I have is that how the authors can still recover the objective function and constraints while estimating the neural activity parameters. From the description in Sec. 2.4.3, in order to recover the objective and constraint functions, the code needs to adapt quickly. If that’s the case, how to reliably estimate the neural parameters in the presence of a change of the code? I’d thought that the two things would be disentangled. Added after the rebuttal period: After reading through other reviews and rebuttal, I remain positive about this paper. As other reviewers also pointed out, although the results in the paper are a bit incremental, but this (slightly) general framework formulated in the paper might still count as an interesting contribution. writing- The paper is quite dense, and the authors went over too many things- I hope the authors could further improve the presentations. One weakness is there are no experimental data presented, but if viewing as a theoretical paper, one could still justify it as a solid paper.

Review 4

Summary and Contributions: 1. This paper proposes an efficient coding formulation with adaptation to changing world with the following three ingredients in a 1D parameter (the sufficient statistics of stimulus) and a 1D stimulus space: * objective function (as a function of Fisher information) the code aims to optimize * constraints (like firing rate) on neural activities * stimulus distribution 2. The authors propose an experimental paradigm that can identify constraints on neural codes, regardless of the objective function, using a fixed-point iterative closed loop experiment, which might be used to evaluate animal's efficient coding constraints in certain adaptation settings.

Strengths: This is an interesting paper that links multiple important ideas in a tractable 1D model framework. It is appealing to test the brain's efficient coding strategy while the neural codes adapts a changing stimulus. This paper offers a simple yet clean way to generalize and formulate the brain's optimization algorithm by decomposing the adaptation part into the change of sufficient statistics p(theta|s) while keeping the encoding model p(r|theta) unchanged. An experiment is conceptually proposed to identify if a resource constraint exists experimentally. Such a prediction is of great importance for experimental validation of the theory.

Weaknesses: This is a hard paper to read. The authors should spend more time justifying and explaining some of their claims. I found the parts related to the flat Fisher space particularly obsure, which seems to be a critical aspect of the paper. The fixed point discussion could also be substantially improved. Within the scope of this paper, it's unclear if the optimization objective function is identifiable experimentally, although they do show that a deviation from a log is testable. Consequently, when the metabolic constraint is tested to be present (non-flat p(\hat(\theta))), it's hard to quantify the deviation since it will depend on the objective function form. No discussion for how the adaptation could be potentially implemented in the brain and predictable by this work.

Correctness: The derivations are clear. Basic assumptions are reasonable in the 1D case. Minor: [1] refers to Histogram Equalization, not Histogram Equilibration

Clarity: The section on information geometry needs more exposition and the flat Fisher space. It was unclear why the authors could move to the flat Fisher space by changing variables, and whether the reparameterization by \hat{\theta} was equivalent to altering the tuning of theta(s). L65: one-dimensional *manifold*, not subspace. Since theta is monotonically related to s, s will already be a sufficient statistic if theta is. Relatedly, this parameterization makes Fisher Info independent of noise, but why is the right one? \hat{\theta} not just a reparameterization of theta, but is also a reparameterization of s — and that was the original point of the optimization. **Update after rebuttal I thank the authors for their clarifications. I have decided to increase the score from 6 to 7.

Relation to Prior Work: Adequate. Mainly discussed in Section 1 and Section 2.1.

Reproducibility: Yes

Additional Feedback: It would be helpful to hold the reader's hand and provide more intuitions. These findings seem interesting but it is difficult for me to piece them together as is.