
Submitted by Assigned_Reviewer_1
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
*Summary*
This paper studies the adaptive stochastic optimization on paths, by imposing some constraints (namely the MLRB and the MLB conditions) on the relationship between the probability of a realization and its utility. While the notion of *adaptive submodularity* states that in expectation, no item will "surprise" us given a longer history, the MLRB / MLB conditions in this paper roughly characterizes the fact that such "surprise" only happens with small probability (i.e., realization with small probability has large utility). By leveraging such constraints, the authors propose a novel recursive algorithm with nearoptimal guarantee on the expected cost needed to reach a certain coverage. Under the MLB conditions, the paper recovers the classical approximation result on bayesian poolbased active learning, up to a constant factor.
*Quality*
The paper is nicely written and well structured. Regarding the experimental results: by Theorem 4, it would be very interesting to see how the proposed RAC algorithm competes with existing algorithms *empirically* on the poolbased active learning problem (e.g., [3][5][7], or [6] under the bounded noise setting), given the fact that the version space reduction function and the Gibbs error function both satisfy MLRB condition, pointwise submodularity condition, and adaptive submodularity condition. Having these results will certainly make the paper stronger.
*Clarity*
Most of the paper is clear. It would be helpful if the authors elaborate more on the intuitive explanations of the two conditions: why are those two conditions more natural than the adaptive submodularity condition for adaptive optimization problems on paths?
*Originality and Significance*
To the best of my knowledge, the technical contribution of this contribution is novel. It provides interesting insights to developing tractable algorithms for the adaptive stochastic optimization problem.
Q2: Please summarize your review in 12 sentences
The paper proposes an interesting solution to the adaptive stochastic optimization problem on paths (and sets as a special case). The authors identify a class of problems that can be nearoptimally approximated using their solution, and provides strong empirical evidence. The presentation is in general clear and easy to read.
Submitted by Assigned_Reviewer_2
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper gives an iterative procedure to solve the Adaptive Stochastic Optimization problem. The problem defined by an objective function f, and the paper develops conditions on the function that make the problem amenable to their solution approach (recursive adaptive convergence).
Quality: The ASO problem is described as a generalization of POMDP and conditions on the solution technique are given in simulation. The choice of baseline algorithms makes sense, but the paper would be much stronger if it the approach was compared more directly to similar approaches, e.g. popular POMDP solutions. Since the paper is rather technical giving more practical/concrete examples of the propositions would be helpful.
Clarity: The layout of the paper, language and flow are excellent, especially given the technical nature of the material.
Connecting the flow to specific problems is difficult. Having a running example, might make the paper easier to follow.
Originality:
This paper seems to capitalize on theoretical developments in submodular function optimization and applying these results to the ASO problem. The original part comes from the particular application.
Significance: Given the generality of the setting, the results could have a significant impact.
However, the current presentation does not make a strong case for this potential. Comparing the numerical results to alternative solution techniques or more directly arguing for the applicably of the MLRB and MLB conditions would make the case of impact stronger.
Q2: Please summarize your review in 12 sentences
This paper gives a tractable
solution algorithm and sufficient conditions for when it's applicable for a general (and generally intractable) planning problem (ASO).
Submitted by Assigned_Reviewer_3
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper is generally clear, with a few exceptions as noted below.
The paper extends current work in a potentially useful way, although the experiments used to illustrate the method are fairly small and simple.
The theoretical work relates this work to prior work (prop 2), demonstrates applicability to a new domain (prop 1 and 3), and analyzes the algorithm.
The usefulness of the algorithm analysis is questionable (e.g. Thm 1 and 2).
The resulting bound grows with the square of the log of the number of actions (where this is actually the number of possible locations on the path, as each action is "visit location x") as compared to the optimal cost.
This does not seem especially tight for application with any reasonablesized problem.
A couple of clarity issues: line 110, should this inequality be reversed as A is a subset of B? line 125, what are the conditions for termination of a policy and how is that determined in practice? line between 151 and 152, I think some mistakes in notation line 216, what are alignment assumptions? experiments: the labels on the plots don't match those in the text.
The captions should be more informative and not repeat information already on the plots (Gibbs error)
Q2: Please summarize your review in 12 sentences
This paper introduces an extension to submodular optimization that extends the method to apply to more than sets.
The contributions include the development and analysis of two conditions that allow this, as well as introducing an algorithm to solve problems meeting these conditions.
These contributions appear novel, well justified, and of interest to the community.
Q1:Author
rebuttal: Please respond to any concerns raised in the reviews. There are
no constraints on how you want to argue your case, except for the fact
that your text should be limited to a maximum of 5000 characters. Note
however, that reviewers and area chairs are busy and may not read long
vague rebuttals. It is in your own interest to be concise and to the
point.
Assigned Reviewer 1:
*Quality*
The
paper is nicely written and well structured. Regarding the experimental
results: by Theorem 4, it would be very interesting to see how the
proposed RAC algorithm competes with existing algorithms *empirically* on
the poolbased active learning problem (e.g., [3][5][7], or [6] under the
bounded noise setting), given the fact that the version space reduction
function and the Gibbs error function both satisfy MLRB condition,
pointwise submodularity condition, and adaptive submodularity condition.
Having these results will certainly make the paper stronger.
A
>>> We thank reviewer for feedback. We intend to compare with
algorithms on active learning problems in an extended version of this
paper.
*Clarity*
Most of the paper is clear. It would be
helpful if the authors elaborate more on the intuitive explanations of the
two conditions: why are those two conditions more natural than the
adaptive submodularity condition for adaptive optimization problems on
paths?
A >>> We do not know of any method to exploit
adaptive submodularity condition for adaptive optimization problems on
PATHS. The two conditions seem natural to us because we can plan a path
assuming the maximum likelihood observation and exploit the condition if
we don't encounter the ML on the path. We will elaborate on this issue in
the revision.
Assigned reviewer 2:
Quality: The ASO
problem is described as a generalization of POMDP and conditions on the
solution technique are given in simulation. The choice of baseline
algorithms makes sense, but the paper would be much stronger if it the
approach was compared more directly to similar approaches, e.g. popular
POMDP solutions. Since the paper is rather technical giving more
practical/concrete examples of the propositions would be helpful.
A >>> We thank reviewer for feedback. We intend to
compare with POMDP algorithms in an extended version of this
paper.
Significance: Given the generality of the setting, the
results could have a significant impact. However, the current presentation
does not make a strong case for this potential. Comparing the numerical
results to alternative solution techniques or more directly arguing for
the applicably of the MLRB and MLB conditions would make the case of
impact stronger.
A >>> Thank you for the suggestion.
We will try to think about meaningful alternatives for
comparison.
Assigned reviewer 3:
This paper is generally
clear, with a few exceptions as noted below. The paper extends current
work in a potentially useful way, although the experiments used to
illustrate the method are fairly small and simple.
The theoretical
work relates this work to prior work (prop 2), demonstrates applicability
to a new domain (prop 1 and 3), and analyzes the algorithm. The usefulness
of the algorithm analysis is questionable (e.g. Thm 1 and 2). The
resulting bound grows with the square of the log of the number of actions
(where this is actually the number of possible locations on the path, as
each action is "visit location x") as compared to the optimal cost. This
does not seem especially tight for application with any reasonablesized
problem.
A >>> We agree that the bounds are not as
tight as we like due to log X^{2+e} factor. Unfortunately, this is
derived from a stateoftheart polynomial time polymatroid path planning
algorithm. We speculate that the algorithm will be good for most practical
cases. Otherwise, we can replace the polynomial time path planning with
general search algorithms
A couple of clarity issues: line 110,
should this inequality be reversed as A is a subset of B?
A
>>> Yes, we will correct that in revision.
line 125,
what are the conditions for termination of a policy and how is that
determined in practice?
A >>> We terminate when the
worst case value of the function given a history is maximum. It will be
easier to evaluate termination if the function satisfies minimal
dependency condition (prop. 5).
line between 151 and 152, I think
some mistakes in notation A >>> I think it is correct.
However, we intend to revise the notations to be more consistent with
Golovin and Krause paper for clarity.
line 216, what are alignment
assumptions?
A >>> We will revise this sentence and
bring the short discussion of alignment forward.
experiments: the
labels on the plots don't match those in the text. The captions should be
more informative and not repeat information already on the plots (Gibbs
error)
A >>> We will correct the mistake in the label.
We agree the captions can be simplified.
Assigned reviewer
4:
Thank you for your encouragement.
Assigned reviewer
6:
We thank reviewer for feedback. We will clarify the notations
especially for the prior and posterior distributions in the
revision. 
