NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center

### Reviewer 1

As discussed above, I believe that the interlacing greedy method is the most novel contribution and the thresholding technique is also a very nice addition which allows for the nearly linear run time. I believe related work is adequately cited. I have verified the proofs and believe them to be technically sound. The submission is very well written and the prose is clear and easy to read. EDIT: The authors have addressed all of my concerns and I keep my score of accept. I believe the significance of this work is the simplicity and speed of the algorithms presented. Of course, this paper does not give improved approximation ratios for these problems (and indeed, better approximation ratios are possible), but these algorithms are more practical than some previously proposed in the literature. Below are some comments regarding the algorithms and experiments: Algorithms 1. Are there instances where the interlacing greedy or the fast variant achieve an approximation ratio of exactly 1/4? In other words, is 1/4 the best approximation to hope for with this technique? 2. The proofs of the algorithms are terse in a few places. In particular, I think the proof of theorem 2 would benefit from a little more discussion on the inequalities required at the end for dealing with the delta term. For instance, I worked out the inequality $(1-2 \delta)/(1 + \delta) >= 1 - 6 \delta$ for $\delta \in [0,1]$ which I think is being used. Although this is easy to verify, I think it’s worth some acknowledgement. Additionally, the authors may want to highlight where they use non-negativity of f in the proofs (although non-negativity is assumed in their definition of submodularity, it is still helpful to see). Finally, the use of submodular inequality f(X \cap Y) + f(X \cup Y) <= f(X) + f(Y) is used in the proof but not discussed in the definitions; it would be nice to include a brief discussion of this equivalent definition of submoularity so that unfamiliar readers may better follow the proof. Experiments: 1. Unfortunately, the link does not contain any code so I could not look at the code. I assume that the author(s) will fix this and it doesn’t greatly affect my review. 2. Can you comment on why might the run time be slowly increasing with k? It’s worth noting that this empirical performance does not contradict your run time analysis. For instance, the # evaluations may be increasing as a function of k, but bounded above by the O(n log n). 3. My understanding is that in this situation, it is incorrect to say that the query complexity is increasing logarithmically with k. Although we see that, empirically, the number of evaluations grows slowly with k, there has been no test to determine that growth is logarithmic. In fact, using the term “logarithmic” implies some precise statement which is probably not true. So, I recommend replacing the discussion about “logarithmic growth” with a discussion about the point above. 4. As per the NeurIPS 2019 reproducability checklist, it would be good to include the language used and computing infrastructure used, even though the metrics of interest (e.g. # function evaluations) are machine independent. General Typos / Suggestions 1. Line 23: Add the epsilon dependence to the work of Badanidiyuru and Vondrak. 2. Lines 100-102: It might be best to describe the problem by the notation $max_{|S| \leq k} f(S)$ rather than the way it is phrased here. 3. Line 102: “the function $f$ is considered to be a value oracle” is awkwardly phrased. The function itself is not a value oracle, but it is “presented as” or “accessed by” a value oracle. 4. Line 119: Type “two standard ;j; greedy procedures” 5. The use of [n] seems to be inconsistent. Sometimes, it refers to the integers 1…n but other times it seems to refer to the integers 0…n-1. For example, Line 1 and Line 15 of Algorithm 1, where [n] = 1…n and [k+1] = 0…k are both used. 6. Line 146: Include delta / epsilon dependence in the running time 7. Line 158: “adds the first on the element” awkward phrasing, might be a typo. 8. Section 2.2 uses epsilon and delta interchangeably. I would recommend using only one for clarity. EDIT: The authors have sufficiently addressed all of my concerns and also (in my opinion) the concerns of other reviewers. I maintain my vote of "accept".