Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
- Even-sized kernels are rarely used in models for discriminative vision tasks. Based on the results the method can effectively address this problem on image classification tasks including CIFAR-10, CIFAR-100, and ImageNet. - Additionally, some results on image generation are demonstrated. Their method consistently leads to a better FID score. - However, it's not evaluated on discriminative tasks like object detection or semantic segmentation where the spatial information is important. - Overall, the method is interesting and effective but simple.
Although this is an interesting work, the main argument of this work is not so convincing. The authors argue that the even-sized kernel will shift the feature map which results in performance degradation. However, the shifting may not be the key reason here. One popular explanation is, when using 2x2 kernel in downsampling layer with stride=2, information will be lost since no overlapping between adjacent convolution patches. The feature map shifting may not be a key issue here as convolutional operator is invariant to spatial shifts. I am not convinced that asymmetric or symmetric padding will make big difference in DCNN. ====== Post Author Feedback ====== I read the author feedback. The authors provide more experimental results to show the performance improvement of symmetric padding. I raise my score based on the new numerical results which seem promising.
Originality: Identifying and bringing up the problem of even-sized kernels is a very interesting direction to take. The method proposed in the paper is not very novel: it is in fact quite similar to what's done in . However, I can also see that the authors come to the shift operation from a different direction, with the intention to address the asymmetric padding issue. I'm not aware of any work that address the asymmetric padding issue this way and I think this solution is quite novel. Quality: The experiments do not fully convince me of the claims made in the paper. More details in "Improvements". Clarity: the paper is written clearly on the methods and experiments. However the explanations and hypothesis of the "edge effect" and the "erosion" are described quite vaguely. Significance: If the experiments and results are more solid (if my concerns about experiments can be resolved), I think this would be an insightful finding in CNN architecture design. Given that the paper has novel motivation and ideas, I'm recommending a score of 6 in the hope that authors can convince me of their claims.