NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:1693
Title:Quality Aware Generative Adversarial Networks

Reviewer 1


		
Update after the rebuttal: I stand by my review and rating. The additional experiments and explanations in the rebuttal largely clarify the concerns I had. --- The paper proposes an approach to improving training of Generative Adversarial Networks (GANs) on images. The idea is to use regularizers based on image-specific similarity metrics (SSIM, NIQE). The method is evaluated on non-progressive GANs trained on three datasets: CIFAR-10, STL-10, CelebA. The proposed method seems to substantially improve FID and IS relative to baselines. Pros: 1) Reasonable and, to my knowledge, new idea. 2) Quite clear and complete presentation 3) The results are good, both qualitatively and quantitatively, especially in terms of FID. 4) Extensive supplementary material with many details and extra results. Cons: 1) Not a huge technical innovation, but a rather incremental modification based on existing techniques. 2) Some questionable statements: - The usage of SSIM as loss function has been limited (Section 3). It definitely has been used in many works, one example is [1] below. - Boundedness to [-1, 1] immediately renders SSIM an invalid distance metric (sec. 3.1) - why? - Why does d^Q serve as a good candidate for regularizing GANs? [1] Loss Functions for Image Restoration with Neural Networks, Hang Zhao, Orazio Gallo, Iuri Frosio, Jan Kautz. Transactions on Computational Imaging, 2017 Overall, the paper proposes a reasonable approach, presents it well and shows that the method performs well empirically. I think the paper can be published.

Reviewer 2


		
Thanks for the rebuttal. I have read it carefully. The new experiments look good, but the authors do not seem to respond to my concern over SSIM metric between unpaired images. I keep my original review and rating. ----------------------------------------------------------- I think the paper is clear, the intuition is well claimed. Given all the prior works that smooth GAN training, the idea that integrates image quality assessment metrics with GANs sounds interesting. From the experiment samples, it seems that the quality aware gan does improve the sample quality, the generated CelebA and STL images look sharp. I would like to see the results of combining QAGAN with large scale GANs such as PGGAN or BigGAN. I think the semantic details and structures play a center role in the quality of large images. Quantitatively, the paper shows that QAGAN is able to achieves comparable IS and lower FID than baselines. Below are some comments. - sec 3.1 and 3.2 discuss how to integrate SSIM into GAN framework, sec 3.3 and 3.4 discuss how to combine NIQE with GAN. I suggest the authors put sec 3.1 and 3.2 together, same for sec 3.3 and sec 3.4. - The MSCN coefficients for images generated by GAN in the supplementary material is useful, I suggest putting it in the main paper with figure 1. - I think this paper [1] is related to the main idea and should be compared and discussed in the paper in detail. - SSIM is orignally used to compare paired distorted image and its pristine counterpart. However, in QAGAN, the distance is calculated between randomly sampled generated image and real image, which are not paired. I am not sure whether the SSIM distance still makes sense in this setting. Chances are that the distance between two real images can be even larger than the distance between real image and distorted image because the two real images may have completely different local luminance, contrast and structure. [1] Kancharla, P., & Channappayya, S. S. (2018, October). Improving the Visual Quality of Generative Adversarial Network (GAN)-Generated Images Using the Multi-Scale Structural Similarity Index. In 2018 25th IEEE International Conference on Image Processing (ICIP) (pp. 3908-3912). IEEE.

Reviewer 3


		
Update: I thank the authors for their feedback. The additional experiments address my concerns regarding larger scale experiments and the discussion about the choice of lambdas clarifies my concerns regarding convergence. Thus, I decided to raise my score. ######################################################################################################################################## (1) Summary of the Paper This work advocates the usage of image quality objectives as regularizers when training GANs in order to obtain more natural looking images. In particular, two different metrics are used: (i) a variant of the SSIM index and (ii) a gradient penalty inspired by NIQE along the WGAN GP framework in order to achieve a superior performance to several baselines, as demonstrated empirically on 3 datasets. (2) Paper Clarity The writing of the paper is clear and it is easy to follow. It is well motivated and contains enough background information. (3) Methodology and Significance The idea to use image quality metrics as regularizers is simple, yet seems intuitive, which I appreciate. However, while the new regularizers are shown to have a superior performance in terms of FID/INC, I have several concerns/questions. Specifically: - How do the regularizers affect the stability of training? How sensitive are they to the choice of the hyperparameters (the lambdas)? - In Table 1 and 2, are the scores provided for FID and INC computed using the same trained model (i.e. same lambdas), or are they tuned separately for both scores? Overall, while the method is simple and intuitive and it appears to work well, its contribution is somewhat limited. The main reason behind this is that it is only applied to one particular GAN variant i.e. WGAN GP and it is not clear how one can extend it to other GAN versions. Furthermore, since the main motivation of the submission is to generate more natural looking images, a proper evaluation should include some higher-resolution datasets where the regularizers can potentially have a bigger impact.