{"title": "Generative Well-intentioned Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 13098, "page_last": 13109, "abstract": "We propose Generative Well-intentioned Networks (GWINs), a novel framework for increasing the accuracy of certainty-based, closed-world classifiers. A conditional generative network recovers the distribution of observations that the classifier labels correctly with high certainty. We introduce a reject option to the classifier during inference, allowing the classifier to reject an observation instance rather than predict an uncertain label. These rejected observations are translated by the generative network to high-certainty representations, which are then relabeled by the classifier. This architecture allows for any certainty-based classifier or rejection function and is not limited to multilayer perceptrons. The capability of this framework is assessed using benchmark classification datasets and shows that GWINs significantly improve the accuracy of uncertain observations.", "full_text": "Generative Well-intentioned Networks\n\nJustin Cosentino, Jun Zhu\u2217\n\nDept. of Comp. Sci. & Tech., Institute for AI, THBI Lab, BNRist Center,\nState Key Lab for Intell. Tech. & Sys., Tsinghua University, Beijing, China\n\njustin@cosentino.io, dcszj@mail.tsinghua.edu.cn\n\nAbstract\n\nWe propose Generative Well-intentioned Networks (GWINs), a novel framework\nfor increasing the accuracy of certainty-based, closed-world classi\ufb01ers. A condi-\ntional generative network recovers the distribution of observations that the classi\ufb01er\nlabels correctly with high certainty. We introduce a reject option to the classi\ufb01er\nduring inference, allowing the classi\ufb01er to reject an observation instance rather\nthan predict an uncertain label. These rejected observations are translated by the\ngenerative network to high-certainty representations, which are then relabeled by\nthe classi\ufb01er. This architecture allows for any certainty-based classi\ufb01er or rejection\nfunction and is not limited to multilayer perceptrons. The capability of this frame-\nwork is assessed using benchmark classi\ufb01cation datasets and shows that GWINs\nsigni\ufb01cantly improve the accuracy of uncertain observations.\n\n1\n\nIntroduction\n\nAn essential aspect of any machine learning system is understanding what the model does not know.\nDespite achieving state-of-the-art performance across a wide array of problem domains, current\ndeep learning techniques do not actually capture model uncertainty. Core settings in which standard\ndeep learning approaches have been deployed, such as medical diagnoses, autonomous vehicles, and\ncritical systems, rely on accurate estimates of uncertainty [16, 10]. Though traditional Bayesian\nprobability theory offers mathematical tools to reason about model uncertainty, such approaches\ndo not scale to the high dimensional feature spaces found in many deep learning tasks. The need\nfor principled uncertainty estimates from deep learning architectures has given rise to the \ufb01eld of\nBayesian deep learning (see e.g., [35]) and many deep learning techniques have been interpreted\nthrough a Bayesian lens with the development of advanced inference algorithms [36, 39], providing\nnovel methods for obtaining uncertainty estimates from deep learning models [21, 11, 12, 13, 22].\nOne may be able to measure epistemic uncertainty \u2013 uncertainty in model prediction due to the lack\nof knowledge \u2013 using Bayesian neural networks [25, 29], but the question of how to best utilize\nuncertainty estimates still remains. In this paper, we propose Generative Well-intentioned Networks\n(GWINs), a novel framework that leverages these uncertainty estimates to increase the generalizability\nand accuracy of certainty-based classi\ufb01ers. Rather than make low-certainty predictions, a model can\nreject an observation to achieve an arbitrarily high accuracy [5]. However, a model that refuses to\nclassify is not particularly useful. Borrowing ideas from the \ufb01elds of classi\ufb01cation with rejection\nand generative networks, we allow a classi\ufb01er to reject uncertain observations and then, using a\ngenerative network, transform them into representations that the classi\ufb01er labels correctly with high\ncertainty. Informally, one can view the classi\ufb01er as \u201cintuition\u201d and the generative network as \u201ccritical\nthinking\u201d: given a new observation that we can not quickly reason about with prior knowledge, we\napply critical thinking to reformulate the problem by relating it to information we already know to\nbe true. We show that the generative network G is able to recover the distribution of observations\n\n\u2217Corresponding author.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fthat classi\ufb01er C labels correctly with high certainty and that this reformulation process signi\ufb01cantly\nincreases classi\ufb01er accuracy on the rejected observation subset.\nThe rest of this paper is organized as follows. We introduce the necessary background regarding\nGenerative Adversarial Networks (GANs) and rejection-based classi\ufb01cation in Section 2. Our\nproposed GWIN framework is formally de\ufb01ned in Section 3 and a sample GWIN implementation is\ndetailed in Section 4. We then empirically evaluate the effectiveness of the proposed framework in\nSection 5. Lastly, we discuss related works in Section 6.\n\n2 Preliminaries\n\n2.1 Generative Adversarial Networks\n\nGenerative Adversarial Networks (GANs) [17] are generative models that make use of an adversarial\nprocess between two networks to learn a distribution: a generator network G produces synthetic data\ngiven some noise vector z while a discriminator network D discriminates between the generator\u2019s\noutput and samples from the true data distribution. The goal of the generator is to produce samples\nthat fool the discriminator. Formally, this adversarial game results in the following minimax objective:\n\nG\n\nD\n\nmin\n\nmax\n\n[log(1 \u2212 D( \u02dcx))],\n\nE\nx\u223cPr\n\n[log(D(x))] + E\n\u02dcx\u223cPg\n\n(1)\nwhere Pr is the real data distribution and Pg is the generated distribution implicitly de\ufb01ned by\nx(cid:48) = G(z). z is a random noise vector sampled from a simple noise distribution p, i.e., z \u223c p(z).\nWith enough capacity, the discriminator will reach an optimum given G so that Pr = Pg [17].\nIt is well known that GANs suffer from training instability [33], suggesting that the divergences\nwhich GANs usually minimize are the cause of such training dif\ufb01culties [2]. The Wasserstein GAN\n(WGAN) proposes the use of the Earth-Mover distance to de\ufb01ne its objective function:\n\n(2)\nwhere D is the set of 1-Lipschitz functions. The Wasserstein GAN with gradient penalty (WGAN-GP)\n[19] further builds on this work, providing a \ufb01nal objective function with desirable properties:\n\n[D( \u02dcx))],\n\nmax\nD\u2208D\n\nmin\n\nG\n\nE\nx\u223cPr\n\n[D(x)] \u2212 E\n\u02dcx\u223cPg\n\nmin\n\nG\n\nmax\n\nD\n\nE\n\u02dcx\u223cPg\n\n[D(\u02dcx)] \u2212 E\nx\u223cPr\n\n[D(x)] + \u03bb E\n\u02c6x\u223cP\u02c6x\n\n(cid:104)\n((cid:107)\u2207\u02c6xD(\u02c6x)(cid:107)2 \u2212 1)2(cid:105)\n\n.\n\n(3)\n\nLastly, GANs can be extended to conditional models by conditioning both the discriminator and\ngenerator on auxiliary information y [27]. By providing y as additional input to each network, the\noriginal GAN objective function presented in Equation (1) becomes:\n\nmin\n\nG\n\nmax\n\nD\n\nE\nx\u223cPr\n\n[log(D(x, y))] + E\nz\u223cPz\n\n[log(1 \u2212 D(G(z, y), y))].\n\n(4)\n\nIn this work, we build upon a conditional implementation of the WGAN with gradient penalty.\n\n2.2 Classi\ufb01cation with Reject\n\nEntirely orthogonal to the \ufb01eld of generative networks is the study of classi\ufb01cation with rejection. The\nproblem of classi\ufb01cation with rejection can be informally de\ufb01ned as giving the classi\ufb01er the option to\nreject an observation instance instead of predicting its label. Depending on the setting, the classi\ufb01er\nmay incur some small cost for rejection, though this cost is typically less than that of a random\nprediction. The motivation behind rejection-based classi\ufb01cation is to avoid misclassi\ufb01cation in high\nrisk situations, such as medical diagnoses, when the classi\ufb01er has low certainty that its prediction\nwill be correct. Early works explored the inherent tradeoff between error rate and rejection rate [4, 5],\nwhile more recent works have explored the binary classi\ufb01cation setting [37, 3, 6]. We borrow the\nbasic idea of threshold rejection from these works: given some threshold \u03c4, one rejects an observation\ninstance if certainty in correct prediction is less than \u03c4.\nRecent work also explored the reject option in the context of deep learning [14, 15]. Though we\nopt for the simplicity of the thresholded reject option described above, it is worth noting that these\nmethodologies could also be used within the Generative Well-intentioned Network framework.\n\n2\n\n\fFigure 1: The inference process for some new observation xi. If classi\ufb01er C labels the input y(cid:48)\ni\nwith certainty ci and rejects the query, the conditional GWIN translates the given query to the\nclassi\ufb01er\u2019s con\ufb01dent distribution. The transformed query x(cid:48)\ni is then relabeled by the classi\ufb01er, i.e.,\nC(G(xi, z)). The variable z denotes a random noise vector. The top half of this \ufb01gure outlines the\nexpected interface of the rejection-based classi\ufb01er. Aside from requiring the model to emit a certainty\nmetric ci and label y(cid:48)\ni, no strong assumptions are made about the classi\ufb01er. Since the classi\ufb01er is\n\ufb01xed during generative training, it need not be a perceptron-based model. The rejection function\nr : {(c, y(cid:48))} \u2192 {reject, y(cid:48)} determines if the given observation is rejected or labeled.\n\n3 Generative Well-intentioned Network Framework\n\nWe propose a novel framework that leverages uncertainty estimates and generative networks to\nincrease the accuracy of certainty-based models during inference. The framework consists of three\ncore components:\n\n1. A pretrained, certainty-based classi\ufb01er C that emits a prediction y(cid:48)\n\nlabeling a new observation xi, i.e., (y(cid:48)\n\ni, ci) = C(xi)\n\ni with certainty ci when\n\n2. A rejection function r : {(c, y(cid:48))} \u2192 {reject, y(cid:48)} that allows the classi\ufb01er to reject an\n\nuncertain instance rather than predicting its label\n\n3. A conditional generative network G that transforms an observation xi and noise vector z to\n\na new representation x(cid:48)\n\ni, i.e., x(cid:48)\n\ni = G(xi, z)\n\nA key feature of this framework is that it can be used together with any certainty-based classi\ufb01er and\ndoes not modify the classi\ufb01er structure at any point during the generative training process. Assuming\nthat the classi\ufb01er and rejection function provide the interface illustrated in Figure 1, any classi\ufb01er or\nrejection function can be used within this framework.\nGiven this \ufb01xed, certainty-based classi\ufb01er C, the conditional GWIN G learns distribution Pc, where\nPc represents the distribution of observations from the original data distribution Pr that C labels\ncorrectly with high certainty. The goal of G is to generate a new observation x(cid:48) \u223c Pc from\n(x, y) \u223c Pr that the classi\ufb01er will label as ground truth y with high certainty. During inference, the\nclassi\ufb01er can choose to reject observation x if uncertain that it will label x correctly. This observation\nis then passed to G, along with a noise vector z, to generate a transformed sample for reclassi\ufb01cation.\nThe inference process is illustrated in Figure 1 and examples of the transformation process using a\nWasserstein GWIN are shown in Figure 2.\nSimilarly to the classi\ufb01er and the rejection function, we do not place any strong restrictions on the\ngenerative framework. We propose a Wasserstein GWIN in Section 4 as one potential approach.\nThough the Wasserstein network makes use of adversarial procedure, we refer to these generative\nnetworks as \u201cwell-intentioned\u201d since they aim to maximize the accuracy and certainty of the provided\nclassi\ufb01er.\n\n3\n\nGWINx\u2019izxiClassifierReject?r(ci, y\u2019i)y\u2019iy\u2019icirejectxi\fFigure 2: A visual representation of the GWIN transformation using example images from the MNIST\nDigits dataset. With a certainty threshold of \u03c4 = 0.8, the classi\ufb01er rejects the observations on the left,\nwhich would had been labeled incorrectly were the classi\ufb01er forced to predict. These observations\nare then transformed into the representations on the right using the Wasserstein GWIN described in\nSection 4.3. When relabeling the generated images, i.e., C(G(x, z)), the classi\ufb01er labels correctly\nwith high-certainty.\n\n4 Wasserstein Generative Well-intentioned Network\n\nWe outline a sample GWIN implementation, as de\ufb01ned in Section 3, based on the Wasserstein GAN\n[2]. We utilize a Bayesian Neural Network classi\ufb01er and a simple \u03c4-threshold rejection function.\nSection 5 evaluates this proposed implementation.\n\n4.1 Classi\ufb01er\n\nThe GWIN is paired with a Bayesian neural network [29] using a LeNet-5 architecture [23]. A detailed\ndescription of the classi\ufb01er\u2019s architecture is in the appendix. The network is implemented using\nTensorFlow Probability [7], which provides clean abstractions for Bayesian variational inference. The\nmodel uses the Flipout estimator [40] to minimize the Kullback-Leibler divergence up to a constant,\nalso known as the negative Evidence Lower Bound (ELBO).\nWe approximate prediction certainty using Monte Carlo sampling to draw class probabilities from the\nmodel. We treat the median prediction of these draws as the certainty metric for each class and the\nmean prediction value as the prediction score. The class with the highest prediction value and its\ncertainty metric are then provided to the rejection function.\nRecall from Section 3 that the GWIN Framework is model-agnostic for certainty-based classi\ufb01ers.\nThus, experiments do not focus on improving the classi\ufb01er or rejection function, but rather analyze\nhow the GWIN improves accuracy for a \ufb01xed classi\ufb01er. In the appendix, we show that the GWIN\nstill improves classi\ufb01er performance for a stronger Bayesian neural network.\n\n4.2 Rejection Function\nWe use a simple \u03c4-threshold rejection rule, where \u03c4 \u2208 [0, 1]:\n\n(cid:26)y(cid:48)\n\nr(ci, y(cid:48)\n\ni) =\n\nif ci \u2265 \u03c4\ni,\nreject, otherwise.\n\n(5)\n\nThe choice of \u03c4 is made at time of inference, meaning that this rejection function can be tuned after\nthe generative network has been trained for optimal accuracy. Setting \u03c4 = 0 rejects no values and is\nequivalent to using only the base classi\ufb01er, while setting \u03c4 = 1 rejects all values and is equivalent to\npreprocessing all input with the GWIN.\n\n4.3 Wasserstein GWIN with Gradient Penalty\n\nThe Wasserstein GWIN with gradient penalty (WGWIN-GP) is based on the Wasserstein GAN\nwith gradient penalty [19]. The architectures of both the critic and generator closely follow the\noriginal WGAN-GP models and a detailed description of these architectures is in the appendix. In\nthis subsection, we detail core modi\ufb01cations to the original model.\n\n4\n\nOld Label: 9Certainty: 60.19%New Label: 5Certainty: 96.60%Old Label: 9Certainty: 67.98%New Label: 7Certainty: 99.86%Old Label: 5Certainty: 78.60%New Label: 0Certainty: 99.95%GWIN\fLoss with Transformation Penalty The WGWIN-GP introduces a new loss function with a\ntransformation penalty that encourages the conditional generator to produce images that the classi\ufb01er\nwill label correctly. Given some (xi, yi) training observation, the generator should produce x(cid:48)\ni that\nthe classi\ufb01er labels as yi. This penalty is the loss of the classi\ufb01er when labeling the transformed\nobservations in the current training batch, denoted Loss(C(x(cid:48))). We include a penalty coef\ufb01cient\n\u03bbLoss. All experiments in this paper use \u03bbLoss = 10, which we found to work well across experiments.\nEquation (6) shows the loss function for the GWIN:\n\nL = E\nx(cid:48)\u223cPg\n\n(cid:124)\n\n[D(x(cid:48), y)] \u2212 E\nx\u223cPc\n\n(cid:123)(cid:122)\n\nWGAN Loss\n\n[D(x, y)]\n\n(cid:125)\n\n+ \u03bbGP E\n\u02c6x\u223cP\u02c6x\n\n(cid:124)\n\n(cid:123)(cid:122)\n\n+ \u03bbLoss E\nx(cid:48)\u223cPg\n\n(cid:125)\n\n(cid:124)\n\n(cid:123)(cid:122)\n\n(cid:125)\n\n[(||\u2207\u02c6xD(\u02c6x, y)||2 \u2212 1)2]\n\n[Loss(C(x(cid:48)))]\n\n.\n\n(6)\n\nWGAN-GP Penalty\n\nTransformation Penalty\n\nCritic Training on Con\ufb01dent Subset The WGAN-GP critic is typically trained on both generated\ndata x(cid:48) \u223c Pg and real data x \u223c Pr. However, we want the GWIN to generate images from the\nclassi\ufb01er\u2019s con\ufb01dent distribution. Thus, we pre\ufb01lter the training data to create a con\ufb01dent distribution\nPc containing all images that the classi\ufb01er labels correctly with certainty of at least \u03c4\u2217. The critic is\nthen trained exclusively on samples drawn from Pc and Pg. Note that \u03c4\u2217 is not necessarily the same\ncertainty threshold used in the rejection function. We set \u03c4\u2217 to some arbitrarily high certainty, e.g.,\n0.95, so that the rejection function can be tuned without needing to retrain the generative model.\nSince the WGWIN-GP will encounter observations from Pr during inference, only the critic samples\nfrom Pc. During training, the generator samples from the entire real distribution Pr.\n\nA Conditional Generative Model The WGWIN-GP is trained as a conditional GAN. Conditional\ngenerative networks are often class conditioned to generate an example of a speci\ufb01c class, and the\nsame conditioning information is given to both the critic and generator. However, as the WGWIN-GP\nwill not have access to the ground truth label during inference, the generator is conditioned on the\nentire observation x. We want the critic to discriminate between certain and uncertain observations.\nSince x is not guaranteed to be from Pc, we condition the critic on a one-hot representation of the\nground truth label y in an effort to generate images that are representative of the original observation\u2019s\nclass. Thus the generator is tasked with translating observations to new images that are from the\ngiven class in the con\ufb01dent distribution.\nOne can achieve conditioning by concatenating the conditional information with the input [27] or with\na feature vector at some hidden layer within the network [32, 42]. Though other conditioning methods\nexists, such as modifying the discriminator\u2019s loss function to also maximize the log likelihood of the\ncorrect class [30] or projection-based approaches [28], we opted to condition the generator using\ninput-based concatenation and to condition the critic using hidden-layer concatenation for simplicity.\nAlgorithm1 shows the new WGWIN-GP training algorithm.\n\n5 Evaluation\n\nWe evaluate the WGWIN-GP using the training procedure outlined in Section 4 and the inference\nmethod illustrated in Figure 1. We compare test accuracy of the base Bayesian neural network,\ndenoted BNN, the Bayesian neural network with reject, denoted BNN w/Reject, and the Bayesian\nneural network when paired with the WGWIN-GP, denoted BNN+GWIN. BNN w/Reject allows\nthe classi\ufb01er to reject observations without needing to relabel while the BNN+GWIN uses the\nWGWIN-GP to transform and relabel the rejected subset.\nThe BNN trained for 30 epochs using a learning rate of 0.001 and batch size of 128. The GWIN\ntrained for 200,000 iterations using the default hyperparameters listed in Algorithm 1. Both the\ngenerator and critic used a learning rate of 0.0001 and batch size of 128. We perform inference using\nvarious certainty thresholds \u03c4 \u2208 {0.10, 0.30, 0.50, 0.70, 0.80, 0.90, 0.95, 0.99}. The BNN uses 10\nMonte Carlo samples to determine prediction certainty.\nGiven the non-deterministic nature of both the Bayesian neural network and the generative network,\nall experimental results are averaged over 10 runs. We trained and evaluated the models using\nNVIDIA GeForce GTX TITAN X GPUs.\n\n5\n\n\fAlgorithm 1: WGWIN with gradient and transformation penalty. We use default values of\n\u03bbGP = 10, \u03bbLoss = 10, ncritic = 5, \u03b1 = 0.0001, \u03b21 = 0.5, \u03b22 = 0.9, certainty preprocessing\nthreshold \u03c4\u2217 = 0.95 and the \ufb01xed classi\ufb01er C described in Section 4.1.\nRequire :The penalty coef\ufb01cients \u03bbGP and \u03bbLoss, the number of critic iterations per generator\n\niteration ncritic, the batch size m, Adam hyperparameters \u03b1, \u03b21, \u03b22, certainty\npreprocessing threshold \u03c4\u2217, and classi\ufb01er C.\n\nRequire :initial critic parameters w0, initial generator parameters \u03b80\n1: Build con\ufb01dent data distribution Pc from training data Pr using classi\ufb01er C and threshold \u03c4\u2217\n2: while \u03b8 has not converged do\n3:\n4:\n5:\n\nfor t = 1, . . . , ncritic do\nfor i = 1, . . . , m do\n\nSample con\ufb01dent data (x, y) \u223c Pc, latent variable z \u223c p(z), and a random\nnumber \u0001 \u223c U [0, 1].\nx(cid:48) \u2190 G\u03b8(x, z)\n\u02c6x \u2190 \u0001x + (1 \u2212 \u0001)x(cid:48)\nL(i) \u2190 Dw(x(cid:48), y) \u2212 Dw(x, y) + \u03bbGP (||\u2207\u02c6xDw(\u02c6x, y)||2 \u2212 1)2\n\nend for\nw \u2190 Adam(\u2207w\n\n(cid:80)m\n\n1\nm\n\ni=1 L(i), w, \u03b1, \u03b21, \u03b22)\n\n6:\n7:\n\n8:\n9:\n\n10:\n11:\n\n12:\n\nend for\nSample a batch of training data {(x, y)(i)}m\n{z(i)}m\n\u03b8 \u2190 Adam(\u2207\u03b8\n\n(cid:80)m\ni=1 \u223c p(z)\ni=1 \u2212Dw(G\u03b8(x, z), y) + \u03bbLoss(Loss(C(G\u03b8(x, z)))), \u03b8, \u03b1, \u03b21, \u03b22)\n\ni=1 \u223c Pr and latent variables\n\n1\nm\n\n13:\n14: end while\n\n5.1 Datasets\n\nWe use two different datasets in our experiments: the MNIST handwritten digits [23] dataset and the\nFashion-MNIST clothing dataset [41]. Both datasets consist of 60,000 training images and 10,000\ntest images. We further split both training sets into a 50,000 image training set and 10,000 image\nvalidation set. Each example is a 28x28x1 grayscale image associated with a label from one of ten\nclasses. Images are preprocessed by normalizing grayscale values to [0, 1].\nBuilding the certain distribution Pc \ufb01lters each dataset a varying amount. The average size of the\nhigh certainty training dataset is 47,948 for MNIST Digits and 31,760 for MNIST Fashion.\n\n5.2 Results\n\nFigure 3 and Figure 4 illustrate the mean accuracy for varying certainty rejection thresholds on each\ndataset while Table 1 and Table 2 present exact accuracy values on the rejected subset. At every\ncertainty threshold, the GWIN+BNN outperforms the BNN on uncertain observations by up to 35%\non MNIST Digits and 20% on MNIST Fashion. As the certainty threshold increases, we see the\nsize of the rejected subset increase and the relative gains from the GWIN transformation decrease.\nHowever, this is expected as we begin to reject observations that the BNN already labels correctly\nwith higher certainty. Figure 5 shows the change in certainty of the ground truth label at varying\ncertainty rejection thresholds. Though the GWIN increases certainty in the ground truth label in\nthe majority of observations, it is possible for the GWIN to map an observation to a lower-certainty\nrepresentation. This suggests that one must carefully tune the rejection function and certainty metrics\nto minimize the number of correct instances that are mistranslated.\n\n6 Related Work\n\nClassi\ufb01ers and inference networks have been paired with generative adversarial networks in the past,\nbut the goal of these models has been to either learn a mapping from data to latent representations\nor improve class-conditional generation [8, 9, 24]. Though GWINs also contain an additional\nclassi\ufb01cation network, the objective of the generative network is not solely image synthesis or\n\n6\n\n\fTable 1: Test set accuracy for MNIST Digits on rejected observations using GWIN transformation for\nthe given certainty threshold \u03c4. BNN and BNN+GWIN denote accuracy for the rejected subset using\nonly the BNN and the BNN with GWIN reformulation, respectively. With no rejections (\u03c4 = 0), the\nBNN had an accuracy of 98.0%. Overall Acc. \u2206 is the change in accuracy while % Error \u2206 denotes\nthe percent change in error rate for the entire subset when the GWIN is applied to rejected queries.\nAll results are presented as the mean over 10 runs.\n\n\u03c4\n\n0.50\n0.70\n0.80\n0.90\n0.95\n0.99\n\n% Reject\n\n0.39\n1.83\n2.74\n4.39\n6.04\n11.00\n\nBNN Acc.\n40.23 \u00b1 8.51\n54.48 \u00b1 2.21\n58.91 \u00b1 1.49\n68.79 \u00b1 2.38\n73.48 \u00b1 1.66\n83.54 \u00b1 0.88\n\nBNN+GWIN Acc. Rejected Acc. \u2206 Overall Acc. \u2206\n0.14 \u00b1 0.04\n0.56 \u00b1 0.06\n0.75 \u00b1 0.06\n0.80 \u00b1 0.13\n0.96 \u00b1 0.13\n0.99 \u00b1 0.10\n\n35.36 \u00b1 8.66\n30.59 \u00b1 2.64\n27.39 \u00b1 2.03\n18.16 \u00b1 2.55\n15.86 \u00b1 2.07\n9.02 \u00b1 0.94\n\n75.59 \u00b1 4.22\n85.07 \u00b1 2.63\n86.30 \u00b1 1.85\n86.95 \u00b1 0.97\n89.34 \u00b1 0.85\n92.55 \u00b1 0.49\n\n% Error \u2206\n\u22126.98 \u00b1 2.08\n\u221227.55 \u00b1 2.66\n\u221236.36 \u00b1 1.93\n\u221240.26 \u00b1 4.19\n\u221247.45 \u00b1 4.09\n\u221249.45 \u00b1 3.16\n\n(a) Rejected subset accuracy\n\n(b) Overall test set accuracy\n\nFigure 3: Test set accuracy for MNIST Digits using GWIN transformation for the given certainty\nthreshold \u03c4. Figure 3a shows BNN and BNN+GWIN accuracy on the rejected subset. % Reject\nrepresents the percent of the 10,000 observations rejected by the classi\ufb01er for the current certainty\nthreshold. Figure 3b shows the accuracy of the BNN and BNN+GWIN on the entire test set. All\nresults are presented as the mean over 10 runs and error bars show standard deviation.\n\nuncovering latent factors, but rather is to reprocess observations in order to increase the classi\ufb01er\u2019s\ngeneralizability and accuracy.\nTo the best of our knowledge, Defense-GAN is the only other instance of pairing a GAN with\na classi\ufb01cation network to increase performance during inference [34]. Defense-GAN serves as\na defense against adversarial examples by using a GAN to \u201cdenoise\u201d perturbed images prior to\nclassi\ufb01cation. A WGAN is \ufb01rst trained to capture the unperturbed training distribution. Before to\nlabeling a new observation x, the image is projected onto the range of the generator by minimizing\nthe reconstruction error,\n\n||G(z) \u2212 x||2\n2,\n\nmin\n\nz\n\nusing L steps of gradient descent for R different samples of z.\nThough both Defense-GAN and GWINs use WGAN-based implementations to improve classi\ufb01er\ninference, there are a number of differences between these two generative models that stem from the\ndifferences in the problems the attempt to solve:\n\n\u2022 Defense-GAN aims to denoise adversarial examples by projecting images back to the real\ndata set while minimizing reconstruction loss. However, this assumes that there exists a\ndenoised equivalent of each observation in the real dataset. GWINs, on the other hand, use a\nconditional WGAN in order to create high-certainty representations of the same class as the\noriginal observation.\n\n7\n\n\fTable 2: Test set accuracy for MNIST fashion on rejected observations using GWIN transformation\nfor the given certainty threshold \u03c4. BNN and BNN+GWIN denote accuracy for the rejected subset\nusing only the BNN and the BNN with GWIN reformulation, respectively. With no rejections (\u03c4 = 0),\nthe BNN had an accuracy of 87.4%. Overall Acc. \u2206 denotes the change in accuracy while % Error\n\u2206 denotes the percent change in error rate for the entire subset when the GWIN is applied to rejected\nqueries. All results are presented as the mean over 10 runs.\n\n\u03c4\n\n0.50\n0.70\n0.80\n0.90\n0.95\n0.99\n\n% Reject\n\n4.18\n15.25\n21.21\n30.29\n37.30\n51.97\n\nBNN Acc.\n40.52 \u00b1 2.36\n52.08 \u00b1 1.55\n57.87 \u00b1 0.89\n64.14 \u00b1 0.66\n68.93 \u00b1 0.49\n76.55 \u00b1 0.30\n\nBNN+GWIN Acc. Rejected Acc. \u2206 Overall Acc. \u2206\n0.79 \u00b1 0.17\n2.27 \u00b1 0.30\n2.39 \u00b1 0.19\n2.74 \u00b1 0.29\n2.66 \u00b1 0.25\n2.49 \u00b1 0.19\n\n18.91 \u00b1 3.61\n14.87 \u00b1 1.78\n11.29 \u00b1 0.87\n9.04 \u00b1 0.83\n7.14 \u00b1 0.61\n4.79 \u00b1 0.34\n\n59.43 \u00b1 2.30\n66.95 \u00b1 0.67\n69.16 \u00b1 0.47\n73.18 \u00b1 0.73\n76.06 \u00b1 0.43\n81.34 \u00b1 0.26\n\n% Error \u2206\n\u22126.22 \u00b1 1.24\n\u221218.08 \u00b1 1.98\n\u221219.25 \u00b1 1.32\n\u221221.63 \u00b1 1.85\n\u221221.15 \u00b1 1.61\n\u221219.94 \u00b1 1.30\n\n(a) Rejected subset accuracy\n\n(b) Overall test set accuracy\n\nFigure 4: Test set accuracy for MNIST Fashion using GWIN transformation for the given certainty\nthreshold \u03c4. Figure 4a shows BNN and BNN+GWIN accuracy on the rejected subset. % Reject\nrepresents the percent of the 10,000 observations rejected by the classi\ufb01er for the current certainty\nthreshold. Figure 4b shows the accuracy of the BNN and BNN+GWIN on the entire test set. All\nresults are presented as the mean over 10 runs and error bars show standard deviation.\n\n\u2022 Defense-GAN preprocesses all input to the classi\ufb01er, incurring the cost of the R \u00d7 L\ngenerations to label each observation. GWINs only transform rejected observations and\nrequire at most a single pass through the generator. We include notes on transformation\nlatency for MNIST experiments in the appendix.\n\n\u2022 GWINs make stronger assumptions about the classi\ufb01er than Defense-GAN, requiring a\ncertainty metric and reject function, but can be used for any classi\ufb01cation task and are not\nlimited to adversarial robustness.\n\n\u2022 GWINs use the \ufb01xed classi\ufb01er during training, while Defense-GAN is trained independently.\nThe novel contribution of GWINs is using the generative network to learn Pc of a certainty-based\nclassi\ufb01er. The WGWIN-GP is just one possible implementation of this idea; though Defense-GAN\nis structured differently to address adversarial examples, one could imagine a similar method being\napplied as a new GWIN implementation. We leave this for future work.\nSimilarly to both DefenseGAN and GWINs, MagNet [26] is a framework that contains a detector\nnetwork that learns to differentiate between normal and adversarial examples and a reformer network\nthat moves adversarial examples towards the manifold of normal examples in order to protect against\nadversarial examples with small perturbations. Though this seems to be the second closest model\nto GWINs, MagNet relies on auto-encoders and also focuses on increasing a model\u2019s robustness to\nadversarial examples rather than making use of classi\ufb01er certainty to label novel examples from the\nnormal manifold.\n\n8\n\n\f(a) MNIST Digits\n\n(b) MNIST Fashion\n\nFigure 5: Change in rejected sample certainty of the ground truth label for varying certainty rejection\nthresholds \u03c4. Outliers are those values that fall outside of 1.5IQR and are denoted with diamonds.\n\nOther common strategies for denoising adversarial examples do not translate well to the uncertainty-\nrejection paradigm. Network distillation [31] trains a classi\ufb01er such that it is nearly impossible to\ngenerate adversarial examples using gradient-based attacks. However, novel observations that might\nmake a classi\ufb01er uncertain in its predictions are not necessarily generated in an adversarial manner\nand thus we have no need to mask the network\u2019s gradients. Adversarial training [18] is speci\ufb01c\nto the attack generating the adversarial examples and does not necessarily generalize well to other\nattacks. Methods that generate additional training data, similarly to hallucination methods in the\nfew-shot learning domain [1, 20, 38], aim to increase the robustness of a classi\ufb01er during training by\ngenerating out-of-distribution training data while our method assumes a \ufb01xed, pretrained classi\ufb01er and\nuses generative methods to translate novel, out-of-distribution examples to the con\ufb01dent distribution\nduring inference. Since the GWIN framework learns representations that the classi\ufb01er labels correctly\nwith high con\ufb01dence, these generative denoising methods can easily be paired with our framework: a\nclassi\ufb01er is trained using the aforementioned techniques and the GWIN is then used to transform\nany novel examples that the new classi\ufb01er is not entirely robust to. Similarly to DefenseGAN and\nMagNet, the \ufb02exibility and additive nature of our frameworks means that we can easily build atop\nthese existing denoising methodologies. Since noise only represents a subset of out-of-distribution\nobservations, we cannot rely entirely on denoising techniques to address classi\ufb01er robustness. GWINs\ntake a step towards a generalizable, principled framework for \u201crethinking\u201d uncertain examples and\nleveraging classi\ufb01er uncertainty.\n\n7 Conclusion\n\nIn this work, we outlined Generative Well-intentioned Networks (GWINs), a novel framework\nleveraging uncertainty and generative networks to increase classi\ufb01er accuracy. We proposed a high\nlevel architecture making use of certainty-based classi\ufb01ers, a rejection function, and a generative\nnetwork. We de\ufb01ned a baseline implementation, the Wasserstein GWIN with gradient penalty\n(WGWIN-GP), and empirically showed that the WGWIN-GP outperforms the base Bayesian neural\nnetwork at all certainty thresholds. This paper has demonstrated the viability of the GWIN framework\nand we hope that our work leads to further study of the use of generative networks to aid classi\ufb01er\ninference.\n\nAcknowledgements\n\nThis work was supported by the National Key Research and Development Program of China (No.\n2017YFA0700904), NSFC Projects (Nos. 61620106010, 61621136008, 61571261), Beijing NSF\nProject (No. L172037), Beijing Academy of Arti\ufb01cial Intelligence (BAAI), Tiangong Institute for\nIntelligent Computing, the JP Morgan Faculty Research Program, and the NVIDIA NVAIL Program\nwith GPU/DGX Acceleration.\n\n9\n\n\fReferences\n[1] Antreas Antoniou, Amos Storkey, and Harrison Edwards. Data augmentation generative\n\nadversarial networks, 2017.\n\n[2] Martin Arjovsky, Soumith Chintala, and L\u00e9on Bottou. Wasserstein gan, 2017.\n\n[3] Peter Bartlett and Marten Wegkamp. Classi\ufb01cation with a reject option using a hinge loss.\n\nJournal of Machine Learning Research, 9(8):1823\u20131840, 2008.\n\n[4] Chi-Keung Chow. An optimum character recognition system using decision functions. IRE\n\nTransactions on Electronic Computers, (4):247\u2013254, 1957.\n\n[5] Chi-Keung Chow. On optimum recognition error and reject tradeoff. IEEE Transactions on\n\ninformation theory, 16(1):41\u201346, 1970.\n\n[6] Corinna Cortes, Giulia DeSalvo, and Mehryar Mohri. Learning with rejection. In International\n\nConference on Algorithmic Learning Theory, pages 67\u201382. Springer, 2016.\n\n[7] Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore,\nBrian Patton, Alex Alemi, Matt Hoffman, and Rif A. Saurous. Tensor\ufb02ow distributions, 2017.\n\n[8] Jeff Donahue, Philipp Kr\u00e4henb\u00fchl, and Trevor Darrell. Adversarial feature learning.\n\nInternational Conference on Learning Representations, 2017.\n\nIn\n\n[9] Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin\nArjovsky, and Aaron Courville. Adversarially learned inference. In International Conference\non Learning Representations, 2017.\n\n[10] Yarin Gal. Uncertainty in Deep Learning. PhD thesis, University of Cambridge, 2016.\n\n[11] Yarin Gal and Zoubin Ghahramani. Bayesian convolutional neural networks with bernoulli\n\napproximate variational inference, 2015.\n\n[12] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing\nmodel uncertainty in deep learning. In International Conference on Machine Learning, pages\n1050\u20131059, 2016.\n\n[13] Yarin Gal and Zoubin Ghahramani. A theoretically grounded application of dropout in recurrent\nneural networks. In Advances in Neural Information Processing Systems, pages 1019\u20131027,\n2016.\n\n[14] Yonatan Geifman and Ran El-Yaniv. Selective classi\ufb01cation for deep neural networks. In\nProceedings of the 31st International Conference on Neural Information Processing Systems,\nNIPS\u201917, pages 4885\u20134894, USA, 2017. Curran Associates Inc.\n\n[15] Yonatan Geifman and Ran El-Yaniv. SelectiveNet: A deep neural network with an integrated\nreject option. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the\n36th International Conference on Machine Learning, volume 97 of Proceedings of Machine\nLearning Research, pages 2151\u20132159, Long Beach, California, USA, 09\u201315 Jun 2019. PMLR.\n\n[16] Zoubin Ghahramani. Probabilistic machine learning and arti\ufb01cial intelligence. Nature,\n\n521(7553):452, 2015.\n\n[17] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil\nOzair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in Neural\nInformation Processing Systems, pages 2672\u20132680, 2014.\n\n[18] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adver-\n\nsarial examples, 2014.\n\n[19] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville.\nImproved training of wasserstein gans. In Advances in Neural Information Processing Systems,\npages 5767\u20135777, 2017.\n\n10\n\n\f[20] Bharath Hariharan and Ross Girshick. Low-shot visual recognition by shrinking and hallucinat-\ning features. In Proceedings of the IEEE International Conference on Computer Vision, pages\n3018\u20133027, 2017.\n\n[21] Durk P Kingma, Tim Salimans, and Max Welling. Variational dropout and the local reparam-\neterization trick. In Advances in Neural Information Processing Systems, pages 2575\u20132583,\n2015.\n\n[22] Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable\npredictive uncertainty estimation using deep ensembles. In Advances in Neural Information\nProcessing Systems, pages 6402\u20136413, 2017.\n\n[23] Yann LeCun, L\u00e9on Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning\n\napplied to document recognition. Proceedings of the IEEE, 86(11):2278\u20132324, 1998.\n\n[24] Chongxuan Li, Tau\ufb01k Xu, Jun Zhu, and Bo Zhang. Triple generative adversarial nets. In\n\nAdvances in Neural Information Processing Systems, pages 4088\u20134098, 2017.\n\n[25] David JC MacKay. A practical bayesian framework for backpropagation networks. Neural\n\ncomputation, 4(3):448\u2013472, 1992.\n\n[26] Dongyu Meng and Hao Chen. Magnet: A two-pronged defense against adversarial examples. In\nProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security,\nCCS \u201917, pages 135\u2013147, New York, NY, USA, 2017. ACM.\n\n[27] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets, 2014.\n\n[28] Takeru Miyato and Masanori Koyama. cGANs with projection discriminator. In International\n\nConference on Learning Representations, 2018.\n\n[29] Radford M Neal. Bayesian learning for neural networks. PhD thesis, University of Toronto,\n\n1995.\n\n[30] Augustus Odena, Christopher Olah, and Jonathon Shlens. Conditional image synthesis with\nauxiliary classi\ufb01er gans. In Proceedings of the 34th International Conference on Machine\nLearning-Volume 70, pages 2642\u20132651. JMLR. org, 2017.\n\n[31] Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as\na defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium\non Security and Privacy (SP), pages 582\u2013597. IEEE, 2016.\n\n[32] Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak\nLee. Generative adversarial text to image synthesis. In Proceedings of the 33rd International\nConference on International Conference on Machine Learning - Volume 48, ICML\u201916, pages\n1060\u20131069. JMLR.org, 2016.\n\n[33] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen.\nImproved techniques for training gans. In Advances in Neural Information Processing Systems,\npages 2234\u20132242, 2016.\n\n[34] Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-gan: Protecting classi\ufb01ers\nagainst adversarial attacks using generative models. In International Conference on Learning\nRepresentations, 2018.\n\n[35] Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, and Yuhao Zhou.\n\nZhuSuan: A library for Bayesian deep learning. arXiv: 1709.05870, 2017.\n\n[36] Jiaxin Shi, Shengyang Sun, and Jun Zhu. A spectral approach to gradient estimation for implicit\n\ndistributions. In International Conference on Machine Learning (ICML), 2018.\n\n[37] Francesco Tortorella. An optimal reject rule for binary classi\ufb01ers. In Joint IAPR International\nWorkshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic\nPattern Recognition (SSPR), pages 611\u2013620. Springer, 2000.\n\n11\n\n\f[38] Yu-Xiong Wang, Ross Girshick, Martial Hebert, and Bharath Hariharan. Low-shot learning\nfrom imaginary data. In Proceedings of the IEEE Conference on Computer Vision and Pattern\nRecognition, pages 7278\u20137286, 2018.\n\n[39] Ziyu Wang, Tongzheng Ren, Jun Zhu, and Bo Zhang. Function space particle optimization for\nbayesian neural networks. In International Conference on Learning Representations (ICLR\n2019), 2019.\n\n[40] Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, and Roger Grosse. Flipout: Ef\ufb01cient pseudo-\nindependent weight perturbations on mini-batches. In International Conference on Learning\nRepresentations, 2018.\n\n[41] Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-mnist: a novel image dataset for\n\nbenchmarking machine learning algorithms, 2017.\n\n[42] Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and\nDimitris N Metaxas. Stackgan: Text to photo-realistic image synthesis with stacked generative\nadversarial networks. In Proceedings of the IEEE International Conference on Computer Vision,\npages 5907\u20135915, 2017.\n\n12\n\n\f", "award": [], "sourceid": 7170, "authors": [{"given_name": "Justin", "family_name": "Cosentino", "institution": "Tsinghua University"}, {"given_name": "Jun", "family_name": "Zhu", "institution": "Tsinghua University"}]}