{"title": "Avoiding Discrimination through Causal Reasoning", "book": "Advances in Neural Information Processing Systems", "page_first": 656, "page_last": 666, "abstract": "Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively.  Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from \"What is the right fairness criterion?\" to \"What do we want to assume about our model of the causal data generating process?\" Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them.", "full_text": "Avoiding Discrimination through Causal Reasoning\n\nNiki Kilbertus\u2020\u2021\n\nnkilbertus@tue.mpg.de\n\nMateo Rojas-Carulla\u2020\u2021\nmrojas@tue.mpg.de\n\nGiambattista Parascandolo\u2020\u00a7\ngparascandolo@tue.mpg.de\n\nMoritz Hardt\u2217\n\nDominik Janzing\u2020\n\nhardt@berkeley.edu\n\njanzing@tue.mpg.de\n\nBernhard Sch\u00a8olkopf\u2020\n\nbs@tue.mpg.de\n\n\u2020Max Planck Institute for Intelligent Systems\n\u00a7Max Planck ETH Center for Learning Systems\n\n\u2021University of Cambridge\n\n\u2217University of California, Berkeley\n\nAbstract\n\nRecent work on fairness in machine learning has focused on various statistical\ndiscrimination criteria and how they trade off. Most of these criteria are observa-\ntional: They depend only on the joint distribution of predictor, protected attribute,\nfeatures, and outcome. While convenient to work with, observational criteria have\nsevere inherent limitations that prevent them from resolving matters of fairness\nconclusively.\nGoing beyond observational criteria, we frame the problem of discrimination\nbased on protected attributes in the language of causal reasoning. This view-\npoint shifts attention from \u201cWhat is the right fairness criterion?\u201d to \u201cWhat do we\nwant to assume about our model of the causal data generating process?\u201d Through\nthe lens of causality, we make several contributions. First, we crisply articulate\nwhy and when observational criteria fail, thus formalizing what was before a mat-\nter of opinion. Second, our approach exposes previously ignored subtleties and\nwhy they are fundamental to the problem. Finally, we put forward natural causal\nnon-discrimination criteria and develop algorithms that satisfy them.\n\n1\n\nIntroduction\n\nAs machine learning progresses rapidly, its societal impact has come under scrutiny. An important\nconcern is potential discrimination based on protected attributes such as gender, race, or religion.\nSince learned predictors and risk scores increasingly support or even replace human judgment, there\nis an opportunity to formalize what harmful discrimination means and to design algorithms that\navoid it. However, researchers have found it dif\ufb01cult to agree on a single measure of discrimination.\nAs of now, there are several competing approaches, representing different opinions and striking\ndifferent trade-offs. Most of the proposed fairness criteria are observational: They depend only\non the joint distribution of predictor R, protected attribute A, features X, and outcome Y. For\nexample, the natural requirement that R and A must be statistically independent is referred to as\ndemographic parity. Some approaches transform the features X to obfuscate the information they\ncontain about A [1]. The recently proposed equalized odds constraint [2] demands that the predictor\nR and the attribute A be independent conditional on the actual outcome Y. All three are examples\nof observational approaches.\nA growing line of work points at the insuf\ufb01ciency of existing de\ufb01nitions. Hardt, Price and Srebro [2]\nconstruct two scenarios with intuitively different social interpretations that admit identical joint dis-\n\n31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.\n\n\ftributions over (R, A, Y, X). Thus, no observational criterion can distinguish them. While there\nare non-observational criteria, notably the early work on individual fairness [3], these have not yet\ngained traction. So, it might appear that the community has reached an impasse.\n\n1.1 Our contributions\n\nWe assay the problem of discrimination in machine learning in the language of causal reasoning.\nThis viewpoint supports several contributions:\n\n\u2022 Revisiting the two scenarios proposed in [2], we articulate a natural causal criterion that\nformally distinguishes them. In particular, we show that observational criteria are unable\nto determine if a protected attribute has direct causal in\ufb02uence on the predictor that is not\nmitigated by resolving variables.\n\n\u2022 We point out subtleties in fair decision making that arise naturally from a causal perspec-\ntive, but have gone widely overlooked in the past. Speci\ufb01cally, we formally argue for the\nneed to distinguish between the underlying concept behind a protected attribute, such as\nrace or gender, and its proxies available to the algorithm, such as visual features or name.\n\u2022 We introduce and discuss two natural causal criteria centered around the notion of inter-\nventions (relative to a causal graph) to formally describe speci\ufb01c forms of discrimination.\n\u2022 Finally, we initiate the study of algorithms that avoid these forms of discrimination. Under\ncertain linearity assumptions about the underlying causal model generating the data, an\nalgorithm to remove a speci\ufb01c kind of discrimination leads to a simple and natural heuristic.\n\nAt a higher level, our work proposes a shift from trying to \ufb01nd a single statistical fairness criterion\nto arguing about properties of the data and which assumptions about the generating process are\njusti\ufb01ed. Causality provides a \ufb02exible framework for organizing such assumptions.\n\n1.2 Related work\n\nDemographic parity and its variants have been discussed in numerous papers, e.g., [1, 4\u20136]. While\ndemographic parity is easy to work with, the authors of [3] already highlighted its insuf\ufb01ciency as\na fairness constraint. In an attempt to remedy the shortcomings of demographic parity [2] proposed\ntwo notions, equal opportunity and equal odds, that were also considered in [7]. A review of various\nfairness criteria can be found in [8], where they are discussed in the context of criminal justice.\nIn [9, 10] it has been shown that imperfect predictors cannot simultaneously satisfy equal odds and\ncalibration unless the groups have identical base rates, i.e. rates of positive outcomes.\nA starting point for our investigation is the unidenti\ufb01ability result of [2]. It shows that observed-\nvational criteria are too weak to distinguish two intuitively very different scenarios. However, the\nwork does not provide a formal mechanism to articulate why and how these scenarios should be\nconsidered different. Inspired by Pearl\u2019s causal interpretation of Simpson\u2019s paradox [11, Section 6],\nwe propose causality as a way of coping with this unidenti\ufb01ability result.\nAn interesting non-observational fairness de\ufb01nition is the notion of individual fairness [3] that as-\nsumes the existence of a similarity measure on individuals, and requires that any two similar individ-\nuals should receive a similar distribution over outcomes. More recent work lends additional support\nto such a de\ufb01nition [12]. From the perspective of causality, the idea of a similarity measure is akin\nto the method of matching in counterfactual reasoning [13, 14]. That is, evaluating approximate\ncounterfactuals by comparing individuals with similar values of covariates excluding the protected\nattribute.\nRecently, [15] put forward one possible causal de\ufb01nition, namely the notion of counterfactual fair-\nness. It requires modeling counterfactuals on a per individual level, which is a delicate task. Even\ndetermining the effect of race at the group level is dif\ufb01cult; see the discussion in [16]. The goal of\nour paper is to assay a more general causal framework for reasoning about discrimination in machine\nlearning without committing to a single fairness criterion, and without committing to evaluating in-\ndividual causal effects. In particular, we draw an explicit distinction between the protected attribute\n(for which interventions are often impossible in practice) and its proxies (which sometimes can be\nintervened upon).\n\n2\n\n\fMoreover, causality has already been employed for the discovery of discrimination in existing data\nsets by [14, 17]. Causal graphical conditions to identify meaningful partitions have been proposed\nfor the discovery and prevention of certain types of discrimination by preprocessing the data [18].\nThese conditions rely on the evaluation of path speci\ufb01c effects, which can be traced back all the\nway to [11, Section 4.5.3]. The authors of [19] recently picked up this notion and generalized\nPearl\u2019s approach by a constraint based prevention of discriminatory path speci\ufb01c effects arising\nfrom counterfactual reasoning. Our research was done independently of these works.\n\n1.3 Causal graphs and notation\n\nCausal graphs are a convenient way of organizing assumptions about the data generating process.\nWe will generally consider causal graphs involving a protected attribute A, a set of proxy variables P,\nfeatures X, a predictor R and sometimes an observed outcome Y. For background on causal graphs\nsee [11]. In the present paper a causal graph is a directed, acyclic graph whose nodes represent\nrandom variables. A directed path is a sequence of distinct nodes V1, . . . , Vk, for k \u2265 2, such\nthat Vi \u2192 Vi+1 for all i \u2208 {1, . . . , k \u2212 1}. We say a directed path is blocked by a set of nodes Z,\nwhere V1, Vk /\u2208 Z, if Vi \u2208 Z for some i \u2208 {2, . . . , k \u2212 1}.1\nA structural equation model is a set of equations Vi = fi(pa(Vi), Ni), for i \u2208 {1, . . . , n},\nwhere pa(Vi) are the parents of Vi, i.e. its direct causes, and the Ni are independent noise vari-\nables. We interpret these equations as assignments. Because we assume acyclicity, starting from\nthe roots of the graph, we can recursively compute the other variables, given the noise variables.\nThis leads us to view the structural equation model and its corresponding graph as a data gener-\nating model. The predictor R maps inputs, e.g., the features X, to a predicted output. Hence we\nmodel it as a childless node, whose parents are its input variables. Finally, note that given the noise\nvariables, a structural equation model entails a unique joint distribution; however, the same joint\ndistribution can usually be entailed by multiple structural equation models corresponding to distinct\ncausal structures.\n\n2 Unresolved discrimination and limitations of observational criteria\n\nA\n\nTo bear out the limitations of observational criteria, we turn to\nPearl\u2019s commentary on claimed gender discrimination in Berke-\nley college admissions [11, Section 4.5.3]. Bickel [20] had shown\nearlier that a lower college-wide admission rate for women than\nfor men was explained by the fact that women applied in more\ncompetitive departments. When adjusted for department choice,\nwomen experienced a slightly higher acceptance rate compared\nwith men. From the causal point of view, what matters is the di-\nrect effect of the protected attribute (here, gender A) on the deci-\nsion (here, college admission R) that cannot be ascribed to a re-\nsolving variable such as department choice X, see Figure 1. We\nshall use the term resolving variable for any variable in the causal\ngraph that is in\ufb02uenced by A in a manner that we accept as non-\ndiscriminatory. With this convention, the criterion can be stated as\nfollows.\nDe\ufb01nition 1 (Unresolved discrimination). A variable V in a causal graph exhibits unresolved dis-\ncrimination if there exists a directed path from A to V that is not blocked by a resolving variable\nand V itself is non-resolving.\n\nFigure 1: The admission de-\ncision R does not only di-\nrectly depend on gender A, but\nalso on department choice X,\nwhich in turn is also affected\nby gender A.\n\nX\n\nR\n\nPearl\u2019s commentary is consistent with what we call the skeptic viewpoint. All paths from the pro-\ntected attribute A to R are problematic, unless they are justi\ufb01ed by a resolving variable. The pres-\nence of unresolved discrimination in the predictor R is worrisome and demands further scrutiny.\nIn practice, R is not a priori part of a given graph. Instead it is our objective to construct it as a\nfunction of the features X, some of which might be resolving. Hence we should \ufb01rst look for unre-\nsolved discrimination in the features. A canonical way to avoid unresolved discrimination in R is to\nonly input the set of features that do not exhibit unresolved discrimination. However, the remaining\n\n1As it is not needed in our work, we do not discuss the graph-theoretic notion of d-separation.\n\n3\n\n\fY\n\nA\n\nY\n\nX2\n\nX1\n\nX2\n\nR\u2217\n\nA\n\nX1\n\nR\u2217\n\nfeatures might be affected by non-resolving and resolving variables. In Section 4 we investigate\nwhether one can exclusively remove unresolved discrimination from such features. A related notion\nof \u201cexplanatory features\u201d in a non-causal setting was introduced in [21].\nThe de\ufb01nition of unresolved discrimination in\na predictor has some interesting special cases\nworth highlighting. If we take the set of resolv-\ning variables to be empty, we intuitively get a\ncausal analog of demographic parity. No di-\nrected paths from A to R are allowed, but A\nand R can still be statistically dependent. Simi-\nlarly, if we choose the set of resolving variables\nto be the singleton set {Y } containing the true\noutcome, we obtain a causal analog of equal-\nized odds where strict independence is not nec-\nessary. The causal intuition implied by \u201cthe\nprotected attribute should not affect the predic-\ntion\u201d, and \u201cthe protected attribute can only af-\nfect the prediction when the information comes\nthrough the true label\u201d, is neglected by (con-\nditional) statistical independences A\u22a5\u22a5 R, and A\u22a5\u22a5 R | Y , but well captured by only considering\ndependences mitigated along directed causal paths.\nWe will next show that observational criteria are fundamentally unable to determine whether a pre-\ndictor exhibits unresolved discrimination or not. This is true even if the predictor is Bayes optimal.\nIn passing, we also note that fairness criteria such as equalized odds may or may not exhibit unre-\nsolved discrimination, but this is again something an observational criterion cannot determine.\nTheorem 1. Given a joint distribution over the protected attribute A, the true label Y , and some\nfeatures X1, . . . , Xn, in which we have already speci\ufb01ed the resolving variables, no observational\ncriterion can generally determine whether the Bayes optimal unconstrained predictor or the Bayes\noptimal equal odds predictor exhibit unresolved discrimination.\n\nFigure 2: Two graphs that may generate the same\njoint distribution for the Bayes optimal uncon-\nstrained predictor R\u2217. If X1 is a resolving vari-\nable, R\u2217 exhibits unresolved discrimination in the\nright graph (along the red paths), but not in the left\none.\n\nAll proofs for the statements in this paper are in the supplementary material.\nThe two graphs in Figure 2 are taken from [2], which we here reinterpret in the causal context to\nprove Theorem 1. We point out that there is an established set of conditions under which unresolved\ndiscrimination can, in fact, be determined from observational data. Note that the two graphs are\nnot Markov equivalent. Therefore, to obtain the same joint distribution we must violate a condition\ncalled faithfulness.2 We later argue that violation of faithfulness is by no means pathological, but\nemerges naturally when designing predictors. In any case, interpreting conditional dependences can\nbe dif\ufb01cult in practice [22].\n\n3 Proxy discrimination and interventions\n\nWe now turn to an important aspect of our framework. Determining causal effects in general requires\nInterventions on deeply rooted individual properties such as gender or\nmodeling interventions.\nrace are notoriously dif\ufb01cult to conceptualize\u2014especially at an individual level, and impossible to\nperform in a randomized trial. VanderWeele et al. [16] discuss the problem comprehensively in an\nepidemiological setting. From a machine learning perspective, it thus makes sense to separate the\nprotected attribute A from its potential proxies, such as name, visual features, languages spoken at\nhome, etc. Intervention based on proxy variables poses a more manageable problem. By deciding on\na suitable proxy we can \ufb01nd an adequate mounting point for determining and removing its in\ufb02uence\non the prediction. Moreover, in practice we are often limited to imperfect measurements of A in any\ncase, making the distinction between root concept and proxy prudent.\nAs was the case with resolving variables, a proxy is a priori nothing more than a descendant of A in\nthe causal graph that we choose to label as a proxy. Nevertheless in reality we envision the proxy\n\n2If we do assume the Markov condition and faithfulness, then conditional independences determine the\n\ngraph up to its so called Markov equivalence class.\n\n4\n\n\fto be a clearly de\ufb01ned observable quantity that is signi\ufb01cantly correlated with A, yet in our view\nshould not affect the prediction.\nDe\ufb01nition 2 (Potential proxy discrimination). A variable V in a causal graph exhibits potential\nproxy discrimination, if there exists a directed path from A to V that is blocked by a proxy variable\nand V itself is not a proxy.\n\nPotential proxy discrimination articulates a causal criterion that is in a sense dual to unresolved\ndiscrimination. From the benevolent viewpoint, we allow any path from A to R unless it passes\nthrough a proxy variable, which we consider worrisome. This viewpoint acknowledges the fact that\nthe in\ufb02uence of A on the graph may be complex and it can be too restraining to rule out all but a few\ndesignated features. In practice, as with unresolved discrimination, we can naively build an uncon-\nstrained predictor based only on those features that do not exhibit potential proxy discrimination.\nThen we must not provide P as input to R; unawareness, i.e. excluding P from the inputs of R,\nsuf\ufb01ces. However, by granting R access to P , we can carefully tune the function R(P, X) to cancel\nthe implicit in\ufb02uence of P on features X that exhibit potential proxy discrimination by the explicit\ndependence on P . Due to this possible cancellation of paths, we called the path based criterion po-\ntential proxy discrimination. When building predictors that exhibit no overall proxy discrimination,\nwe precisely aim for such a cancellation.\nFortunately, this idea can be conveniently expressed by an intervention on P , which is denoted\nby do(P = p) [11]. Visually, intervening on P amounts to removing all incoming arrows of P in\nthe graph; algebraically, it consists of replacing the structural equation of P by P = p, i.e. we put\npoint mass on the value p.\nDe\ufb01nition 3 (Proxy discrimination). A predictor R exhibits no proxy discrimination based on a\nproxy P if for all p, p(cid:48)\n\nP(R | do(P = p)) = P(R | do(P = p(cid:48))) .\n\n(1)\n\nThe interventional characterization of proxy discrimination leads to a simple procedure to remove\nit in causal graphs that we will turn to in the next section. It also leads to several natural variants\nof the de\ufb01nition that we discuss in Section 4.3. We remark that Equation (1) is an equality of\nprobabilities in the \u201cdo-calculus\u201d that cannot in general be inferred by an observational method,\nbecause it depends on an underlying causal graph, see the discussion in [11]. However, in some\ncases, we do not need to resort to interventions to avoid proxy discrimination.\nProposition 1. If there is no directed path from a proxy to a feature, unawareness avoids proxy\ndiscrimination.\n\n4 Procedures for avoiding discrimination\n\nHaving motivated the two types of discrimination that we distinguish, we now turn to building\npredictors that avoid them in a given causal model. First, we remark that a more comprehensive\ntreatment requires individual judgement of not only variables, but the legitimacy of every existing\npath that ends in R, i.e. evaluation of path-speci\ufb01c effects [18, 19], which is tedious in practice.\nThe natural concept of proxies and resolving variables covers most relevant scenarios and allows for\nnatural removal procedures.\n\n4.1 Avoiding proxy discrimination\n\nWhile presenting the general procedure, we illustrate each step in the example shown in Figure 3.\nA protected attribute A affects a proxy P as well as a feature X. Both P and X have additional\nunobserved causes NP and NX, where NP , NX , A are pairwise independent. Finally, the proxy also\nhas an effect on the features X and the predictor R is a function of P and X. Given labeled training\ndata, our task is to \ufb01nd a good predictor that exhibits no proxy discrimination within a hypothesis\nclass of functions R\u03b8(P, X) parameterized by a real valued vector \u03b8.\nWe now work out a formal procedure to solve this task under speci\ufb01c assumptions and simultane-\nously illustrate it in a fully linear example, i.e. the structural equations are given by\n\nP = \u03b1P A + NP ,\n\nX = \u03b1X A + \u03b2P + NX ,\n\nR\u03b8 = \u03bbP P + \u03bbX X .\n\nNote that we choose linear functions parameterized by \u03b8 = (\u03bbP , \u03bbX ) as the hypothesis class\nfor R\u03b8(P, X).\n\n5\n\n\fNP\n\nA\n\nNX\n\nNP\n\nA\n\nNX\n\nNE\n\nA\n\nNX\n\nNE\n\nA\n\nNX\n\nP\n\nX\n\nR\u02dcG\n\nX\n\nP\nRG\n\nFigure 3: A template graph \u02dcG for proxy dis-\ncrimination (left) with its intervened version G\n(right). While from the benevolent viewpoint we\ndo not generically prohibit any in\ufb02uence from A\non R, we want to guarantee that the proxy P has\nno overall in\ufb02uence on the prediction, by adjust-\ning P \u2192 R to cancel the in\ufb02uence along P \u2192\nX \u2192 R in the intervened graph.\n\nE\n\nX\n\nR\u02dcG\n\nX\n\nE\nRG\n\nFigure 4: A template graph \u02dcG for unresolved\ndiscrimination (left) with its intervened ver-\nsion G (right). While from the skeptical\nviewpoint we generically do not want A to\nin\ufb02uence R, we \ufb01rst intervene on E inter-\nrupting all paths through E and only cancel\nthe remaining in\ufb02uence on A to R.\n\nWe will refer to the terminal ancestors of a node V in a causal graph D, denoted by taD(V ), which\nare those ancestors of V that are also root nodes of D. Moreover, in the procedure we clarify the\nnotion of expressibility, which is an assumption about the relation of the given structural equations\nand the hypothesis class we choose for R\u03b8.\nProposition 2. If there is a choice of parameters \u03b80 such that R\u03b80 (P, X) is constant with respect\nto its \ufb01rst argument and the structural equations are expressible, the following procedure returns a\npredictor from the given hypothesis class that exhibits no proxy discrimination and is non-trivial in\nthe sense that it can make use of features that exhibit potential proxy discrimination.\n\n1. Intervene on P by removing all incoming arrows and replacing the structural equation for P\n\nby P = p. For the example in Figure 3,\n\nP = p,\n\nX = \u03b1X A + \u03b2P + NX ,\n\nR\u03b8 = \u03bbP P + \u03bbX X .\n\n(2)\n\n2. Iteratively substitute variables in the equation for R\u03b8 from their structural equations until only\nroot nodes of the intervened graph are left, i.e. write R\u03b8(P, X) as R\u03b8(P, g(taG(X))) for some\nfunction g. In the example, ta(X) = {A, P, NX} and\n\nR\u03b8 = (\u03bbP + \u03bbX \u03b2)p + \u03bbX (\u03b1X A + NX ) .\n\n(3)\n\n3. We now require the distribution of R\u03b8 in (3) to be independent of p, i.e. for all p, p(cid:48)\n\nP((\u03bbP + \u03bbX \u03b2)p + \u03bbX (\u03b1X A + NX )) = P((\u03bbP + \u03bbX \u03b2)p(cid:48) + \u03bbX (\u03b1X A + NX )) .\n\n(4)\nWe seek to write the predictor as a function of P and all the other roots of G separately. If our\nhypothesis class is such that there exists \u02dc\u03b8 such that R\u03b8(P, g(ta(X))) = R\u02dc\u03b8(P, \u02dcg(ta(X)\\{P})),\nwe call the structural equation model and hypothesis class speci\ufb01ed in (2) expressible. In our\nexample, this is possible with \u02dc\u03b8 = (\u03bbP + \u03bbX \u03b2, \u03bbX ) and \u02dcg = \u03b1X A + NX. Equation (4) then\nyields the non-discrimination constraint \u02dc\u03b8 = \u03b80. Here, a possible \u03b80 is \u03b80 = (0, \u03bbX ), which\nsimply yields \u03bbP = \u2212\u03bbX \u03b2.\n\n4. Given labeled training data, we can optimize the predictor R\u03b8 within the hypothesis class as given\n\nin (2), subject to the non-discrimination constraint. In the example\n\nR\u03b8 = \u2212\u03bbX \u03b2P + \u03bbX X = \u03bbX (X \u2212 \u03b2P ) ,\n\nwith the free parameter \u03bbX \u2208 R.\n\nIn general, the non-discrimination constraint (4) is by construction just P(R | do(P = p)) =\nP(R | do(P = p(cid:48))), coinciding with De\ufb01nition 3. Thus Proposition 2 holds by construction of\nthe procedure. The choice of \u03b80 strongly in\ufb02uences the non-discrimination constraint. However, as\nthe example shows, it allows R\u03b8 to exploit features that exhibit potential proxy discrimination.\n\n6\n\n\fA\n\nP\n\nR\n\nX\n\nA\n\nP\n\nR\n\nX\n\n\u02dcG\n\nDAG\n\nG\n\nDAG\n\nFigure 5: Left: A generic graph \u02dcG to describe proxy discrimination. Right: The graph corresponding\nto an intervention on P . The circle labeled \u201cDAG\u201d represents any sub-DAG of \u02dcG and G containing\nan arbitrary number of variables that is compatible with the shown arrows. Dashed arrows can, but\ndo not have to be present in a given scenario.\n\n4.2 Avoiding unresolved discrimination\n\nWe proceed analogously to the previous subsection using the example graph in Figure 4. Instead of\nthe proxy, we consider a resolving variable E. The causal dependences are equivalent to the ones in\nFigure 3 and we again assume linear structural equations\n\nE = \u03b1EA + NE,\n\nX = \u03b1X A + \u03b2E + NX ,\n\nR\u03b8 = \u03bbEE + \u03bbX X .\n\nLet us now try to adjust the previous procedure to the context of avoiding unresolved discrimination.\n1. Intervene on E by \ufb01xing it to a random variable \u03b7 with P(\u03b7) = P(E), the marginal distribution\n\nof E in \u02dcG, see Figure 4. In the example we \ufb01nd\n\n(5)\n2. By iterative substitution write R\u03b8(E, X) as R\u03b8(E, g(taG(X))) for some function g, i.e. in the\n\nX = \u03b1X A + \u03b2E + NX ,\n\nR\u03b8 = \u03bbEE + \u03bbX X .\n\nE = \u03b7,\n\nexample\n\n(6)\n3. We now demand the distribution of R\u03b8 in (6) be invariant under interventions on A, which coin-\n\nR\u03b8 = (\u03bbE + \u03bbX \u03b2)\u03b7 + \u03bbX \u03b1X A + \u03bbX NX .\n\ncides with conditioning on A whenever A is a root of \u02dcG. Hence, in the example, for all a, a(cid:48)\nP((\u03bbE + \u03bbX \u03b2)\u03b7 + \u03bbX \u03b1X a + \u03bbX NX )) = P((\u03bbE + \u03bbX \u03b2)\u03b7 + \u03bbX \u03b1X a(cid:48) + \u03bbX NX )) .\n\n(7)\n\nHere, the subtle asymmetry between proxy discrimination and unresolved discrimination becomes\napparent. Because R\u03b8 is not explicitly a function of A, we cannot cancel implicit in\ufb02uences of A\nthrough X. There might still be a \u03b80 such that R\u03b80 indeed ful\ufb01ls (7), but there is no princi-\npled way for us to construct it. In the example, (7) suggests the obvious non-discrimination con-\nstraint \u03bbX = 0. We can then proceed as before and, given labeled training data, optimize R\u03b8 = \u03bbEE\nby varying \u03bbE. However, by setting \u03bbX = 0, we also cancel the path A \u2192 E \u2192 X \u2192 R, even\nthough it is blocked by a resolving variable. In general, if R\u03b8 does not have access to A, we can not\nadjust for unresolved discrimination without also removing resolved in\ufb02uences from A on R\u03b8.\nIf, however, R\u03b8 is a function of A, i.e. we add the term \u03bbAA to R\u03b8 in (5), the non-discrimination\nconstraint is \u03bbA = \u2212\u03bbX \u03b1X and we can proceed analogously to the procedure for proxies.\n\n4.3 Relating proxy discriminations to other notions of fairness\n\nMotivated by the algorithm to avoid proxy discrimination, we discuss some natural variants of the\nnotion in this section that connect our interventional approach to individual fairness and other pro-\nposed criteria. We consider a generic graph structure as shown on the left in Figure 5. The proxy P\nand the features X could be multidimensional. The empty circle in the middle represents any num-\nber of variables forming a DAG that respects the drawn arrows. Figure 3 is an example thereof. All\ndashed arrows are optional depending on the speci\ufb01cs of the situation.\nDe\ufb01nition 4. A predictor R exhibits no individual proxy discrimination, if for all x and all p, p(cid:48)\n\nP(R | do(P = p), X = x) = P(R | do(P = p(cid:48)), X = x) .\n\nA predictor R exhibits no proxy discrimination in expectation, if for all p, p(cid:48)\n\nE[R | do(P = p)] = E[R | do(P = p(cid:48))] .\n\n7\n\n\fIndividual proxy discrimination aims at comparing examples with the same features X, for different\nvalues of P . Note that this can be individuals with different values for the unobserved non-feature\nvariables. A true individual-level comparison of the form \u201cWhat would have happened to me, if I\nhad always belonged to another group\u201d is captured by counterfactuals and discussed in [15, 19].\nFor an analysis of proxy discrimination, we need the structural equations for P, X, R in Figure 5\n\nP = \u02c6fP (pa(P )) ,\nX = \u02c6fX (pa(X)) = fX (P, taG(X) \\ {P}) ,\nR = \u02c6fR(P, X) = fR(P, taG(R) \\ {P}) .\n\nP (X) := taG(X) \\ {P}. We can \ufb01nd fX , fR\nFor convenience, we will use the notation taG\nfrom \u02c6fX , \u02c6fR by \ufb01rst rewriting the functions in terms of root nodes of the intervened graph, shown\non the right side of Figure 5, and then assigning the overall dependence on P to the \ufb01rst argument.\nWe now compare proxy discrimination to other existing notions.\nTheorem 2. Let the in\ufb02uence of P on X be additive and linear, i.e.\n\nX = fX (P, taG\n\nP (X)) = gX (taG\n\nP (X)) + \u00b5X P\n\nfor some function gX and \u00b5X \u2208 R. Then any predictor of the form\n\nR = r(X \u2212 E[X | do(P )])\n\nfor some function r exhibits no proxy discrimination.\nNote that in general E[X | do(P )] (cid:54)= E[X | P ]. Since in practice we only have observational data\nfrom \u02dcG, one cannot simply build a predictor based on the \u201cregressed out features\u201d \u02dcX := X \u2212\nE[X | P ] to avoid proxy discrimination. In the scenario of Figure 3, the direct effect of P on X\nalong the arrow P \u2192 X in the left graph cannot be estimated by E[X | P ], because of the common\nconfounder A. The desired interventional expectation E[X | do(P )] coincides with E[X | P ] only\nif one of the arrows A \u2192 P or A \u2192 X is not present. Estimating direct causal effects is a hard\nproblem, well studied by the causality community and often involves instrumental variables [23].\nThis cautions against the natural idea of using \u02dcX as a \u201cfair representation\u201d of X, as it implicitly\nneglects that we often want to remove the effect of proxies and not the protected attribute. Never-\ntheless, the notion agrees with our interventional proxy discrimination in some cases.\nCorollary 1. Under the assumptions of Theorem 2, if all directed paths from any ancestor of P\nto X in the graph G are blocked by P , then any predictor based on the adjusted features \u02dcX :=\nX \u2212 E[X | P ] exhibits no proxy discrimination and can be learned from the observational distribu-\ntion P(P, X, Y ) when target labels Y are available.\n\nOur de\ufb01nition of proxy discrimination in expectation (4) is motivated by a weaker notion proposed\nin [24]. It asks for the expected outcome to be the same across the different populations E[R | P =\np] = E[R | P = p(cid:48)]. Again, when talking about proxies, we must be careful to distinguish conditional\nand interventional expectations, which is captured by the following proposition and its corollary.\nProposition 3. Any predictor of the form R = \u03bb(X \u2212 E[X | do(P )]) + c for \u03bb, c \u2208 R exhibits no\nproxy discrimination in expectation.\n\nFrom this and the proof of Corollary 1 we conclude the following Corollary.\nCorollary 2. If all directed paths from any ancestor of P to X are blocked by P , any predictor of\nthe form R = r(X \u2212 E[X | P ]) for linear r exhibits no proxy discrimination in expectation and can\nbe learned from the observational distribution P(P, X, Y ) when target labels Y are available.\n\n5 Conclusion\n\nThe goal of our work is to assay fairness in machine learning within the context of causal reasoning.\nThis perspective naturally addresses shortcomings of earlier statistical approaches. Causal fairness\ncriteria are suitable whenever we are willing to make assumptions about the (causal) generating\n\n8\n\n\fprocess governing the data. Whilst not always feasible, the causal approach naturally creates an\nincentive to scrutinize the data more closely and work out plausible assumptions to be discussed\nalongside any conclusions regarding fairness.\nKey concepts of our conceptual framework are resolving variables and proxy variables that play\na dual role in de\ufb01ning causal discrimination criteria. We develop a practical procedure to remove\nproxy discrimination given the structural equation model and analyze a similar approach for un-\nresolved discrimination.\nIn the case of proxy discrimination for linear structural equations, the\nprocedure has an intuitive form that is similar to heuristics already used in the regression literature.\nOur framework is limited by the assumption that we can construct a valid causal graph. The removal\nof proxy discrimination moreover depends on the functional form of the causal dependencies. We\nhave focused on the conceptual and theoretical analysis, and experimental validations are beyond\nthe scope of the present work.\nThe causal perspective suggests a number of interesting new directions at the technical, empirical,\nand conceptual level. We hope that the framework and language put forward in our work will be a\nstepping stone for future investigations.\n\n9\n\n\fReferences\n\n[1] Richard S Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, and Cynthia Dwork. \u201cLearning\nFair Representations.\u201d In: Proceedings of the International Conference of Machine Learning\n28 (2013), pp. 325\u2013333.\n\n[2] Moritz Hardt, Eric Price, Nati Srebro, et al. \u201cEquality of opportunity in supervised learning\u201d.\n\nIn: Advances in Neural Information Processing Systems. 2016, pp. 3315\u20133323.\n\n[3] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. \u201cFairness\nThrough Awareness\u201d. In: Proceedings of the 3rd Innovations in Theoretical Computer Science\nConference. 2012, pp. 214\u2013226.\n\n[4] Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkata-\nsubramanian. \u201cCertifying and removing disparate impact\u201d. In: Proceedings of the 21th\nACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015,\npp. 259\u2013268.\n\n[5] Muhammad Bilal Zafar, Isabel Valera, Manuel G\u00b4omez Rogriguez, and Krishna P. Gummadi.\n\u201cFairness Constraints: Mechanisms for Fair Classi\ufb01cation\u201d. In: Proceedings of the 20th In-\nternational Conference on Arti\ufb01cial Intelligence and Statistics. 2017, pp. 962\u2013970.\n\n[6] Harrison Edwards and Amos Storkey. \u201cCensoring Representations with an Adversary\u201d. In:\n\n(Nov. 18, 2015). arXiv: 1511.05897v3.\n\n[7] Muhammad Bilal Zafar, Isabel Valera, Manuel G\u00b4omez Rodriguez, and Krishna P. Gummadi.\n\u201cFairness Beyond Disparate Treatment & Disparate Impact: Learning Classi\ufb01cation Without\nDisparate Mistreatment\u201d. In: Proceedings of the 26th International Conference on World Wide\nWeb. 2017, pp. 1171\u20131180.\n\n[8] Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. \u201cFairness\nin Criminal Justice Risk Assessments: The State of the Art\u201d. In: (Mar. 27, 2017). arXiv:\n1703.09207v1.\nJon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. \u201cInherent Trade-Offs in the Fair\nDetermination of Risk Scores\u201d. In: (Sept. 19, 2016). arXiv: 1609.05807v1.\n\n[9]\n\n[10] Alexandra Chouldechova. \u201cFair prediction with disparate impact: A study of bias in recidi-\n\nvism prediction instruments\u201d. In: (Oct. 24, 2016). arXiv: 1610.07524v1.\nJudea Pearl. Causality. Cambridge University Press, 2009.\n\n[11]\n[12] Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. \u201cOn the\n\n(im)possibility of fairness\u201d. In: (Sept. 23, 2016). arXiv: 1609.07236v1.\n\n[13] Paul R Rosenbaum and Donald B Rubin. \u201cThe central role of the propensity score in obser-\n\nvational studies for causal effects\u201d. In: Biometrika (1983), pp. 41\u201355.\n\n[14] Bilal Qureshi, Faisal Kamiran, Asim Karim, and Salvatore Ruggieri. \u201cCausal Discrimination\n\nDiscovery Through Propensity Score Analysis\u201d. In: (Aug. 12, 2016). arXiv: 1608.03735.\n\n[15] Matt J. Kusner, Joshua R. Loftus, Chris Russell, and Ricardo Silva. \u201cCounterfactual Fairness\u201d.\n\nIn: (Mar. 20, 2017). arXiv: 1703.06856v1.\n\n[16] Tyler J VanderWeele and Whitney R Robinson. \u201cOn causal interpretation of race in regres-\nsions adjusting for confounding and mediating variables\u201d. In: Epidemiology 25.4 (2014),\np. 473.\n\n[17] Francesco Bonchi, Sara Hajian, Bud Mishra, and Daniele Ramazzotti. \u201cExposing the proba-\n\nbilistic causal structure of discrimination\u201d. In: (Mar. 8, 2017). arXiv: 1510.00552v3.\n\n[18] Lu Zhang and Xintao Wu. \u201cAnti-discrimination learning: a causal modeling-based frame-\n\nwork\u201d. In: International Journal of Data Science and Analytics (2017), pp. 1\u201316.\n\n[19] Razieh Nabi and Ilya Shpitser. \u201cFair Inference On Outcomes\u201d. In: (May 29, 2017). arXiv:\n\n1705.10378v1.\n\n[20] Peter J Bickel, Eugene A Hammel, J William O\u2019Connell, et al. \u201cSex bias in graduate admis-\n\nsions: Data from Berkeley\u201d. In: Science 187.4175 (1975), pp. 398\u2013404.\n\n[21] Faisal Kamiran, Indr\u02d9e \u02c7Zliobait\u02d9e, and Toon Calders. \u201cQuantifying explainable discrimination\nand removing illegal discrimination in automated decision making\u201d. In: Knowledge and in-\nformation systems 35.3 (2013), pp. 613\u2013644.\n\n[22] Nicholas Cornia and Joris M Mooij. \u201cType-II errors of independence tests can lead to arbi-\ntrarily large errors in estimated causal effects: An illustrative example\u201d. In: Proceedings of\nthe Workshop on Causal Inference (UAI). 2014, pp. 35\u201342.\n\n10\n\n\f[23]\n\nJoshua Angrist and Alan B Krueger. Instrumental variables and the search for identi\ufb01cation:\nFrom supply and demand to natural experiments. Tech. rep. National Bureau of Economic\nResearch, 2001.\n\n[24] Toon Calders and Sicco Verwer. \u201cThree naive Bayes approaches for discrimination-free clas-\n\nsi\ufb01cation\u201d. In: Data Mining and Knowledge Discovery 21.2 (2010), pp. 277\u2013292.\n\n11\n\n\f", "award": [], "sourceid": 450, "authors": [{"given_name": "Niki", "family_name": "Kilbertus", "institution": "MPI Tuebingen & Cambridge"}, {"given_name": "Mateo", "family_name": "Rojas Carulla", "institution": "University of Cambridge, Max Planck for Intelligent Systems"}, {"given_name": "Giambattista", "family_name": "Parascandolo", "institution": "Max Planck Institute for Intelligent Systems and and Max Planck ETH CLS"}, {"given_name": "Moritz", "family_name": "Hardt", "institution": "UC Berkeley"}, {"given_name": "Dominik", "family_name": "Janzing", "institution": "MPI T\u00fcbingen"}, {"given_name": "Bernhard", "family_name": "Sch\u00f6lkopf", "institution": "MPI for Intelligent Systems"}]}