Reviews: On Human-Aligned Risk Minimization

The paper presents a novel alternative to ERM that is based on an economic theory called CPT which is composed of an inverse S-shaped probability weighting function to transform the CDF so that the small probabilities are inflated and large probabilities are deflated. The only similar works considering a human loss/reward/risk have been studied in Bandits and RL. I am not aware of other literature that studies in the context of supervised learning for classification, although I might be missing something here. The paper is well written and very clear in most arguments it makes. There are little to no typos except ones noted below. Weaknesses: 0. My first concern is the assumption that a human risk measure is gold standard when it comes to fairness. There are many reasons to question this assumption. First, humans are the worst random number generators, e.g. the distribution over random integers from 1 to 10 is highly skewed in the center. Similarly, if humans perceive a higher risk in the tails of a distribution, it doesn't necessarily mean that minimizing such risk makes the model fair. This still needs to be discussed and proven. 1. The paper suggests that using EHRM has fairness implications. These fairness implications are obtained as a side effect of using different hyperparameter setting for the skewness of the human risk distribution. There is no direct relationship between fairness consideration and the risk metric used. 2. In the Introduction, the authors choose to over-sell their work by presenting their work as a "very natural if simple solution to addressing these varied desiderata" where the desiderata include "fairness, safety, and robustness". This is a strong statement but incorrect at the same time. The paper lacks any connection between these objectives and the proposed risk metric. One could try to investigate these connections before claiming to address them. 3. One example of connection would be the definition of Calibration used in, for example, Kleinberg et al. and connect it to a human calibration measure and derive a Human risk objective from there as well. It is a straightforward application but the work lacks that. 4. There are no comparison baselines even when applying to a fairness problem which has a number of available software to get good results. Agarwal 2018: "A Reductions Approach to Fair Classification" is seemingly relevant as it reduces fairness in classification to cost-sensitive learning. In this case, the weighting is done on the basis of the loss and not the group identities or class values, but it may be the reason why there is a slight improvement in fairness outcomes. Since the EHRM weights minorities higher, it might be correlated to the weights under a fair classification reduction and hence giving you slight improvements in fairness metrics. 5. There were a few typos and some other mistakes: - doomed -> deemed (Line50) - Line 74: Remove hence. The last line doesn't imply this sentence. It seems independent.

Reviewer 2

Thank you for your response. I would love to see this paper published, but feel that there are still issues that should be addressed: - More convincing experimental results (mostly in terms of effect size) - Better justification as to why fairness is expected to improve under HRM (since, as noted in the response, all subgroups should be effected) - Decoupling the part risk aversion plays in model selection vs. prediction And while my score remains unchanged, I truly hope that a revised and improved version of this paper will be published in the near future. -------------------------------------------------------------------------------------------------------- The paper explores a conceptually intriguing question - what if predictive machines were trained with imbued human biases (e.g., loss aversion)? The authors propose that this would trade off accuracy for other measures of interest such as fairness, which they aim to show empirically. Overall the paper makes an interesting point and offers a novel perspective, and the paper generally reads well. There are however several issues that I'm hoping the authors can elaborate on. 1. I'm having trouble untangling what aspect of the machine learning process is drawn the analogy to human decision making. In some parts of the paper, human preferences are linked to model selection (as in the example in Sec. 2.1, and in general in its relation to the ERM setup). However, the paper is motivated by the increasing use of machines in real-world decision making, which relates not to model selection but to per-instance decisions (made on the basis of predictions). 2. In Def. 1, F is portrayed as 'given'. Is it indeed a given object, or is it the cdf induced by the data distribution (through \ell) (which makes more sense here)? 3. How do the authors propose to optimize the EHRM in Eq. 5? Specifically, how is F and/or F_n modeled and parameterized? Is it computationally efficient to train? In general, F can be a complex function of \theta, and possibly non-differentiable (e.g., if it bins values of \ell). 4. It is not clear which parts of Sec. 4 are novel and which are not. 5. The link between human-aligned risk and fairness is presented in line 176: "... we expect minimizers of HRM to avoid drastic losses for minority groups." I understand why HRM avoids drastic losses, but why would these necessarily account for those of minority groups? This is an important argument that the authors should clarify. 6. The empirical section (which is stated as being a main contribution) is somewhat discouraging, specifically: - he effects seem small and noisy (especially in the figures) - the table is missing many of the stated model configurations - the table lacks significance tests (that should correct for multiple hypotheses, of which there are fairly many) - there is no comparison to other fairness-promoting baselines - Sec. 5.3 considers only one configuration (without and explanation) - feature weights is not a robust (or meaningful) criterion for comparing models

Reviewer 3

Orginality: Moderate to high The idea of using better, more human-aligned risks is a nice idea that extends recent interest in having more risk-averse losses and training procedures. Approaching the problem through the lens of behavioral economics is different from many of the machine learning based ideas. Quality: Moderate. The evaluation and comparisons to prior work are sound (with some minor omissions, like `Fairness Risk Measures' by Williamson which cover similar ideas about risk measures and fairness). The paper has minor conceptual issues that prevent it from being a great paper. The main problem is that CPT is justified relatively briefly in section 2, and then a large part of the remaining paper is about the analysis of CPT. While CPT is clearly a reasonable framework for behavioral economics, it is not clear that the (surrogate) losses used in machine learning are at all similar to monetary losses and gains. In this case, should we still assume that CPT is the right way to weight losses? What is missing in this paper is the 'human' in human risk minimization - quantifying whether losses used in ML + CPT matches human utility would go a long way to convincing the reader. Additionally, many of the goals of CPT can be achieved by simply using more risk averse losses like CVaR. Is the upweighting of tail gains necessary? I would think that this can in fact result in unfairness when one tries to optimize tail gains at the expense of losses. It might be worthwhile to include something like CVaR in your experiments as well. Clarity: Moderate to high. Section 2.1 that covers CPT was clear, and motivated the reasons for using CPT. I do think Section 4 is a bit low level, and cutting some material there to show some examples of how CPT would work on toy examples would be helpful. I did not follow the phrasing of the 3 bullet points from lines 126 to 128. Typo (?) at line 50 ``doomed larger than gains''? Significance: Moderate to high Despite some minor complaints about whether CPT truly matches human risk, I think the ideas and approach are new, and the writing is fairly good at making the case for CPT as a way to adjust losses. The paper may serve to interest other researchers in more human-aligned loss measures as well.

Paper ID:	8591
Title:	On Human-Aligned Risk Minimization

Reviewer 1

Reviewer 2

Reviewer 3