Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

Moldovan, Teodor; Abbeel, Pieter

Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

Teodor M. Moldovan, Pieter Abbeel

Advances in Neural Information Processing Systems 25 (NIPS 2012)

Abstract

The expected return is a widely used objective in decision making under uncer- tainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw con- nections to previously proposed objectives for risk-aware planing: minmax, ex- ponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efﬁcient algorithm for op- timizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.

Abstract

Name Change Policy