Prediction and Change Detection

Part of Advances in Neural Information Processing Systems 18 (NIPS 2005)

Bibtex Metadata Paper


Mark Steyvers, Scott Brown


We measure the ability of human observers to predict the next datum in a sequence that is generated by a simple statistical process undergoing change at random points in time. Accurate performance in this task requires the identification of changepoints. We assess individual differences between observers both empirically, and using two kinds of models: a Bayesian approach for change detection and a family of cognitively plausible fast and frugal models. Some individuals detect too many changes and hence perform sub-optimally due to excess variability. Other individuals do not detect enough changes, and perform sub-optimally because they fail to notice short-term temporal trends.

1 I n t r o d u c t i o n

Decision-making often requires a rapid response to change. For example, stock analysts need to quickly detect changes in the market in order to adjust investment strategies. Coaches need to track changes in a player’s performance in order to adjust strategy. When tracking changes, there are costs involved when either more or less changes are observed than actually occurred. For example, when using an overly conservative change detection criterion, a stock analyst might miss important short-term trends and interpret them as random fluctuations instead. On the other hand, a change may also be detected too readily. For example, in basketball, a player who makes a series of consecutive baskets is often identified as a “hot hand” player whose underlying ability is perceived to have suddenly increased [1,2]. This might lead to sub-optimal passing strategies, based on random fluctuations.

We are interested in explaining individual differences in a sequential prediction task. Observers are shown stimuli generated from a simple statistical process with the task of predicting the next datum in the sequence. The latent parameters of the statistical process change discretely at random points in time. Performance in this task depends on the accurate detection of those changepoints, as well as inference about future outcomes based on the outcomes that followed the most recent inferred changepoint. There is much prior research in statistics on the problem of identifying changepoints [3,4,5]. In this paper, we adopt a Bayesian approach to the changepoint identification problem and develop a simple inference procedure to predict the next datum in a sequence. The Bayesian model serves as an ideal observer model and is useful to characterize the ways in which individuals deviate from optimality.

The plan of the paper is as follows. We first introduce the sequential prediction task and discuss a Bayesian analysis of this prediction problem. We then discuss the results from a few individuals in this prediction task and show how the Bayesian approach can capture individual differences with a single “twitchiness” parameter that describes how readily changes are perceived in random sequences. We will show that some individuals are too twitchy: their performance is too variable because they base their predictions on too little of the recent data. Other individuals are not twitchy enough, and they fail to capture fast changes in the data. We also show how behavior can be explained with a set of fast and frugal models [6]. These are cognitively realistic models that operate under plausible computational constraints.

2 A p r e d i c t i o n t a s k w i t h m u l t i p l e c h a n g e p o i n t s

In the prediction task, stimuli are presented sequentially and the task is to predict the next stimulus in the sequence. After t trials, the observer has been presented with stimuli y1, y2, …, yt and the task is to make a prediction about yt+1. After the prediction is made, the actual outcome yt+1 is revealed and the next trial proceeds to the prediction of yt+2. This procedure starts with y1 and is repeated for T trials.

The observations yt are D-dimensional vectors with elements sampled from binomial distributions. The parameters of those distributions change discretely at random points in time such that the mean increases or decreases after a change point. This generates a sequence of observation vectors, y1, y2, …, yT, where each yt = {yt,1 … yt,D}. Each of the yt,d is sampled from a binomial distribution Bin(θt,d,K), so 0 ≤ yt,d ≤ K. The parameter vector θt ={θt,1 … θt,D} changes depending on the locations of the changepoints. At each time step, x is a binary indicator for the occurrence of a


changepoint occurring at time t+1. The parameter α determines the probability of a change occurring in the sequence. The generative model is specified by the following algorithm:

  1. For d=1..D sample θ1,d from a Uniform(0,1) distribution