{"title": "How memory biases affect information transmission: A rational analysis of serial reproduction", "book": "Advances in Neural Information Processing Systems", "page_first": 1809, "page_last": 1816, "abstract": "Many human interactions involve pieces of information being passed from one person to another, raising the question of how this process of information transmission is affected by the capacities of the agents involved. In the 1930s, Sir Frederic Bartlett explored the influence of memory biases in \u00e2\u0080\u009cserial reproduction\u00e2\u0080\u009d of information, in which one person\u00e2\u0080\u0099s reconstruction of a stimulus from memory becomes the stimulus seen by the next person. These experiments were done using relatively uncontrolled stimuli such as pictures and stories, but suggested that serial reproduction would transform information in a way that reflected the biases inherent in memory. We formally analyze serial reproduction using a Bayesian model of reconstruction from memory, giving a general result characterizing the effect of memory biases on information transmission. We then test the predictions of this account in two experiments using simple one-dimensional stimuli. Our results provide theoretical and empirical justification for the idea that serial reproduction reflects memory biases.", "full_text": "How memory biases affect information transmission:\n\nA rational analysis of serial reproduction\n\nJing Xu Thomas L. Grif\ufb01ths\n\nDepartment of Psychology\n\nUniversity of California, Berkeley\n\nBerkeley, CA 94720-1650\n\n{jing.xu,tom griffiths}@berkeley.edu\n\nAbstract\n\nMany human interactions involve pieces of information being passed from one\nperson to another, raising the question of how this process of information trans-\nmission is affected by the capacities of the agents involved.\nIn the 1930s, Sir\nFrederic Bartlett explored the in\ufb02uence of memory biases in \u201cserial reproduction\u201d\nof information, in which one person\u2019s reconstruction of a stimulus from memory\nbecomes the stimulus seen by the next person. These experiments were done us-\ning relatively uncontrolled stimuli such as pictures and stories, but suggested that\nserial reproduction would transform information in a way that re\ufb02ected the biases\ninherent in memory. We formally analyze serial reproduction using a Bayesian\nmodel of reconstruction from memory, giving a general result characterizing the\neffect of memory biases on information transmission. We then test the predic-\ntions of this account in two experiments using simple one-dimensional stimuli.\nOur results provide theoretical and empirical justi\ufb01cation for the idea that serial\nreproduction re\ufb02ects memory biases.\n\n1 Introduction\n\nMost of the facts that we know about the world are not learned through \ufb01rst-hand experience, but\nare the result of information being passed from one person to another. This raises a natural question:\nhow are such processes of information transmission affected by the capacities of the agents involved?\nDecades of memory research have charted the ways in which our memories distort reality, changing\nthe details of experiences and introducing events that never occurred (see [1] for an overview). We\nmight thus expect that these memory biases would affect the transmission of information, since such\na process relies on each person remembering a fact accurately.\n\nThe question of how memory biases affect information transmission was \ufb01rst investigated in detail\nin Sir Frederic Bartlett\u2019s \u201cserial reproduction\u201d experiments [2]. Bartlett interpreted these studies\nas showing that people were biased by their own culture when they reconstruct information from\nmemory, and that this bias became exaggerated through serial reproduction. Serial reproduction\nhas become one of the standard methods used to simulate the process of cultural transmission, and\nseveral subsequent studies have used this paradigm (e.g., [3, 4]). However, this phenomenon has not\nbeen systematically and formally analyzed, and most of these studies have used complex stimuli that\nare semantically rich but hard to control. In this paper, we formally analyze and empirically evaluate\nhow information is changed by serial reproduction and how this process relates to memory biases.\nIn particular, we provide a rational analysis of serial reproduction (in the spirit of [5]), considering\nhow information should change when passed along a chain of rational agents.\n\nBiased reconstructions are found in many tasks. For example, people are biased by their knowledge\nof the structure of categories when they reconstruct simple stimuli from memory. One common\n\n\feffect of this kind is that people judge stimuli that cross boundaries of two different categories to\nbe further apart than those within the same category, although the distances between the stimuli\nare the same in the two situations [6]. However, biases need not re\ufb02ect suboptimal performance.\nIf we assume that memory is solving the problem of extracting and storing information from the\nnoisy signal presented to our senses, we can analyze the process of reconstruction from memory as\na Bayesian inference. Under this view, reconstructions should combine prior knowledge about the\nworld with the information provided by noisy stimuli. Use of prior knowledge will result in biases,\nbut these biases ultimately make memory more accurate [7].\n\nIf this account of reconstruction from memory is true, we would expect the same inference process\nto occur at every step of serial reproduction. The effects of memory biases should thus be accumu-\nlated. Assuming all participants share the same prior knowledge about the world, serial reproduction\nshould ultimately reveal the nature of this knowledge. Drawing on recent work exploring other pro-\ncesses of information transmission [8, 9], we show that a rational analysis of serial reproduction\nmakes exactly this prediction. To test the predictions of this account, we explore the special case\nwhere the task is to reconstruct a one-dimensional stimulus using the information that it is drawn\nfrom a \ufb01xed Gaussian distribution. In this case we can precisely characterize behavior at every step\nof serial reproduction. Speci\ufb01cally, we show that this de\ufb01nes a simple \ufb01rst-order autoregressive, or\nAR(1), process, allowing us to draw on a variety of results characterizing such processes. We use\nthese predictions to test the Bayesian models of serial reproduction in two laboratory experiments\nand show that the predictions hold serial reproduction both between- and within-subjects.\n\nThe plan of the paper is as follows. Section 2 lays out the Bayesian account of serial reproduction.\nIn Section 3 we show how this Bayesian account corresponds to the AR(1) process. Sections 4 and\n5 present two experiments testing the model\u2019s prediction that serial reproduction reveals memory\nbiases. Section 6 concludes the paper.\n\n2 A Bayesian view of serial reproduction\n\nWe will outline our Bayesian approach to serial reproduction by \ufb01rst considering the problem of\nreconstruction from memory, and then asking what happens when the solution to this problem is\nrepeated many times, as in serial reproduction.\n\n2.1 Reconstruction from memory\n\nOur goal is to give a rational account of reconstruction from memory, considering the underlying\ncomputational problem and \ufb01nding the optimal solution to that problem. We will formulate the\nproblem of reconstruction from memory as a problem of inferring and storing accurate information\nabout the world from noisy sensory data. Given a noisy stimulus x, we seek to recover the true state\nof the world \u00b5 that generated that stimulus, storing an estimate \u02c6\u00b5 in memory. The optimal solution\nto this problem is provided by Bayesian statistics. Previous experience provides a \u201cprior\u201d distri-\nbution on possible states of the world, p(\u00b5). On observing x, this can be updated to a \u201cposterior\u201d\ndistribution p(\u00b5|x) by applying Bayes\u2019 rule\n\n(1)\n\np(\u00b5|x) =\n\np(x|\u00b5)p(\u00b5)\n\nR p(x|\u00b5)p(\u00b5) d\u00b5\n\nwhere p(x|\u00b5) \u2013 the \u201clikelihood\u201d \u2013 indicates the probability of observing x if \u00b5 is the true state of\nthe world. Having computed p(\u00b5|x), a number of schemes could be used to select an estimate of\u02c6\u00b5\nto store. Perhaps the simplest such scheme is sampling from the posterior, with \u02c6\u00b5 \u223c p(\u00b5|x).\nThis analysis provides a general schema for modeling reconstruction from memory, applicable for\nany form of x and \u00b5. A simple example is the special case where x and \u00b5 vary along a single\ncontinuous dimension. In the experiment presented later in the paper we take this dimension to be\nthe width of a \ufb01sh, showing people a \ufb01sh and asking them to reconstruct its width from memory, but\nthe dimension of interest could be any subjective quantity such as the perceived length, loudness,\nduration, or brightness of a stimulus. Assume that previous experience establishes that \u00b5 has a\nGaussian distribution, with \u00b5 \u223c N (\u00b50, \u03c32\n0), and that the noise process means that x has a Gaussian\ndistribution centered on \u00b5, x|\u00b5 \u223c N (\u00b5, \u03c32\nx). In this case, we can use standard results from Bayesian\nstatistics [10] to show that the outcome of Equation 1 is also a Gaussian distribution, with p(\u00b5|x)\nbeing N (\u03bbx + (1 \u2212 \u03bb)\u00b50, \u03bb\u03c32\n\nx), where \u03bb = 1/(1 + \u03c32\n\nx/\u03c32\n\n0).\n\n\fThe analysis presented in the previous paragraph makes a clear prediction: that the reconstruction \u02c6\u00b5\nshould be a compromise between the observed value x and the mean of the prior \u00b50, with the terms\n0.\nx to the uncertainty in the prior \u03c32\nof the compromise being set by the ratio of the noise in the data \u03c32\nThis model thus predicts a systematic bias in reconstruction that is not a consequence of an error of\nmemory, but the optimal solution to the problem of extracting information from a noisy stimulus.\nHuttenlocher and colleagues [7] have conducted several experiments testing this account of memory\nbiases, showing that people\u2019s reconstructions interpolate between observed stimuli and the mean of\na trained distribution as predicted. Using a similar notion of recosntruction from memory, Hemmer\nand Steyvers [11] have conducted experiments to show that people formed appropriate Bayesian\nreconstructions for realistic stimuli such as images of fruit, and seemed capable of drawing on prior\nknowledge at multiple levels of abstraction in doing so.\n\n2.2 Serial reproduction\n\nWith a model of how people might approach the problem of reconstruction from memory in hand,\nwe are now in a position to analyze what happens in serial reproduction, where the stimuli that\npeople receive on one trial are the results of a previous reconstruction. On the nth trial, a participant\nsees a stimulus xn. The participant then computes p(\u00b5|xn) as outlined in the previous section, and\nstores a sample \u02c6\u00b5 from this distribution in memory. When asked to produce a reconstruction, the\nparticipant generates a new value xn+1 from a distribution that depends on \u02c6\u00b5. If the likelihood,\np(x|\u00b5), re\ufb02ects perceptual noise, then it is reasonable to assume that xn+1 will be sampled from this\ndistribution, substituting \u02c6\u00b5 for \u00b5. This value of xn+1 is the stimulus for the next trial.\nViewed from this perspective, serial reproduction de\ufb01nes a stochastic process: a sequence of random\nvariables evolving over time. In particular, it is a Markov chain, since the reconstruction produced\non the current trial depends only on the value produced on the preceding trial (e.g.\n[12]). The\ntransition probabilities of this Markov chain are\n\np(xn+1|xn) = Z p(xn+1|\u00b5)p(\u00b5|xn) d\u00b5\n\n(2)\n\nbeing the probability that xn+1 is produced as a reconstruction for the stimulus xn. If this Markov\nchain is ergodic (see [12] for details) it will converge to a stationary distribution \u03c0(x), with p(xn|x1)\ntending to \u03c0(xn) as n \u2192 \u221e. That is, after many reproductions, we should expect the probability\nof seeing a particular stimulus being produced as a reproduction to stabilize to a \ufb01xed distribution.\nIdentifying this distribution will help us understand the consequences of serial reproduction.\n\nThe transition probabilities given in Equation 2 have a special form, being the result of sampling\na value from the posterior distribution p(\u00b5|xn) and then sampling a value from the likelihood\np(xn+1|\u00b5). In this case, it is possible to identify the stationary distribution of the Markov chain\n[8, 9]. The stationary distribution of this Markov chain is the prior predictive distribution\n\n\u03c0(x) = Z p(x|\u00b5)p(\u00b5) d\u00b5\n\n(3)\n\nbeing the probability of observing the stimulus x when \u00b5 is sampled from the prior. This happens\nbecause this Markov chain is a Gibbs sampler for the joint distribution on x and \u00b5 de\ufb01ned by\nmultiplying p(x|\u00b5) and p(\u00b5) [9]. This gives a clear characterization of the consequences of serial\nreproduction: after many reproductions, the stimuli being produced will be sampled from the prior\ndistribution assumed by the participants. Convergence to the prior predictive distribution provides\na formal justi\ufb01cation for the traditional claims that serial reproduction reveals cultural biases, since\nthose biases would be re\ufb02ected in the prior.\n\nIn the special case of reconstruction of stimuli that vary along a single dimension, we can also\nanalytically compute the probability density functions for the transition probabilities and stationary\ndistribution. Applying Equation 2 using the results summarized in the previous section, we have\nx. Likewise, Equation\nxn+1|xn \u223c N (\u00b5n, (\u03c32\n3 indicates that the stationary distribution is N (\u00b50, (\u03c32\n0)). The rate at which the Markov chain\nconverges to the stationary distribution depends on the value of \u03bb. When \u03bb is close to 1, convergence\nis slow since \u00b5n is close to xn. As \u03bb gets closer to 0, \u00b5n is more in\ufb02uenced by \u00b50 and convergence is\nfaster. Since \u03bb = 1/(1 + \u03c32\n0), the convergence rate thus depends on the ratio of the participant\u2019s\nperceptual noise and the variance of the prior distribution, \u03c32\n0. More perceptual noise results in\n\nn)), where \u00b5n = \u03bbxn + (1 \u2212 \u03bb)\u00b50, and \u03c32\n\nx + \u03c32\n\nn = \u03bb\u03c32\n\nx + \u03c32\n\nx/\u03c32\n\nx/\u03c32\n\n\ffaster convergence, since the speci\ufb01c value of xn is trusted less; while more uncertainty in the prior\nresults in slower convergence, since xn is given greater weight.\n\n3 Serial reproduction of one-dimensional stimuli as an AR(1) process\n\nThe special case of serial reproduction of one-dimensional stimuli can also give us further insight\ninto the consequences of modifying our assumptions about storage and reconstruction from mem-\nory, by exploiting a further property of the underlying stochastic process:\nthat it is a \ufb01rst-order\nautoregressive process, abbreviated to AR(1). The general form of an AR(1) process is\n\nxn+1 = c + \u03c6xn + \u01ebn+1\n\n(4)\n\nwhere \u01ebn+1 \u223c N (0, \u03c32\n\u01eb ). Equation 4 has the familiar form of a regression equation, predicting one\nvariable as a linear function of another, plus Gaussian noise. It de\ufb01nes a stochastic process because\neach variable is being predicted from that which precedes it in sequence. AR(1) models are widely\nused to model timeseries data, being one of the simplest models for capturing temporal dependency.\n\nJust as showing that a stochastic process is a Markov chain provides information about its dynamics\nand asymptotic behavior, showing that it reduces to an AR(1) process provides access to a number\nof results characterizing the properties of these processes. If \u03c6 < 1 the process has a stationary\ndistribution that is Gaussian with mean c/(1 \u2212 \u03c6) and variance \u03c32\n\u01eb /(1 \u2212 \u03c62). The autocovariance at\na lag of n is \u03c6n\u03c32\n\u01eb /(1 \u2212 \u03c62), and thus decays geometrically in \u03c6. An AR(1) process thus converges\nto its stationary distribution at a rate determined by \u03c6.\nIt is straightforward to show that the stochastic process de\ufb01ned by serial reproduction where a sam-\nple from the posterior distribution on \u00b5 is stored in memory and a new value x is sampled from the\nlikelihood is an AR(1) process. Using the results in the previous section, at the (n + 1)th iteration\n\nxn+1 = (1 \u2212 \u03bb)\u00b50 + \u03bbxn + \u01ebn+1\n\n(5)\n\nwhere \u03bb = 1/(1 + \u03c32\nwith c = (1 \u2212 \u03bb)\u00b50, \u03c6 = \u03bb, and \u03c32\n0 and \u03c32\n\ufb01nd the stationary distribution by substituting these values into the expressions given above.\n\nx + \u03c32\nn. Since \u03bb is less than 1 for any \u03c32\n\nx. This is an AR(1) process\nx, we can\n\n0) and \u01ebn+1 \u223c N (0, (\u03c32\nx + \u03c32\n\n\u01eb = \u03c32\n\nx/\u03c32\n\nn)) with \u03c32\n\nn = \u03bb\u03c32\n\nIdentifying serial reproduction for single-dimensional stimuli as an AR(1) process allows us to relax\nour assumptions about the way that people are storing and reconstructing information. The AR(1)\nmodel can accommodate different assumptions about memory storage and reconstruction.1 All these\nways of characterizing serial reproduction lead to the same basic prediction: that repeatedly recon-\nstructing stimuli from memory will result in convergence to a distribution whose mean corresponds\nto the mean of the prior. In the remainder of the paper we test this prediction.\n\nIn the following sections, we present two serial reproduction experiments conducted with stimuli\nthat vary along only one dimension (width of \ufb01sh). The \ufb01rst experiment follows previous research\nin using a between-subjects design, with the reconstructions of one participant serving as the stimuli\nfor the next. The second experiment uses a within-subjects design in which each person reconstructs\nstimuli that they themselves produced on a previous trial, testing the potential of this design to reveal\nthe memory biases of individuals.\n\n4 Experiment 1: Between-subjects serial reproduction\n\nThis experiment directly tested the basic prediction that the outcome of serial reproduction will\nre\ufb02ect people\u2019s priors. Two groups of participants were trained on different distributions of a one-\ndimensional quantity \u2013 the width of a schematic \ufb01sh \u2013 that would serve as a prior for reconstructing\n\n1In the memorization phase, the participant\u2019s memory \u02c6\u00b5 can be 1) a sample from the posterior distribution\np(\u00b5|xn), as assumed above, or 2) a value such that \u02c6\u00b5 = argmax\u00b5 p(\u00b5|xn), which is also the expected value\nof the Gaussian posterior, p(\u00b5|xn). In the reproduction phase, the participant\u2019s reproduction xn+1 can be 1)\na noisy reconstruction, which is a sample from the likelihood p(xn+1|\u02c6\u00b5), as assumed above, or 2) a perfect\nreconstruction from memory, such that xn+1 = \u02c6\u00b5. This de\ufb01nes four different models of serial reproduction,\nall of which correspond to AR(1) processes that differ only in the variance \u03c32\n\u01eb (although maximizing p(\u00b5|xn)\nand then storing a perfect reconstruction is degenerate, with \u03c32\n\u01eb = 0). In all four cases serial reproduction thus\nconverges to a Gaussian stationary distribution with mean \u00b50, but with different variances.\n\n\fsimilar stimuli from memory. The two distributions differed in their means, allowing us to examine\nwhether the mean of the distribution produced by serial reproduction is affected by the prior.\n\n4.1 Method\n\nThe experiment followed the same basic procedure as Bartlett\u2019s classic experiments [2]. Participants\nwere 46 members of the university community. Stimuli were the same as those used in [7]: \ufb01sh with\nelliptical bodies and fan-shaped tails. All the \ufb01sh stimuli varied only in one dimension, the width of\nthe \ufb01sh, ranging from 2.63cm to 5.76cm. The stimuli were presented on an Apple iMac computer\nby a Matlab script using PsychToolBox extensions [13, 14].\n\nParticipants were \ufb01rst trained to discriminate \ufb01sh-farm and ocean \ufb01sh. The width of the \ufb01sh-farm\n\ufb01sh was normally distributed and that of the ocean \ufb01sh was uniformly distributed between 2.63 and\n5.75cm. Two groups of participants were trained on one of the two distributions of \ufb01sh-farm \ufb01sh\n(prior distributions A and B), with different means and same standard deviations. In condition A,\n\u00b50 = 3.66cm, \u03c30 = 1.3cm; in condition B, \u00b50 = 4.72cm, \u03c30 = 1.3cm.\nIn the training phase, participants \ufb01rst received a block of 60 trials. On each trial, a stimulus was\npresented at the center of a computer monitor and participants tried to predict which type of \ufb01sh it\nwas by pressing one of the keys on the keyboard and they received feedback about the correctness of\nthe prediction. The participants were then tested for 20 trials on their knowledge of the two types of\n\ufb01sh. The procedure was the same as the training block except there was no feedback. The training-\ntesting loop was repeated until the participants reached 80% correct in using the optimal decision\nstrategy. If a participant could not pass the test after \ufb01ve iterations, the experiment halted.\n\nIn the reproduction phase, the participants were told that they were to record \ufb01sh sizes for the \ufb01sh\nfarm. On each trial, a \ufb01sh stimulus was \ufb02ashed at the center of the screen for 500ms and then\ndisappeared. Another \ufb01sh of random size appeared at one of four possible positions near the center\nof screen and the participants used the up and down arrow keys to adjust the width of the \ufb01sh until\nthey thought it matched the \ufb01sh they just saw. The \ufb01sh widths seen by the \ufb01rst participant in each\ncondition were 120 values randomly sampled from a uniform distribution from 2.63 to 5.75cm.\nThe \ufb01rst participant tried to memorize these random samples and then gave the reconstructions.\nEach subsequent participant in each condition was then presented with the data generated by the\nprevious participant and they again tried to reconstruct those \ufb01sh widths. Thus, each participant\u2019s\ndata constitute one slice of time in 120 serial reproduction chains.\n\nAt the end of the experiment, the participants were given a \ufb01nal 50-trial test to check if their prior\ndistributions had drifted. Ten participants\u2019 data were excluded from the chains based on three cri-\nteria: 1) \ufb01nal testing score was less than 80% of optimal performance; 2) the difference between\nthe reproduced value and stimulus shown was greater than the difference between the largest and\nthe smallest stimuli in the training distribution on any trial; 3) there were no adjustments from the\nstarting value of the \ufb01sh width for more than half of the trials.\n\n4.2 Results and Discussion\n\nThere were 18 participants in each condition, resulting in 18 generations of serial reproduction. Fig-\nure 1 shows the initial and \ufb01nal distributions of the reconstructions, together with the autoregression\nplots for the two conditions. The mean reconstructed \ufb01sh widths produced by the \ufb01rst participants\nin conditions A and B were 4.22 and 4.21cm respectively, which were not statistically signi\ufb01cantly\ndifferent (t(238) = 0.09, p = 0.93). For the \ufb01nal participants in each chain, the mean reconstructed\n\ufb01sh widths were 3.20 and 3.68cm respectively, a statistically signi\ufb01cant difference (t(238) = 6.93,\np < 0.001). The difference in means matches the direction of the difference in the training provided\nin conditions A and B, although the overall size of the difference is reduced and the means of the\nstationary distributions were lower than those of the distributions used in training.\n\nThe autoregression plots provide a further quantitative test of the predictions of our Bayesian model.\nThe basic prediction of the model is that reconstruction should look like regression, and this is\nexactly what we see in Figure 1. The correlation between the stimulus xn and its reconstruction xn+1\nis the correlation between the AR(1) model\u2019s predictions and the data, and this correlation was high\nin both conditions, being 0.91 and 0.86 (p < 0.001) for conditions A and B respectively. Finally,\nwe examined whether the Markov assumption underlying our analysis was valid, by computing the\n\n\fInitial distribution\n\n \n\nStimuli\nCondition A\nCondition B\n\nFinal distribution\n\nAutoregression\n\n0.07\n\n0.06\n\n0.05\n\n1\n+\nn\n\nx\n\n0.04\n\n0.03\n\n0.02\n\n0.01\n\n0.02\n\n0.04\nxn\n\n0.06\n\n \n\n0.02\n\n0.04\n\n0.06\n\nFish width (m)\n\n0.02\n\n0.04\n\n0.06\n\nFish width (m)\n\nFigure 1: Initial and \ufb01nal distributions for the two conditions in Experiment 1. (a) The distribution of\nstimuli and Gaussian \ufb01ts to reconstructions for the \ufb01rst participants in the two conditions. (b) Gaus-\nsian \ufb01ts to reconstructions generated by the 18th participants in each condition. (c) Autoregression\nplot for xn+1 as a function of xn for the two conditions.\n\ncorrelation between xn+1 and xn\u22121 given xn. The resulting partial correlation was low for both\nconditions, being 0.04 and 0.01 in conditions A and B respectively (both p < 0.05).\n\n5 Experiment 2: Within-subjects serial reproduction\n\nThe between-subjects design allows us to reproduce the process of information transmission, but\nour analysis suggests that serial reproduction might also have promise as a method for investigating\nthe memory biases of individuals. To explore the potential of this method, we tested the model\nwith a within-subjects design, in which a participant\u2019s reproduction in the current trial became the\nstimulus for that same participant in a later trial. Each participant\u2019s responses over the entire exper-\niment thus produced a chain of reproductions. Each participant produced three such chains, starting\nfrom widely separated initial values. Control trials and careful instructions were used so that the\nparticipants would not realize that some of the stimuli were their own reproductions.\n\n5.1 Method\n\nForty-six undergraduates from the university research participation pool participated the experiment.\nThe basic procedure was the same as Experiment 1, except in the reproduction phase. Each partici-\npant\u2019s responses in this phase formed three chains of 40 trials. The chains started with three original\nstimuli with width values of 2.63cm, 4.19cm, and 5.76cm, then in the following trials, the stimuli\nparticipants saw were their own reproductions in the previous trials in the same chain. To prevent\nparticipants from realizing this fact, chain order was randomized and the Markov chain trials were\nintermixed with 40 control trials in which widths were drawn from the prior distribution.\n\n5.2 Results and Discussion\n\nParticipants\u2019 data were excluded based on the same criteria as used in Experiment 1, with a lower\ntesting score of 70% of optimal performance and one additional criterion relevant to the within-\nsubjects case: participants were also excluded if the three chains did not converge, with the criterion\nfor convergence being that the lower and upper chains must cross the middle chain. After these\nscreening procedures, 40 participants\u2019 data were accepted, with 21 in condition A and 19 in condi-\ntion B. It took most participants about 20 trials for the chains to converge, so only the second half of\nthe chains (trials 21-40) were analyzed further.\n\nThe locations of the stationary distributions were measured by computing the means of the repro-\nduced \ufb01sh widths for each participant. For conditions A (3.66cm) and B (4.72cm), the average of\nthese means was 3.32 and 4.01cm respectively (t(38) = 2.41, p = 0.021). The right panel of Figure\n\n\fFigure 2: Stimuli, training distributions and stationary distributions for Experiment 2. Each data\npoint in the right panel shows the mean of the last 20 iterations for a single participant. Boxes show\nthe 95% con\ufb01dence interval around the mean for each condition.\n\n)\n\nm\n\n(\n \nh\n\nt\n\ni\n\n \n\nd\nW\nh\ns\nF\n\ni\n\n)\n\nm\n\n(\n \nh\n\nt\n\ni\n\n \n\nd\nW\nh\ns\nF\n\ni\n\n0.06\n\n0.04\n\n0.02\n\n0\n\n0.06\n\n0.04\n\n0.02\n\n0\n\nSerial Reproduction\n\nTraining\n\nGaussian Fit\n\nAuto Regression\n\ncondition A\n\n10\n\n20\n\n30\n\n40\n\ncondition B\n\n10\n\n20\n\nIteration\n\n30\n\n40\n\n1\n+\n\nt\n\nx\n\n1\n+\n\nt\n\nx\n\nxt\n\nxt\n\nFigure 3: Chains and stationary distributions for individual participants from the two conditions.\n(a) The three Markov chains generated by each participant, starting from three different values.\n(b) Training distributions for each condition.\n(c) Gaussian \ufb01ts for the last 20 iterations of each\nparticipant\u2019s data. (d) Autoregression for the last 20 iterations of each participant\u2019s data.\n\n2 shows the mean values for these two conditions. The basic prediction of the model was borne\nout: participants converged to distributions that differed signi\ufb01cantly in their means when they were\nexposed to data suggesting a different prior. However, the means were in general lower than those\nof the prior. This effect was less prominent in the control trials, which produced means of 3.63 and\n4.53cm respectively.2\nFigure 3 shows the chains, training distributions, the Gaussian \ufb01ts and the autoregression for the\nsecond half of the Markov chains for two participants in the two conditions. Correlation analysis\nshowed that the AR(1) model\u2019s predictions are highly correlated with the data generated by each\nparticipant, with mean correlations being 0.90 and 0.81 for conditions A and B respectively. The\n\n2Since both experiments produced stationary distributions with means lower than those of the training dis-\ntributions, we conducted a separate experiment examining the reconstructions that people produced without\ntraining. The mean \ufb01sh width produced by 20 participants was 3.43cm, signi\ufb01cantly less than the mean of the\ninitial values of each chain, 4.19cm (t(19) = 3.75, p < 0.01). This result suggested that people seem to have\nan a priori expectation that \ufb01sh will have widths smaller than those used as our category means, suggesting that\npeople in the experiments are using a prior that is a compromise between this expectation and the training data.\n\n\fcorrelations are signi\ufb01cant for all participants. The mean partial correlation between xt+1 and xt\u22121\ngiven xt was low, being 0.07 and 0.11 for conditions A and B respectively, suggesting that the\nMarkov assumption was satis\ufb01ed. The partial correlations were signi\ufb01cant (p < 0.05) for only one\nparticipant in condition B.\n\n6 Conclusion\n\nWe have presented a Bayesian account of serial reproduction, and tested the basic predictions of this\naccount using two strictly controlled laboratory experiments. The results of these experiments are\nconsistent with the predictions of our account, with serial reproduction converging to a distribution\nthat is in\ufb02uenced by the prior distribution established through training. Our analysis connects the\nbiases revealed by serial reproduction with the more general Bayesian strategy of combining prior\nknowledge with noisy data to achieve higher accuracy [7]. It also shows that serial reproduction can\nbe analyzed using Markov chains and \ufb01rst-order autoregressive models, providing the opportunity\nto draw on a rich body of work on the dynamics and asymptotic behavior of such processes. These\nconnections allows us to provide a formal justi\ufb01cation for the idea that serial reproduction changes\nthe information being transmitted in a way that re\ufb02ects the biases of the people transmitting it,\nestablishing that this result holds under several different characterizations of the processes involved\nin storage and reconstruction from memory.\n\nAcknowledgments\n\nThis work was supported by grant number 0704034 from the National Science Foundation.\n\nReferences\n\n[1] D. L. Schacter, J. T. Coyle, G. D. Fischbach, M. M. Mesulam, and L. E. Sullivan, editors. Memory\ndistortion: How minds, brains, and societies reconstruct the past. Harvard University Press, Cambridge,\nMA, 1995.\n\n[2] F. C. Bartlett. Remembering: a study in experimental and social psychology. Cambridge University Press,\n\nCambridge, 1932.\n\n[3] A. Bangerter. Transformation between scienti\ufb01c and social representations of conception: The method of\n\nserial reproduction. British Journal of Social Psychology, 39:521\u2013535, 2000.\n\n[4] J. Barrett and M. Nyhof. Spreading nonnatural concepts: The role of intuitive conceptual structures in\n\nmemory and transmission of cultural materials. Journal of Cognition and Culture, 1:69\u2013100, 2001.\n\n[5] J. R. Anderson. The adaptive character of thought. Erlbaum, Hillsdale, NJ, 1990.\n[6] A. M. Liberman, F. S. Cooper, D. P. shankweiler, and M. Studdert-Kennedy. Perception of the speech\n\ncode. Psychological Review, 74:431\u2013461, 1967.\n\n[7] J. Huttenlocher, L. V. Hedges, and J. L. Vevea. Why do categories affect stimulus judgment? Journal of\n\nExperimental Psychology: General, pages 220\u2013241, 2000.\n\n[8] T. L. Grif\ufb01ths and M. L. Kalish. A Bayesian view of language evolution by iterated learning. In B. G.\nBara, L. Barsalou, and M. Bucciarelli, editors, Proceedings of the Twenty-Seventh Annual Conference of\nthe Cognitive Science Society, pages 827\u2013832. Erlbaum, Mahwah, NJ, 2005.\n\n[9] T. L. Grif\ufb01ths and M. L. Kalish. Language evolution by iterated learning with bayesian agents. Cognitive\n\nScience, 31:441\u2013480, 2007.\n\n[10] A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian data analysis. Chapman & Hall, New\n\nYork, 1995.\n\n[11] P. Hemmer and M. Steyvers. A bayesian account of reconstructive memory. In Proceedings of the 30th\n\nAnnual Conference of the Cognitive Science Society, 2008.\n\n[12] J. R. Norris. Markov Chains. Cambridge University Press, Cambridge, UK, 1997.\n[13] D. H. Brainard. The Psychophysics Toolbox. Spatial Vision, 10:433\u2013436, 1997.\n[14] D. G. Pelli. The VideoToolbox software for visual psychophysics: Transforming numbers into movies.\n\nSpatial Vision, 10:437\u2013442, 1997.\n\n\f", "award": [], "sourceid": 275, "authors": [{"given_name": "Jing", "family_name": "Xu", "institution": null}, {"given_name": "Thomas", "family_name": "Griffiths", "institution": null}]}