{"title": "Can We Learn to Beat the Best Stock", "book": "Advances in Neural Information Processing Systems", "page_first": 345, "page_last": 352, "abstract": "", "full_text": "Can We Learn to Beat the Best Stock\n\nAllan Borodin1 Ran El-Yaniv2 Vincent Gogan1\n\nDepartment of Computer Science\n\nUniversity of Toronto1 Technion - Israel Institute of Technology2\n\n{bor,vincent}@cs.toronto.edu rani@cs.technion.ac.il\n\nAbstract\n\nA novel algorithm for actively trading stocks is presented. While tradi-\ntional universal algorithms (and technical trading heuristics) attempt to\npredict winners or trends, our approach relies on predictable statistical\nrelations between all pairs of stocks in the market. Our empirical results\non historical markets provide strong evidence that this type of techni-\ncal trading can \u201cbeat the market\u201d and moreover, can beat the best stock\nin the market. In doing so we utilize a new idea for smoothing critical\nparameters in the context of expert learning.\n\n1 Introduction: The Portfolio Selection Problem\n\nThe portfolio selection (PS) problem is a challenging problem for machine learning, online\nalgorithms and, of course, computational \ufb01nance. As is well known (e.g. see Lugosi [1])\nsequence prediction under the log loss measure can be viewed as a special case of portfo-\nlio selection, and perhaps more surprisingly, from a certain worst case minimax criterion,\nportfolio selection is not essentially any harder (than prediction) as shown in [2] (see also\n[1], Thm. 20 & 21). But there seems to be a qualitative difference between the practical\nutility of \u201cuniversal\u201d sequence prediction and universal portfolio selection. Simply stated,\nuniversal sequence prediction algorithms under various probabilistic and worst-case mod-\nels work very well in practice whereas the known universal portfolio selection algorithms\ndo not seem to provide any substantial bene\ufb01t over a naive investment strategy (see Sec. 4).\n\nA major pragmatic question is whether or not a computer program can consistently out-\nperform the market. A closer inspection of the interesting ideas developed in information\ntheory and online learning suggests that a promising approach is to exploit the natural\nvolatility in the market and in particular to bene\ufb01t from simple and rather persistent statis-\ntical relations between stocks rather than to try to predict stock prices or \u201cwinners\u201d. We\npresent a non-universal portfolio selection algorithm1, which does not try to predict win-\nners. The motivation behind our algorithm is the rationale behind constant rebalancing\nalgorithms and the worst case study of universal trading introduced by Cover [3]. Not only\ndoes our proposed algorithm substantially \u201cbeat the market\u201d on historical markets, it also\nbeats the best stock. So why are we presenting this algorithm and not just simply making\nmoney? There are, of course some caveats and obstacles to utilizing the algorithm. But for\nlarge investors the possibility of a goose laying silver (if not golden) eggs is not impossible.\n\n1Any PS algorithm can be modi\ufb01ed to be universal by investing any \ufb01xed fraction of the initial\n\nwealth in a universal algorithm.\n\n\f(cid:80)\n\n(cid:80)\n\nAssume a market with m stocks. Let vt = (vt(1), . . . , vt(m)) be the closing prices of the\nm stocks for the tth day, where vt(j) is the price of the jth stock. It is convenient to work\nwith relative prices xt(j) = vt(j)/vt\u22121(j) so that an investment of $d in the jth stock just\nbefore the tth period yields dxt(j) dollars. We let xt = (xt(1), . . . , xt(m)) denote the\nmarket vector of relative prices corresponding to the tth day. A portfolio b is an allocation\nof wealth in the stocks, speci\ufb01ed by the proportions b = (b(1), . . . , b(m)) of current dollar\nwealth invested in each of the stocks, where b(j) \u2265 0 and\nj b(j) = 1. The daily return\n(cid:81)n\nof a portfolio b w.r.t. a market vector x is b \u00b7 x =\nj b(j)x(j) and the (compound) total\nreturn, retX(b1, . . . , bn), of a sequence of portfolios b1, . . . , bn w.r.t. a market sequence\nt=1 bt \u00b7 xt. A portfolio selection algorithm is any deterministic or\nX = x1, . . . , xn is\nrandomized rule for specifying a sequence of portfolios.\nThe simplest strategy is to \u201cbuy-and-hold\u201d stocks using some portfolio b. We de-\nnote this strategy by BAHb and let U-BAH denote the uniform buy-and-hold when b =\n(1/m, . . . , 1/m). We say that a portfolio selection algorithm \u201cbeats the market\u201d when\nit outperforms U-BAH on a given market sequence although in practice \u201cthe market\u201d can\nbe represented by some non-uniform BAH (e.g. DJIA). Buy-and-hold strategies rely on the\ntendency of successful markets to grow. Much of modern portfolio theory focuses on how\nto choose a good b for the buy-and-hold strategy. The seminal ideas of Markowitz in [4]\nyield an algorithmic procedure for choosing the weights of the portfolio b so as to mini-\nmize the variance for any feasible expected return. This variance minimization is possible\nby placing appropriate larger weights on subsets of anti-correlated stocks, an idea which\nwe shall also utilize. We denote the optimal in hindsight buy-and-hold strategy (i.e. invest\nonly in the best stock) by BAH\u2217.\nAn alternative approach to the static buy-and-hold is to dynamically change the portfolio\nduring the trading period. This approach is often called \u201cactive trading\u201d. One example\nof active trading is constant rebalancing; namely, \ufb01x a portfolio b and (re)invest your\ndollars each day according to b. We denote this constant rebalancing strategy by CBALb\nand let CBAL\u2217 denote the optimal (in hindsight) CBAL. A constant rebalancing strategy\ncan often take advantage of market \ufb02uctuations to achieve a return signi\ufb01cantly greater\nthan that of BAH\u2217. CBAL\u2217 is always at least as good as the best stock BAH\u2217 and in some real\nmarket sequences a constant rebalancing strategy will take advantage of market \ufb02uctuations\nand signi\ufb01cantly outperform the best stock (see Table 1). For now, consider Cover and\nGluss\u2019 [5] classic (but contrived) example of a market consisting of cash and one stock and\n, . . . Now consider the CBALb\nthe market sequence of price relatives\nwith b = ( 1\n4 and on\neach even day, it is 3/2. The total return over n days is therefore (9/8)n/2, illustrating\nhow a constant rebalancing strategy can yield exponential returns in a \u201cno-growth market\u201d.\nUnder the assumption that the daily market vectors are observations of identically and\nindependently distributed (i.i.d) random variables, it is shown in [6] that CBAL\u2217 performs\nat least as good (in the sense of expected total return) as the best online portfolio selection\nalgorithm. However, many studies (see e.g. [7]) argue that stock price sequences do have\nlong term memory and are not i.i.d.\n\n2). On each odd day the daily return of CBALb is 1\n\n(cid:161) 1\n\n(cid:161) 1\n\n21 + 1\n\n2\n\n(cid:162)\n\n(cid:161)1\n\n1\n\n2 = 3\n\n(cid:162)\n\n(cid:161)1\n\n2\n\n,\n\n,\n\n2\n\n1/2\n\n2 , 1\n\n(cid:162)\n\n(cid:162)\n\n,\n\n1/2\n\nA non-traditional objective (in computational \ufb01nance) is to develop online trading strate-\ngies that are in some sense always guaranteed to perform well. Within a line of research\npioneered by Cover [5, 3, 2] one attempts to design portfolio selection algorithms that\ncan provably do well (in terms of their total return) with respect to some online or of\ufb02ine\nbenchmark algorithms. Two natural online benchmark algorithms are the uniform buy and\nhold U-BAH, and the uniform constant rebalancing strategy U-CBAL, which is CBALb with\nm). A natural of\ufb02ine benchmark is BAH\u2217 and a more challenging of\ufb02ine\nb = ( 1\nbenchmark is CBAL\u2217.\nCover and Ordentlich\u2019s Universal Portfolios algorithm [3, 2], denoted here by UNIVERSAL,\n\nm , . . . , 1\n\n\fwas proven to be universal against CBAL\u2217, in the sense that for every market sequence X of\nm stocks over n days, it guarantees a sub-exponential (indeed polynomial) ratio in n,\n\n(cid:179)\n\n(cid:180)\n\nretX(CBAL\n\n\u2217)/retX(UNIVERSAL) \u2264 O\n\nm\u22121\n\n2\n\nn\n\n(1)\n\n2\n\nm\u22121\n\nFrom a theoretical perspective this is surprising as the ratio is a polynomial in n (for \ufb01xed\nm) whereas CBAL\u2217 is capable of exponential returns. From a practical perspective, while the\nis not very useful, the motivation that underlies the potential of CBAL algorithms\nratio n\nis useful! We follow this motivation and develop a new algorithm which we call ANTICOR.\nBy attempting to systematically follow the constant rebalancing philosophy, ANTICOR is\ncapable of some extraordinary performance in the absence of transaction costs, or even\nwith very small transaction costs.\n\n2 Trying to Learn the Winners\n\nThe most direct approach to expert learning and portfolio selection is a \u201c(reward based)\nweighted average prediction\u201d algorithm which adaptively computes a weighted average of\nexperts by gradually increasing (by some multiplicative or additive update rule) the relative\nweights of the more successful experts. For example, in the context of the PS problem\nconsider the \u201cexponentiated gradient\u201d EG(\u03b7) algorithm proposed by Helmbold et al. [8].\nThe EG(\u03b7) algorithm computes the next portfolio to be\n\n(cid:80)m\nbt(j) exp{\u03b7xt(j)/(bt \u00b7 xt)}\nj=1 bt(j) exp{\u03b7xt(j)/(bt \u00b7 xt)}\n\nbt+1(j) =\n\n(cid:112)\n\nwhere \u03b7 is a \u201clearning rate\u201d parameter.\nEG was designed to greedily choose the best\nportfolio for yesterday\u2019s market xt while at the same time paying a penalty from mov-\ning far from yesterday\u2019s portfolio. For a universal bound on EG, Helmbold et al.\nset\n2(log m)/n where xmin is a lower bound on any price relative.2 It is easy\n\u03b7 = 2xmin\nto see that as n increases, \u03b7 decreases to 0 so that we can think of \u03b7 as being very small in\norder to achieve universality. When \u03b7 = 0, the algorithm EG(\u03b7) degenerates to the uniform\nCBAL which is not a universal algorithm. It is also the case that if each day the price relatives\nfor all stocks were identical, then EG (as well as other PS algorithms) will converge to the\nuniform CBAL. Combining a small learning rate with a \u201creasonably balanced\u201d market we\nexpect the performance of EG to be similar to that of the uniform CBAL and this is con\ufb01rmed\nby our experiments (see Table1).3\nCover\u2019s universal algorithms adaptively learn each day\u2019s portfolio by increasing the weights\nof successful CBALs. The update rule for these universal algorithms is\n\n(cid:82)\n\n(cid:82)\n\nbt+1 =\n\nb \u00b7 rett(CBALb)d\u00b5(b)\nrett(CBALb)d\u00b5(b)\n\n,\n\nwhere \u00b5(\u00b7) is some prior distribution over portfolios. Thus, the weight of a possible port-\nfolio is proportional to its total return rett(b) thus far times its prior. The particular uni-\nversal algorithm we consider in our experiments uses the Dirichlet prior (with parameters\n2)) [2]. Within a constant factor, this algorithm attains the optimal ratio (1) with\n( 1\n2 , . . . , 1\nrespect to CBAL\u2217.4 The algorithm is equivalent to a particular static distribution over the\n\n2Helmbold et al. show how to eliminate the need to know xmin and n. While EG can be made\n\nuniversal, its performance ratio is only sub-exponential (and not polynomial) in n.\n\n3Following Helmbold et al. we \ufb01x \u03b7 = 0.01 in our experiments.\n4Experimentally (on our datasets) there is a negligible difference between the uniform universal\n\nalgorithm in [3] and the above Dirichlet universal algorithm.\n\n\fclass of all CBALs. This equivalence helps to demystify the universality result and also\nshows that the algorithm can never outperform CBAL\u2217.\nA different type of \u201cwinner learning\u201d algorithm can be obtained from any sequence predic-\ntion strategy. For each stock, a (soft) sequence prediction algorithm provides a probability\np(j) that the next symbol will be j \u2208 {1, . . . , m}. We view this as a prediction that stock\nj will have the best price relative for the next day and set bt+1(j) = pj. We consider pre-\ndictions made using the prediction component of the well-known Lempel-Ziv (LZ) lossless\ncompression algorithm [9]. This prediction component is nicely described in Langdon [10]\nand in Feder [11]. As a prediction algorithm, LZ is provably powerful in various senses.\nFirst it can be shown that it is asymptotically optimal with respect to any stationary and\nergodic \ufb01nite order Markov source (Rissanen [12]). Moreover, Feder shows that LZ is also\nuniversal in a worst case sense with respect to the (of\ufb02ine) benchmark class of all \ufb01nite\nstate prediction machines. To summarize, the common approach to devising PS algorithms\nhas been to attempt and learn winners using winner learning schemes.\n\n3 The Anticor Algorithm\n\nWe propose a different approach, motivated by the CBAL \u201cphilosophy\u201d. How can we inter-\npret the success of the uniform CBAL on the Cover and Gluss example of Sec. 1? Clearly,\nthe uniform CBAL here is taking advantage of price \ufb02uctuation by constantly transferring\nwealth from the high performing stock to the anti-correlated low performing stock. Even\nin a less contrived market, we should be able to take advantage when a stock is currently\noutperforming other stocks especially if this strong performance is anti-correlated with the\nperformance of these other stocks. Our ANTICORw algorithm considers a short market his-\ntory (consisting of two consecutive \u201cwindows\u201d, each of w trading days) so as to model\nstatistical relations between each pair of stocks. Let\nLX1 = log(xt\u22122w+1), . . . , log(xt\u2212w)T and LX2 = log(xt\u2212w+1), . . . , log(xt)T ,\nwhere log(xk) denotes (log(xk(1)), . . . , log(xk(m))). Thus, LX1 and LX2 are the two\nvector sequences (equivalently, two w \u00d7 m matrices) constructed by taking the logarithm\nover the market subsequences corresponding to the time windows [t \u2212 2w + 1, t \u2212 w]\nand [t \u2212 w + 1, t], respectively. We denote the jth column of LXk by LXk(j). Let\n\u00b5k = (\u00b5k(1), . . . , \u00b5k(m)), be the vectors of averages of columns of LXk (that is,\n\u00b5k(j) = E{LXk(j)}). Similarly, let \u03c3k, be the vector of standard deviations of columns\nof LXk. The cross-correlation matrix (and its normalization) between column vectors in\nLX1 and LX2 are de\ufb01ned as:\n\nMcov(i, j) = (LX1(i) \u2212 \u00b51(i))T (LX2(j) \u2212 \u00b52(j));\n\n(cid:189) Mcov(i,j)\n\u03c31(i)\u03c32(j) \u03c31(i), \u03c32(j) (cid:54)= 0;\n0\n\notherwise.\n\nMcor(i, j)\n\nMcor(i, j) \u2208 [\u22121, 1] measures the correlation between log-relative prices of stock i over\nthe \ufb01rst window and stock j over the second window. For each pair of stocks i and j we\ncompute claimi\u2192j, the extent to which we want to shift our investment from stock i to\nstock j. Namely, there is such a claim iff \u00b52(i) > \u00b52(j) and Mcor(i, j) > 0 in which case\nclaimi\u2192j = Mcor(i, j) + A(i) + A(j) where A(h) = |Mcor(h, h)| if Mcor(h, h) < 0,\nelse 0. Following our interpretation for the success of a CBAL, Mcor(i, j) > 0 is used\nto predict that stocks i and j will be correlated in consecutive windows (i.e.\nthe cur-\nrent window and the next window based on the evidence for the last two windows) and\nMcor(h, h) < 0 predicts that stock h will be anti-correlated with itself over consec-\nj(cid:54)=i[transferj\u2192i \u2212 transferi\u2192j] where\nutive windows. Finally, bt+1(i) = \u02dcbt(i) +\ntransferi\u2192j = \u02dcbt(i) \u00b7 claimi\u2192j/\nj claimi\u2192j and \u02dcbt is the resulting portfolio just af-\nter market closing (on day t).\n\n(cid:80)\n\n(cid:80)\n\n\fFigure 1: ANTICORw\u2019s total return (per $1 investment) vs. window size 2 \u2264 w \u2264 30 for\nNYSE (left) and SP500 (right).\n\nOur ANTICORw algorithm has one critical parameter, the window size w. In Figure 1 we\ndepict the total return of ANTICORw on two historical datasets as a function of the window\nsize w = 2, . . . , 30. As we might expect, the performance of ANTICORw depends signi\ufb01-\ncantly on the window size. However, for all w, ANTICORw beats the uniform market and,\nmoreover, it beats the best stock using most window sizes. Of course, in online trading we\ncannot choose w in hindsight. Viewing the ANTICORw algorithms as experts, we can try to\nlearn the best expert. But the windows, like individual stocks, induce a rather volatile set\nof experts and standard expert combination algorithms [13] tend to fail.\nAlternatively, we can adaptively learn and invest in some weighted average of all ANTICORw\nalgorithms with w less than some maximum W . The simplest case is a uniform invest-\nment on all the windows; that is, a uniform buy-and-hold investment on the algorithms\nANTICORw, w \u2208 [2, W ], denoted by BAHW (ANTICOR). Figure 2 (left) graphs the total return\nof BAHW (ANTICOR) as a function of W for all values of 2 \u2264 W \u2264 50 with respect to the\nNYSE dataset (see details below). Similar graphs for the other datasets we consider appear\nqualitatively the same and the choice W = 30 is clearly not optimal. However, for all\nW \u2265 3, BAHW (ANTICOR) beats the best stock in all our experiments.\n\nFigure 2: Left: BAHW (ANTICOR)\u2019s total return (per $1 investment) as a function of the\nmaximal window W . Right: Cumulative returns for last month of the DJIA dataset: stocks\n(left panel); ANTICORw algorithms trading the stocks (denoted ANTICOR1, middle panel);\nANTICORw algorithms trading the ANTICOR algorithms (right panel).\n\nSince we now consider the various algorithms as stocks (whose prices are determined by\n\n251015202530100101102105108NYSE: Anticorw vs. window sizeWindow Size (w)Total Return (log\u2212scale)BAH(Anticorw)AnticorwBest StockMarketAnticorw Best Stock 510152025301 2 4 6 8 1012SP500: Anticor vs. window sizeWindow Size (w)Total ReturnBAH(Anticorw)AnticorwBest StockMarket ReturnAnticorw Best Stock 5101520253035404550100101102103104105106107NYSE: Total Return vs. Max WindowMaximal Window size (W)Total Return (log\u2212scale)BAHW(Anticor)Best StockMArketBAHW(Anticor) Best Stock 5101520250.40.50.60.70.80.911.1DaysTotal ReturnStocks51015202511.21.41.61.822.2DaysAnticor15101520251.61.822.22.42.62.8DaysAnticor2DJIA: Dec 14, 2002 \u2212 Jan 14, 2003 \fthe cumulative returns of the algorithms), we are back to our original portfolio selection\nproblem and if the ANTICOR algorithm performs well on stocks it may also perform well on\nalgorithms. We thus consider active investment in the various ANTICORw algorithms using\nANTICOR. We again consider all windows w \u2264 W . Of course, we can continue to compound\nthe algorithm any number of times. Here we compound twice and then use a buy-and-hold\ninvestment. The resulting algorithm is denoted BAHW (ANTICOR(ANTICOR)). One impact\nof this compounding, depicted in Figure 2 (right), is to smooth out the anti-correlations\nexhibited in the stocks.\nIt is evident that after compounding twice the returns become\nalmost completely correlated thus diminishing the possibility that additional compounding\nwill substantially help.5 This idea for eliminating critical parameters may be applicable in\nother learning applications. The challenge is to understand the conditions and applications\nin which the process of compounding algorithms will have this smoothing effect!\n\n4 Experimental Results\n\nWe present an experimental study of the the ANTICOR algorithm and the three online learn-\ning algorithms described in Sec. 2. We focus on BAH30(ANTICOR), abbreviated by ANTI1\nand BAH30(ANTICOR(ANTICOR)), abbreviated by ANTI2. Four historical datasets are used.\nThe \ufb01rst NYSE dataset, is the one used in [3, 2, 8, 14]. This dataset contains 5651 daily\nprices for 36 stocks in the New York Stock Exchange (NYSE) for the twenty two year pe-\nriod July 3rd, 1962 to Dec 31st, 1984. The second TSE dataset consists of 88 stocks from\nthe Toronto Stock Exchange (TSE), for the \ufb01ve year period Jan 4th, 1994 to Dec 31st,\n1998. The third dataset consists of the 25 stocks from SP500 which (as of Apr. 2003) had\nthe largest market capitalization. This set spans 1276 trading days for the period Jan 2nd,\n1998 to Jan 31st, 2003. The fourth dataset consists of the thirty stocks composing the Dow\nJones Industrial Average (DJIA) for the two year period (507 days) from Jan 14th, 2001 to\nJan 14th, 2003.6\nThese four datasets are quite different in nature (the market returns for these datasets appear\nin the \ufb01rst row of Table 1). While every stock in the NYSE increased in value, 32 of the\n88 stocks in the TSE lost money, 7 of the 25 stocks in the SP500 lost money and 25 of\nthe 30 stocks in the \u201cnegative market\u201d DJIA lost money. All these sets include only highly\nliquid stocks with huge market capitalizations. In order to maximize the utility of these\ndatasets and yet present rather different markets, we also ran each market in reverse. This\nis simply done by reversing the order and inverting the relative prices. The reverse datasets\nare denoted by a \u2018-1\u2019 superscript. Some of the reverse markets are particularly challenging.\nFor example, all of the NYSE\u22121 stocks are going down. Note that the forward and reverse\nmarkets (i.e. U-BAH) for the TSE are both increasing but that the TSE\u22121 is also a challenging\nmarket since so many stocks (56 of 88) are declining.\n\nTable 1 reports on the total returns of the various algorithms for all eight datasets. We see\nthat prediction algorithms such as LZ can do quite well but the more aggressive ANTI1 and\nANTI2 have excellent and sometimes fantastic returns. Note that these active strategies beat\nthe best stock and even CBAL\u2217 in all markets with the exception of the TSE\u22121 in which\nthey still signi\ufb01cantly outperform the market. The reader may well be distrustful of what\nappears to be such unbelievable returns for ANTI1 and ANTI2 especially when applied to\nthe NYSE dataset. However, recall that the NYSE dataset consists of n = 5651 trading\ndays and the y such that yn = the total NYSE return is approximately 1.0029511 for ANTI1\n(respectively, 1.0074539 for ANTI2); that is, the average daily increase is less than .3%\n\n5This smoothing effect also allows for the use of simple prediction algorithms such as \u201cexpert\nadvice\u201d algorithms [13], which can now better predict a good window size. We have not explored\nthis direction.\nfrom http://www.cs.technion.ac.il/\u223crani/portfolios.\n\n6The four datasets, including their sources and individual stock compositions can be downloaded\n\n\f(respectively, .75%). Thus a transaction cost of 1% can present a signi\ufb01cant challenge\nto such active trading strategies (see also Sec. 5). We observe that UNIVERSAL and EG\nhave no substantial advantage over U-CBAL. Some previous expositions of these algorithms\nhighlighted particular combinations of stocks where the returns signi\ufb01cantly outperformed\nUNIVERSAL and the best stock. But the same can be said for U-CBAL.\n\nAlgorithm\nMARKET (U-BAH)\nBEST STOCK\nCBAL\u2217\nU-CBAL\nANTI1\nANTI2\nLZ\nEG\nUNIVERSAL\n\nNYSE\n14.49\n54.14\n250.59\n27.07\n17,059,811.56\n238,820,058.10\n79.78\n27.08\n26.99\n\nTSE\n1.61\n6.27\n6.77\n1.59\n26.77\n39.07\n1.32\n1.59\n1.59\n\nSP500\n1.34\n3.77\n4.06\n1.64\n5.56\n5.88\n1.67\n1.64\n1.62\n\nDJIA\n0.76\n1.18\n1.23\n0.81\n1.59\n2.28\n0.89\n0.81\n0.80\n\nNYSE\u22121\n0.11\n0.32\n2.86\n0.22\n246.22\n1383.78\n5.41\n0.22\n0.22\n\nTSE\u22121\n1.67\n37.64\n58.61\n1.18\n7.12\n7.27\n4.80\n1.19\n1.19\n\nSP500\u22121\n0.87\n1.65\n1.91\n1.09\n6.61\n9.69\n1.20\n1.09\n1.07\n\nDJIA\u22121\n1.43\n2.77\n2.97\n1.53\n3.67\n4.60\n1.83\n1.53\n1.53\n\nTable 1: Monetary returns in dollars (per $1 investment) of various algorithms for four\ndifferent datasets and their reversed versions. The winner and runner-up for each market\nappear in boldface. All \ufb01gures are truncated to two decimals.\n\n5 Concluding Remarks\n\n(cid:81)\n\nt\n\n\u03b3\n\nj\n\n(cid:179)\n\nbt \u00b7 xt(1 \u2212(cid:80)\n\n(cid:180)\n2|bt(j) \u2212 \u02dcbt(j)|)\n\nWhen handling a portfolio of m stocks our algorithm may perform up to m transac-\ntions per day. A major concern is therefore the commissions it will incur. Within\nthe proportional commission model (see e.g.\n[14] and [15], Sec. 14.5.4) there exists\na fraction \u03b3 \u2208 (0, 1) such that an investor pays at a rate of \u03b3/2 for each buy and\nfor each sell. Therefore, the return of a sequence b1, . . . , bn of portfolios with re-\nspect to a market sequence x1, . . . , xn is\n, where\n\u02dcbt = 1\n(bt(1)xt(1), . . . , bt(m)xt(m)). Our investment algorithm in its simplest form\nbt\u00b7xt\ncan tolerate very small proportional commission rates and still beat the best stock.7 We\nnote that Blum and Kalai [14] showed that the performance guarantee of UNIVERSAL still\nholds (and gracefully degrades) in the case of proportional commissions. Many current\nonline brokers only charge a small per share commission rate. A related problem that one\nmust face when actually trading is the difference between bid and ask prices. These bid-ask\nspreads (and the availability of stocks for both buying and selling) are typically functions\nof stock liquidity and are typically smaller for large market capitalization stocks. We con-\nsider here only very large market cap stocks. As a \ufb01nal caveat, we note that we assume\nthat any one portfolio selection algorithm has no impact on the market! But just like any\ngoose laying golden eggs, widespread use will soon lead to the end of the goose; that is,\nthe market will quickly react.\n\nAny report of abnormal returns using historical markets should be suspected of \u201cdata\nsnooping\u201d. In particular, when a dataset is excessively mined by testing many strategies\nthere is a substantial chance that one of the strategies will be successful by simple over-\n\ufb01tting. Another data snooping hazard is stock selection. For example, the 36 stocks se-\nlected for the NYSE dataset were all known to have survived for 22 years. Our ANTICOR\nalgorithms were fully developed using only the NYSE and TSE datasets. The DJIA and\nSP500 sets were obtained (from public domain sources) after the algorithms were \ufb01xed.\nFinally, our algorithm has one parameter (the maximal window size W ). Our experiments\nindicate that the algorithm\u2019s performance is robust with respect to W (see Figure 2).\n\n7For example, with \u03b3 = 0.1% we can typically beat the best stock. These results will be presented\n\nin the full paper.\n\n\fA number of well-respected works report on statistically robust \u201cabnormal\u201d returns for\nsimple \u201ctechnical analysis\u201d heuristics, which slightly beat the market. For example, the\nlandmark study of Brock et al. [16] apply 26 simple trading heuristics to the DJIA index\nfrom 1897 to 1986 and provide strong support for technical analysis heuristics. While\nconsistently beating the market is considered a great (if not impossible) challenge, our\napproach to portfolio selection indicates that beating the best stock is an achievable goal.\nWhat is missing at this point of time is an analytical model which better explains why\nour active trading strategies are so successful. In this regard, we are investigating various\n\u201cstatistical adversary\u201d models along the lines suggested by [17, 18]. Namely, we would\nlike to show that an algorithm performs well (relative to some benchmark) for any market\nsequence that satis\ufb01es certain constraints on its empirical statistics.\n\nReferences\n\n[1] G.\n\nLugosi.\n\nprediction\n\nof\n\nindividual\n\nsequences.\n\nURL:http://www.econ.upf.es/\u223clugosi/ihp.ps, 2001.\n\nLectures\n\non\n\n[2] T.M. Cover and E. Ordentlich. Universal portfolios with side information. IEEE Transactions\n\non Information Theory, 42(2):348\u2013363, 1996.\n\n[3] T.M. Cover. Universal portfolios. Mathematical Finance, 1:1\u201329, 1991.\n[4] H. Markowitz. Portfolio Selection: Ef\ufb01cient Diversi\ufb01cation of Investments. John Wiley and\n\nSons, 1959.\n\n[5] T.M. Cover and D.H. Gluss. Empirical bayes stock market portfolios. Advances in Applied\n\nMathematics, 7:170\u2013181, 1986.\n\n[6] T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., 1991.\n[7] A. Lo and C. MacKinlay. A Non-Random Walk Down Wall Street. Princeton University Press,\n\n1999.\n\n[8] D.P. Helmbold, R.E. Schapire, Y. Singer, and M.K. Warmuth. Portfolio selection using multi-\n\nplicative updates. Mathematical Finance, 8(4):325\u2013347, 1998.\n\n[9] J. Ziv and A. Lempel. Compression of individual sequences via variable rate coding. IEEE\n\nTransactions on Information Theory, 24:530\u2013536, 1978.\n\n[10] G.G. Langdon. A note on the lempel-ziv model for compressing individual sequences. IEEE\n\nTransactions on Information Theory, 29:284\u2013287, 1983.\n\n[11] M. Feder. Gambling using a \ufb01nite state machine. IEEE Transactions on Information Theory,\n\n37:1459\u20131465, 1991.\n\n[12] J. Rissanen. A universal data compression system. IEEE Transactions on Information Theory,\n\n29:656\u2013664, 1983.\n\n[13] N. Cesa-Bianchi, Y. Freund, D. Haussler, D.P. Helmbold, R.E. Schapire, and M.K. Warmuth.\n\nHow to use expert advice. Journal of the ACM, 44(3):427\u2013485, May 1997.\n\n[14] A. Blum and A. Kalai. Universal portfolios with and without transaction costs. Machine Learn-\n\ning, 30(1):23\u201330, 1998.\n\n[15] A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis. Cambridge Uni-\n\nversity Press, 1998.\n\n[16] L. Brock, J. Lakonishok, and B. LeBaron. Simple technical trading rules and the stochastic\n\nproperties of stock returns. Journal of Finance, 47:1731\u20131764, 1992.\n\n[17] P. Raghavan. A statistical adversary for on-line algorithms. DIMACS Series in Discrete Mathe-\n\nmatics and Theoretical Computer Science, 7:79\u201383, 1992.\n\n[18] A. Chou, J.R. Cooperstock, R. El-Yaniv, M. Klugerman, and T. Leighton. The statistical ad-\nIn Proceedings of the 6th Annual\n\nversary allows optimal money-making trading strategies.\nACM-SIAM Symposium on Discrete Algorithms, 1995.\n\n\f", "award": [], "sourceid": 2453, "authors": [{"given_name": "Allan", "family_name": "Borodin", "institution": null}, {"given_name": "Ran", "family_name": "El-Yaniv", "institution": null}, {"given_name": "Vincent", "family_name": "Gogan", "institution": null}]}