{"title": "Estimating Robust Query Models with Convex Optimization", "book": "Advances in Neural Information Processing Systems", "page_first": 329, "page_last": 336, "abstract": "Query expansion is a long-studied approach for improving retrieval effectiveness by enhancing the user\u00e2\u0080\u0099s original query with additional related terms. Current algorithms for automatic query expansion have been shown to consistently improve retrieval accuracy on average, but are highly unstable and have bad worst-case performance for individual queries. We introduce a novel risk framework that formulates query model estimation as a constrained metric labeling problem on a graph of term relations. Themodel combines assignment costs based on a baseline feedback algorithm, edge weights based on term similarity, and simple constraints to enforce aspect balance, aspect coverage, and term centrality. Results across multiple standard test collections show consistent and dramatic reductions in the number and magnitude of expansion failures, while retaining the strong positive gains of the baseline algorithm.", "full_text": "Estimating Robust Query Models\n\nwith Convex Optimization\n\nKevyn Collins-Thompson\u2217\n\nMicrosoft Research\n1 Microsoft Way\n\nRedmond, WA U.S.A. 98052\nkevynct@microsoft.com\n\nAbstract\n\nQuery expansion is a long-studied approach for improving retrieval effectiveness\nby enhancing the user\u2019s original query with additional related words. Current\nalgorithms for automatic query expansion can often improve retrieval accuracy\non average, but are not robust: that is, they are highly unstable and have poor\nworst-case performance for individual queries. To address this problem, we in-\ntroduce a novel formulation of query expansion as a convex optimization problem\nover a word graph. The model combines initial weights from a baseline feed-\nback algorithm with edge weights based on word similarity, and integrates simple\nconstraints to enforce set-based criteria such as aspect balance, aspect coverage,\nand term centrality. Results across multiple standard test collections show consis-\ntent and signi\ufb01cant reductions in the number and magnitude of expansion failures,\nwhile retaining the strong positive gains of the baseline algorithm. Our approach\ndoes not assume a particular retrieval model, making it applicable to a broad class\nof existing expansion algorithms.\n\n1 Introduction\n\nA major goal of current information retrieval research is to develop algorithms that can improve\nretrieval effectiveness by inferring a more complete picture of the user\u2019s information need, beyond\nthat provided by the user\u2019s query text. A query model captures a richer representation of the context\nand goals of a particular information need. For example, in the language modeling approach to\nretrieval [9], a simple query model may be a unigram language model, with higher probability given\nto terms related to the query text. Once estimated, a query model may be used for such tasks as\nquery expansion, suggesting alternate query terms to the user, or personalizing search results [11].\nIn this paper, we focus on the problem of automatically inferring a query model from the top-ranked\ndocuments obtained from an initial query. This task is known as pseudo-relevance feedback or blind\nfeedback, because we do not assume any direct input from the user other than the initial query text.\nDespite decades of research, even state-of-the-art methods for inferring query models \u2013 and in par-\nticular, pseudo-relevance feedback \u2013 still suffer from some serious drawbacks. First, past research\nefforts have focused largely on achieving good average performance, without regard for the stability\nof individual retrieval results. The result is that current models are highly unstable and have bad\nworst-case performance for individual queries. This is one signi\ufb01cant reason that Web search en-\ngines still make little or no use of automatic feedback methods. In addition, current methods do not\n\n\u2217This work was primarily done while the author was at the Language Technologies Institute, School of\n\nComputer Science, Carnegie Mellon University.\n\n\fadequately capture the relationships or tradeoffs between competing objectives, such as maximizing\nthe expected relevance weights of selected words versus the risks of those choices. This is turn leads\nto several problems.\nFirst, when term risk is ignored, the result will be less reliable algorithms for query models, as we\nshow in Section 3. Second, selection of expansion terms is typically done in a greedy fashion by\nrank or score, which ignores the properties of the terms as a set and leads to the problem of aspect\nimbalance, a major source of retrieval failures [2]. Third, few existing expansion algorithms can\noperate selectively; that is, automatically detect when a query is risky to expand, and then avoid or\nreduce expansion in such cases. The few algorithms we have seen that do attempt selective expansion\nare not especially effective, and rely on sometimes complex heuristics that are integrated in a way\nthat is not easy to untangle, modify or re\ufb01ne. Finally, for a given task there may be additional\nfactors that must be constrained, such as the computational cost of sending many expansion terms\nto the search engine. To our knowledge such situations are not handled by any current query model\nestimation methods in a principled way.\nTo remedy these problems, we need a better theoretical framework for query model estimation: one\nthat incorporates both risk and reward data about terms, that detect risky situations and expands\nselectively, that can incorporate arbitrary additional problem constraints such as a computational\nbudget, and has fast practical implementations.\nOur solution is to develop a novel formulation of query model estimation as a convex optimization\nproblem [1], by casting the problem in terms of constrained graph labeling. Informally, we seek\nquery models that use a set of terms with high expected relevance but low expected risk. This idea\nhas close connections with models of risk in portfolio optimization [7]. An optimization approach\nfrees us from the need to provide a closed-form formula for term weighting. Instead, we specify a\n(convex) objective function and a set of constraints that a good query model should satisfy, letting\nthe solver do the work of searching the space of feasible query models. This approach gives a natural\nway to perform selective expansion: if there is no feasible solution to the optimization problem, we\ndo not attempt to expand the original query. ore generally, it gives a very \ufb02exible framework for\nintegrating different criteria for expansion as optimization constraints or objectives.\nOur risk framework consists of two key parts. First, we seek to minimize an objective function that\nconsists of two criteria: term relevance, and term risk. Term risk in turn has two subcomponents:\nthe individual risk of a term, and the conditional risk of choosing one term given we have already\nchosen another. Second, we specify constraints on what \u2018good\u2019 sets of terms should look like. These\nconstraints are chosen to address traditional reasons for query drift. With these two parts, we obtain\na simple convex program for solving for the relative term weights in a query model.\n\n2 Theoretical model\nOur aim in this section is to develop a constrained optimization program to \ufb01nd stable, effective\nquery models. Typically, our optimization will embody a basic tradeoff between wanting to use\nevidence that has strong expected relevance, such as expansion terms with high relevance model\nweights, and the risk or con\ufb01dence in using that evidence. We begin by describing the objectives\nand constraints over term sets that might be of interest for estimating query models. We then describe\na set of (sometimes competing) constraints whose feasible set re\ufb02ects query models that are likely to\nbe effective and reliable. Finally, we put all these together to form the convex optimization problem.\n\n2.1 Query model estimation as graph labeling\nWe can gain some insight into the problem of query model estimation by viewing the process of\nbuilding a query as a two-class labeling problem over terms. Given a vocabulary V , for each term\nt \u2208 V we decide to either add term t to the query (assign label \u20181\u2019 to the term), or to leave it out\n(assign label \u20180\u2019). The initial query terms are given a label of \u20181\u2019. Our goal is to \ufb01nd a function\nf : V \u2192 {0, 1} that classi\ufb01es the \ufb01nite set V of |V | = K terms, choosing one of the two labels for\neach term. The terms are typically related, so that the pairwise similarity \u03c3(i, j) between any two\nterms wi, wj is represented by the weight of the edge connecting wi and wj in the undirected graph\nG = (V, E), where E is the set of all edges. The cost function L(f ) captures our displeasure for a\ngiven f, according to how badly the following two criteria are given by the labeling produced by f.\n\n\f(cid:523)\n\n(cid:523)\n\nY\n\nZ\n\nY\n\nZ\n\nFigure 1: Query model estimation as a constrained graph labeling problem using two labels (rele-\nvant, non-relevant) on a graph of pairwise term relations. The square nodes X, Y, and Z represent\nquery terms, and circular nodes represent potential expansion terms. Dark nodes represent terms\nwith high estimated label weights that are likely to be added to the initial query. Additional con-\nstraints can select sets of terms having desirable properties for stable expansion, such as a bias\ntoward relevant labels related to multiple query terms (right).\n\n\u2022 The cost ci:k gives the cost of labeling term ti with label k \u2208 {0, 1}.\n\n\u2022 The cost \u03c3i,j \u00b7 d(f (i), f (j)) gives the penalty for assigning labels f (i) and f (j) to items\ni and j when their similarity is \u03c3i,j. The function d(u, v) is a metric that is the same for\nall edges. Typically, similar items are expected to have similar labels and thus a penalty is\nassigned to the degree this expectation is violated.\n\nFor this study, we assume a very simple metric in which d(i, j) = 1 if i 6= j and 0 otherwise. In\na probabilistic setting, \ufb01nding the most probable labeling can be viewed as a form of maximum a\nposteriori (MAP) estimation over the Markov random \ufb01eld de\ufb01ned by the term graph.\nAlthough this problem is NP-hard for arbitrary con\ufb01gurations, various approximation algorithms\nexist that run in polynomial time by relaxing the constraints. Here we relax the condition that the\nlabels be integers in {0, 1} and allow real values in [0, 1]. A review of relaxations for the more\ngeneral metric labeling problem is given by Ravikumar and Lafferty [10]. The basic relaxation we\nuse is\n\ncs;jxs;j + Xs,t;j,k\n\nxs;j = 1\n\n\u03c3s,j;t,kxs;jxt;k\n\n(1)\n\nmaximize Xs;j\nsubject to Xj\n\n0 \u2264 xs;j \u2264 1.\n\nThe variable xs;j denotes the assignment value of label j for term s. Our method obtains its initial\nassignment costs cs;j from a baseline feedback method, given an observed query and corresponding\nset of query-ranked documents. For our baseline expansion method, we use the strong default feed-\nback algorithm included in Indri 2.2 based on Lavrenko\u2019s Relevance Model [5]. Further details are\navailable in [4].\nIn the next section, we discuss how to specify values for cs;j and \u03c3s,j;t,k that make sense for query\nmodel estimation. For a two-label problem where j \u2208 {0, 1}, the values of xi for one label com-\npletely determine the values for the other, since they must sum to 1, so it suf\ufb01ces to optimize over\nonly the xi;1, and for simplicity we simply refer to xi instead of xi;1.\nOur goal is to \ufb01nd a set of weights x = (x1, . . . , xK) where each xi corresponds to the weight\nin the \ufb01nal query model of term wi and thus is the relative value of each word in the expanded\nquery. The graph labeling formulation may be interpreted as combining two natural objectives:\nthe \ufb01rst maximizes the expected relevance of the selected terms, and the second minimizes the\nrisk associated with the selection. We now describe each of these in more detail, followed by a\ndescription of additional set-based constraints that are useful for query expansion.\n\n\f2.2 Relevance objectives\nGiven an initial set of term weights from a baseline expansion method c = (c1, . . . , cK ) the expected\n\nrelevance over the vocabulary V of a solution x is given by the weighted sum c \u00b7 x = Pk ckxk.\n\nEssentially, maximizing expected relevance biases the \u2018relevant\u2019 labels toward those words with the\nhighest ci values. Other relevance objective functions are also possible, as long as they are convex.\nFor example, if c and x represent probability distributions over terms, then we could replace c \u00b7 x\nwith KL(c||x) as an objective since KL-divergence is also convex in c and x.\nThe initial assignment costs (label values) c can be set using a number of methods depending on\nhow scores from the baseline expansion model are normalized.\nIn the case of Indri\u2019s language\nmodel-based expansion, we are given estimates of the Relevance Model p(w|R) over the highest-\nranking k documents1. We can also estimate a non-relevance model p(w|N ) using the collection to\napproximate non-relevant documents, or using the lowest-ranked k documents out of the top 1000\nretrieved by the initial query Q. To set cs:1, we \ufb01rst compute p(R | w) for each word w via Bayes\nTheorem,\n\np(R|w) =\n\np(w|R)\n\np(w|R) + p(w|N )\n\n(2)\n\nassuming p(R) = p(N ) = 1/2. Using the notation p(R|Q) and p(R| \u00afQ) to denote our belief that\nany query word or non-query word respectively should have label 1, the initial expected label value\nis then\n\ncs:1 =(p(R|Q) + (1 \u2212 p(R|Q)) \u00b7 p(R|ws)\n\np(R| \u00afQ) \u00b7 p(R|ws)\n\ns \u2208 Q\ns /\u2208 Q\n\n(3)\n\nfor the \u2018relevant\u2019 label. We use p(R|Q) = 0.75 and p(R| \u00afQ) = 0.5. Since the label values must sum\nto one, for binary labels we have cs:0 = 1 \u2212 cs:1.\n\n2.3 Risk objectives\nOptimizing for expected term relevance only considers one dimension of the problem. A second\ncritical objective is minimizing the risk associated with a particular term labeling. We adapt an\ninformal de\ufb01nition of risk here in which the variance of the expected relevance is a proxy for un-\ncertainty, encoded in the matrix \u03a3 with entries \u03c3ij. Using a betting analogy, the weights x = {xi}\nrepresent wagers on the utility of the query model terms. A risky strategy would place all bets on the\nsingle term with highest relevance score. A lower-risk strategy would distribute bets among terms\nthat had both a large estimated relevance and low redundancy, to cover all aspects of the query.\n\nConditional term risk. First, we consider the conditional risk \u03c3ij between pairs of terms wi and\nwj. To quantify conditional risk, we measure the redundancy of choosing word wi given that wj\nhas already been selected. This relation is expressed by choosing a symmetric similarity measure\n\u03c3(wi, wj) between wi and wj, which is rescaled into a distance-like measure d(wi, wj) with the\nformula\n\n\u03c3ij = d(wi, wj) = \u03b3 exp(\u2212\u03c1 \u00b7 \u03c3(wi, wj))\n\n(4)\n\nThe quantities \u03b3 and \u03c1 are scaling constants that depend on the output scale of \u03c3, and the choice\nof \u03b3 also controls the relative importance of individual vs. conditional term risk. In this study, our\n\u03c3(wi, wj) measure is based on term associations over the 2 \u00d7 2 contingency table of term document\ncounts. For this experiment we used the Jaccard coef\ufb01cient: future work will examine others.\n\nIndividual risk. We say that a term related to multiple query terms exhibits term centrality. Previ-\nous work has shown that central terms are more likely to be more effective for expansion than terms\nrelated to few query terms [3] [12]. We use term centrality to quantify a term\u2019s individual risk, and\nde\ufb01ne it for a term wi in terms of the vector di of all similarities of wi with all query terms. The\ncovariance matrix \u03a3 then has diagonal entries\n\n\u03c3ii = kdik2\n\n2 = Xwq \u2208Q\n\nd2(wi, wq)\n\n(5)\n\n1We use the symbols R and N to represent relevance and non-relevance respectively.\n\n\f(cid:523)\n\nY\n\n(cid:523)\n\nY\n\n(cid:523)\n\nY\n\n(cid:523)\n\nY\n\n(cid:523)\n\nY\n\n(cid:523)\n\nY\n\nBad\n\nGood\n\nLow\n\nHigh\n\nVariable\n\nCentered\n\n(a) Aspect balance\n\n(b) Aspect coverage\n\n(c) Term centering\n\nFigure 2: Three complementary criteria for expansion term weighting on a graph of candidate terms,\nand two query terms X and Y . The aspect balance constraint (left) prefers sets of expansion terms\nthat balance the representation of X and Y . The aspect coverage constraint (center) increases recall\nby allowing more expansion candidates within a distance threshold of each term. Term centering\n(right) prefers terms near the center of the graph, and thus more likely to be related to both terms,\nwith minimum variation in the distances to X and Y .\n\nOther de\ufb01nitions of centrality are certainly possible, e.g. depending on generative assumptions for\nterm distributions.\nWe can now combine relevance and risk into a single objective, and control the tradeoff with a single\nparameter \u03ba, by minimizing the function\n\nL(x) = \u2212cT x +\n\n\u03ba\n2\n\nxT \u03a3x.\n\n(6)\n\nIf \u03a3 is estimated from term co-occurrence data in the top-retrieved documents, then the condition\nto minimize xT \u03a3x also encodes the fact that we want to select expansion terms that are not all in\nthe same co-occurrence cluster. Rather, we prefer a set of expansion terms that are more diverse,\ncovering a larger range of potential topics.\n\n2.4 Set-based constraints\nOne limitation of current query model estimation methods is that they typically make greedy term-\nby-term decisions using a threshold, without considering the qualities of the set of terms as a whole.\nA one-dimensional greedy selection by term score, especially for a small number of terms, has the\nrisk of emphasizing terms related to one aspect and not others. This in turn increases the risk of\nquery drift after expansion. We now de\ufb01ne several useful constraints on query model terms: aspect\nbalance, aspect coverage, and query term support. Figure 2 gives graphical examples of aspect\nbalance, aspect coverage, and the term centrality objective.\n\nAspect balance. We make the simplistic assumption that each of a query\u2019s terms represents a\nseparate and unique aspect of the user\u2019s information need. We create the matrix A from the vectors\n\u03c6k(wi) for each query term qk, by setting Aki = \u03c6k(wi) = \u03c3ik. In effect, Ax gives the projection\nof the solution model x on each query term\u2019s feature vector \u03c6k. We de\ufb01ne the requirement that x be\nin balance to be that the vector Ax be element-wise close to the mean vector \u00b5 of the \u03c6k, within a\ntolerance \u03b6\u00b5, which we denote (with some \ufb02exibility in notation) by\n\nAx (cid:22) \u00b5 + \u03b6\u00b5.\n\n(7)\n\nTo demand an exact solution, we set \u03b6\u00b5 = 0. In reality, some slack is desirable for slightly better\nresults and so we use a small positive value for \u03b6\u00b5 such as 1.0.\nQuery term support. Another important constraint is that the set of initial query terms Q be\npredicted by the solution labeling. We express this mathematically by requiring that the the weights\nfor the \u2018relevant\u2019 label on the query terms xi:1 lie in a range li \u2264 xi \u2264 ui and in particular be above\nthe threshold li for xi \u2208 Q. Currently li is set to a default value of 0.95 for all query terms, and zero\nfor all other terms. ui is set to 1.0 for all terms. Term-speci\ufb01c values for li may also be desirable to\nre\ufb02ect the rarity or ambiguity of individual query terms.\n\n\fminimize \u2212 cT x +\n\n\u03ba\n2\n\nxT \u03a3x\n\nsubject to Ax (cid:22) \u00b5 + \u03b6\u00b5\n\nRelevance, term centrality & risk\n\nAspect balance\n\nT x \u2265 \u03b6i,\n\ngi\nli \u2264 xi \u2264 ui,\n\nwi \u2208 Q\ni = 1, . . . , K\n\nAspect coverage\nQuery term support, positivity\n\n(9)\n\n(10)\n(11)\n(12)\n\nFigure 3: The basic constrained quadratic program QMOD used for query model estimation.\n\nAspect coverage. One of the strengths of query expansion is its potential for solving the vocabu-\nlary mismatch problem by \ufb01nding different words to express the same information need. Therefore,\nwe can also require a minimal level of aspect coverage. That is, we may require more than just that\nterms are balanced evenly among all query terms: we may care about the absolute level of support\nthat exists. For example, suppose our information sources are feedback terms, and we have two\npossible term weightings that are otherwise feasible solutions. The \ufb01rst weighting has only enough\nterms selected to give a minimal non-zero but even covering to all aspects. The second weighting\nscheme has three times as many terms, but also gives an even covering. Assuming no con\ufb02icting\nconstraints such as maximum query length, we may prefer the second weighting because it increases\nthe chance we \ufb01nd the right alternate words for the query, potentially improving recall.\nWe denote the set of distances to neighboring words of query term qi by the vector gi. The projection\nT x gives us the aspect coverage, or how well the words selected by the solution x \u2018cover\u2019 term\ngi\nqi. The more expansion terms near qi that are given higher weights, the larger this value becomes.\nWhen only the query term is covered, the value of gi\nT x = \u03c3ii. We want the aspect coverage for\neach of the vectors gi to exceed a threshold \u03b6i, and this is expressed by the constraint\n\ngi\n\nT x \u2265 \u03b6i.\n\n(8)\n\nPutting together the relevance and risk objectives, and constraining by the set properties, results in\nthe following complete quadratic program for query model estimation, which we call QMOD and is\nshown in Figure 3. The role of each constraint is given in italics.\n\n3 Evaluation\n\nIn this section we summarize the effectiveness of using the QMOD convex programs to estimate\nquery models and examine how well the QMOD feasible set is calibrated to the empirical risk of\nexpansion. For space reasons we are unable to include a complete sensitivity analysis of the effect\nof the various constraints. The best risk-reward tradeoff is generally obtained with a strong query\nsupport constraint (li near 1.0) and moderate balance between individual and conditional term risk.\nWe used the following default values for the control parameters: \u03ba = 1.0, \u03b3 = 0.75, \u03b6\u00b5 = 1.0,\n\u03b6i = 0.1, ui = 1.0, and li = 0.95 for query terms and li = 0 for non-query terms.\n\n3.1 Robustness of Model Estimation\nIn this section we evaluate the robustness of the query models estimated using the convex program\nin Fig. 3 over several TREC collections. We created a histogram of MAP improvement across sets\nof topics. This is a \ufb01ne-grained look that shows the distribution of gain or loss in MAP for a given\nfeedback method. Using these histograms we can distinguish between two systems that might have\nthe same number of failures, but which help or hurt queries by very different magnitudes. The\nnumber of queries helped or hurt by expansion is shown, binned by the loss or gain in average\nprecision by using feedback. The baseline feedback here was Indri 2.2 (Modi\ufb01ed Relevance Model\nwith stoplist) [8]. The robustness histogram with results combined for all collections is shown in\nFig. 4. Both algorithms achieve the same gain in average precision over all collections (15%). Yet\nconsidering the expansion failures whose loss in average precision is more than 10%, the robust\nversion hurts more than 60% fewer queries.\n\n\fs\ne\n\ni\nr\ne\nu\nQ\n\n \nf\no\n \nr\ne\nb\nm\nu\nN\n\n40\n\n35\n\n30\n\n25\n\n20\n\n15\n\n10\n\n5\n\n0\n\ns\ne\n\ni\nr\ne\nu\nQ\n\n \nf\no\n \nr\ne\nb\nm\nu\nN\n\n80\n\n70\n\n60\n\n50\n\n40\n\n30\n\n20\n\n10\n\n0\n\n[-1 0 0,-9 0)\n\n[-9 0,-8 0)\n\n[-8 0,-7 0)\n\n[-7 0,-6 0)\n\n[-6 0,-5 0)\n\n[-5 0,-4 0)\n\n[-4 0,-3 0)\n\n[-3 0,-2 0)\n\n[-2 0,-1 0)\n\n[-1 0, 0)\n\n[0, 1 0)\n\n[1 0. 2 0)\n\n[2 0, 3 0)\n\n[3 0, 4 0)\n\n[4 0, 5 0)\n\n[5 0, 6 0)\n\n[6 0, 7 0)\n\n[7 0, 8 0)\n\n[8 0, 9 0)\n\n[9 0, 1 0 0)\n\n1 0 0 +\n\n(a) Queries hurt\n\n(b) Queries helped\n\nFigure 4: Comparison of expansion robustness for four TREC collections combined (TREC 1&2,\nTREC 7, TREC 8, wt10g). The histograms show counts of queries, binned by percent change\nin average precision. The dark bars show robust expansion performance using the QMOD convex\nprogram with default control parameters. The light bars show baseline expansion performance using\nterm relevance weights only. Both methods improve average precision by an average of 15%, but\nthe robust version hurts signi\ufb01cantly fewer queries, as evident by the greatly reduced tail on the left\nhistogram (queries hurt).\n\n3.2 Calibration of Feasible Set\nIf the constraints of a convex program are well-designed for stable query expansion, the odds of an\ninfeasible solution should be much greater than 50% for queries that are risky. In those cases, the\nalgorithm will not attempt to enhance the query. Conversely, the odds of \ufb01nding a feasible query\nmodel should ideally increase for thoese queries that are more amenable to expansion. Overall, 17%\nof all queries had infeasible programs. We binned these queries according to the actual gain or loss\nthat would have been achieved with the baseline expansion, normalized by the original number of\nqueries appearing in each bin when the (non-selective) baseline expansion is used. This gives the\nlog-odds of reverting to the original query for any given gain/loss level.\nThe results are shown in in Figure 5. As predicted, the QMOD algorithm is more likely to decide\ninfeasibility for the high-risk zones at the extreme ends of the scale. Furthermore, the odds of \ufb01nding\na feasible solution do indeed increase directly with the actual bene\ufb01ts of using expansion, up to a\npoint where we reach an average precision gain of 75% and higher. At this point, such high-reward\nqueries are considered high risk by the algorithm, and the likelihood of reverting to the original\nquery increases dramatically again. This analysis makes clear that the selective expansion behavior\nof the convex algorithm is well-calibrated to the true expansion bene\ufb01t.\n\n4 Conclusions\n\nWe have presented a new research approach to query model estimation, showing how to adapt convex\noptimization methods to the problem by casting it as constrained graph labeling. By integrating\nrelevance and risk objectives with additional constraints to selectively reduce expansion for the most\nrisky queries, our approach is able to signi\ufb01cantly reduce the downside risk of a strong baseline\nalgorithm while retaining its strong gains in average precision.\nOur expansion framework is quite general and easily accomodates further extensions and re\ufb01ne-\nments. For example, similar to methods used for portfolio optimization [6] we can assign a compu-\ntational cost to each term having non-zero weight, and add budget constraints to prefer more ef\ufb01cient\nexpansions. In addition, sensitivity analysis of the constraints is likely provide useful information\nfor active learning: interesting extensions to semi-supervised learning are possible to incorporate\nadditional observations such as relevance feedback from the user. Finally, there are a number of\n\n\fFigure 5: The log-odds of reverting to the original query as a result of selective expansion. Queries\nare binned by the percent change in average precision if baseline expansion were used. Columns\nabove the line indicate greater-than-even odds that we revert to the original query.\n\nhigher-level control parameters and it would be interesting to determine the optimal settings. The\nvalues we use have not been extensively tuned, so that further performance gains may be possible.\n\nAcknowledgments\nWe thank Jamie Callan, John Lafferty, William Cohen, and Susan Dumais for their valuable feed-\nback on many aspects of this work.\n\nReferences\n[1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.\n\n[2] C. Buckley. Why current IR engines fail. In Proceedings of the 27th Annual International ACM SIGIR\nConference on Research and Development in Information Retrieval (SIGIR 2004), pages 584\u2013585, 2004.\n\n[3] K. Collins-Thompson and J. Callan. Query expansion using random walk models. In Proc. of the 14th\n\nInternational Conf. on Information and Knowledge Management (CIKM 2005), pages 704\u2013711, 2005.\n\n[4] K. Collins-Thompson and J. Callan. Estimation and use of uncertainty in pseudo-relevance feedback. In\nProceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in\nInformation Retrieval (SIGIR 2007), pages 303\u2013310, 2007.\n\n[5] V. Lavrenko. A Generative Theory of Relevance. PhD thesis, Univ. of Massachusetts, Amherst, 2004.\n\n[6] M. S. Lobo, M. Fazel, and S. Boyd. Portfolio optimization with linear and \ufb01xed transaction costs. Annals\n\nof Operations Research, 152(1):376\u2013394, 2007.\n\n[7] H. M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77\u201391, 1952.\n\n[8] D. Metzler and W. B. Croft. Combining the language model and inference network approaches to retrieval.\n\nInformation Processing and Management, 40(5):735\u2013750, 2004.\n\n[9] J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proc. of the 1998\n\nACM SIGIR Conference on Research and Development in Information Retrieval, pages 275\u2013281, 1998.\n\n[10] P. Ravikumar and J. Lafferty. Quadratic programming relaxations for metric labeling and markov random\n\ufb01eld map estimation. In Proceedings of the 23rd International Conference on Machine Learning (ICML\n2006), pages 737\u2013744, 2006.\n\n[11] J. Teevan, S. T. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and\nactivities.\nIn Proceedings of the 28th Annual International ACM SIGIR Conference on Research and\nDevelopment in Information Retrieval (SIGIR 2005), pages 449\u2013456, New York, NY, USA, 2005. ACM.\n\n[12] J. Xu and W. B. Croft. Query expansion using local and global document analysis. In Proceedings of\nthe 1996 Annual International ACM SIGIR Conference on Research and Development in Information\nRetrieval, pages 4\u201311, 1996.\n\n\f", "award": [], "sourceid": 813, "authors": [{"given_name": "Kevyn", "family_name": "Collins-thompson", "institution": null}]}