{"title": "Bayesian Image Super-resolution, Continued", "book": "Advances in Neural Information Processing Systems", "page_first": 1089, "page_last": 1096, "abstract": null, "full_text": "Bayesian Image Super-resolution, Continued\n\nLyndsey C. Pickup, David P. Capel\u2020, Stephen J. Roberts Andrew Zisserman\n\nInformation Engineering Building, Dept. of Eng. Science, Parks Road, Oxford, OX1 3PJ, UK\n\n{elle,sjrob,az}@robots.ox.ac.uk\n\n\u2020 2D3, d.capel@2d3.com\n\nAbstract\n\nThis paper develops a multi-frame image super-resolution approach from a\nBayesian view-point by marginalizing over the unknown registration parameters\nrelating the set of input low-resolution views. In Tipping and Bishop\u2019s Bayesian\nimage super-resolution approach [16], the marginalization was over the super-\nresolution image, necessitating the use of an unfavorable image prior. By inte-\ngrating over the registration parameters rather than the high-resolution image, our\nmethod allows for more realistic prior distributions, and also reduces the dimen-\nsion of the integral considerably, removing the main computational bottleneck of\nthe other algorithm. In addition to the motion model used by Tipping and Bishop,\nillumination components are introduced into the generative model, allowing us\nto handle changes in lighting as well as motion. We show results on real and\nsynthetic datasets to illustrate the ef\ufb01cacy of this approach.\n\n1 Introduction\n\nMulti-frame image super-resolution refers to the process by which a group of images of the same\nscene are fused to produce an image or images with a higher spatial resolution, or with more visible\ndetail in the high spatial frequency features [7]. Such problems are common, with everything from\nholiday snaps and DVD frames to satellite terrain imagery providing collections of low-resolution\nimages to be enhanced, for instance to produce a more aesthetic image for media publication [15],\nor for higher-level vision tasks such as object recognition or localization [5].\n\nLimits on the resolution of the original imaging device can be improved by exploiting the relative\nsub-pixel motion between the scene and the imaging plane. No matter how accurate the registration\nestimate, there will be some residual uncertainty associated with the parameters [13]. We propose a\nscheme to deal with this uncertainty by integrating over the registration parameters, and demonstrate\nimproved results on synthetic and real digital image data.\n\nImage registration and super-resolution are often treated as distinct processes, to be considered se-\nquentially [1, 3, 7]. Hardie et al. demonstrated that the low-resolution image registration can be\nupdated using the super-resolution image estimate, and that this improves a Maximum a Posteriori\n(MAP) super-resolution image estimate [5]. More recently, Pickup et al. used a similar joint MAP\napproach to learn more general geometric and photometric registrations, the super-resolution image,\nand values for the prior\u2019s parameters simultaneously [12]. Tipping and Bishop\u2019s Bayesian image\nsuper-resolution work [16] uses a Maximum Likelihood (ML) point estimate of the registration pa-\nrameters and the camera imaging blur, found by integrating the high-resolution image out of the\nregistration problem and optimizing the marginal probability of the observed low-resolution images\ndirectly. This gives an improvement in the accuracy of the recovered registration (measured against\nknown truth on synthetic data) compared to the MAP approach.\n\nThe image-integrating Bayesian super-resolution method [16] is extremely costly in terms of com-\nputation time, requiring operations that scale with the cube of the total number of high-resolution\n\n\fpixels, severely limiting the size of the image patches over which they perform the registration (they\nuse 9 \u00d7 9 pixel patches). The marginalization also requires a form of prior on the super-resolution\nimage that renders the integral tractable, though priors such as Tipping and Bishop\u2019s chosen Gaus-\nsian form are known to be poor for tasks such as edge preservation, and much super-resolution work\nhas employed other more favorable priors [2, 3, 4, 11, 14].\n\nIt is generally more desirable to integrate over the registration parameters rather than the super-\nresolution image, because it is the registration that constitutes the \u201cnuisance parameters\u201d, and the\nsuper-resolution image that we wish to estimate. We derive a new view of Bayesian image super-\nresolution in which a MAP high-resolution image estimate is found by marginalizing over the\nuncertain registration parameters. Memory requirements are considerably lower than the image-\nintegrating case; while the algorithm is more costly than a simple MAP super-resolution estimate, it\nis not infeasible to run on images of several hundred pixels in size.\n\nSections 2 and 3 develop the model and the proposed objective function. Section 4 evaluates re-\nsults on synthetically-generated sequences (with ground truth for comparison), and on a real data\nexample. A discussion of this approach and concluding remarks can be found in section 5.\n\n2 Generative model\n\nThe generative model for multi-frame super-resolution assumes a known scene x (vectorized, size\nN \u00d7 1), and a given registration vector \u03b8(k). These are used to generate a vectorized low-resolution\nimage y(k) with M pixels through a system matrix W(k). Gaussian i.i.d. noise with precision \u03b2 is\nthen added to y(k),\n\ny(k) = \u03bb(k)\n\n\u03b1 W(cid:16)\u03b8(k)(cid:17) x + \u03bb(k)\n\n\u03b2 + \u0001(k)\n\n(1)\n\n\u0001(k) \u223c N (cid:0)0, \u03b2\u22121I(cid:1) .\n\n(2)\nPhotometric parameters \u03bb\u03b1 and \u03bb\u03b2 provide a global af\ufb01ne correction for the scene illumination, and\n\u03bb\u03b2 is simply an M \u00d7 1 vector \ufb01lled out with the value of \u03bb\u03b2. Each row of W(k) constructs a single\npixel in y(k), and the row\u2019s entries are the vectorized and point-spread function (PSF) response\nfor each low-resolution pixel, in the frame of the super-resolution image [2, 3, 16]. The PSF is\nusually assumed to be an isotropic Gaussian on the imaging plane, though for some motion models\n(e.g. planar projective) this does not necessarily lead to a Gaussian distribution on the frame of x.\n\nFor an individual low-resolution image, given registrations and x, the data likelihood is\n\nx, \u03b8(k), \u03bb(k)(cid:17) = (cid:18) \u03b2\n2\u03c0(cid:19)\n\nM\n2\n\nexp(cid:26)\u2212\n\np(cid:16)y(k)(cid:12)(cid:12)(cid:12)\n\n\u03b2\n\n2 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\ny(k) \u2212 \u03bb(k)\n\n2\n\n2(cid:27) .\n\u03b2 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\u03b1 W(cid:16)\u03b8(k)(cid:17) x \u2212 \u03bb(k)\n\n(3)\n\nWhen the registration is known approximately, for instance by pre-registering inputs, the uncertainty\n(k) for each image\u2019s parameter\ncan be modeled as a Gaussian perturbation about the mean estimate \u00af\u03b8\nset, with covariance C, which we restrict to be a diagonal matrix,\n\n(k)\n\n\uf8f9\n\uf8fa\uf8fb\n\n\u03b8(k)\n\u03bb(k)\n\u03b1\n\u03bb(k)\n\u03b2\n\n\u00af\u03b8\n(k)\n\u00af\u03bb\n\u03b1\n\u00af\u03bb(k)\n\u03b2\n\n= \uf8ee\n\uf8ef\uf8f0\n\n\uf8f9\n\uf8ee\n\uf8fa\uf8fb\n\uf8ef\uf8f0\np(cid:16)\u03b8(k), \u03bb(k)(cid:17) = (cid:18) |C\u22121|\n(2\u03c0)n(cid:19)\n\n\u03b4(k) \u223c N (0, C)\n\n+ \u03b4(k)\n\n1\n\n2\n\nexp(cid:26)\u2212\n\n1\n2\n\n\u03b4(k)T C\u22121\u03b4(k)(cid:27) .\n\n(4)\n\n(5)\n\n(6)\n\nA Huber prior is assumed for the directional image gradients Dx in the super-resolution image x\n(in the horizontal, vertical, and two diagonal directions),\n\u03bd\n2\n\np (x) =\n\n1\nZx\n\n(7)\n\nexpn\u2212\n\n\u03c1 (Dx, \u03b1)o\n\nif |z| < \u03b1\notherwise\n\n\u03c1(z, \u03b1) = (cid:26) z2\n\n2\u03b1|z| \u2212 \u03b12\n\n(8)\n\n\fwhere \u03b1 is a parameter of the Huber potential function, and \u03bd is the prior strength parameter. This\nbelongs to a family of functions often favored over Gaussians for super-resolution image priors [2,\n3, 14] because the Huber distribution\u2019s heavy tails mean image edges are penalized less severely.\nThe dif\ufb01culty in computing the partition function Zx is a consideration when marginalizing over x\nas in [16], though for the MAP image estimate, a value for this scale factor is not required.\n\nRegardless of the exact forms of these probability distributions, probabilistic super-resolution algo-\nrithms can usually be interpreted in one of the following ways.\n\nThe most popular approach to super-resolution is to obtain a MAP estimate, typically using an\n\niterative scheme to maximize p(cid:0)x(cid:12)(cid:12)(cid:8)y(k), \u03b8(k), \u03bb(k)(cid:9)(cid:1) with respect to x, where\np (x)QK\nk=1 p(cid:0)y(k)(cid:12)(cid:12)x, \u03b8(k), \u03bb(k)(cid:1)\np(cid:0)(cid:8)y(k)(cid:9)(cid:12)(cid:12)(cid:8)\u03b8(k), \u03bb(k)(cid:9)(cid:1)\n\np(cid:16)x(cid:12)(cid:12)(cid:12)ny(k), \u03b8(k), \u03bb(k)o(cid:17) =\n\nand the denominator is an unknown scaling factor.\n\nTipping and Bishop\u2019s approach takes an ML estimate of the registration by marginalizing over x,\nthen calculates the super-resolution estimate as in (9). While Tipping and Bishop did not include a\nphotometric model, the equivalent expression to be maximized with respect to \u03b8 and \u03bb is\n\n,\n\n(9)\n\np(cid:16)ny(y)o(cid:12)(cid:12)(cid:12)n\u03b8(k), \u03bb(k)o(cid:17) = Z p (x)\n\nK\n\nYk=1\n\np(cid:16)y(y)(cid:12)(cid:12)(cid:12)\n\nx, \u03b8(k), \u03bb(k)(cid:17) dx.\n\nNote that Tipping and Bishop\u2019s work does employ the same data likelihood expression as in (3),\nwhich forced them to select a Gaussian form for p (x), rather than a more suitable image prior, in\norder to keep the integral tractable.\n\n(10)\n\nFinally, in this paper we \ufb01nd x through marginalizing over \u03b8 and \u03bb, so that a MAP estimate of x can\n\nbe obtained by maximizing p(cid:0)x(cid:12)(cid:12)(cid:8)y(k)(cid:9)(cid:1) directly with respect to x. This is achieved by \ufb01nding\np(cid:16)x(cid:12)(cid:12)(cid:12)ny(k)o(cid:17) =\nx, \u03b8(k), \u03bb(k)(cid:17) d {\u03b8, \u03bb} , (11)\n\nwhich is developed further in the next section. Note that the integral does not involve the prior, p (x).\n\np(cid:16)\u03b8(k), \u03bb(k)(cid:17) p(cid:16)y(k)(cid:12)(cid:12)(cid:12)\n\np(cid:0)(cid:8)y(k)(cid:9)(cid:1)Z\n\nYk=1\n\np(x)\n\nK\n\n3 Marginalizing over registration parameters\n\nIn order to obtain an expression for p(cid:0)x|(cid:8)y(k)(cid:9)(cid:1) from expressions (3), (6) and (7) above, the\n\n(k), \u00af\u03bb\u03b1\nparameter variations \u03b4(k) must be integrated out of the problem. Registration estimates \u00af\u03b8\nand \u00af\u03bb\u03b2 can be obtained using classical registration methods, either intensity-based [8] or estimation\nfrom image points [6], and the diagonal matrix C is constructed to re\ufb02ect the con\ufb01dence in each\nparameter estimate. This might mean a standard deviation of a tenth of a low-resolution pixel on\nimage translation parameters, or a few gray levels\u2019 shift on the illumination model, for instance.\n\nThe integral performed is\n\nKM\n\nKn\n\n2\n\n1\n\n2\u03c0(cid:19)\np(cid:0)(cid:8)y(k)(cid:9)(cid:1)(cid:18) \u03b2\nexpn\u2212\np(cid:16)x|ny(k)o(cid:17) =\n\u00d7Z exp(\u2212\nXk=1(cid:20) \u03b2\n\u03b2 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n2 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\u03b1 W(cid:16)\u03b8(k)(cid:17) x \u2212 \u03bb(k)\n\n2 (cid:18) b\n2\u03c0(cid:19)\n\ny(k) \u2212 \u03bb(k)\n\n1\nZx\n\n1\n2\n\n+\n\n2\n\n2\n\nK\n\n\u03bd\n2\n\n\u03c1 (Dx, \u03b1)o\n\u03b4(k)C(k)\u22121\u03b4(k)(cid:21)) d\u03b4,\n\nwhere \u03b4T = (cid:2)\u03b4(1)T , \u03b4(2)T , . . . , \u03b4(K)T(cid:3) and all the \u03bb and \u03b8 parameters are functions of \u03b4 as in\n\n(4). Expanding the data error term in the exponent for each low-resolution image as a second-order\nTaylor series about the estimated geometric registration parameter yields\n\n(12)\n\ne(k) (\u03b4) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\ny(k) \u2212 \u03bb\u03b1(cid:16)\u03b8(k)(cid:17) W(k)(cid:16)\u03b8(k)(cid:17) x \u2212 \u03bb(k)\n\n= F (k) + G(k)T \u03b4 +\n\n\u03b4(k)T H(k)\u03b4(k),\n\n2\n\n2\n\n\u03b2 (cid:16)\u03b8(k)(cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\n1\n2\n\n(13)\n\n(14)\n\n\fValues for F , G and H can be found numerically (for geometric registrations) or analytically (for\n\nthe photometric parameters) from x andny(k), \u03b8(k), \u03bb(k)\n\nf , becomes\n\nf =\n\nK\n\nXk=1(cid:18)\u2212\n\n\u03b2\n2\n\n= \u2212\n\nF \u2212\n\n\u03b2\n2\n\nF (k) \u2212\n\n\u03b2\n2\n\nG(k)T \u03b4(k) \u2212\n\n\u03b2\n2\n\nGT \u03b4 \u2212\n\n1\n2\n\n\u03b4T (cid:20) \u03b2\n\n2\n\n\u03b1 , \u03bb(k)\n\n\u03b2 o. Thus the whole exponent of (12),\n\u03b4(k)T (cid:20) \u03b2\nH + V\u22121(cid:21) \u03b4,\n\nH(k) + C\u22121(cid:21) \u03b4(k)(cid:19)\n\n(16)\n\n(15)\n\n1\n2\n\n2\n\nwhere the omission of image superscripts indicates stacked matrices, and H is therefore a block-\ndiagonal nK \u00d7 nK sparse matrix, and V is comprised of the repeated diagonal of C.\n\nFinally, letting S = \u03b2\n\n2 H + V\u22121,\n\nZ exp {f } d\u03b4 = exp(cid:26)\u2212\n= exp(cid:26)\u2212\n\n\u03b2\n2\n\n\u03b2\n2\n\nF(cid:27)Z exp(cid:26)\u2212\nF(cid:27) (2\u03c0)\n\nnK\n\n2\n\n\u03b2\n2\n\nGT \u03b4 \u2212\n\n1\n2\n\n|S|\u2212 1\n\n2 exp(cid:26) \u03b22\n\n8\n\n\u03b4T S\u03b4(cid:27) d\u03b4\nGT S\u22121G(cid:27) .\n\n(17)\n\n(18)\n\nThe objective function, L, to be minimized with respect to x is obtained by taking the negative log\nof (12), using the result from (18), and neglecting the constant terms:\n\nL =\n\n\u03bd\n2\n\n\u03c1 (Dx, \u03b1) +\n\n\u03b2\n2\n\nF +\n\n1\n2\n\nlog |S| \u2212\n\n\u03b22\n8\n\nGT S\u22121G.\n\n(19)\n\nThis can be optimized using Scaled Conjugate Gradients (SCG) [9], noting that the gradient can be\nexpressed\n\ndL\ndx\n\n=\n\n\u03bd\n2\n\nDT d\ndx\n+(cid:20) \u03b2\n\n4\n\n\u03c1 (Dx) +\n\n\u2212\n\n\u03b22\n4\n\nGT S\u22121 dG\ndx\n\ndF\n\u03b2\n2\ndx\n\u03b23\n\nvec(cid:0)S\u22121(cid:1)T\n\n\u2212\n\n16 (cid:0)GT S\u22121 \u2297 GT S\u22121(cid:1)(cid:21) dvecH\n\ndx\n\n,\n\n(20)\n\nwhere derivatives of F , G and H with respect to x can be found analytically for photometric pa-\n\nrameters, and numerically (using the analytic gradient of e(k)(cid:0)\u03b4(k)(cid:1) with respect to x) with respect\n\nto the geometric parameters.\n\n3.1 Implementation notes\n\nNotice that the value F from (16) is simply the reprojection error of the current estimate of x at\nthe mean registration parameter values, and that gradients of this expression with respect to the \u03bb\nparameters, and with respect to x can both be found analytically. To \ufb01nd the gradient with respect to\na geometric registration parameter \u03b8\n, and elements of the Hessian involving it, a central difference\nscheme involving only the kth image is used.\nMean values for the registration are computed by standard registration techniques, and x is initialized\nusing around 10 iterations of SCG to \ufb01nd the maximum likelihood solution evaluated at these mean\nparameters. Additionally, pixel values are scaled to lie between \u2212 1\n2 , and the ML solution is\nbounded to lie within these values in order to curb the severe over\ufb01tting usually observed in ML\nsuper-resolution results.\n\n2 and 1\n\n(k)\ni\n\nIn our implementation, the parameters representing the \u03bb values are scaled so that they share the\nsame standard deviations as the \u03b8 parameters, which represent the sub-pixel geometric registration\nshifts, which makes the matrix V a multiple of the identity. The scale factors are chosen so that one\nstandard deviation in \u03bb\u03b2 gives a 10-gray-level shift, and one standard deviation in \u03bb\u03b1 varies pixel\nvalues by around 10 gray levels at mean image intensity.\n\n\f4 Results\n\nThe \ufb01rst experiment takes a sixteen-image synthetic dataset created from an eyechart image. Data is\ngenerated at a zoom factor of 4, using a 2D translation-only motion model, and the two-parameter\nglobal af\ufb01ne illumination model described above, giving a total of four registration parameters per\nlow-resolution image. Gaussian noise with standard deviation equivalent to 5 gray levels is added\nto each low-resolution pixel independently. The sub-pixel perturbations are evenly spaced over a\ngrid up to plus or minus one half of a low-resolution pixel, giving a similar setup to that described\nin [10], but with additional lighting variation. The ground truth image and two of the low-resolution\nimages appear in the \ufb01rst row of Figure 1.\n\nGeometric and photometric registration parameters were initialized to the identity, and the images\nwere registered using an iterative intensity-based scheme. The resulting parameter values were used\nto recover two sets of super-resolution images: one using the standard Huber MAP algorithm, and\nthe second using our extension integrating over the registration uncertainty. The Huber parameter \u03b1\nwas \ufb01xed at 0.01 for all runs, and \u03bd was varied over a range of possible values representing ratios\nbetween \u03bd and the image noise precision \u03b2.\n\nThe images giving lowest RMS error from each set are displayed in the second row of Figure 1.\nVisually, the differences between the images are subtle, though the bottom row of letters is better\nde\ufb01ned in the output from the new algorithm. Plotting the RMSE as a function of \u03bd in Figure 2,\nwe see that the proposed registration-integrating approach achieves a lower error, compared to the\nground truth high-resolution image, than the standard Huber MAP algorithm for any choice of prior\nstrength, \u03bd in the optimal region.\n\n(a) ground truth high\u2212res\n\n(b) input 1/16\n\n(c) input 16/16\n\n(d) best Huber (err = 15.6)\n\n(e) best int\u2212\u03b8\u2212\u03bb (err = 14.8)\n\nFigure 1: (a) Ground truth image. Only the central recoverable part is shown; (b,c) low-resolution\nimages. The variation in intensity is clearly visible, and the sub-pixel displacements necessary for\nmulti-frame image super-resolution are most apparent on the \u201cD\u201d characters to the right of each im-\nage; (d) The best (\u0131.e. minimum MSE \u2013 see Figure 2) image from the regular Huber MAP algorithm,\nhaving super-resolved the dataset multiple times with different prior strength settings; (e) The best\nresult using out approach of integrating over \u03b8 and \u03bb. As well as having a lower RMSE, note the\nimprovement in black-white edge detail on some of the letters on the bottom line.\n\nThe second experiment uses real data with a 2D translation motion model and an af\ufb01ne lighting\nmodel exactly as above. The \ufb01rst and last images appear on the top row of Figure 3. Image regis-\ntration was carried out in the same manner as before, and the geometric parameters agree with the\nprovided homographies to within a few hundredths of a pixel. Super-resolution images were created\n\n\f23\n\n22\n\n21\n\n20\n\n19\n\n18\n\n17\n\n16\n\n15\n\nl\n\ns\ne\nv\ne\n\nl\n \ny\na\nr\ng\nn\n\n \n\ni\n \n\nE\nS\nM\nR\n\n14\n\n0\n\n0.01\n\nRMSE comparison\n\nStandard Huber MAP\nIntegrating over registrations and illumination\n\n0.02\n0.08\nratio of prior strength parameter, \u03bd, and noise precision, \u03b2\n\n0.05\n\n0.07\n\n0.03\n\n0.04\n\n0.06\n\n0.09\n\n0.1\n\nFigure 2: Plot showing the variation of RMSE with prior strength for the standard Huber-prior MAP\nsuper-resolution method and our approach integrating over \u03b8 and \u03bb. The images corresponding to\nthe minima of the two curves are shown in Figure 1\n\nfor a number of \u03bd values, the equivalent values to those quoted in [3] were found subjectively to be\nthe most suitable.\n\nThe covariance of the registration values was chosen to be similar to that used in the synthetic\nexperiments. Finally, Tipping and Bishop\u2019s method was extended to cover the illumination model\nand used to register and super-resolve the dataset, using the same PSF standard deviation (0.4 low-\nresolution pixels) as the other methods.\n\nThe three sets of results on the real data sequence are shown in the middle and bottom rows of\nFigure 3. To facilitate a better comparison, a sub-region of each is expanded to make the letter\ndetails clearer. The Huber prior tends to make the edges unnaturally sharp, though it is very suc-\ncessful at regularizing the solution elsewhere. Between the Tipping and Bishop image and the\nregistration-integrating approach, the text appears more clear in our method, and the regularization\nin the constant background regions is slightly more successful.\n\n5 Discussion\n\nIt is possible to interpret the extra terms introduced into the objective function in the derivation\nof this method as an extra regularizer term or image prior. Considering (19), the \ufb01rst two terms\nare identical to the standard MAP super-resolution problem using a Huber image prior. The two\nadditional terms constitute an additional distribution over x in the cases where S is not dominated\nby V; as the distribution over \u03b8 and \u03bb tightens to a single point, the terms tend to constant values.\n\nThe intuition behind the method\u2019s success is that this extra prior resulting from the \ufb01nal two terms\nof (19) will favor image solutions which are not acutely sensitive to minor adjustments in the image\nregistration. The images of \ufb01gure 4 illustrate the type of solution which would score poorly. To\ncreate the \ufb01gure, one dataset was used to produce two super-resolved images, using two independent\nsets of registration parameters which were randomly perturbed by an i.i.d. Gaussian vector with a\nstandard deviation of only 0.04 low-resolution pixels. The checker-board pattern typical of ML\nsuper-resolution images can be observed, and the difference image on the right shows the drastic\ncontrast between the two image estimates.\n\n\f(a) input 1/10\n\n(b) input 10/10\n\n(c) integrating \u03b8, \u03bb\n\n(d) integrating \u03b8, \u03bb (detailed region)\n\n(e) regular Huber (detailed region)\n\n(f) Tipping & Bishop (detailed region)\n\nFigure 3: (a,b) First and last images from a real data sequence containing 10 images acquired on a\nrig which constrained the motion to be pure translation in 2D. (c) The full super-resolution output\nfrom our algorithm. (d) Detailed region of the central letters, again with our algorithm. (e) Detailed\nregion of the regular Huber MAP super-resolution image, using parameter values suggested in [3],\nwhich are also found to be subjectively good choices. The edges are slightly arti\ufb01cially crisp, but the\nlarge smooth regions are well regularized. (f) Close-up of letter detail for comparison with Tipping\nand Bishop\u2019s method of marginalization. The Gaussian form of their prior leads to a more blurred\noutput, or one that over-\ufb01ts to the image noise on the input data if the prior\u2019s in\ufb02uence is decreased.\n\n5.1 Conclusion\n\nThis work has developed an alternative approach for Bayesian image super-resolution with several\nadvantages over Tipping and Bishop\u2019s original algorithm. These are namely a formal treatment of\nregistration uncertainty, the use of a much more realistic image prior, and the computational speed\nand memory ef\ufb01ciency relating to the smaller dimension of the space over which we integrate.\nThe results on real and synthetic images with this method show an advantage over the popular\nMAP approach, and over the result from Tipping and Bishop\u2019s method, largely owing to our more\nfavorable prior over the super-resolution image.\n\nIt will be a straightforward extension of the current approach to incorporate learning for the point-\nspread function covariance, though it will result in a less sparse Hessian matrix H, because each\nrow and column associated with the PSF parameter(s) has the potential to be full-rank, assuming a\ncommon camera con\ufb01guration is shared across all the frames.\n\nFinally, the best way of learning the appropriate covariance values for the distribution over \u03b8 given\nthe observed data, and how to assess the trade-off between its \u201cprior-like\u201d effects and the need for a\nstandard Huber-style image prior, are still open questions.\n\nAcknowledgements\n\nThe real dataset used in the results section is due to Tomas Pajdla and Daniel Martinec, CMP, Prague,\nand is available at http://www.robots.ox.ac.uk/\u223cvgg/data4.html.\n\n\f(a) truth\n\n(b) ML image 1\n\n(c) ML image 2\n\n(d) difference\n\nFigure 4: An example of the effect of tiny changes in the registration parameters. (a) Ground truth\nimage from which a 16-image low-resolution dataset was generated. (b,c) Two ML super-resolution\nestimates. In both cases, the same dataset was used, but the registration parameters were perturbed\nby an i.i.d. vector with standard deviation of just 0.04 low-resolution pixels. (d) The difference\nbetween the two solutions. In all these images, values outside the valid image intensity range have\nbeen rounded to white or black values.\n\nThis work was funded in part by EC Network of Excellence PASCAL.\n\nReferences\n\n[1] S. Baker and T. Kanade. Limits on super-resolution and how to break them. IEEE Transactions\n\non Pattern Analysis and Machine Intelligence, 24(9):1167\u20131183, 2002.\n\n[2] S. Borman. Topics in Multiframe Superresolution Restoration. PhD thesis, University of Notre\n\nDame, Notre Dame, Indiana, May 2004.\n\n[3] D. Capel.\n\nImage Mosaicing and Super-resolution (Distinguished Dissertations). Springer,\n\nISBN: 1852337710, 2004.\n\n[4] S. Farsiu, M. Elad, and P. Milanfar. A practical approach to super-resolution. In Proc. of the\n\nSPIE: Visual Communications and Image Processing, San-Jose, 2006.\n\n[5] R. C. Hardie, K. J. Barnard, and E. A. Armstrong. Joint map registration and high-resolution\nIEEE Transactions on Image\n\nimage estimation using a sequence of undersampled images.\nProcessing, 6(12):1621\u20131633, 1997.\n\n[6] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge\n\nUniversity Press, ISBN: 0521540518, second edition, 2004.\n\n[7] M. Irani and S. Peleg. Super resolution from image sequences. ICPR, 2:115\u2013120, June 1990.\nImproving resolution by image registration. Graphical Models and\n[8] M. Irani and S. Peleg.\n\nImage Processing, 53:231\u2013239, 1991.\n\n[9] I. Nabney. Netlab algorithms for pattern recognition. Springer, 2002.\n[10] N. Nguyen, P. Milanfar, and G. Golub. Ef\ufb01cient generalized cross-validation with applications\nIEEE Transactions on Image\n\nto parametric image restoration and resolution enhancement.\nProcessing, 10(9):1299\u20131308, September 2001.\n\n[11] L. C. Pickup, S. J. Roberts, and A. Zisserman. A sampled texture prior for image super-\n\nresolution. In Advances in Neural Information Processing Systems, pages 1587\u20131594, 2003.\n\n[12] L. C. Pickup, S. J. Roberts, and A. Zisserman. Optimizing and learning for super-resolution.\n\nIn Proceedings of the British Machine Vision Conference, 2006. to appear.\n\n[13] D. Robinson and P. Milanfar. Fundamental performance limits in image registration. IEEE\n\nTransactions on Image Processing, 13(9):1185\u20141199, September 2004.\n\n[14] R. R. Schultz and R. L. Stevenson. A bayesian approach to image expansion for improved\n\nde\ufb01nition. IEEE Transactions on Image Processing, 3(3):233\u2013242, 1994.\n\n[15] Salient Stills. http://www.salientstills.com/.\n[16] M. E. Tipping and C. M. Bishop. Bayesian imge super-resolution. In S. Thrun, S. Becker, and\nK. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15, pages\n1279\u20131286, Cambridge, MA, 2003. MIT Press.\n\n\f", "award": [], "sourceid": 3037, "authors": [{"given_name": "Lyndsey", "family_name": "Pickup", "institution": null}, {"given_name": "David", "family_name": "Capel", "institution": null}, {"given_name": "Stephen", "family_name": "Roberts", "institution": null}, {"given_name": "Andrew", "family_name": "Zisserman", "institution": null}]}