Part of Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
Pierre Glaser, Michael Arbel, Arthur Gretton
We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergencebetween a moving source and a fixed target distribution.This approximation, termed theKALE (KL approximate lower-bound estimator), solves a regularized version ofthe Fenchel dual problem defining the KL over a restricted class of functions.When using a Reproducing Kernel Hilbert Space (RKHS) to define the functionclass, we show that the KALE continuously interpolates between the KL and theMaximum Mean Discrepancy (MMD). Like the MMD and other Integral ProbabilityMetrics, the KALE remains well defined for mutually singulardistributions. Nonetheless, the KALE inherits from the limiting KL a greater sensitivity to mismatch in the support of the distributions, compared with the MMD. These two properties make theKALE gradient flow particularly well suited when the target distribution is supported on a low-dimensional manifold. Under an assumption of sufficient smoothness of the trajectories, we show the global convergence of the KALE flow. We propose a particle implementation of the flow given initial samples from the source and the target distribution, which we use to empirically confirm the KALE's properties.