Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track
David Durfee
In this work we consider the problem of differentially private computation ofquantiles for the data, especially the highest quantiles such as maximum, butwith an unbounded range for the dataset. We show that this can be doneefficiently through a simple invocation of $\texttt{AboveThreshold}$, asubroutine that is iteratively called in the fundamental Sparse VectorTechnique, even when there is no upper bound on the data. In particular, weshow that this procedure can give more accurate and robust estimates on thehighest quantiles with applications towards clipping that is essential fordifferentially private sum and mean estimation. In addition, we show how twoinvocations can handle the fully unbounded data setting. Within our study, weshow that an improved analysis of $\texttt{AboveThreshold}$ can improve theprivacy guarantees for the widely used Sparse Vector Technique that is ofindependent interest. We give a more general characterization of privacy lossfor $\texttt{AboveThreshold}$ which we immediately apply to our method forimproved privacy guarantees. Our algorithm only requires one $O(n)$ passthrough the data, which can be unsorted, and each subsequent query takes $O(1)$time. We empirically compare our unbounded algorithm with the state-of-the-artalgorithms in the bounded setting. For inner quantiles, we find that our methodoften performs better on non-synthetic datasets. For the maximal quantiles,which we apply to differentially private sum computation, we find that ourmethod performs significantly better.