Shivani Agarwal, Thore Graepel, Ralf Herbrich, Dan Roth
The area under the ROC curve (AUC) has been advocated as an evalu- ation criterion for the bipartite ranking problem. We study large devi- ation properties of the AUC; in particular, we derive a distribution-free large deviation bound for the AUC which serves to bound the expected accuracy of a ranking function in terms of its empirical AUC on an inde- pendent test sequence. A comparison of our result with a corresponding large deviation result for the classiﬁcation error rate suggests that the test sample size required to obtain an -accurate estimate of the expected ac- curacy of a ranking function with δ-conﬁdence is larger than that required to obtain an -accurate estimate of the expected error rate of a classiﬁ- cation function with the same conﬁdence. A simple application of the union bound allows the large deviation bound to be extended to learned ranking functions chosen from ﬁnite function classes.