Sample complexity of rank regression using pairwise comparisons

•We propose an estimator for the parameters of a generalized linear parametric model, which encompasses classical preference models such as Bradley-Terry [13] and Thurstone [14].•We overcome the violation of independence and prove a sample complexity guarantee on model parameters. In particular, ass...

Full description

Saved in:

Bibliographic Details
Published in:	Pattern recognition Vol. 130; p. 108688
Main Authors:	Kadıoğlu, Berkan, Tian, Peng, Dy, Jennifer, Erdoğmuş, Deniz, Ioannidis, Stratis
Format:	Journal Article
Language:	English
Published:	Elsevier Ltd 01-10-2022
Subjects:	Features Pairwise comparisons Rank regression Sample complexity Rank regression Features Sample complexity Pairwise comparisons
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•We propose an estimator for the parameters of a generalized linear parametric model, which encompasses classical preference models such as Bradley-Terry [13] and Thurstone [14].•We overcome the violation of independence and prove a sample complexity guarantee on model parameters. In particular, assuming Gaussian distributed features, we characterize the convergence of the estimator to a rescaled version of the model parameters w.r.t. the ambient dimension d, the number of samples N, and the number of comparisons M presented to the oracle.•We show that to attain an accuracy ϵ>0 in model parameters, it suffices to conduct Ω(dNlog3N/ϵ2) comparisons when the number of samples is Ω(d/ϵ2).•Finally, we confirm this dependence with experiments on synthetic data. We consider a rank regression setting, in which a dataset of N samples with features in Rd is ranked by an oracle via M pairwise comparisons. Specifically, there exists a latent total ordering of the samples; when presented with a pair of samples, a noisy oracle identifies the one ranked higher with respect to the underlying total ordering. A learner observes a dataset of such comparisons and wishes to regress sample ranks from their features. We show that to learn the model parameters with ϵ>0 accuracy, it suffices to conduct M∈Ω(dNlog3N/ϵ2) comparisons uniformly at random when N is Ω(d/ϵ2).
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2022.108688