Best position algorithms for efficient top- k query processing

The general problem of answering top -k queries can be modeled using lists of data items sorted by their local scores. The main algorithm proposed so far for answering top- k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists...

Full description

Saved in:
Bibliographic Details
Published in:Information systems (Oxford) Vol. 36; no. 6; pp. 973 - 989
Main Authors: Akbarinia, Reza, Pacitti, Esther, Valduriez, Patrick
Format: Journal Article
Language:English
Published: Elsevier Ltd 01-09-2011
Elsevier
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The general problem of answering top -k queries can be modeled using lists of data items sorted by their local scores. The main algorithm proposed so far for answering top- k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists. In this paper, we propose two algorithms that are much more efficient than TA. First, we propose the best position algorithm (BPA). For any database instance (i.e. set of sorted lists), we prove that BPA stops as early as TA, and that its execution cost is never higher than TA. We show that there are databases over which BPA executes top -k queries O( m) times faster than that of TA, where m is the number of lists. We also show that the execution cost of our algorithm can be ( m−1) times lower than that of TA. Second, we propose the BPA2 algorithm, which is much more efficient than BPA. We show that the number of accesses to the lists done by BPA2 can be about ( m−1) times lower than that of BPA. We evaluated the performance of our algorithms through extensive experimental tests. The results show that over our test databases, BPA and BPA2 achieve significant performance gains in comparison with TA. ► We propose two new algorithms for processing top -k queries over sorted lists. ► We propose BPA algorithm that is much more efficient than TA. ► The execution cost of BPA is up to ( m−1) times lower than that of TA. ► We also propose the BPA2 algorithm, which is much more efficient than BPA. ► The number of accesses done by BPA2 can be up to ( m−1) times lower than BPA.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2011.03.010