Best position algorithms for efficient top- k query processing
The general problem of answering top -k queries can be modeled using lists of data items sorted by their local scores. The main algorithm proposed so far for answering top- k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists...
Saved in:
Published in: | Information systems (Oxford) Vol. 36; no. 6; pp. 973 - 989 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier Ltd
01-09-2011
Elsevier |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The general problem of answering top
-k queries can be modeled using lists of data items sorted by their local scores. The main algorithm proposed so far for answering top-
k queries over sorted lists is the Threshold Algorithm (TA). However, TA may still incur a lot of useless accesses to the lists. In this paper, we propose two algorithms that are much more efficient than TA. First, we propose the best position algorithm (BPA). For any database instance (i.e. set of sorted lists), we prove that BPA stops as early as TA, and that its execution cost is never higher than TA. We show that there are databases over which BPA executes top
-k queries
O(
m) times faster than that of TA, where
m is the number of lists. We also show that the execution cost of our algorithm can be (
m−1) times lower than that of TA. Second, we propose the BPA2 algorithm, which is much more efficient than BPA. We show that the number of accesses to the lists done by BPA2 can be about (
m−1) times lower than that of BPA. We evaluated the performance of our algorithms through extensive experimental tests. The results show that over our test databases, BPA and BPA2 achieve significant performance gains in comparison with TA.
► We propose two new algorithms for processing top
-k queries over sorted lists. ► We propose BPA algorithm that is much more efficient than TA. ► The execution cost of BPA is up to (
m−1) times lower than that of TA. ► We also propose the BPA2 algorithm, which is much more efficient than BPA. ► The number of accesses done by BPA2 can be up to (
m−1) times lower than BPA. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
ISSN: | 0306-4379 1873-6076 |
DOI: | 10.1016/j.is.2011.03.010 |