Evaluation Infrastructures for Academic Shared Tasks Requirements and Concept Design for Search and Recommendation Scenarios

Academic search systems aid users in finding information covering specific topics of scientific interest and have evolved from early catalog-based library systems to modern web-scale systems. However, evaluating the performance of the underlying retrieval approaches remains a challenge. An increasin...

Full description

Saved in:
Bibliographic Details
Published in:Datenbank-Spektrum : Zeitschrift für Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft für Informatik e.V Vol. 20; no. 1; pp. 29 - 36
Main Authors: Schaible, Johann, Breuer, Timo, Tavakolpoursaleh, Narges, Müller, Bernd, Wolff, Benjamin, Schaer, Philipp
Format: Journal Article
Language:English
Published: Berlin/Heidelberg Springer Berlin Heidelberg 01-03-2020
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Academic search systems aid users in finding information covering specific topics of scientific interest and have evolved from early catalog-based library systems to modern web-scale systems. However, evaluating the performance of the underlying retrieval approaches remains a challenge. An increasing amount of requirements for producing accurate retrieval results have to be considered, e.g., close integration of the system’s users. Due to these requirements, small to mid-size academic search systems cannot evaluate their retrieval system in-house. Evaluation infrastructures for shared tasks alleviate this situation. They allow researchers to experiment with retrieval approaches in specific search and recommendation scenarios without building their own infrastructure. In this paper, we elaborate on the benefits and shortcomings of four state-of-the-art evaluation infrastructures on search and recommendation tasks concerning the following requirements: support for online and offline evaluations, domain specificity of shared tasks, and reproducibility of experiments and results. In addition, we introduce an evaluation infrastructure concept design aiming at reducing the shortcomings in shared tasks for search and recommender systems.
ISSN:1618-2162
1610-1995
DOI:10.1007/s13222-020-00335-x