Integrated Personalized and Diversified Search Based on Search Logs

Personalized search and search result diversification are two possible solutions to cope with the query ambiguity problem in search engines. In most existing studies, they have been investigated separately, but intuitively, they address the problem from two complementary perspectives and should be c...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering Vol. 36; no. 2; pp. 1 - 14
Main Authors: Liu, Jiongnan, Dou, Zhicheng, Nie, Jian-Yun, Wen, Ji-Rong
Format: Journal Article
Language:English
Published: New York IEEE 01-02-2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Personalized search and search result diversification are two possible solutions to cope with the query ambiguity problem in search engines. In most existing studies, they have been investigated separately, but intuitively, they address the problem from two complementary perspectives and should be combined. Some recent work tried to combine them by restricting result diversification to the subtopics corresponding to the user's personal profile. However, diversification can be required even when the subtopics are outside the user's profile. In this paper, we propose a more general approach to integrate them based on users' implicit feedback in query logs. The proposed approach PER+DIV aggregates a document's novelty score and personal relevance score dynamically according to how much the query falls into the user's interests. To train the model based on user clicks in the logs, we consider user click as a result of both personal relevance and result diversity and a new method is proposed to isolate and model these two factors. To evaluate the model, we design several diversified and personalized metrics in addition to the traditional click-based metrics. Experimental results on a large-scale query log dataset show that the proposed integrated method significantly outperforms the existing personalization and diversification approaches.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2023.3291006