Integration of multiple evidences based on a query type for web search
The massive and heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web pages are not enough to find answer pages. PageRank compensates for the insufficiencies of content information. The content information and PageRank are combined to get better result...
Saved in:
Published in: | Information processing & management Vol. 40; no. 3; pp. 459 - 478 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Oxford
Elsevier Ltd
01-05-2004
Elsevier Science Elsevier Science Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The massive and heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web pages are not enough to find answer pages. PageRank compensates for the insufficiencies of content information. The content information and PageRank are combined to get better results. However, static combination of multiple evidences may lower the retrieval performance. We have to use different strategies to meet the need of a user. We can classify user queries as three categories according to users' intent, the topic relevance task, the homepage finding task, and the service finding task. In this paper, we present a user query classification method. The difference of distribution, mutual information, the usage rate as anchor texts and the POS information are used for the classification. After we classified a user query, we apply different algorithms and information for the better results. For the topic relevance task, we emphasize the content information, on the other hand, for the homepage finding task, we emphasize the Link information and the URL information. We could get the best performance when our proposed classification method with the OKAPI scoring algorithm was used. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/S0306-4573(03)00053-0 |