Automating keyphrase extraction with multi-objective genetic algorithms
Keyphrases have been used extensively in IR systems to facilitate information exchange, organize information and assist information retrieval. Automation of keyphrase generation is essential for the timely creation of keyphrases for large repositories in new domains where previous thesauri do not ex...
Saved in:
Published in: | 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the p. 8 pp. |
---|---|
Main Authors: | , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
2004
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Keyphrases have been used extensively in IR systems to facilitate information exchange, organize information and assist information retrieval. Automation of keyphrase generation is essential for the timely creation of keyphrases for large repositories in new domains where previous thesauri do not exist or for metacollections in which keyphrases that are meaningful across disparate collections are needed. In this paper we propose an automated keyphrase extraction algorithm using a non-dominated sorting multi-objective genetic algorithm. The "clumping" property of keyphrases is used to judge the appropriateness of a phrase and is quantified by a condensation clustering measure proposed by Bookstein. The objective is to find the smallest phrase set that has the best precision, as measured by average condensation clustering. Keyphrases were retrieved from a collection of design conference papers and the results were presented to domain experts for evaluation. Ninety percent of the generated phrases were deemed appropriate for use in a thesaurus for engineering design. |
---|---|
ISBN: | 0769520561 9780769520568 |
DOI: | 10.1109/HICSS.2004.1265278 |