Consensus Similarity Measure for Short Text Clustering
Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-a...
Saved in:
Published in: | 2015 26th International Workshop on Database and Expert Systems Applications (DEXA) pp. 264 - 268 |
---|---|
Main Authors: | , , , |
Format: | Conference Proceeding Journal Article |
Language: | English |
Published: |
IEEE
01-09-2015
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity measures in order to exploit advantages of both approaches. We apply our method to a dialog-utterance dataset, which consists of short dialog texts. Empirical study shows that the proposed method outperforms one of the state-of-the-art clustering algorithms for short text clustering. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
ISBN: | 1467375810 9781467375818 |
ISSN: | 1529-4188 2378-3915 |
DOI: | 10.1109/DEXA.2015.65 |