Consensus Similarity Measure for Short Text Clustering

Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-a...

Full description

Saved in:
Bibliographic Details
Published in:2015 26th International Workshop on Database and Expert Systems Applications (DEXA) pp. 264 - 268
Main Authors: Shin, Youhyun, Ahn, Yeonchan, Jeon, Heesik, Lee, Sang-goo
Format: Conference Proceeding Journal Article
Language:English
Published: IEEE 01-09-2015
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity measures in order to exploit advantages of both approaches. We apply our method to a dialog-utterance dataset, which consists of short dialog texts. Empirical study shows that the proposed method outperforms one of the state-of-the-art clustering algorithms for short text clustering.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
ISBN:1467375810
9781467375818
ISSN:1529-4188
2378-3915
DOI:10.1109/DEXA.2015.65