Consensus Similarity Measure for Short Text Clustering

Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-a...

Full description

Saved in:

Bibliographic Details
Published in:	2015 26th International Workshop on Database and Expert Systems Applications (DEXA) pp. 264 - 268
Main Authors:	Shin, Youhyun, Ahn, Yeonchan, Jeon, Heesik, Lee, Sang-goo
Format:	Conference Proceeding Journal Article
Language:	English
Published:	IEEE 01-09-2015
Subjects:	Batteries Clustering Context Expert systems Knowledge base Knowledge based systems Length measurement Natural language processing semantic similarity Semantics short text Similarity State of the art Taxonomy Texts Workshops
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity measures in order to exploit advantages of both approaches. We apply our method to a dialog-utterance dataset, which consists of short dialog texts. Empirical study shows that the proposed method outperforms one of the state-of-the-art clustering algorithms for short text clustering.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2
ISBN:	1467375810 9781467375818
ISSN:	1529-4188 2378-3915
DOI:	10.1109/DEXA.2015.65