Discovering topics in text datasets by visualizing relevant words
When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which disting...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
18-07-2017
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | When dealing with large collections of documents, it is imperative to quickly
get an overview of the texts' contents. In this paper we show how this can be
achieved by using a clustering algorithm to identify topics in the dataset and
then selecting and visualizing relevant words, which distinguish a group of
documents from the rest of the texts, to summarize the contents of the
documents belonging to each topic. We demonstrate our approach by discovering
trending topics in a collection of New York Times article snippets. |
---|---|
DOI: | 10.48550/arxiv.1707.06100 |