Discovering topics in text datasets by visualizing relevant words

When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which disting...

Full description

Saved in:
Bibliographic Details
Main Authors: Horn, Franziska, Arras, Leila, Montavon, Grégoire, Müller, Klaus-Robert, Samek, Wojciech
Format: Journal Article
Language:English
Published: 18-07-2017
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which distinguish a group of documents from the rest of the texts, to summarize the contents of the documents belonging to each topic. We demonstrate our approach by discovering trending topics in a collection of New York Times article snippets.
DOI:10.48550/arxiv.1707.06100