LDA Analyzer: A Tool for Exploring Topic Models

Online technical forums are valuable sources for mining useful software engineering information. LDA (Latent Dirichlet Allocation) is an unsupervised machine learning method which can be used for extracting underlying topics out of such large forums. However, the main output of LDA forum learning ar...

Full description

Saved in:
Bibliographic Details
Published in:2014 IEEE International Conference on Software Maintenance and Evolution pp. 593 - 596
Main Authors: Chunyao Zou, Daqing Hou
Format: Conference Proceeding
Language:English
Published: IEEE 01-09-2014
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Online technical forums are valuable sources for mining useful software engineering information. LDA (Latent Dirichlet Allocation) is an unsupervised machine learning method which can be used for extracting underlying topics out of such large forums. However, the main output of LDA forum learning are usually huge matrices that contain millions of numbers, which is impossible for researchers to directly scrutinize the numerical distribution and semantically evaluate the relationship between the extracted topics and large collection of unorganized documents. In this paper, we present LDAAnalyzer, an LDA visualization tool that makes the hidden topic-document structures rise to the surface. From the functionality point of view, LDA Analyzer consists of (1) LDA modeling (2) LDA output analysis and (3) new corpus training. With the help of LDAAnalyzer, our semantic topic-modeling evaluation based on large technical forums becomes feasible.
ISSN:1063-6773
2576-3148
DOI:10.1109/ICSME.2014.103