A Machine Learning Based Topic Exploration and Categorization on Surveys
This paper describes an automatic topic extraction, categorization, and relevance ranking model for multi-lingual surveys and questions that exploits machine learning algorithms such as topic modeling and fuzzy clustering. Automatically generated question and survey categories are used to build ques...
Saved in:
Published in: | 2012 11th International Conference on Machine Learning and Applications Vol. 2; pp. 7 - 12 |
---|---|
Main Authors: | , , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-12-2012
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper describes an automatic topic extraction, categorization, and relevance ranking model for multi-lingual surveys and questions that exploits machine learning algorithms such as topic modeling and fuzzy clustering. Automatically generated question and survey categories are used to build question banks and category-specific survey templates. First, we describe different pre-processing steps we considered for removing noise in the multilingual survey text. Second, we explain our strategy to automatically extract survey categories from surveys based on topic models. Third, we describe different methods to cluster questions under survey categories and group them based on relevance. Last, we describe our experimental results on a large group of unique, real-world survey datasets from the German, Spanish, French, and Portuguese languages and our refining methods to determine meaningful and sensible categories for building question banks. We conclude this document with possible enhancements to the current system and impacts in the business domain. |
---|---|
ISBN: | 1467346519 9781467346511 |
DOI: | 10.1109/ICMLA.2012.132 |