Topic Modelling on Consumer Financial Protection Bureau Data: An Approach Using BERT Based Embeddings

Customers' reviews and comments are important for businesses to understand users' sentiment about the products and services. However, this data needs to be analyzed to assess the sentiment associated with topics/aspects to provide efficient customer assistance. LDA and LSA fail to capture...

Full description

Saved in:
Bibliographic Details
Published in:2022 IEEE 7th International conference for Convergence in Technology (I2CT) pp. 1 - 6
Main Authors: Vasudeva Raju, S., Kumar Bolla, Bharath, Nayak, Deepak Kumar, Kh, Jyothsna
Format: Conference Proceeding
Language:English
Published: IEEE 07-04-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Customers' reviews and comments are important for businesses to understand users' sentiment about the products and services. However, this data needs to be analyzed to assess the sentiment associated with topics/aspects to provide efficient customer assistance. LDA and LSA fail to capture the semantic relationship and are not specific to any domain. In this study, we evaluate BERTopic, a novel method that generates topics using sentence embeddings on Consumer Financial Protection Bureau (CFPB) data. Our work shows that BERTopic is flexible and yet provides meaningful and diverse topics compared to LDA and LSA. Furthermore, domain-specific pre-trained embeddings (FinBERT) yield even better topics. We evaluated the topics on coherence score (c_v) and UMass.
DOI:10.1109/I2CT54291.2022.9824873