A Roman Urdu Corpus for sentiment analysis
Abstract Sentiment analysis is a dynamic field focused on understanding and predicting emotional sentiments in text or images. With the prevalence of smartphones, e-commerce and social networks, individuals readily express opinions, aiding businesses, political analysts and organizations in decision...
Saved in:
Published in: | Computer journal Vol. 67; no. 9; pp. 2864 - 2876 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
Oxford University Press
09-10-2024
|
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract
Sentiment analysis is a dynamic field focused on understanding and predicting emotional sentiments in text or images. With the prevalence of smartphones, e-commerce and social networks, individuals readily express opinions, aiding businesses, political analysts and organizations in decision-making. Despite extensive research in sentiment analysis for various languages, challenges persist in low-resource languages like Roman Urdu. Roman Urdu, the use of Roman script to write Urdu, has gained popularity, yet limited linguistic resources hinder sentiment analysis research. This study addresses this gap by developing a bidirectional long short-term memory network with FastText embeddings and additional layers. A large Roman Urdu corpus for sentiment analysis, consisting of over 51 000 reviews, is crated and the proposed model is trained and compared with 14 other models, demonstrating an accuracy of 0.854 and an F1-score of 0.84. |
---|---|
ISSN: | 0010-4620 1460-2067 |
DOI: | 10.1093/comjnl/bxae052 |