Detection of Hate-Speech Text on Indonesian Twitter Social Media Using IndoBERTweet-BiLSTM-CNN

Social media Twitter has become the second place in people's lives to express themselves. Social media users can comment on whatever they want, and it is not uncommon to find comments that contain hate-speech. If it is not stopped, hate-speech can spread quickly, therefore it is necessary to de...

Full description

Saved in:

Bibliographic Details
Published in:	2024 12th International Conference on Information and Communication Technology (ICoICT) pp. 374 - 381
Main Authors:	Hakim, Atalla Naufal, Sibaroni, Yuliant, Prasetyowati, Sri Suryani
Format:	Conference Proceeding
Language:	English
Published:	IEEE 07-08-2024
Subjects:	Accuracy BiLSTM Blogs CNN Data models Hate speech IndoBERTweet Noise Optimization Reviews Social networking (online) Testing Text categorization Text Classification
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Social media Twitter has become the second place in people's lives to express themselves. Social media users can comment on whatever they want, and it is not uncommon to find comments that contain hate-speech. If it is not stopped, hate-speech can spread quickly, therefore it is necessary to detect hate-speech. In this research, the detection of hate-speech was carried out using IndoBERTweet, which is a development of the BERT model that has been previously trained using data from Indonesian language Twitter, so it is suitable for classifying Indonesian language texts. BiLSTM and CNN are deep-learning methods that can be used for text classification. This research aims to detect hate-speech texts using these three methods and then combining them. To carry out optimization, experiments were carried out on batch size and learning rate values. With a batch size of 8 and a learning rate of 0.001, the best accuracy is 85.45%, and the F1-Score is 85.06%.
DOI:	10.1109/ICoICT61617.2024.10698615