Extractive Summarization on Twi Language Using Text-Rank and Fuzzy Clustering Algorithm

Twi is a dialect of the Akan language spoken in southern and central Ghana by several million people. About 80% of the Ghanaian population speaks Twi as a first or second language. Summarization is a basic natural language processing task, that is a step to further more advance tasks as well as an a...

Full description

Saved in:

Bibliographic Details
Published in:	2023 1st International Conference on Circuits, Power and Intelligent Systems (CCPIS) pp. 1 - 6
Main Authors:	Dhole, Sayak, Sagnika, Santwana, Dominic, Ampofo, Mishra, Bhabani Shankar Prasad, Fernando, Aloka, Dash, Satya Ranjan
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-09-2023
Subjects:	Clustering algorithms Fuzzy Clustering Linguistics Measurement Natural language processing Sociology Statistics Summaization Task analysis Text-Rank TF-IDF Twi Language
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Twi is a dialect of the Akan language spoken in southern and central Ghana by several million people. About 80% of the Ghanaian population speaks Twi as a first or second language. Summarization is a basic natural language processing task, that is a step to further more advance tasks as well as an aid in reviewing information. This paper tackles the problem of summarizing Twi texts. Three extractive summarization techniques - TF-IDF, Text-Rank and Fuzzy C-means Clustering are implemented. We find that Text Rank performs best, while TF-IDF despite being fairly simple performs comparably well to Text Rank. Our work suggests that embedding based approaches are potent and further work should employ their use.
DOI:	10.1109/CCPIS59145.2023.10291396