Extractive Summarization on Twi Language Using Text-Rank and Fuzzy Clustering Algorithm

Twi is a dialect of the Akan language spoken in southern and central Ghana by several million people. About 80% of the Ghanaian population speaks Twi as a first or second language. Summarization is a basic natural language processing task, that is a step to further more advance tasks as well as an a...

Full description

Saved in:
Bibliographic Details
Published in:2023 1st International Conference on Circuits, Power and Intelligent Systems (CCPIS) pp. 1 - 6
Main Authors: Dhole, Sayak, Sagnika, Santwana, Dominic, Ampofo, Mishra, Bhabani Shankar Prasad, Fernando, Aloka, Dash, Satya Ranjan
Format: Conference Proceeding
Language:English
Published: IEEE 01-09-2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Twi is a dialect of the Akan language spoken in southern and central Ghana by several million people. About 80% of the Ghanaian population speaks Twi as a first or second language. Summarization is a basic natural language processing task, that is a step to further more advance tasks as well as an aid in reviewing information. This paper tackles the problem of summarizing Twi texts. Three extractive summarization techniques - TF-IDF, Text-Rank and Fuzzy C-means Clustering are implemented. We find that Text Rank performs best, while TF-IDF despite being fairly simple performs comparably well to Text Rank. Our work suggests that embedding based approaches are potent and further work should employ their use.
DOI:10.1109/CCPIS59145.2023.10291396