Extractive Summarization on Twi Language Using Text-Rank and Fuzzy Clustering Algorithm
Twi is a dialect of the Akan language spoken in southern and central Ghana by several million people. About 80% of the Ghanaian population speaks Twi as a first or second language. Summarization is a basic natural language processing task, that is a step to further more advance tasks as well as an a...
Saved in:
Published in: | 2023 1st International Conference on Circuits, Power and Intelligent Systems (CCPIS) pp. 1 - 6 |
---|---|
Main Authors: | , , , , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-09-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Twi is a dialect of the Akan language spoken in southern and central Ghana by several million people. About 80% of the Ghanaian population speaks Twi as a first or second language. Summarization is a basic natural language processing task, that is a step to further more advance tasks as well as an aid in reviewing information. This paper tackles the problem of summarizing Twi texts. Three extractive summarization techniques - TF-IDF, Text-Rank and Fuzzy C-means Clustering are implemented. We find that Text Rank performs best, while TF-IDF despite being fairly simple performs comparably well to Text Rank. Our work suggests that embedding based approaches are potent and further work should employ their use. |
---|---|
DOI: | 10.1109/CCPIS59145.2023.10291396 |