Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
We consider hate speech detection through keyword spotting on radio broadcasts. One approach is to build an automatic speech recognition (ASR) system for the target low-resource language. We compare this to using acoustic word embedding (AWE) models that map speech segments to a space where matching...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
01-06-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We consider hate speech detection through keyword spotting on radio
broadcasts. One approach is to build an automatic speech recognition (ASR)
system for the target low-resource language. We compare this to using acoustic
word embedding (AWE) models that map speech segments to a space where matching
words have similar vectors. We specifically use a multilingual AWE model
trained on labelled data from well-resourced languages to spot keywords in data
in the unseen target language. In contrast to ASR, the AWE approach only
requires a few keyword exemplars. In controlled experiments on Wolof and
Swahili where training and test data are from the same domain, an ASR model
trained on just five minutes of data outperforms the AWE approach. But in an
in-the-wild test on Swahili radio broadcasts with actual hate speech keywords,
the AWE model (using one minute of template data) is more robust, giving
similar performance to an ASR system trained on 30 hours of labelled data. |
---|---|
DOI: | 10.48550/arxiv.2306.00410 |