Classifying the Mexican epidemiological semaphore colour from the Covid-19 text Spanish news
This work aims to generate classification models that help determine the colour of an epidemiological semaphore (ES) by analysing online news and being better prepared for the different changes in the evolution of the pandemic. To accomplish this, we introduce Cov-NES-Mex corpus, a collection of 77,...
Saved in:
Published in: | Journal of information science Vol. 50; no. 3; pp. 568 - 589 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
London, England
SAGE Publications
01-06-2024
Bowker-Saur Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This work aims to generate classification models that help determine the colour of an epidemiological semaphore (ES) by analysing online news and being better prepared for the different changes in the evolution of the pandemic. To accomplish this, we introduce Cov-NES-Mex corpus, a collection of 77,983 news (labelled with the Mexican ES system) related to Covid-19 for the 32 regions of Mexico. Also, we showed measures that describe the corpus as imbalanced and with a high vocabulary overlap between classes. In addition, evaluation measurements of the pandemic by region are proposed. Furthermore, a classification model, based on a transformer architecture specialised for the Spanish language, achieved up to 0.83 of F-measure. Thus, this work provides evidence that there is essential information in the news that can be used to determine the colour of the ES up to 4 weeks in advance. Finally, the presented results could be applied to other Spanish-speaking countries, which do not have an ES system, thus inferring and comparing their situation concerning the Mexican ES. |
---|---|
ISSN: | 0165-5515 1741-6485 |
DOI: | 10.1177/01655515221100952 |