Kada se portali „zaraze” koronom: Razvoj i usporedna analiza članaka portala Index.hr 2019. i 2020. godine

The goal of this paper is to present the methodology, tools and results of comparative computational analysis of newspaper online articles: from the collection of documents and the cleaning of language data for the development of specialized corpora of newspaper articles, to the presentation of the...

Full description

Saved in:
Bibliographic Details
Published in:Medijske studije Vol. 13; no. 25; pp. 27 - 49
Main Author: Bago, Petra
Format: Journal Article Paper
Language:Croatian
Published: Zagreb Fakultet političkih znanosti u Zagrebu 2022
Faculty of Political Science - University of Zagreb
Sveuciliste u Zagrebu, Fakultet Politckih Znanosti
Fakultet političkih znanosti
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The goal of this paper is to present the methodology, tools and results of comparative computational analysis of newspaper online articles: from the collection of documents and the cleaning of language data for the development of specialized corpora of newspaper articles, to the presentation of the tools used and the comparative statistical analysis of the corpora. The research was conducted on two specialized corpora developed precisely for the purpose of this research, based on 500 newspaper articles in the category “News” of the Index.hr news portal. One corpus is based on articles published in the pre-pandemic year 2019, and the other is based on articles published in the pandemic year 2020. By analyzing the data, we found that the vocabulary of the pandemic corpus is significantly poorer than the pre-pandemic corpus, that in 2020 less was written about the neighboring states of the Republic of Croatia than in 2019, and that the pre-pandemic corpus mentioned domestic cities more than the foreign ones, while the opposite can be argued for the pandemic corpus. Finally, we also investigated the adequacy of automatic term extraction to identify specific topics covered in the observed corpora.
Bibliography:281477
ISSN:1847-9758
1848-5030
DOI:10.20901/ms.13.25.2