Search Results - "Klubička, Filip"
-
1
Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification
Published in Frontiers in artificial intelligence (14-03-2022)“…This article examines the basis of Natural Language Understanding of transformer based language models, such as BERT. It does this through a case study on…”
Get full text
Journal Article -
2
Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian
Published in Machine translation (01-09-2018)“…This paper presents a quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems. We build…”
Get full text
Journal Article -
3
Size Matters: The Impact of Training Size in Taxonomically-Enriched Word Embeddings
Published in Open computer science (11-10-2019)“…Word embeddings trained on natural corpora (e.g., newspaper collections, Wikipedia or the Web) excel in capturing thematic similarity (“topical relatedness”)…”
Get full text
Journal Article -
4
Crawl and crowd to bring machine translation to under-resourced languages
Published in Language Resources and Evaluation (01-12-2017)“…We present a widely applicable methodology to bring machine translation (MT) to under-resourced languages in a cost-effective and rapid manner. Our proposal…”
Get full text
Journal Article -
5
SHARING HIGH-QUALITY LANGUAGE RESOURCES IN THE LEGAL DOMAIN TO DEVELOP NEURAL MACHINE TRANSLATION FOR UNDER-RESOURCED EUROPEAN LANGUAGES
Published in Revista de llengua i dret (01-12-2022)“…This article reports some of the main achievements of the European Union-funded PRINCIPLE project in collecting high-quality language resources (LRs) in the…”
Get full text
Journal Article -
6
Probing Taxonomic and Thematic Embeddings for Taxonomic Information
Published 25-01-2023“…Modelling taxonomic and thematic relatedness is important for building AI with comprehensive natural language understanding. The goal of this paper is to learn…”
Get full text
Journal Article -
7
Probing with Noise: Unpicking the Warp and Weft of Embeddings
Published 21-10-2022“…Improving our understanding of how information is encoded in vector space can yield valuable interpretability insights. Alongside vector dimensions, we argue…”
Get full text
Journal Article -
8
Idioms, Probing and Dangerous Things: Towards Structural Probing for Idiomaticity in Vector Space
Published 27-04-2023“…The goal of this paper is to learn more about how idiomatic information is structurally encoded in embeddings, using a structural probing method. We repurpose…”
Get full text
Journal Article -
9
Examining a hate speech corpus for hate speech detection and popularity prediction
Published 12-05-2018“…In Proceedings of 4REAL Workshop 9-16 (2018) As research on hate speech becomes more and more relevant every day, most of it is still focused on hate speech…”
Get full text
Journal Article -
10
Is it worth it? Budget-related evaluation metrics for model selection
Published 18-07-2018“…Creating a linguistic resource is often done by using a machine learning model that filters the content that goes through to a human annotator, before going…”
Get full text
Journal Article -
11
Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian
Published 02-02-2018“…Machine Translation, pp 1-21, (2018), http://rdcu.be/GIkb This paper presents a quantitative fine-grained manual evaluation approach to comparing the…”
Get full text
Journal Article -
12
Fine-grained human evaluation of neural versus phrase-based machine translation
Published 14-06-2017“…The Prague Bulletin of Mathematical Linguistics No. 108, pp. 121-132 (2017) We compare three approaches to statistical machine translation (pure phrase-based,…”
Get full text
Journal Article -
13
Queer In AI: A Case Study in Community-Led Participatory AI
Published 08-06-2023“…2023 ACM Conference on Fairness, Accountability, and Transparency We present Queer in AI as a case study for community-led participatory design in AI. We…”
Get full text
Journal Article -
14
Collaborative development of a rule-based machine translator between Croatian and Serbian
Published in Baltic Journal of Modern Computing (01-01-2016)“…This paper describes the development and current state of a bidirectional Croatian-Serbian machine translation system based on the open-source Apertium…”
Get full text
Journal Article -
15
Dealing with Data Sparseness in SMT with Factored Models and Morphological Expansion: a Case Study on Croatian
Published in Baltic Journal of Modern Computing (01-01-2016)“…This paper describes our experience using available linguistic resources for Croatian in order to address data sparseness when building an English-to-Croatian…”
Get full text
Journal Article -
16
Semantic Relatedness and Taxonomic Word Embeddings
Published 14-02-2020“…This paper connects a series of papers dealing with taxonomic word embeddings. It begins by noting that there are different types of semantic relatedness and…”
Get full text
Journal Article -
17
Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Published 02-05-2023“…We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human…”
Get full text
Journal Article