Search Results - "Bojar, Ondřej"
-
1
Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals
Published in Nature communications (01-09-2020)“…The quality of human translation was long thought to be unattainable for computer translation systems. In this study, we present a deep-learning system,…”
Get full text
Journal Article -
2
Keyphrase Generation: A Multi-Aspect Survey
Published in 2019 25th Conference of Open Innovations Association (FRUCT) (01-11-2019)“…Extractive keyphrase generation research has been around since the nineties, but the more advanced abstractive approach based on the encoder-decoder framework…”
Get full text
Conference Proceeding Journal Article -
3
Understanding the role of FFNs in driving multilingual behaviour in LLMs
Published 21-04-2024“…Multilingualism in Large Language Models (LLMs) is an yet under-explored area. In this paper, we conduct an in-depth analysis of the multilingual capabilities…”
Get full text
Journal Article -
4
Quality and Quantity of Machine Translation References for Automatic Metrics
Published 02-01-2024“…Automatic machine translation metrics typically rely on human translations to determine the quality of system translations. Common wisdom in the field dictates…”
Get full text
Journal Article -
5
Boosting Unsupervised Machine Translation with Pseudo-Parallel Data
Published 22-10-2023“…Ivana Kvapil\'ikov\'a, Ond\v{r}ej Bojar (2023): Boosting Unsupervised Machine Translation with Pseudo-Parallel Data. In: Proceedings of Machine Translation…”
Get full text
Journal Article -
6
Long-Form End-to-End Speech Translation via Latent Alignment Segmentation
Published 20-09-2023“…Current simultaneous speech translation models can process audio only up to a few seconds long. Contemporary datasets provide an oracle segmentation into…”
Get full text
Journal Article -
7
Minuteman: Machine and Human Joining Forces in Meeting Summarization
Published 11-09-2023“…Many meetings require creating a meeting summary to keep everyone up to date. Creating minutes of sufficient quality is however very cognitively demanding…”
Get full text
Journal Article -
8
Character-level NMT and language similarity
Published 08-08-2023“…We explore the effectiveness of character-level neural machine translation using Transformer architecture for various levels of language similarity and size of…”
Get full text
Journal Article -
9
Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation
Published 30-05-2023“…We propose a genetic algorithm (GA) based method for modifying n-best lists produced by a machine translation (MT) system. Our method offers an innovative…”
Get full text
Journal Article -
10
Sequence Length is a Domain: Length-based Overfitting in Transformer Models
Published 15-09-2021“…Transformer-based sequence-to-sequence architectures, while achieving state-of-the-art results on a large number of NLP tasks, can still suffer from…”
Get full text
Journal Article -
11
Coarse-To-Fine And Cross-Lingual ASR Transfer
Published 02-09-2021“…End-to-end neural automatic speech recognition systems achieved recently state-of-the-art results, but they require large datasets and extensive computing…”
Get full text
Journal Article -
12
Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers
Published 17-10-2024“…The Transformer model has a tendency to overfit various aspects of the training data, such as the overall sequence length. We study elementary string edit…”
Get full text
Journal Article -
13
Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation
Published 19-10-2020“…This work presents our ongoing research of unsupervised pretraining in neural machine translation (NMT). In our method, we initialize the weights of the…”
Get full text
Journal Article -
14
Presenting Simultaneous Translation in Limited Space
Published 18-09-2020“…ITAT WAFNL 2020 Some methods of automatic simultaneous translation of a long-form speech allow revisions of outputs, trading accuracy for low latency…”
Get full text
Journal Article -
15
Automating Text Naturalness Evaluation of NLG Systems
Published 23-06-2020“…Automatic methods and metrics that assess various quality criteria of automatically generated texts are important for developing NLG systems because they…”
Get full text
Journal Article -
16
Turning Whisper into Real-Time Transcription System
Published 27-07-2023“…Whisper is one of the recent state-of-the-art multilingual speech recognition and translation models, however, it is not designed for real time transcription…”
Get full text
Journal Article -
17
Human or Machine: Automating Human Likeliness Evaluation of NLG Texts
Published 04-06-2020“…Automatic evaluation of various text quality criteria produced by data-driven intelligent methods is very common and useful because it is cheap, fast, and…”
Get full text
Journal Article -
18
Assessing Word Importance Using Models Trained for Semantic Tasks
Published 31-05-2023“…Many NLP tasks require to automatically identify the most significant words in a text. In this work, we derive word significance from models trained to solve…”
Get full text
Journal Article -
19
Two Huge Title and Keyword Generation Corpora of Research Articles
Published 11-02-2020“…Recent developments in sequence-to-sequence learning with neural networks have considerably improved the quality of automatically generated text summaries and…”
Get full text
Journal Article -
20
Multimodal Shannon Game with Images
Published 20-03-2023“…The Shannon game has long been used as a thought experiment in linguistics and NLP, asking participants to guess the next letter in a sentence based on its…”
Get full text
Journal Article