An analysis of Google Translate and DeepL translation of source text typographical errors in the economic and legal fields

Training neural machine translation systems with noisy data has been shown to improve robustness (Heigold et al., 2018). The objective of the present study is to test Google Translate and DeepL performance in the detection and correction of typographical errors, by introducing 1,820 source text typo...

Full description

Saved in:
Bibliographic Details
Published in:Revista de llengua i dret pp. 88 - 105
Main Author: Santiago Rodríguez-Rubio Mediavilla
Format: Journal Article
Language:Aragonese Spanish
English
Published: Escola d'Administració Pública de Catalunya 01-06-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Training neural machine translation systems with noisy data has been shown to improve robustness (Heigold et al., 2018). The objective of the present study is to test Google Translate and DeepL performance in the detection and correction of typographical errors, by introducing 1,820 source text typos found in a previous work on specialised Spanish-English dictionaries (Rodríguez-Rubio & Fernández-Quesada, 2020a, 2020b; Rodríguez-Rubio Mediavilla, 2021). Typos were introduced in isolation and also in co-text. Results showed that Google Translate clearly outperformed DeepL. Moreover, the repetition of the source typo was found to be the most frequent phenomenon in the machine translation output of both systems. By shedding light on the capacity of systems to deal with source text typographical errors, our study aims to provide a starting point for their refinement.
ISSN:0212-5056
2013-1453
DOI:10.58992/rld.i81.2024.4188