Search Results - "Gallé, Matthias"
-
1
The Smallest Grammar Problem as Constituents Choice and Minimal Grammar Parsing
Published in Algorithms (01-12-2011)“…The smallest grammar problem—namely, finding a smallest context-free grammar that generates exactly one sequence—is of practical and theoretical importance in…”
Get full text
Journal Article -
2
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Published in The Journal of artificial intelligence research (01-01-2021)“…There is an ongoing debate in the NLP community whether modern language models contain linguistic knowledge, recovered through so-called probes. In this paper,…”
Get full text
Journal Article -
3
xkcd-repeats: A new taxonomy of repeats defined by their context diversity
Published in Journal of discrete algorithms (Amsterdam, Netherlands) (01-01-2018)“…The context in which a substring appears is an important notion to identify – for example – its semantic meaning. However, existing definitions from…”
Get full text
Journal Article -
4
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model
Published in Psychofenia (09-12-2022)“…The BigScience Workshop was a value-driven initiative that spanned one and half years of interdisciplinary research and culminated in the creation of ROOTS, a…”
Get full text
Conference Proceeding -
5
Searching for smallest grammars on large sequences and application to DNA
Published in Journal of discrete algorithms (Amsterdam, Netherlands) (01-02-2012)“…Motivated by the inference of the structure of genomic sequences, we address here the smallest grammar problem. In previous work, we introduced a new…”
Get full text
Journal Article -
6
LLMCRIT: Teaching Large Language Models to Use Criteria
Published 01-03-2024“…Humans follow criteria when they execute tasks, and these criteria are directly used to assess the quality of task completion. Therefore, having models learn…”
Get full text
Journal Article -
7
What Can I Do Now? Guiding Users in a World of Automated Decisions
Published 13-01-2017“…More and more processes governing our lives use in some part an automatic decision step, where -- based on a feature vector derived from an applicant -- an…”
Get full text
Journal Article -
8
Improving Reward Models with Synthetic Critiques
Published 31-05-2024“…Reward models (RMs) play a critical role in aligning language models through the process of reinforcement learning from human feedback. RMs are trained to…”
Get full text
Journal Article -
9
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
Published 22-02-2024“…AI alignment in the shape of Reinforcement Learning from Human Feedback (RLHF) is increasingly treated as a crucial ingredient for high performance large…”
Get full text
Journal Article -
10
Speeding Up Entmax
Published 12-11-2021“…Softmax is the de facto standard in modern neural networks for language processing when it comes to normalizing logits. However, by producing a dense…”
Get full text
Journal Article -
11
Unsupervised and Distributional Detection of Machine-Generated Text
Published 04-11-2021“…The power of natural language generation models has provoked a flurry of interest in automatic methods to detect if a piece of text is human or…”
Get full text
Journal Article -
12
Multilingual Unsupervised Neural Machine Translation with Denoising Adapters
Published 20-10-2021“…We consider the problem of multilingual unsupervised machine translation, translating to and from languages that only have monolingual data by using auxiliary…”
Get full text
Journal Article -
13
The Generalized Smallest Grammar Problem
Published 31-08-2016“…The Smallest Grammar Problem -- the problem of finding the smallest context-free grammar that generates exactly one given sequence -- has never been…”
Get full text
Journal Article -
14
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model
Published 09-12-2022“…The BigScience Workshop was a value-driven initiative that spanned one and half years of interdisciplinary research and culminated in the creation of ROOTS, a…”
Get full text
Journal Article -
15
On Leakage of Code Generation Evaluation Datasets
Published 10-07-2024“…In this paper, we consider contamination by code generation test sets, in particular in their use in modern large language models. We discuss three possible…”
Get full text
Journal Article -
16
Self-Supervised and Controlled Multi-Document Opinion Summarization
Published 30-04-2020“…We address the problem of unsupervised abstractive summarization of collections of user generated reviews with self-supervision and control. We propose a…”
Get full text
Journal Article -
17
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Published 03-01-2022“…Journal of Artificial Intelligence Vol. 72 (2021) 1343-1384 There is an ongoing debate in the NLP community whether modern language models contain linguistic…”
Get full text
Journal Article -
18
"Roles for the boys?" Mining cast lists for gender and role distributions over time
Published 11-03-2015“…Film and television play an important role in popular culture, however studies that require watching and annotating video are time-consuming and expensive to…”
Get full text
Journal Article -
19
Character-based NMT with Transformer
Published 12-11-2019“…Character-based translation has several appealing advantages, but its performance is in general worse than a carefully tuned BPE baseline. In this paper we…”
Get full text
Journal Article -
20
On the Evaluation of Machine Translation for Terminology Consistency
Published 22-06-2021“…As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with…”
Get full text
Journal Article