Search Results - "Yogatama, Dani"
-
1
Adaptive Semiparametric Language Models
Published in Transactions of the Association for Computational Linguistics (01-01-2021)“…We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an…”
Get full text
Journal Article -
2
Questions Are All You Need to Train a Dense Passage Retriever
Published in Transactions of the Association for Computational Linguistics (20-06-2023)“…We introduce , a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is…”
Get full text
Journal Article -
3
Relational Memory-Augmented Language Models
Published in Transactions of the Association for Computational Linguistics (04-05-2022)“…We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation…”
Get full text
Journal Article -
4
Grandmaster level in StarCraft II using multi-agent reinforcement learning
Published in Nature (London) (01-11-2019)“…Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal,…”
Get full text
Journal Article -
5
Syntactic Structure Distillation Pretraining for Bidirectional Encoders
Published in Transactions of the Association for Computational Linguistics (01-01-2020)“…Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well…”
Get full text
Journal Article -
6
Understanding In-Context Learning with a Pelican Soup Framework
Published 15-02-2024“…Many existing theoretical analyses of in-context learning for natural language processing are based on latent variable models that leaves gaps between theory…”
Get full text
Journal Article -
7
The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining
Published 24-10-2023“…We analyze the masked language modeling pretraining objective function from the perspective of the distributional hypothesis. We investigate whether better…”
Get full text
Journal Article -
8
Modelling Latent Skills for Multitask Language Generation
Published 21-02-2020“…We present a generative model for multitask conditional language generation. Our guiding hypothesis is that a shared set of latent skills underlies many…”
Get full text
Journal Article -
9
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics
Published 28-10-2024“…While interpretability research has shed light on some internal algorithms utilized by transformer-based LLMs, reasoning in natural language, with its deep…”
Get full text
Journal Article -
10
LocateBench: Evaluating the Locating Ability of Vision Language Models
Published 17-10-2024“…The ability to locate an object in an image according to natural language instructions is crucial for many real-world applications. In this work we propose…”
Get full text
Journal Article -
11
DeLLMa: Decision Making Under Uncertainty with Large Language Models
Published 04-02-2024“…The potential of large language models (LLMs) as decision support tools is increasingly being explored in fields such as business, engineering, and medicine,…”
Get full text
Journal Article -
12
Relational Memory Augmented Language Models
Published 24-01-2022“…We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation…”
Get full text
Journal Article -
13
Balancing Average and Worst-case Accuracy in Multitask Learning
Published 12-10-2021“…When training and evaluating machine learning models on a large number of tasks, it is important to not only look at average task accuracy -- which may be…”
Get full text
Journal Article -
14
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Published 07-11-2024“…Modern language models can process inputs across diverse languages and modalities. We hypothesize that models acquire this capability through learning a shared…”
Get full text
Journal Article -
15
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations
Published 01-04-2024“…Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs. But do their capabilities…”
Get full text
Journal Article -
16
On Retrieval Augmentation and the Limitations of Language Model Training
Published 16-11-2023“…Augmenting a language model (LM) with $k$-nearest neighbors ($k$NN) retrieval on its training data alone can decrease its perplexity, though the underlying…”
Get full text
Journal Article -
17
Interpretable Diffusion via Information Decomposition
Published 11-10-2023“…Denoising diffusion models enable conditional generation and density modeling of complex relationships like images and text. However, the nature of the learned…”
Get full text
Journal Article -
18
On the Cross-lingual Transferability of Monolingual Representations
Published 26-05-2020“…State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This…”
Get full text
Journal Article -
19
Dynamic Language Models for Streaming Text
Published in Transactions of the Association for Computational Linguistics (01-12-2014)“…We present a probabilistic language model that captures temporal dynamics and conditions on arbitrary context features. These context features serve as…”
Get full text
Journal Article -
20
Adaptive Semiparametric Language Models
Published 04-02-2021“…We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an…”
Get full text
Journal Article