Search Results - "Yogatama, Dani"

Refine Results
  1. 1

    Adaptive Semiparametric Language Models by Yogatama, Dani, de Masson d’Autume, Cyprien, Kong, Lingpeng

    “…We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an…”
    Get full text
    Journal Article
  2. 2

    Questions Are All You Need to Train a Dense Passage Retriever by Sachan, Devendra Singh, Lewis, Mike, Yogatama, Dani, Zettlemoyer, Luke, Pineau, Joelle, Zaheer, Manzil

    “…We introduce , a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is…”
    Get full text
    Journal Article
  3. 3

    Relational Memory-Augmented Language Models by Liu, Qi, Yogatama, Dani, Blunsom, Phil

    “…We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation…”
    Get full text
    Journal Article
  4. 4
  5. 5

    Syntactic Structure Distillation Pretraining for Bidirectional Encoders by Kuncoro, Adhiguna, Kong, Lingpeng, Fried, Daniel, Yogatama, Dani, Rimell, Laura, Dyer, Chris, Blunsom, Phil

    “…Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well…”
    Get full text
    Journal Article
  6. 6

    Understanding In-Context Learning with a Pelican Soup Framework by Chiang, Ting-Rui, Yogatama, Dani

    Published 15-02-2024
    “…Many existing theoretical analyses of in-context learning for natural language processing are based on latent variable models that leaves gaps between theory…”
    Get full text
    Journal Article
  7. 7

    The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining by Chiang, Ting-Rui, Yogatama, Dani

    Published 24-10-2023
    “…We analyze the masked language modeling pretraining objective function from the perspective of the distributional hypothesis. We investigate whether better…”
    Get full text
    Journal Article
  8. 8

    Modelling Latent Skills for Multitask Language Generation by Cao, Kris, Yogatama, Dani

    Published 21-02-2020
    “…We present a generative model for multitask conditional language generation. Our guiding hypothesis is that a shared set of latent skills underlies many…”
    Get full text
    Journal Article
  9. 9

    Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics by Lee, Isabelle, Lum, Joshua, Liu, Ziyi, Yogatama, Dani

    Published 28-10-2024
    “…While interpretability research has shed light on some internal algorithms utilized by transformer-based LLMs, reasoning in natural language, with its deep…”
    Get full text
    Journal Article
  10. 10

    LocateBench: Evaluating the Locating Ability of Vision Language Models by Chiang, Ting-Rui, Robinson, Joshua, Yu, Xinyan Velocity, Yogatama, Dani

    Published 17-10-2024
    “…The ability to locate an object in an image according to natural language instructions is crucial for many real-world applications. In this work we propose…”
    Get full text
    Journal Article
  11. 11

    DeLLMa: Decision Making Under Uncertainty with Large Language Models by Liu, Ollie, Fu, Deqing, Yogatama, Dani, Neiswanger, Willie

    Published 04-02-2024
    “…The potential of large language models (LLMs) as decision support tools is increasingly being explored in fields such as business, engineering, and medicine,…”
    Get full text
    Journal Article
  12. 12

    Relational Memory Augmented Language Models by Liu, Qi, Yogatama, Dani, Blunsom, Phil

    Published 24-01-2022
    “…We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation…”
    Get full text
    Journal Article
  13. 13

    Balancing Average and Worst-case Accuracy in Multitask Learning by Michel, Paul, Ruder, Sebastian, Yogatama, Dani

    Published 12-10-2021
    “…When training and evaluating machine learning models on a large number of tasks, it is important to not only look at average task accuracy -- which may be…”
    Get full text
    Journal Article
  14. 14

    The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities by Wu, Zhaofeng, Yu, Xinyan Velocity, Yogatama, Dani, Lu, Jiasen, Kim, Yoon

    Published 07-11-2024
    “…Modern language models can process inputs across diverse languages and modalities. We hypothesize that models acquire this capability through learning a shared…”
    Get full text
    Journal Article
  15. 15

    IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations by Fu, Deqing, Guo, Ruohao, Khalighinejad, Ghazal, Liu, Ollie, Dhingra, Bhuwan, Yogatama, Dani, Jia, Robin, Neiswanger, Willie

    Published 01-04-2024
    “…Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs. But do their capabilities…”
    Get full text
    Journal Article
  16. 16

    On Retrieval Augmentation and the Limitations of Language Model Training by Chiang, Ting-Rui, Yu, Xinyan Velocity, Robinson, Joshua, Liu, Ollie, Lee, Isabelle, Yogatama, Dani

    Published 16-11-2023
    “…Augmenting a language model (LM) with $k$-nearest neighbors ($k$NN) retrieval on its training data alone can decrease its perplexity, though the underlying…”
    Get full text
    Journal Article
  17. 17

    Interpretable Diffusion via Information Decomposition by Kong, Xianghao, Liu, Ollie, Li, Han, Yogatama, Dani, Steeg, Greg Ver

    Published 11-10-2023
    “…Denoising diffusion models enable conditional generation and density modeling of complex relationships like images and text. However, the nature of the learned…”
    Get full text
    Journal Article
  18. 18

    On the Cross-lingual Transferability of Monolingual Representations by Artetxe, Mikel, Ruder, Sebastian, Yogatama, Dani

    Published 26-05-2020
    “…State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This…”
    Get full text
    Journal Article
  19. 19

    Dynamic Language Models for Streaming Text by Yogatama, Dani, Wang, Chong, Routledge, Bryan R., Smith, Noah A., Xing, Eric P.

    “…We present a probabilistic language model that captures temporal dynamics and conditions on arbitrary context features. These context features serve as…”
    Get full text
    Journal Article
  20. 20

    Adaptive Semiparametric Language Models by Yogatama, Dani, d'Autume, Cyprien de Masson, Kong, Lingpeng

    Published 04-02-2021
    “…We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an…”
    Get full text
    Journal Article