Search Results - "Thrush, Tristan"

1
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality by Thrush, Tristan, Jiang, Ryan, Bartolo, Max, Singh, Amanpreet, Williams, Adina, Kiela, Douwe, Ross, Candace

Published in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (01-06-2022)
“…We present a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning, which we call…”

Get full text

Conference Proceeding
QR Code
Save to List

Saved in:
2
Compositional Neural Machine Translation by Removing the Lexicon from Syntax by Thrush, Tristan

Published 06-02-2020
“…The meaning of a natural language utterance is largely determined from its syntax and words. Additionally, there is evidence that humans process an utterance…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
3
Rover Relocalization for Mars Sample Return by Virtual Template Synthesis and Matching by Pham, Tu-Hoa, Seto, William, Daftry, Shreyansh, Ridge, Barry, Hansen, Johanna, Thrush, Tristan, Van der Merwe, Mark, Maggiolino, Gerard, Brinkman, Alexander, Mayo, John, Cheng, Yang, Padgett, Curtis, Kulczycki, Eric, Detry, Renaud

Published in IEEE robotics and automation letters (01-04-2021)
“…We consider the problem of rover relocalization in the context of the notional Mars Sample Return campaign. In this campaign, a rover (R1) needs to be capable…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
4
Improving Pretraining Data Using Perplexity Correlations by Thrush, Tristan, Potts, Christopher, Hashimoto, Tatsunori

Published 09-09-2024
“…Quality pretraining data is often seen as the key to high-performance language models. However, progress in understanding pretraining data has been slow due to…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
5
ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation by Burapacheep, Jirayu, Gaur, Ishan, Bhatia, Agam, Thrush, Tristan

Published 06-02-2024
“…This paper introduces the ColorSwap dataset, designed to assess and improve the proficiency of multimodal models in matching objects with their colors. The…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
6
I am a Strange Dataset: Metalinguistic Tests for Language Models by Thrush, Tristan, Moore, Jared, Monares, Miguel, Potts, Christopher, Kiela, Douwe

Published 10-01-2024
“…Statements involving metalinguistic self-reference ("This paper has six sections.") are prevalent in many domains. Can current large language models (LLMs)…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language by Berrios, William, Mittal, Gautam, Thrush, Tristan, Kiela, Douwe, Singh, Amanpreet

Published 28-06-2023
“…We propose LENS, a modular approach for tackling computer vision problems by leveraging the power of large language models (LLMs). Our system uses a language…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
8
Nearest Neighbor Normalization Improves Multimodal Retrieval by Chowdhury, Neil, Wang, Franklin, Shenoy, Sumedh, Kiela, Douwe, Schwettmann, Sarah, Thrush, Tristan

Published 31-10-2024
“…EMNLP 2024 Multimodal models leverage large-scale pre-training to achieve strong but still imperfect performance on tasks such as image captioning, visual…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
9
Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization by Thrush, Tristan, Wilcox, Ethan, Levy, Roger

Published 04-11-2020
“…Previous studies investigating the syntactic abilities of deep learning models have not targeted the relationship between the strength of the grammatical…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
10
ANLIzing the Adversarial Natural Language Inference Dataset by Williams, Adina, Thrush, Tristan, Kiela, Douwe

Published 23-10-2020
“…We perform an in-depth error analysis of Adversarial NLI (ANLI), a recently introduced large-scale human-and-model-in-the-loop natural language inference…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
11
Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection by Vidgen, Bertie, Thrush, Tristan, Waseem, Zeerak, Kiela, Douwe

Published 31-12-2020
“…We present a human-and-model-in-the-loop process for dynamically generating datasets and training better performing and more robust hate detection models. We…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
12
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality by Thrush, Tristan, Jiang, Ryan, Bartolo, Max, Singh, Amanpreet, Williams, Adina, Kiela, Douwe, Ross, Candace

Published 06-04-2022
“…We present a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning, which we call…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
13
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation by Bartolo, Max, Thrush, Tristan, Jia, Robin, Riedel, Sebastian, Stenetorp, Pontus, Kiela, Douwe

Published 15-03-2022
“…Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, p.8830-8848. Association for Computational Linguistics Despite recent…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
14
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants by Bartolo, Max, Thrush, Tristan, Riedel, Sebastian, Stenetorp, Pontus, Jia, Robin, Kiela, Douwe

Published 16-12-2021
“…In Dynamic Adversarial Data Collection (DADC), human annotators are tasked with finding examples that models struggle to predict correctly. Models trained on…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
15
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate by Kirk, Hannah Rose, Vidgen, Bertram, Röttger, Paul, Thrush, Tristan, Hale, Scott A

Published 12-08-2021
“…2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022) Detecting online hate is a complex task, and…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
16
Measuring Data by Mitchell, Margaret, Luccioni, Alexandra Sasha, Lambert, Nathan, Gerchick, Marissa, McMillan-Major, Angelina, Ozoani, Ezinwanne, Rajani, Nazneen, Thrush, Tristan, Jernite, Yacine, Kiela, Douwe

Published 09-12-2022
“…We identify the task of measuring data to quantitatively characterize the composition of machine learning data and datasets. Similar to an object's height,…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
17
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking by Ma, Zhiyi, Ethayarajh, Kawin, Thrush, Tristan, Jain, Somya, Wu, Ledell, Jia, Robin, Potts, Christopher, Williams, Adina, Kiela, Douwe

Published 20-05-2021
“…We introduce Dynaboard, an evaluation-as-a-service framework for hosting benchmarks and conducting holistic model comparison, integrated with the Dynabench…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
18
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks by Thrush, Tristan, Tirumala, Kushal, Gupta, Anmol, Bartolo, Max, Rodriguez, Pedro, Kane, Tariq, Rojas, William Gaviria, Mattson, Peter, Williams, Adina, Kiela, Douwe

Published 04-04-2022
“…We introduce Dynatask: an open source system for setting up custom NLP tasks that aims to greatly lower the technical knowledge and effort required for hosting…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
19
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements by von Werra, Leandro, Tunstall, Lewis, Thakur, Abhishek, Luccioni, Alexandra Sasha, Thrush, Tristan, Piktus, Aleksandra, Marty, Felix, Rajani, Nazneen, Mustar, Victor, Ngo, Helen, Sanseviero, Omar, Šaško, Mario, Villanova, Albert, Lhoest, Quentin, Chaumond, Julien, Mitchell, Margaret, Rush, Alexander M, Wolf, Thomas, Kiela, Douwe

Published 30-09-2022
“…Evaluation is a key part of machine learning (ML), yet there is a lack of support and tooling to enable its informed and systematic practice. We introduce…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
20
Dynabench: Rethinking Benchmarking in NLP by Kiela, Douwe, Bartolo, Max, Nie, Yixin, Kaushik, Divyansh, Geiger, Atticus, Wu, Zhengxuan, Vidgen, Bertie, Prasad, Grusha, Singh, Amanpreet, Ringshia, Pratik, Ma, Zhiyi, Thrush, Tristan, Riedel, Sebastian, Waseem, Zeerak, Stenetorp, Pontus, Jia, Robin, Bansal, Mohit, Potts, Christopher, Williams, Adina

Published 07-04-2021
“…We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports…”

Get full text

Journal Article
QR Code
Save to List

Saved in:

Search Results - "Thrush, Tristan"

Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality by Thrush, Tristan, Jiang, Ryan, Bartolo, Max, Singh, Amanpreet, Williams, Adina, Kiela, Douwe, Ross, Candace

Compositional Neural Machine Translation by Removing the Lexicon from Syntax by Thrush, Tristan

Improving Pretraining Data Using Perplexity Correlations by Thrush, Tristan, Potts, Christopher, Hashimoto, Tatsunori

ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation by Burapacheep, Jirayu, Gaur, Ishan, Bhatia, Agam, Thrush, Tristan

I am a Strange Dataset: Metalinguistic Tests for Language Models by Thrush, Tristan, Moore, Jared, Monares, Miguel, Potts, Christopher, Kiela, Douwe

Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language by Berrios, William, Mittal, Gautam, Thrush, Tristan, Kiela, Douwe, Singh, Amanpreet

Nearest Neighbor Normalization Improves Multimodal Retrieval by Chowdhury, Neil, Wang, Franklin, Shenoy, Sumedh, Kiela, Douwe, Schwettmann, Sarah, Thrush, Tristan

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization by Thrush, Tristan, Wilcox, Ethan, Levy, Roger

ANLIzing the Adversarial Natural Language Inference Dataset by Williams, Adina, Thrush, Tristan, Kiela, Douwe

Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection by Vidgen, Bertie, Thrush, Tristan, Waseem, Zeerak, Kiela, Douwe

Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality by Thrush, Tristan, Jiang, Ryan, Bartolo, Max, Singh, Amanpreet, Williams, Adina, Kiela, Douwe, Ross, Candace

Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation by Bartolo, Max, Thrush, Tristan, Jia, Robin, Riedel, Sebastian, Stenetorp, Pontus, Kiela, Douwe

Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants by Bartolo, Max, Thrush, Tristan, Riedel, Sebastian, Stenetorp, Pontus, Jia, Robin, Kiela, Douwe

Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate by Kirk, Hannah Rose, Vidgen, Bertram, Röttger, Paul, Thrush, Tristan, Hale, Scott A

Measuring Data by Mitchell, Margaret, Luccioni, Alexandra Sasha, Lambert, Nathan, Gerchick, Marissa, McMillan-Major, Angelina, Ozoani, Ezinwanne, Rajani, Nazneen, Thrush, Tristan, Jernite, Yacine, Kiela, Douwe

Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking by Ma, Zhiyi, Ethayarajh, Kawin, Thrush, Tristan, Jain, Somya, Wu, Ledell, Jia, Robin, Potts, Christopher, Williams, Adina, Kiela, Douwe

Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks by Thrush, Tristan, Tirumala, Kushal, Gupta, Anmol, Bartolo, Max, Rodriguez, Pedro, Kane, Tariq, Rojas, William Gaviria, Mattson, Peter, Williams, Adina, Kiela, Douwe

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication