Search Results - "Longpre, Shayne"

Refine Results
  1. 1

    MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering by Longpre, Shayne, Lu, Yi, Daiber, Joachim

    “…Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers…”
    Get full text
    Journal Article
  2. 2

    How Big Data Confers Market Power to Big Tech: Leveraging the Perspective of Data Science by Santesteban, Cristian, Longpre, Shayne

    Published in Antitrust bulletin (01-09-2020)
    “…Data-hungry applications are central to the largest online platforms. Using a novel approach that leverages data science to inform the economics, we…”
    Get full text
    Journal Article
  3. 3
  4. 4

    Future and AI-Ready Data Strategies: Response to DOC RFI on AI and Open Government Data Assets by Oderinwale, Hamidah, Longpre, Shayne

    Published 26-07-2024
    “…The following is a response to the US Department of Commerce's Request for Information (RFI) regarding AI and Open Government Data Assets. First, we commend…”
    Get full text
    Journal Article
  5. 5

    A large-scale audit of dataset licensing and attribution in AI by Longpre, Shayne, Mahari, Robert, Chen, Anthony, Obeng-Marnu, Naana, Sileo, Damien, Brannon, William, Muennighoff, Niklas, Khazam, Nathan, Kabbara, Jad, Perisetla, Kartik, Wu, Xinyi (Alexis), Shippole, Enrico, Bollacker, Kurt, Wu, Tongshuang, Villa, Luis, Pentland, Sandy, Hooker, Sara

    Published in Nature machine intelligence (30-08-2024)
    “…The race to train language models on vast, diverse and inconsistently documented datasets raises pressing legal and ethical concerns. To improve data…”
    Get full text
    Journal Article
  6. 6

    A Systematic Review of NeurIPS Dataset Management Practices by Wu, Yiwei, Ajmani, Leah, Longpre, Shayne, Li, Hanlin

    Published 31-10-2024
    “…As new machine learning methods demand larger training datasets, researchers and developers face significant challenges in dataset management. Although ethics…”
    Get full text
    Journal Article
  7. 7

    AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research by Simmons-Edler, Riley, Badman, Ryan, Longpre, Shayne, Rajan, Kanaka

    Published 03-05-2024
    “…The recent embrace of machine learning (ML) in the development of autonomous weapons systems (AWS) creates serious risks to geopolitical stability and the free…”
    Get full text
    Journal Article
  8. 8

    The Foundation Model Transparency Index v1.1: May 2024 by Bommasani, Rishi, Klyman, Kevin, Kapoor, Sayash, Longpre, Shayne, Xiong, Betty, Maslej, Nestor, Liang, Percy

    Published 17-07-2024
    “…Foundation models are increasingly consequential yet extremely opaque. To characterize the status quo, the Foundation Model Transparency Index was launched in…”
    Get full text
    Journal Article
  9. 9

    Combining Compressions for Multiplicative Size Scaling on Natural Language Tasks by Movva, Rajiv, Lei, Jinhao, Longpre, Shayne, Gupta, Ajay, DuBois, Chris

    Published 20-08-2022
    “…Quantization, knowledge distillation, and magnitude pruning are among the most popular methods for neural network compression in NLP. Independently, these…”
    Get full text
    Journal Article
  10. 10

    Foundation Model Transparency Reports by Bommasani, Rishi, Klyman, Kevin, Longpre, Shayne, Xiong, Betty, Kapoor, Sayash, Maslej, Nestor, Narayanan, Arvind, Liang, Percy

    Published 25-02-2024
    “…Published in AIES 2024 Foundation models are critical digital technologies with sweeping societal impact that necessitates transparency. To codify how…”
    Get full text
    Journal Article
  11. 11

    How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers? by Longpre, Shayne, Wang, Yu, DuBois, Christopher

    Published 04-10-2020
    “…Task-agnostic forms of data augmentation have proven widely effective in computer vision, even on pretrained models. In NLP similar results are reported most…”
    Get full text
    Journal Article
  12. 12

    On the Transferability of Minimal Prediction Preserving Inputs in Question Answering by Longpre, Shayne, Lu, Yi, DuBois, Christopher

    Published 17-09-2020
    “…Recent work (Feng et al., 2018) establishes the presence of short, uninterpretable input fragments that yield high confidence and accuracy in neural models. We…”
    Get full text
    Journal Article
  13. 13

    MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering by Longpre, Shayne, Lu, Yi, Daiber, Joachim

    Published 29-07-2020
    “…Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers…”
    Get full text
    Journal Article
  14. 14

    The Foundation Model Transparency Index by Bommasani, Rishi, Klyman, Kevin, Longpre, Shayne, Kapoor, Sayash, Maslej, Nestor, Xiong, Betty, Zhang, Daniel, Liang, Percy

    Published 19-10-2023
    “…Foundation models have rapidly permeated society, catalyzing a wave of generative AI applications spanning enterprise and consumer-facing contexts. While the…”
    Get full text
    Journal Article
  15. 15

    Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them? by Longpre, Shayne, Mahari, Robert, Obeng-Marnu, Naana, Brannon, William, South, Tobin, Gero, Katy, Pentland, Sandy, Kabbara, Jad

    Published 19-04-2024
    “…Proceedings of ICML 2024, in PMLR 235:32711-32725. URL: https://proceedings.mlr.press/v235/longpre24b.html New capabilities in foundation models are owed in…”
    Get full text
    Journal Article
  16. 16

    Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP by Chen, Anthony, Gudipati, Pallavi, Longpre, Shayne, Ling, Xiao, Singh, Sameer

    Published 12-06-2021
    “…Retrieval is a core component for open-domain NLP tasks. In open-domain tasks, multiple entities can share a name, making disambiguation an inherent yet…”
    Get full text
    Journal Article
  17. 17

    To Err is AI : A Case Study Informing LLM Flaw Reporting Practices by McGregor, Sean, Ettinger, Allyson, Judd, Nick, Albee, Paul, Jiang, Liwei, Rao, Kavel, Smith, Will, Longpre, Shayne, Ghosh, Avijit, Fiorelli, Christopher, Hoang, Michelle, Cattell, Sven, Dziri, Nouha

    Published 15-10-2024
    “…In August of 2024, 495 hackers generated evaluations in an open-ended bug bounty targeting the Open Language Model (OLMo) from The Allen Institute for AI. A…”
    Get full text
    Journal Article
  18. 18

    Leveraging Query Resolution and Reading Comprehension for Conversational Passage Retrieval by Vakulenko, Svitlana, Voskarides, Nikos, Tu, Zhucheng, Longpre, Shayne

    Published 17-02-2021
    “…This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track. Our passage retrieval pipeline consists of (i) an initial retrieval…”
    Get full text
    Journal Article
  19. 19

    A Comparison of Question Rewriting Methods for Conversational Passage Retrieval by Vakulenko, Svitlana, Voskarides, Nikos, Tu, Zhucheng, Longpre, Shayne

    Published 18-01-2021
    “…Conversational passage retrieval relies on question rewriting to modify the original question so that it no longer depends on the conversation history. Several…”
    Get full text
    Journal Article
  20. 20

    Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models by Kim, Seungone, Suk, Juyoung, Longpre, Shayne, Lin, Bill Yuchen, Shin, Jamin, Welleck, Sean, Neubig, Graham, Lee, Moontae, Lee, Kyungjae, Seo, Minjoon

    Published 02-05-2024
    “…Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs. However, concerns including transparency,…”
    Get full text
    Journal Article