Search Results - "Raffel, Colin"

Refine Results
  1. 1

    An Empirical Survey of Data Augmentation for Limited Data Learning in NLP by Chen, Jiaao, Tam, Derek, Raffel, Colin, Bansal, Mohit, Yang, Diyi

    “…NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP…”
    Get full text
    Journal Article
  2. 2
  3. 3

    ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models by Xue, Linting, Barua, Aditya, Constant, Noah, Al-Rfou, Rami, Narang, Sharan, Kale, Mihir, Roberts, Adam, Raffel, Colin

    “…Most widely used pre-trained language models operate on sequences of tokens corresponding to word or subword units. By comparison, models that operate directly…”
    Get full text
    Journal Article
  4. 4

    Learning Hard Alignments with Variational Inference by Lawson, Dieterich, Chiu, Chung-Cheng, Tucker, George, Raffel, Colin, Swersky, Kevin, Jaitly, Navdeep

    “…There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard…”
    Get full text
    Conference Proceeding
  5. 5

    Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model by Deng, Haikang, Raffel, Colin

    Published 02-01-2024
    “…While large language models have proven effective in a huge range of downstream applications, they often generate text that is problematic or lacks a desired…”
    Get full text
    Journal Article
  6. 6

    Optimizing DTW-based audio-to-MIDI alignment and matching by Raffel, Colin, Ellis, Daniel P. W.

    “…Dynamic time warping (DTW) has proven to be an extremely effective method for both aligning and matching recordings of music to corresponding MIDI…”
    Get full text
    Conference Proceeding Journal Article
  7. 7

    NPEFF: Non-Negative Per-Example Fisher Factorization by Matena, Michael, Raffel, Colin

    Published 06-10-2023
    “…As deep learning models are deployed in more and more settings, it becomes increasingly important to be able to understand why they produce a given prediction,…”
    Get full text
    Journal Article
  8. 8

    Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching by Raffel, Colin

    Published 01-01-2016
    “…Sequences of feature vectors are a natural way of representing temporal data. Given a database of sequences, a fundamental task is to find the database entry…”
    Get full text
    Dissertation
  9. 9

    A Combinatorial Perspective on the Optimization of Shallow ReLU Networks by Matena, Michael, Raffel, Colin

    Published 30-09-2022
    “…The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern…”
    Get full text
    Journal Article
  10. 10

    Merging Models with Fisher-Weighted Averaging by Matena, Michael, Raffel, Colin

    Published 18-11-2021
    “…Averaging the parameters of models that have the same architecture and initialization can provide a means of combining their respective capabilities. In this…”
    Get full text
    Journal Article
  11. 11
  12. 12

    DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows by Patel, Ajay, Raffel, Colin, Callison-Burch, Chris

    Published 15-02-2024
    “…Large language models (LLMs) have become a dominant and important tool for NLP researchers in a wide range of tasks. Today, many researchers use LLMs in…”
    Get full text
    Journal Article
  13. 13

    Pruning subsequence search with attention-based embedding by Raffel, Colin, Ellis, Daniel P. W.

    “…Searching a large database to find a sequence that is most similar to a query can be prohibitively expensive, particularly if individual sequence comparisons…”
    Get full text
    Conference Proceeding Journal Article
  14. 14

    Merging by Matching Models in Task Parameter Subspaces by Tam, Derek, Bansal, Mohit, Raffel, Colin

    Published 07-12-2023
    “…Model merging aims to cheaply combine individual task-specific models into a single multitask model. In this work, we view past merging methods as leveraging…”
    Get full text
    Journal Article
  15. 15

    Soft Merging of Experts with Adaptive Routing by Muqeeth, Mohammed, Liu, Haokun, Raffel, Colin

    Published 06-06-2023
    “…Sparsely activated neural networks with conditional computation learn to route their inputs through different "expert" subnetworks, providing a form of…”
    Get full text
    Journal Article
  16. 16

    Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data by Albalak, Alon, Raffel, Colin, Wang, William Yang

    Published 01-02-2023
    “…Few-shot learning is valuable in many real-world applications, but learning a generalizable model without overfitting to the few labeled datapoints is…”
    Get full text
    Journal Article
  17. 17

    Realistic Evaluation of Model Merging for Compositional Generalization by Tam, Derek, Kant, Yash, Lester, Brian, Gilitschenski, Igor, Raffel, Colin

    Published 26-09-2024
    “…Merging has become a widespread way to cheaply combine individual models into a single model that inherits their capabilities and attains better performance…”
    Get full text
    Journal Article
  18. 18

    Estimating timing and channel distortion across related signals by Raffel, Colin, Ellis, Daniel P. W.

    “…We consider the situation where there are multiple audio signals whose relationship is of interest. If these signals have been differently captured, the…”
    Get full text
    Conference Proceeding
  19. 19

    Compositional Generalization in Unsupervised Compositional Representation Learning: A Study on Disentanglement and Emergent Language by Xu, Zhenlin, Niethammer, Marc, Raffel, Colin

    Published 02-10-2022
    “…Deep learning models struggle with compositional generalization, i.e. the ability to recognize or generate novel combinations of observed elementary concepts…”
    Get full text
    Journal Article
  20. 20

    Learning to Route Among Specialized Experts for Zero-Shot Generalization by Muqeeth, Mohammed, Liu, Haokun, Liu, Yufan, Raffel, Colin

    Published 08-02-2024
    “…Recently, there has been a widespread proliferation of "expert" language models that are specialized to a specific task or domain through parameter-efficient…”
    Get full text
    Journal Article