Search Results - "Kirchenbauer, John"
-
1
What is Your Metric Telling You? Evaluating Classifier Calibration under Context-Specific Definitions of Reliability
Published 23-05-2022“…Classifier calibration has received recent attention from the machine learning community due both to its practical utility in facilitating decision making, as…”
Get full text
Journal Article -
2
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
Published 31-05-2023“…Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this…”
Get full text
Journal Article -
3
GenQA: Generating Millions of Instructions from a Handful of Prompts
Published 14-06-2024“…Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about…”
Get full text
Journal Article -
4
LMD3: Language Model Data Density Dependence
Published 10-05-2024“…We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments…”
Get full text
Journal Article -
5
OPTune: Efficient Online Preference Tuning
Published 11-06-2024“…Reinforcement learning with human feedback~(RLHF) is critical for aligning Large Language Models (LLMs) with human preference. Compared to the widely studied…”
Get full text
Journal Article -
6
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
Published 07-02-2023“…The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable…”
Get full text
Journal Article -
7
A Watermark for Large Language Models
Published 24-01-2023“…Potential harms of large language models can be mitigated by watermarking model output, i.e., embedding signals into generated text that are invisible to…”
Get full text
Journal Article -
8
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Published 14-06-2024“…Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle…”
Get full text
Journal Article -
9
Transformers Can Do Arithmetic with the Right Embeddings
Published 27-05-2024“…The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit…”
Get full text
Journal Article -
10
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Published 23-06-2023“…With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is…”
Get full text
Journal Article -
11
On the Reliability of Watermarks for Large Language Models
Published 07-06-2023“…As LLMs become commonplace, machine-generated text has the potential to flood the internet with spam, social media bots, and valueless content. Watermarking is…”
Get full text
Journal Article -
12
Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Published 01-09-2023“…As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers…”
Get full text
Journal Article -
13
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Published 09-10-2023“…We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during…”
Get full text
Journal Article