Search Results - "Pfau, Jacob"

1
Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models by Young, Albert T., Fernandez, Kristen, Pfau, Jacob, Reddy, Rasika, Cao, Nhat Anh, von Franque, Max Y., Johal, Arjun, Wu, Benjamin V., Wu, Rachel R., Chen, Jennifer Y., Fadadu, Raj P., Vasquez, Juan A., Tam, Andrew, Keiser, Michael J., Wei, Maria L.

Published in NPJ digital medicine (21-01-2021)
“…Artificial intelligence models match or exceed dermatologists in melanoma image classification. Less is known about their robustness against real-world…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
2
Artificial Intelligence in Dermatology: A Primer by Young, Albert T., Xiong, Mulin, Pfau, Jacob, Keiser, Michael J., Wei, Maria L.

Published in Journal of investigative dermatology (01-08-2020)
“…Artificial intelligence is becoming increasingly important in dermatology, with studies reporting accuracy matching or exceeding dermatologists for the…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
3
Artificial Intelligence in Teledermatology by Xiong, Mulin, Pfau, Jacob, Young, Albert T., Wei, Maria L.

Published in Current dermatology reports (15-09-2019)
“…Purpose of Review This review summarizes current and prospective applications of artificial intelligence (AI) and smartphone technologies to automated…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
4
Let's Think Dot by Dot: Hidden Computation in Transformer Language Models by Pfau, Jacob, Merrill, William, Bowman, Samuel R

Published 24-04-2024
“…Chain-of-thought responses from language models improve performance across most benchmarks. However, it remains unclear to what extent these performance gains…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
5
Steering Without Side Effects: Improving Post-Deployment Control of Language Models by Stickland, Asa Cooper, Lyzhov, Alexander, Pfau, Jacob, Mahdi, Salsabila, Bowman, Samuel R

Published 20-06-2024
“…Language models (LMs) have been shown to behave unexpectedly post-deployment. For example, new jailbreaks continually arise, allowing model misuse, despite…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
6
Taking AI Welfare Seriously by Long, Robert, Sebo, Jeff, Butlin, Patrick, Finlinson, Kathleen, Fish, Kyle, Harding, Jacqueline, Pfau, Jacob, Sims, Toni, Birch, Jonathan, Chalmers, David

Published 04-11-2024
“…In this report, we argue that there is a realistic possibility that some AI systems will be conscious and/or robustly agentic in the near future. That means…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
7
Self-Consistency of Large Language Models under Ambiguity by Bartsch, Henning, Jorgensen, Ole, Rosati, Domenic, Hoelscher-Obermaier, Jason, Pfau, Jacob

Published 20-10-2023
“…Large language models (LLMs) that do not give consistent answers across contexts are problematic when used for tasks with expectations of consistency, e.g.,…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
8
Goal Misgeneralization in Deep Reinforcement Learning by Langosco, Lauro, Koch, Jack, Sharkey, Lee, Pfau, Jacob, Orseau, Laurent, Krueger, David

Published 28-05-2021
“…We study goal misgeneralization, a type of out-of-distribution generalization failure in reinforcement learning (RL). Goal misgeneralization failures occur…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
9
Robust Semantic Interpretability: Revisiting Concept Activation Vectors by Pfau, Jacob, Young, Albert T, Wei, Jerome, Wei, Maria L, Keiser, Michael J

Published 06-04-2021
“…Interpretability methods for image classification assess model trustworthiness by attempting to expose whether the model is systematically biased or attending…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
10
Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias by Pfau, Jacob, Young, Albert T, Wei, Maria L, Keiser, Michael J

Published 16-10-2019
“…In high-stakes applications of machine learning models, interpretability methods provide guarantees that models are right for the right reasons. In medical…”

Get full text

Journal Article
QR Code
Save to List

Saved in:
11
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback by Casper, Stephen, Davies, Xander, Shi, Claudia, Gilbert, Thomas Krendl, Scheurer, Jérémy, Rando, Javier, Freedman, Rachel, Korbak, Tomasz, Lindner, David, Freire, Pedro, Wang, Tony, Marks, Samuel, Segerie, Charbel-Raphaël, Carroll, Micah, Peng, Andi, Christoffersen, Phillip, Damani, Mehul, Slocum, Stewart, Anwar, Usman, Siththaranjan, Anand, Nadeau, Max, Michaud, Eric J, Pfau, Jacob, Krasheninnikov, Dmitrii, Chen, Xin, Langosco, Lauro, Hase, Peter, Bıyık, Erdem, Dragan, Anca, Krueger, David, Sadigh, Dorsa, Hadfield-Menell, Dylan

Published 27-07-2023
“…Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used…”

Get full text

Journal Article
QR Code
Save to List

Saved in:

Search Results - "Pfau, Jacob"

Artificial Intelligence in Dermatology: A Primer by Young, Albert T., Xiong, Mulin, Pfau, Jacob, Keiser, Michael J., Wei, Maria L.

Artificial Intelligence in Teledermatology by Xiong, Mulin, Pfau, Jacob, Young, Albert T., Wei, Maria L.

Let's Think Dot by Dot: Hidden Computation in Transformer Language Models by Pfau, Jacob, Merrill, William, Bowman, Samuel R

Steering Without Side Effects: Improving Post-Deployment Control of Language Models by Stickland, Asa Cooper, Lyzhov, Alexander, Pfau, Jacob, Mahdi, Salsabila, Bowman, Samuel R

Taking AI Welfare Seriously by Long, Robert, Sebo, Jeff, Butlin, Patrick, Finlinson, Kathleen, Fish, Kyle, Harding, Jacqueline, Pfau, Jacob, Sims, Toni, Birch, Jonathan, Chalmers, David

Self-Consistency of Large Language Models under Ambiguity by Bartsch, Henning, Jorgensen, Ole, Rosati, Domenic, Hoelscher-Obermaier, Jason, Pfau, Jacob

Goal Misgeneralization in Deep Reinforcement Learning by Langosco, Lauro, Koch, Jack, Sharkey, Lee, Pfau, Jacob, Orseau, Laurent, Krueger, David

Robust Semantic Interpretability: Revisiting Concept Activation Vectors by Pfau, Jacob, Young, Albert T, Wei, Jerome, Wei, Maria L, Keiser, Michael J

Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias by Pfau, Jacob, Young, Albert T, Wei, Maria L, Keiser, Michael J

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication