A multimodal generative AI copilot for human pathology

Computational pathology1,2 has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders3,4. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building ge...

Full description

Saved in:
Bibliographic Details
Published in:Nature (London) Vol. 634; no. 8033; pp. 466 - 3
Main Authors: Lu, Ming Y, Chen, Bowen, Williamson, Drew F K, Chen, Richard J, Zhao, Melissa, Chow, Aaron K, Ikemura, Kenji, Kim, Ahrong, Pouli, Dimitra, Patel, Ankush, Soliman, Amr, Chen, Chengkuan, Ding, Tong, Wang, Judy J, Gerber, Georg, Liang, Ivy, Le, Long Phi, Parwani, Anil V, Weishaupt, Luca L, Mahmood, Faisal
Format: Journal Article
Language:English
Published: London Nature Publishing Group 10-10-2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Computational pathology1,2 has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders3,4. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building general-purpose multimodal AI assistants and copilots5 tailored to pathology. Here we present PathChat, a visionlanguage generalist AI assistant for human pathology. We built PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and fine-tuning the whole system on over 456,000 diverse visuallanguage instructions consisting of999,202 question and answer turns. We compare PathChat with several multimodal vision-language AI assistants and GPT-4V, which powers the commercially available multimodal general-purpose AI assistant ChatGPT-4 (ref. 6). PathChat achieved state-of-the-art performance on multiplechoice diagnostic questions from cases with diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive vision-language AI copilot that can flexibly handle both visual and natural language inputs, PathChat may potentially find impactful applications in pathology education, research and human-in-the-Ioop clinical decision-making.
ISSN:0028-0836
1476-4687
DOI:10.1O38/s41586-O24-O7618-3