A multimodal generative AI copilot for human pathology.

Overview

abstract

Computational pathology^1,2 has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders^3,4. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building general-purpose multimodal AI assistants and copilots⁵ tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We built PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and fine-tuning the whole system on over 456,000 diverse visual-language instructions consisting of 999,202 question and answer turns. We compare PathChat with several multimodal vision-language AI assistants and GPT-4V, which powers the commercially available multimodal general-purpose AI assistant ChatGPT-4 (ref. ⁶). PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases with diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive vision-language AI copilot that can flexibly handle both visual and natural language inputs, PathChat may potentially find impactful applications in pathology education, research and human-in-the-loop clinical decision-making.

authors

Ikemura, Kenji
Kim, Ahrong
Pouli, Dimitra
Patel, Ankush
Soliman, Amr
Chen, Chengkuan
Ding, Tong
Wang, Judy J
Gerber, Georg
Liang, Ivy
Le, Long Phi
Parwani, Anil V
Weishaupt, Luca L
Mahmood, Faisal

publication date

June 12, 2024

published in

Nature Journal

Research

keywords

Artificial Intelligence
Clinical Decision-Making
Diagnostic Imaging
Pathology

Identity

PubMed Central ID

PMC11464372

Scopus Document Identifier

85199394091

Digital Object Identifier (DOI)

10.1038/s41586-024-07618-3

PubMed ID

38866050

Additional Document Info

has global citation frequency

301

volume

634

issue

8033

VIVO Weill Cornell Medical College

A multimodal generative AI copilot for human pathology. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume

issue