A multimodal generative AI copilot for human pathology

Ming Y Lu; Bowen Chen; Drew F K Williamson; Richard J Chen; Melissa Zhao; Aaron K Chow; Kenji Ikemura; Ahrong Kim; Dimitra Pouli; Ankush Patel; Amr Soliman; Chengkuan Chen; Tong Ding; Judy J Wang; Georg Gerber; Ivy Liang; Long Phi Le; Anil V Parwani; Luca L Weishaupt; Faisal Mahmood

doi:10.1038/s41586-024-07618-3

A multimodal generative AI copilot for human pathology

Nature. 2024 Oct;634(8033):466-473. doi: 10.1038/s41586-024-07618-3. Epub 2024 Jun 12.

Authors

Ming Y Lu^#^{1

2

3

4}, Bowen Chen^#^{1

2}, Drew F K Williamson^#^{1

2

3}, Richard J Chen^{1

2

3}, Melissa Zhao^{1

2}, Aaron K Chow⁵, Kenji Ikemura^{1

2}, Ahrong Kim^{1

6}, Dimitra Pouli^{1

2}, Ankush Patel⁷, Amr Soliman⁵, Chengkuan Chen¹, Tong Ding^{1

8}, Judy J Wang¹, Georg Gerber¹, Ivy Liang^{1

8}, Long Phi Le², Anil V Parwani⁵, Luca L Weishaupt^{1

9}, Faisal Mahmood^{10

11

12

13}

Affiliations

¹ Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
² Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
³ Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
⁴ Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA.
⁵ Department of Pathology, Wexner Medical Center, Ohio State University, Columbus, OH, USA.
⁶ Department of Pathology, Pusan National University, Busan, South Korea.
⁷ Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA.
⁸ Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA.
⁹ Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA.
¹⁰ Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. faisalmahmood@bwh.harvard.edu.
¹¹ Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA. faisalmahmood@bwh.harvard.edu.
¹² Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA. faisalmahmood@bwh.harvard.edu.
¹³ Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA. faisalmahmood@bwh.harvard.edu.

^# Contributed equally.

Abstract

Computational pathology^1,2 has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders^3,4. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building general-purpose multimodal AI assistants and copilots⁵ tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We built PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and fine-tuning the whole system on over 456,000 diverse visual-language instructions consisting of 999,202 question and answer turns. We compare PathChat with several multimodal vision-language AI assistants and GPT-4V, which powers the commercially available multimodal general-purpose AI assistant ChatGPT-4 (ref. ⁶). PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases with diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive vision-language AI copilot that can flexibly handle both visual and natural language inputs, PathChat may potentially find impactful applications in pathology education, research and human-in-the-loop clinical decision-making.

MeSH terms

Artificial Intelligence*
Clinical Decision-Making* / methods
Diagnostic Imaging* / methods
Diagnostic Imaging* / trends
Female
Humans
Male
Natural Language Processing
Pathology* / education
Pathology* / methods
Pathology* / trends