Performance and risks of ChatGPT used in drug information: an exploratory real-world analysis

Eur J Hosp Pharm. 2024 Oct 25;31(6):491-497. doi: 10.1136/ejhpharm-2023-003750.

Abstract

Objectives: To investigate the performance and risk associated with the usage of Chat Generative Pre-trained Transformer (ChatGPT) to answer drug-related questions.

Methods: A sample of 50 drug-related questions were consecutively collected and entered in the artificial intelligence software application ChatGPT. Answers were documented and rated in a standardised consensus process by six senior hospital pharmacists in the domains content (correct, incomplete, false), patient management (possible, insufficient, not possible) and risk (no risk, low risk, high risk). As reference, answers were researched in adherence to the German guideline of drug information and stratified in four categories according to the sources used. In addition, the reproducibility of ChatGPT's answers was analysed by entering three questions at different timepoints repeatedly (day 1, day 2, week 2, week 3).

Results: Overall, only 13 of 50 answers provided correct content and had enough information to initiate management with no risk of patient harm. The majority of answers were either false (38%, n=19) or had partly correct content (36%, n=18) and no references were provided. A high risk of patient harm was likely in 26% (n=13) of the cases and risk was judged low for 28% (n=14) of the cases. In all high-risk cases, actions could have been initiated based on the provided information. The answers of ChatGPT varied over time when entered repeatedly and only three out of 12 answers were identical, showing no reproducibility to low reproducibility.

Conclusion: In a real-world sample of 50 drug-related questions, ChatGPT answered the majority of questions wrong or partly wrong. The use of artificial intelligence applications in drug information is not possible as long as barriers like wrong content, missing references and reproducibility remain.

Keywords: EVIDENCE-BASED MEDICINE; HEALTH SERVICES ADMINISTRATION; JOURNALISM, MEDICAL; Medical Informatics; PHARMACY SERVICE, HOSPITAL.

MeSH terms

  • Artificial Intelligence
  • Drug Information Services / standards
  • Drug-Related Side Effects and Adverse Reactions / epidemiology
  • Drug-Related Side Effects and Adverse Reactions / prevention & control
  • Humans
  • Pharmacists / standards
  • Pharmacy Service, Hospital* / methods
  • Pharmacy Service, Hospital* / standards
  • Reproducibility of Results
  • Risk Assessment / methods
  • Software