The performance of ChatGPT in day surgery and pre-anesthesia risk assessment: a case-control study of 150 simulated patient presentations

Perioper Med (Lond). 2024 Nov 21;13(1):111. doi: 10.1186/s13741-024-00469-6.

Abstract

Background: Day surgery has developed rapidly in China in recent years, although it still faces a shortage of anesthesiologists to handle pre-anesthesia routine before surgery. We hypothesized that ChatGPT may assist anesthesia practitioners in preoperative assessment and answer questions on the concerns of patients. The aims of this study were to examine the ability of ChatGPT to assess preoperative risk and determine its accuracy in answering questions regarding knowledge and management of day surgery anesthesia.

Methods: One-hundred fifty patient profiles were generated to simulate day surgery patient presentations that involved complications of varying acuity and severity. The ChatGPT group and the expert group were both required to evaluate the profiles of 150 simulated patients to determine their ASA-PS classification and whether day surgery was recommended. ChatGPT was then asked to answer 131 questions about day surgery anesthesia that represented the most common issues encountered in clinical practice. The performance of ChatGPT was assessed and graded independently by two experienced anesthesiologists.

Results: A total of 150 patient profiles were included in the study (75 males [50.0%] and 75 females [50.0%]). There was no difference between the ChatGPT group and the expert group for the ASA-PS classification and assessment of anesthesia risk in the patient profiles (P > 0.05). Regarding recommendation for day surgery in patients with certain comorbidities (ASA ≥ II), the expert group was inclined to require further examination or treatment. In addition, the proportion of conclusions made by ChatGPT was smaller than that of the experts (i.e., ChatGPT n (%) vs. expert n (%): day surgery can be performed, 67 (47.9) vs. 31 (25.4); needs further treatment and evaluation, 56 (37.3) vs. 66 (44.0); and day surgery is not recommended, 18 (12.9) vs. 29 (9.3), P < 0.05). We showed that ChatGPT had extensive knowledge related to day surgery anesthesia (94.0% correct), with most of the points (70%) considered comprehensive. The performance of ChatGPT was also better in the domains of peri-anesthesia concerns, lifestyle, and emotional support.

Conclusions: ChatGPT can assist anesthesia practitioners and surgeons by alerting them to the ASA-PS classification and assessing perioperative risk in day surgery patients. ChatGPT can also be trusted to answer questions and concerns related to pre-anesthesia and therefore has the potential to provide important assistance in clinical work.

Keywords: ASA-PS classification; ChatGPT; Day surgery; Day-case surgery; Peri-anesthesia concerns; Pre-anesthesia; Risk assessment.