Evaluation of the Appropriateness and Readability of ChatGPT-4 Responses to Patient Queries on Uveitis

Ophthalmol Sci. 2024 Aug 8;5(1):100594. doi: 10.1016/j.xops.2024.100594. eCollection 2025 Jan-Feb.

Abstract

Purpose: To compare the utility of ChatGPT-4 as an online uveitis patient education resource with existing patient education websites.

Design: Evaluation of technology.

Participants: Not applicable.

Methods: The term "uveitis" was entered into the Google search engine, and the first 8 nonsponsored websites were selected to be enrolled in the study. Information regarding uveitis for patients was extracted from Healthline, Mayo Clinic, WebMD, National Eye Institute, Ocular Uveitis and Immunology Foundation, American Academy of Ophthalmology, Cleveland Clinic, and National Health Service websites. ChatGPT-4 was then prompted to generate responses about uveitis in both standard and simplified formats. To generate the simplified response, the following request was added to the prompt: 'Please provide a response suitable for the average American adult, at a sixth-grade comprehension level.' Three dual fellowship-trained specialists, all masked to the sources, graded the appropriateness of the contents (extracted from the existing websites) and responses (generated responses by ChatGPT-4) in terms of personal preference, comprehensiveness, and accuracy. Additionally, 5 readability indices, including Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning Fog Index, Coleman-Liau Index, and Simple Measure of Gobbledygook index were calculated using an online calculator, Readable.com, to assess the ease of comprehension of each answer.

Main outcome measures: Personal preference, accuracy, comprehensiveness, and readability of contents and responses about uveitis.

Results: A total of 497 contents and responses, including 71 contents from existing websites, 213 standard responses, and 213 simplified responses from ChatGPT-4 were recorded and graded. Standard ChatGPT-4 responses were preferred and perceived to be more comprehensive by dually trained (uveitis and retina) specialist ophthalmologists while maintaining similar accuracy level compared with existing websites. Moreover, simplified ChatGPT-4 responses matched almost all existing websites in terms of personal preference, accuracy, and comprehensiveness. Notably, almost all readability indices suggested that standard ChatGPT-4 responses demand a higher educational level for comprehension, whereas simplified responses required lower level of education compared with the existing websites.

Conclusions: This study shows that ChatGPT can provide patients with an avenue to access comprehensive and accurate information about uveitis, tailored to their educational level.

Financial disclosures: The author(s) have no proprietary or commercial interest in any materials discussed in this article.

Keywords: Artificial intelligence; ChatGPT; Patient education; Uveitis.