Background: The aim of this study is to evaluate whether Chat Generative Pretrained Transformer (ChatGPT) can be recommended as a resource for informing patients planning rotator cuff repairs, and to assess the differences between ChatGPT 3.5 and 4.0 versions in terms of information content and readability.
Methods: In August 2023, 13 commonly asked questions by patients with rotator cuff disease were posed to ChatGPT 3.5 and ChatGPT 4 programs using different internet protocol computers by 3 experienced surgeons in rotator cuff surgery. After converting the answers of both versions into text, the quality and readability of the answers were examined.
Results: The average Journal of the American Medical Association score for both versions was 0, and the average DISCERN score was 61.6. A statistically significant and strong correlation was found between ChatGPT 3.5 and 4.0 DISCERN scores. There was excellent agreement in DISCERN scores for both versions among the 3 evaluators. ChatGPT 3.5 was found to be less readable than ChatGPT 4.0.
Conclusion: The information provided by the ChatGPT conversational system was evaluated as of high quality, but there were significant shortcomings in terms of reliability due to the lack of citations. Despite the ChatGPT 4.0 version having higher readability scores, both versions were considered difficult to read.
Keywords: Arthroscopy; Artificial intelligence; ChatGPT; OpenAI; Rotator cuff; Shoulder.
© 2024 The Author(s).