Individualized decision making in on-scene resuscitation time for out-of-hospital cardiac arrest using reinforcement learning

Dong Hyun Choi; Min Hyuk Lim; Ki Jeong Hong; Young Gyun Kim; Jeong Ho Park; Kyoung Jun Song; Sang Do Shin; Sungwan Kim

doi:10.1038/s41746-024-01278-3

Individualized decision making in on-scene resuscitation time for out-of-hospital cardiac arrest using reinforcement learning

NPJ Digit Med. 2024 Oct 9;7(1):276. doi: 10.1038/s41746-024-01278-3.

Authors

Dong Hyun Choi^#¹, Min Hyuk Lim^#^{2

3}, Ki Jeong Hong^{4

5}, Young Gyun Kim⁶, Jeong Ho Park^{7

8}, Kyoung Jun Song^{8

9}, Sang Do Shin^{7

8}, Sungwan Kim¹⁰

Affiliations

¹ Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, South Korea.
² Graduate School of Health Science and Technology, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.
³ Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.
⁴ Department of Emergency Medicine, Seoul National University College of Medicine and Hospital, Seoul, South Korea. emkjhong@gmail.com.
⁵ Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea. emkjhong@gmail.com.
⁶ Interdisciplinary Program in Bioengineering, Graduate School, Seoul National University, Seoul, South Korea.
⁷ Department of Emergency Medicine, Seoul National University College of Medicine and Hospital, Seoul, South Korea.
⁸ Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, South Korea.
⁹ Department of Emergency Medicine, Seoul National University Boramae Medical Center, Seoul, South Korea.
¹⁰ Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, South Korea. sungwan@snu.ac.kr.

^# Contributed equally.

Abstract

On-scene resuscitation time is associated with out-of-hospital cardiac arrest (OHCA) outcomes. We developed and validated reinforcement learning models for individualized on-scene resuscitation times, leveraging nationwide Korean data. Adult OHCA patients with a medical cause of arrest were included (N = 73,905). The optimal policy was derived from conservative Q-learning to maximize survival. The on-scene return of spontaneous circulation hazard rates estimated from the Random Survival Forest were used as intermediate rewards to handle sparse rewards, while patients' historical survival was reflected in the terminal rewards. The optimal policy increased the survival to hospital discharge rate from 9.6% to 12.5% (95% CI: 12.2-12.8) and the good neurological recovery rate from 5.4% to 7.5% (95% CI: 7.3-7.7). The recommended maximum on-scene resuscitation times for patients demonstrated a bimodal distribution, varying with patient, emergency medical services, and OHCA characteristics. Our survival analysis-based approach generates explainable rewards, reducing subjectivity in reinforcement learning.

Abstract

Grants and funding