Background: Delirium in hospitalized patients is a syndrome of acute brain dysfunction. Diagnostic (International Classification of Diseases [ICD]) codes are often used in studies using electronic health records (EHRs), but they are inaccurate.
Objective: We sought to develop a more accurate method using natural language processing (NLP) to detect delirium episodes on the basis of unstructured clinical notes.
Methods: We collected 1.5 million notes from >10,000 patients from among 9 hospitals. Seven experts iteratively labeled 200,471 sentences. Using these, we trained three NLP classifiers: Support Vector Machine, Recurrent Neural Networks, and Transformer. Testing was performed using an external data set. We also evaluated associations with delirium billing (ICD) codes, medications, orders for restraints and sitters, direct assessments (Confusion Assessment Method [CAM] scores), and in-hospital mortality. F1 scores, confusion matrices, and areas under the receiver operating characteristic curve (AUCs) were used to compare NLP models. We used the φ coefficient to measure associations with other delirium indicators.
Results: The transformer NLP performed best on the following parameters: micro F1=0.978, macro F1=0.918, positive AUC=0.984, and negative AUC=0.992. NLP detections exhibited higher correlations (φ) than ICD codes with deliriogenic medications (0.194 vs 0.073 for ICD codes), restraints and sitter orders (0.358 vs 0.177), mortality (0.216 vs 0.000), and CAM scores (0.256 vs -0.028).
Conclusions: Clinical notes are an attractive alternative to ICD codes for EHR delirium studies but require automated methods. Our NLP model detects delirium with high accuracy, similar to manual chart review. Our NLP approach can provide more accurate determination of delirium for large-scale EHR-based studies regarding delirium, quality improvement, and clinical trails.
Keywords: clinical notes; delirium; electronic health records; machine learning; natural language processing.
©Wendong Ge, Haitham Alabsi, Aayushee Jain, Elissa Ye, Haoqi Sun, Marta Fernandes, Colin Magdamo, Ryan A Tesh, Sarah I Collens, Amy Newhouse, Lidia MVR Moura, Sahar Zafar, John Hsu, Oluwaseun Akeju, Gregory K Robbins, Shibani S Mukerji, Sudeshna Das, M Brandon Westover. Originally published in JMIR Formative Research (https://formative.jmir.org), 24.06.2022.