A hybrid model to identify fall occurrence from electronic health records

Sunyang Fu; Bjoerg Thorsteinsdottir; Xin Zhang; Guilherme S Lopes; Sandeep R Pagali; Nathan K LeBrasseur; Andrew Wen; Hongfang Liu; Walter A Rocca; Janet E Olson; Jennifer St Sauver; Sunghwan Sohn

doi:10.1016/j.ijmedinf.2022.104736

A hybrid model to identify fall occurrence from electronic health records

Int J Med Inform. 2022 Mar 7:162:104736. doi: 10.1016/j.ijmedinf.2022.104736. Online ahead of print.

Authors

Affiliations

¹ Department of AI and Informatics, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA; University of Minnesota, Minneapolis, MN 55455, USA.
² Department of Medicine, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
³ Department of Quantitative Health Sciences, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
⁴ Department of Physical Medicine & Rehabilitation, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA; Department of Physiology & Biomedical Engineering, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
⁵ Department of AI and Informatics, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
⁶ Department of Quantitative Health Sciences, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA; Department of Neurology, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA; Women's Health Research Center, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
⁷ Department of AI and Informatics, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA. Electronic address: Sohn.Sunghwan@mayo.edu.

Abstract

Introduction: Falls are a leading cause of unintentional injury in the elderly. Electronic health records (EHRs) offer the unique opportunity to develop models that can identify fall events. However, identifying fall events in clinical notes requires advanced natural language processing (NLP) to simultaneously address multiple issues because the word "fall" is a typical homonym.

Methods: We implemented a context-aware language model, Bidirectional Encoder Representations from Transformers (BERT) to identify falls from the EHR text and further fused the BERT model into a hybrid architecture coupled with post-hoc heuristic rules to enhance the performance. The models were evaluated on real world EHR data and were compared to conventional rule-based and deep learning models (CNN and Bi-LSTM). To better understand the ability of each approach to identify falls, we further categorize fall-related concepts (i.e., risk of fall, prevention of fall, homonym) and performed a detailed error analysis.

Results: The hybrid model achieved the highest f1-score on sentence (0.971), document (0.985), and patient (0.954) level. At the sentence level (basic data unit in the model), the hybrid model had 0.954, 1.000, 0.988, and 0.999 in sensitivity, specificity, positive predictive value, and negative predictive value, respectively. The error analysis showed that that machine learning-based approaches demonstrated higher performance than a rule-based approach in challenging cases that required contextual understanding. The context-aware language model (BERT) slightly outperformed the word embedding approach trained on Bi-LSTM. No single model yielded the best performance for all fall-related semantic categories.

Conclusion: A context-aware language model (BERT) was able to identify challenging fall events that requires context understanding in EHR free text. The hybrid model combined with post-hoc rules allowed a custom fix on the BERT outcomes and further improved the performance of fall detection.

Keywords: BERT; EHR; Fall; NLP.

Abstract

Grants and funding