Machine learning algorithms to classify self-harm behaviours in New South Wales Ambulance electronic medical records: A retrospective study

Alexander Burnett; Nicola Chen; Stephanie Zeritis; Sandra Ware; Lauren McGillivray; Fiona Shand; Michelle Torok

doi:10.1016/j.ijmedinf.2022.104734

Machine learning algorithms to classify self-harm behaviours in New South Wales Ambulance electronic medical records: A retrospective study

Int J Med Inform. 2022 May:161:104734. doi: 10.1016/j.ijmedinf.2022.104734. Epub 2022 Mar 8.

Authors

Alexander Burnett¹, Nicola Chen², Stephanie Zeritis³, Sandra Ware⁴, Lauren McGillivray⁵, Fiona Shand⁵, Michelle Torok⁵

Affiliations

¹ Black Dog Institute, Australia. Electronic address: alexander.burnett@blackdog.org.au.
² Orygen, Australia; University of Melbourne, Australia.
³ Black Dog Institute, Australia.
⁴ NSW Ambulance, Australia.
⁵ Black Dog Institute, Australia; University of New South Wales, Australia.

PMID: 35287099
DOI: 10.1016/j.ijmedinf.2022.104734

Abstract

Background: There is increasing interest in suicide surveillance solutions to identify non-fatal suicidal and self-harming behaviours in the Australian community not currently captured through national administrative datasets.

Objective: The aim of the present study was to develop machine learning models to classify self-harm related behaviours using unstructured clinical note text from New South Wales (NSW) Ambulance data and compare their performance via traditional methods.

Methods: Primary data were derived from NSW Ambulance electronic medical records (eMRs) for potential self-harm related NSW Ambulance attendances for the period 2013-2019. Data included paramedic clinical notes detailing the nature of the attendance, clinical outcome, and narrative information. We assessed sensitivity, specificity, positive predictive value, negative predictive value, F-score, and the Matthews correlation coefficient (MCC) for four algorithms (Support Vector Machine, random forest, decision tree, and logistic regression).

Results: The performance of these algorithms was compared using the MCC measure. In a test sample of 3157 ambulance attendances (1349 self-harm related behaviours and 1808 unrelated), the MCC for classification of self-harm related behaviour ranged from +0.681 to +0.730. The Support Vector Machine (sensitivity = 82.7%, specificity = 89.6%, MCC = 0.730) and the logistic regression (sensitivity = 83.1%, specificity = 89.3%, MCC = 0.727) models performed best.

Conclusions: This study demonstrates that machine learning models can be applied to paramedic notes within unstructured medical records to classify self-harm related behaviours. The resulting model could be used to compliment current manual abstraction of self-harm behaviours and provide more timely approximations to be used for self-harm surveillance.

Keywords: Epidemiology; Machine learning; Natural language processing; Population surveillance; Suicidal behaviour.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Ambulances
Australia
Electronic Health Records*
Humans
Machine Learning
New South Wales / epidemiology
Retrospective Studies
Self-Injurious Behavior* / diagnosis
Self-Injurious Behavior* / epidemiology