Contextualized medication information extraction using Transformer-based deep learning architectures

J Biomed Inform. 2023 Jun:142:104370. doi: 10.1016/j.jbi.2023.104370. Epub 2023 Apr 24.

Abstract

Objective: To develop a natural language processing (NLP) system to extract medications and contextual information that help understand drug changes. This project is part of the 2022 n2c2 challenge.

Materials and methods: We developed NLP systems for medication mention extraction, event classification (indicating medication changes discussed or not), and context classification to classify medication changes context into 5 orthogonal dimensions related to drug changes. We explored 6 state-of-the-art pretrained transformer models for the three subtasks, including GatorTron, a large language model pretrained using > 90 billion words of text (including > 80 billion words from > 290 million clinical notes identified at the University of Florida Health). We evaluated our NLP systems using annotated data and evaluation scripts provided by the 2022 n2c2 organizers.

Results: Our GatorTron models achieved the best F1-scores of 0.9828 for medication extraction (ranked 3rd), 0.9379 for event classification (ranked 2nd), and the best micro-average accuracy of 0.9126 for context classification. GatorTron outperformed existing transformer models pretrained using smaller general English text and clinical text corpora, indicating the advantage of large language models.

Conclusion: This study demonstrated the advantage of using large transformer models for contextual medication information extraction from clinical narratives.

Keywords: Clinical natural language processing; Deep learning; Medication information extraction; Named entity recognition; Text classification.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • Deep Learning*
  • Information Storage and Retrieval
  • Natural Language Processing