Concept-based AI interpretability in physiological time-series data: Example of abnormality detection in electroencephalography

Alexander Brenner; Felix Knispel; Florian P Fischer; Peter Rossmanith; Yvonne Weber; Henner Koch; Rainer Röhrig; Julian Varghese; Ekaterina Kutafina

doi:10.1016/j.cmpb.2024.108448

Concept-based AI interpretability in physiological time-series data: Example of abnormality detection in electroencephalography

Comput Methods Programs Biomed. 2024 Dec:257:108448. doi: 10.1016/j.cmpb.2024.108448. Epub 2024 Sep 30.

Authors

Alexander Brenner¹, Felix Knispel², Florian P Fischer³, Peter Rossmanith⁴, Yvonne Weber³, Henner Koch³, Rainer Röhrig², Julian Varghese⁵, Ekaterina Kutafina⁶

Affiliations

¹ Institute of Medical Informatics, University of Münster, Münster, Germany. Electronic address: alexander.brenner@uni-muenster.de.
² Institute of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany.
³ Department of Epileptology and Neurology, Medical Faculty, RWTH Aachen University Hospital, Aachen, Germany.
⁴ Theoretical Computer Science, Department of Computer Science, RWTH Aachen University, Aachen, Germany.
⁵ Institute of Medical Informatics, University of Münster, Münster, Germany.
⁶ Institute for Biomedical Informatics, Faculty of Medicine, University Hospital Cologne, University of Cologne, Cologne, Germany.

PMID: 39395304
DOI: 10.1016/j.cmpb.2024.108448

Abstract

Background and objective: Despite recent performance advancements, deep learning models are not yet adopted in clinical practice on a wide scale. The intrinsic intransparency of such systems is commonly cited as one major reason for this reluctance. This has motivated methods that aim to provide explanations of model functioning. Known limitations of feature-based explanations have led to an increased interest in concept-based interpretability. Testing with Concept Activation Vectors (TCAV) employs human-understandable, abstract concepts to explain model behavior. The method has previously been applied to the medical domain in the context of electronic health records, retinal fundus images and magnetic resonance imaging.

Methods: We explore the usage of TCAV for building interpretable models on physiological time series, using an example of abnormality detection in electroencephalography (EEG). For this purpose, we adopt the XceptionTime model, which is suitable for multi-channel physiological data of variable sizes. The model provides state-of-the-art performance on raw EEG data and is publically available. We propose and test several ideas regarding concept definition through metadata mining, using additional labeled EEG data and extracting interpretable signal characteristics in the form of frequencies. By including our own hospital data with analog labeling, we further evaluate the robustness of our approach.

Results: The tested concepts show a TCAV score distribution that is in line with the clinical expectations, i.e. concepts known to have strong links with EEG pathologies (such as epileptiform discharges) received higher scores than the neutral concepts (e.g. sex). The scores were consistent across the applied concept generation strategies.

Conclusions: TCAV has the potential to improve interpretability of deep learning applied to multi-channel signals as well as to detect possible biases in the data. Still, further work on developing the strategies for concept definition and validation on clinical physiological time series is needed to better understand how to extract clinically relevant information from the concept sensitivity scores.

Keywords: Artificial neural networks; Decision support systems; Electroencephalography; Explainable artificial intelligence; Supervised learning; TCAV.

MeSH terms

Algorithms
Artificial Intelligence
Deep Learning
Electroencephalography* / methods
Humans
Signal Processing, Computer-Assisted