Assessing the transportability of clinical prediction models for cognitive impairment using causal models

Jana Fehr; Marco Piccininni; Tobias Kurth; Stefan Konigorski

doi:10.1186/s12874-023-02003-6

Assessing the transportability of clinical prediction models for cognitive impairment using causal models

BMC Med Res Methodol. 2023 Aug 19;23(1):187. doi: 10.1186/s12874-023-02003-6.

Authors

Jana Fehr^{1

2}, Marco Piccininni^{3

4}, Tobias Kurth³, Stefan Konigorski^{5

6

7}

Affiliations

¹ Digital Engineering Faculty, University of Potsdam, Potsdam, Germany. jana.fehr@hpi.de.
² Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany. jana.fehr@hpi.de.
³ Institute of Public Health, Charité - Universitätsmedizin Berlin, Berlin, Germany.
⁴ Center for Stroke Research Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany.
⁵ Digital Engineering Faculty, University of Potsdam, Potsdam, Germany. stefan.konigorski@hpi.de.
⁶ Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany. stefan.konigorski@hpi.de.
⁷ Icahn School of Medicine at Mount Sinai, Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA. stefan.konigorski@hpi.de.

Abstract

Background: Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics.

Methods: We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE ε4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC).

Results: Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC.

Conclusions: We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings.

Keywords: Alzheimer’s Disease; Causality; Clinical risk prediction; DAG; Transportability.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Benchmarking
Calibration
Cognitive Dysfunction* / diagnosis
Humans
Models, Statistical*
Prognosis

Grants and funding

U01 AG024904/AG/NIA NIH HHS/United States