In evolving clinical environments, the accuracy of prediction models deteriorates over time. Guidance on the design of model updating policies is limited, and there is limited exploration of the impact of different policies on future model performance and across different model types. We implemented a new data-driven updating strategy based on a nonparametric testing procedure and compared this strategy to two baseline approaches in which models are never updated or fully refit annually. The test-based strategy generally recommended intermittent recalibration and delivered more highly calibrated predictions than either of the baseline strategies. The test-based strategy highlighted differences in the updating requirements between logistic regression, L1-regularized logistic regression, random forest, and neural network models, both in terms of the extent and timing of updates. These findings underscore the potential improvements in using a data-driven maintenance approach over "one-size fits all" to sustain more stable and accurate model performance over time.
©2019 AMIA - All rights reserved.