The ability to predict outcomes from neuroimaging data has the potential to answer important clinical questions such as which depressed patients will respond to treatment, which abstinent drug users will relapse, or which patients will convert to dementia. However, many prediction analyses require methods and techniques, not typically required in neuroimaging, to accurately assess a model's predictive ability. Regression models will tend to fit to the idiosyncratic characteristics of a particular sample and consequently will perform worse on unseen data. Failure to account for this inherent optimism is especially pernicious when the number of possible predictors is high relative to the number of participants, a common scenario in psychiatric neuroimaging. We show via simulated data that models can appear predictive even when data and outcomes are random, and we note examples of optimistic prediction in the literature. We provide some recommendations for assessment of model performance.
Keywords: Addiction; imaging; machine learning; methods; prediction; simulation.
Copyright © 2014 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.