A concern in the field of autism electroencephalography (EEG) biomarker discovery is their lack of reproducibility. In the present study, we considered the problem of learning reproducible associations between multiple features of resting state (RS) neural activity and autism, using EEG data collected during a RS paradigm from 36 to 96 month-old children diagnosed with autism (N = 224) and neurotypical children (N = 69). Specifically, EEG spectral power and functional connectivity features were used as inputs to a regularized generalized linear model trained to predict diagnostic group (autism versus neurotypical). To evaluate our model, we proposed a procedure that quantified both the predictive generalization and reproducibility of learned associations produced by the model. When prioritizing both model predictive performance and reproducibility of associations, a highly reproducible profile of associations emerged. This profile revealed a distinct pattern of increased gamma power and connectivity in occipital and posterior midline regions associated with an autism diagnosis. Conversely, model selection based on predictive performance alone resulted in non-robust associations. Finally, we built a custom machine learning model that further empirically improved robustness of learned associations. Our results highlight the need for model selection criteria that maximize the scientific utility provided by reproducibility instead of predictive performance.
Keywords: Autism; EEG; Electroencephalography; Reproducibility; Reproducible; Resting state.
© 2024. The Author(s).