Machine learning algorithms to the early diagnosis of fetal alcohol spectrum disorders

Front Neurosci. 2024 May 6:18:1400933. doi: 10.3389/fnins.2024.1400933. eCollection 2024.

Abstract

Introduction: Fetal alcohol spectrum disorders include a variety of physical and neurocognitive disorders caused by prenatal alcohol exposure. Although their overall prevalence is around 0.77%, FASD remains underdiagnosed and little known, partly due to the complexity of their diagnosis, which shares some symptoms with other pathologies such as autism spectrum, depression or hyperactivity disorders.

Methods: This study included 73 control and 158 patients diagnosed with FASD. Variables selected were based on IOM classification from 2016, including sociodemographic, clinical, and psychological characteristics. Statistical analysis included Kruskal-Wallis test for quantitative factors, Chi-square test for qualitative variables, and Machine Learning (ML) algorithms for predictions.

Results: This study explores the application ML in diagnosing FASD and its subtypes: Fetal Alcohol Syndrome (FAS), partial FAS (pFAS), and Alcohol-Related Neurodevelopmental Disorder (ARND). ML constructed a profile for FASD based on socio-demographic, clinical, and psychological data from children with FASD compared to a control group. Random Forest (RF) model was the most efficient for predicting FASD, achieving the highest metrics in accuracy (0.92), precision (0.96), sensitivity (0.92), F1 Score (0.94), specificity (0.92), and AUC (0.92). For FAS, XGBoost model obtained the highest accuracy (0.94), precision (0.91), sensitivity (0.91), F1 Score (0.91), specificity (0.96), and AUC (0.93). In the case of pFAS, RF model showed its effectiveness, with high levels of accuracy (0.90), precision (0.86), sensitivity (0.96), F1 Score (0.91), specificity (0.83), and AUC (0.90). For ARND, RF model obtained the best levels of accuracy (0.87), precision (0.76), sensitivity (0.93), F1 Score (0.84), specificity (0.83), and AUC (0.88). Our study identified key variables for efficient FASD screening, including traditional clinical characteristics like maternal alcohol consumption, lip-philtrum, microcephaly, height and weight impairment, as well as neuropsychological variables such as the Working Memory Index (WMI), aggressive behavior, IQ, somatic complaints, and depressive problems.

Discussion: Our findings emphasize the importance of ML analyses for early diagnoses of FASD, allowing a better understanding of FASD subtypes to potentially improve clinical practice and avoid misdiagnosis.

Keywords: PAE; Random Forest (RF); eXtreme Gradient Boosting (XGB); early diagnosis; fetal alcohol spectrum disorders; machine learning; neurodevelopment.

Grants and funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This study has been funded by Instituto de Salud Carlos II (ISCIII) through the project PI19/01853, PI21/01415 and PI23/01220 and co-funded by the European Union. Project RD21/0012/0017 and RD21/0012/0023 financed by Instituto de Salud Carlos III (ISCIII) and Unión Europea NextGenerationEU/Mecanismo para la Recuperación y la Resiliencia (MRR)/Plan de Recuperación, Transformación y Resiliencia (PRTR). This research was funded also by Fundación Mutua Madrileña (AP183662023). This study has also been carried out thanks to the support of the Departament de Recerca i Universitats de la Generalitat de Catalunya al Grup de Recerca Infància i Entorn (GRIE) (2021 SGR 01290). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.