Identifying at-risk patients for congenital heart disease using integrated predictive models and fuzzy clustering analysis: A cross-sectional study

Heliyon. 2024 Oct 18;10(20):e39609. doi: 10.1016/j.heliyon.2024.e39609. eCollection 2024 Oct 30.

Abstract

Congenital heart disease (CHD) remains a significant global health concern, affecting approximately 1 % of newborns worldwide. While its accurate causes often remain elusive, a combination of genetic and environmental factors is implicated. In this cross-sectional study, we propose a comprehensive prediction framework leveraging Machine Learning (ML) and Multi-Attribute Decision Making (MADM) techniques to enhance CHD diagnostics and forecasting. Our framework integrates supervised and unsupervised learning methodologies to remove data noise and address imbalanced datasets effectively. Through the utilization of imbalance ensemble methods and clustering algorithms such as K-means, we enhance predictive accuracy, particularly in non-clinical datasets where imbalances are prevalent. Our results demonstrate an improvement of 8 % in recall compared to existing literature, showcasing the efficacy of our approach. Moreover, our framework identifies clusters of patients at the highest risk using MADM techniques, providing insights into susceptibility to CHD. Fuzzy clustering techniques further assess the degree of risk for individuals within each cluster, enabling personalized risk evaluation. Importantly, our analysis reveals that unhealthy lifestyle factors, annual per capita income, nutrition, and folic acid supplementation emerge as crucial predictors of CHD occurrences. Additionally, environmental risk factors and maternal illnesses significantly contribute to the predictive model. These findings underscore the multifactorial nature of CHD development, emphasizing the importance of considering socioeconomic and lifestyle factors alongside medical variables in CHD risk assessment and prevention strategies. Our proposed framework offers a promising avenue for early identification and intervention, potentially mitigating the burden of CHD on affected individuals and healthcare systems globally.

Keywords: Clustering; Congenital heart disease; Machine learning; Multi-attribute decision-making; Risk assessment.