Background: Machine learning (ML) approaches have been broadly applied to the prediction of length of stay and mortality in hospitalized patients. ML may also reduce societal health burdens, assist in health resources planning and improve health outcomes. However, the fairness of these ML models across ethnoracial or socioeconomic subgroups is rarely assessed or discussed. In this study, we aim (1) to quantify the algorithmic bias of ML models when predicting the probability of long-term hospitalization or in-hospital mortality for different heart failure (HF) subpopulations, and (2) to propose a novel method that can improve the fairness of our models without compromising predictive power.
Methods: We built 5 ML classifiers to predict the composite outcome of hospitalization length-of-stay and in-hospital mortality for 210 368 HF patients extracted from the Get With The Guidelines-Heart Failure registry data set. We integrated 15 social determinants of health variables, including the Social Deprivation Index and the Area Deprivation Index, into the feature space of ML models based on patients' geographies to mitigate the algorithmic bias.
Results: The best-performing random forest model demonstrated modest predictive power but selectively underdiagnosed underserved subpopulations, for example, female, Black, and socioeconomically disadvantaged patients. The integration of social determinants of health variables can significantly improve fairness without compromising model performance.
Conclusions: We quantified algorithmic bias against underserved subpopulations in the prediction of the composite outcome for HF patients. We provide a potential direction to reduce disparities of ML-based predictive models by integrating social determinants of health variables. We urge fellow researchers to strongly consider ML fairness when developing predictive models for HF patients.
Keywords: bias; healthcare disparities; heart failure; machine learning; social determinants of health.