Comparison between machine learning methods for mortality prediction for sepsis patients with different social determinants

Hanyin Wang; Yikuan Li; Andrew Naidech; Yuan Luo

doi:10.1186/s12911-022-01871-0

Comparison between machine learning methods for mortality prediction for sepsis patients with different social determinants

BMC Med Inform Decis Mak. 2022 Jun 16;22(Suppl 2):156. doi: 10.1186/s12911-022-01871-0.

Authors

Hanyin Wang¹, Yikuan Li¹, Andrew Naidech², Yuan Luo³

Affiliations

¹ Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
² Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
³ Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA. Yuan.Luo@northwestern.edu.

Abstract

Background: Sepsis is one of the most life-threatening circumstances for critically ill patients in the United States, while diagnosis of sepsis is challenging as a standardized criteria for sepsis identification is still under development. Disparities in social determinants of sepsis patients can interfere with the risk prediction performances using machine learning.

Methods: We analyzed a cohort of critical care patients from the Medical Information Mart for Intensive Care (MIMIC)-III database. Disparities in social determinants, including race, sex, marital status, insurance types and languages, among patients identified by six available sepsis criteria were revealed by forest plots with 95% confidence intervals. Sepsis patients were then identified by the Sepsis-3 criteria. Sixteen machine learning classifiers were trained to predict in-hospital mortality for sepsis patients on a training set constructed by random selection. The performance was measured by area under the receiver operating characteristic curve (AUC). The performance of the trained model was tested on the entire randomly conducted test set and each sub-population built based on each of the following social determinants: race, sex, marital status, insurance type, and language. The fluctuations in performances were further examined by permutation tests.

Results: We analyzed a total of 11,791 critical care patients from the MIMIC-III database. Within the population identified by each sepsis identification method, significant differences were observed among sub-populations regarding race, marital status, insurance type, and language. On the 5783 sepsis patients identified by the Sepsis-3 criteria statistically significant performance decreases for mortality prediction were observed when applying the trained machine learning model on Asian and Hispanic patients, as well as the Spanish-speaking patients. With pairwise comparison, we detected performance discrepancies in mortality prediction between Asian and White patients, Asians and patients of other races, as well as English-speaking and Spanish-speaking patients.

Conclusions: Disparities in proportions of patients identified by various sepsis criteria were detected among the different social determinant groups. The performances of mortality prediction for sepsis patients can be compromised when applying a universally trained model for each subpopulation. To achieve accurate diagnosis, a versatile diagnostic system for sepsis is needed to overcome the social determinant disparities of patients.

Keywords: Disparity; Machine learning; Mortality prediction; Sepsis; Social determinants.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Critical Illness
Hospital Mortality
Humans
Machine Learning
Retrospective Studies
Sepsis* / diagnosis
Social Determinants of Health*

Abstract

Publication types

MeSH terms

Grants and funding