A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects

Shiho Kino; Yu-Tien Hsu; Koichiro Shiba; Yung-Shin Chien; Carol Mita; Ichiro Kawachi; Adel Daoud

doi:10.1016/j.ssmph.2021.100836

A scoping review on the use of machine learning in research on social determinants of health: Trends and research prospects

SSM Popul Health. 2021 Jun 5:15:100836. doi: 10.1016/j.ssmph.2021.100836. eCollection 2021 Sep.

Authors

Shiho Kino^{1

2}, Yu-Tien Hsu¹, Koichiro Shiba³, Yung-Shin Chien¹, Carol Mita⁴, Ichiro Kawachi¹, Adel Daoud^{5

6

7

8}

Affiliations

¹ Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
² Department of Social Epidemiology, Kyoto University, Kyoto, Japan.
³ Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
⁴ Countway Library of Medicine, Harvard University, Boston, MA, USA.
⁵ Center for Population and Development Studies, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA.
⁶ Department of Sociology and Work Science, University of Gothenburg, Sweden.
⁷ The Division of Data Science and Artificial Intelligence of the Department of Computer Science and Engineering, Chalmers University of Technology, Sweden.
⁸ Institute for Analytical Sociology, Linköping University, Sweden.

Abstract

Background: Machine learning (ML) has spread rapidly from computer science to several disciplines. Given the predictive capacity of ML, it offers new opportunities for health, behavioral, and social scientists. However, it remains unclear how and to what extent ML is being used in studies of social determinants of health (SDH).

Methods: Using four search engines, we conducted a scoping review of studies that used ML to study SDH (published before May 1, 2020). Two independent reviewers analyzed the relevant studies. For each study, we identified the research questions, Results, data, and algorithms. We synthesized our findings in a narrative report.

Results: Of the initial 8097 hits, we identified 82 relevant studies. The number of publications has risen during the past decade. More than half of the studies (n = 46) used US data. About 80% (n = 66) utilized surveys, and 70% (n = 57) employed ML for common prediction tasks. Although the number of studies in ML and SDH is growing rapidly, only a few studies used ML to improve causal inference, curate data, or identify social bias in predictions (i.e., algorithmic fairness).

Conclusions: While ML equips researchers with new ways to measure health outcomes and their determinants from non-conventional sources such as text, audio, and image data, most studies still rely on traditional surveys. Although there are no guarantees that ML will lead to better social epidemiological research, the potential for innovation in SDH research is evident as a result of harnessing the predictive power of ML for causality, data curation, or algorithmic fairness.

Keywords: Machine learning; Review; Social determinants of health.

Publication types

Review