Environmental exposures in machine learning and data mining approaches to diabetes etiology: A scoping review

Artif Intell Med. 2023 Jan:135:102461. doi: 10.1016/j.artmed.2022.102461. Epub 2022 Nov 30.

Abstract

Background: Environmental exposures are implicated in diabetes etiology, but are poorly understood due to disease heterogeneity, complexity of exposures, and analytical challenges. Machine learning and data mining are artificial intelligence methods that can address these limitations. Despite their increasing adoption in etiology and prediction of diabetes research, the types of methods and exposures analyzed have not been thoroughly reviewed.

Objective: We aimed to review articles that implemented machine learning and data mining methods to understand environmental exposures in diabetes etiology and disease prediction.

Methods: We queried PubMed and Scopus databases for machine learning and data mining studies that used environmental exposures to understand diabetes etiology on September 19th, 2022. Exposures were classified into specific external, general external, or internal exposures. We reviewed machine learning and data mining methods and characterized the scope of environmental exposures studied in the etiology of general diabetes, type 1 diabetes, type 2 diabetes, and other types of diabetes.

Results: We identified 44 articles for inclusion. Specific external exposures were the most common exposures studied, and supervised models were the most common methods used. Well-established specific external exposures of low physical activity, high cholesterol, and high triglycerides were predictive of general diabetes, type 2 diabetes, and prediabetes, while novel metabolic and gut microbiome biomarkers were implicated in type 1 diabetes.

Discussion: The use of machine learning and data mining methods to elucidate environmental triggers of diabetes was largely limited to well-established risk factors identified using easily explainable and interpretable models. Future studies should seek to leverage machine learning and data mining to explore the temporality and co-occurrence of multiple exposures and further evaluate the role of general external and internal exposures in diabetes etiology.

Keywords: Data mining; Diabetes mellitus; Environmental exposures; Machine learning.

Publication types

  • Review
  • Research Support, N.I.H., Extramural

MeSH terms

  • Artificial Intelligence
  • Data Mining / methods
  • Diabetes Mellitus, Type 1*
  • Diabetes Mellitus, Type 2* / epidemiology
  • Diabetes Mellitus, Type 2* / etiology
  • Environmental Exposure / adverse effects
  • Humans
  • Machine Learning