Leveraging Internet Search Data to Improve the Prediction and Prevention of Noncommunicable Diseases: Retrospective Observational Study

J Med Internet Res. 2020 Nov 12;22(11):e18998. doi: 10.2196/18998.

Abstract

Background: As human society enters an era of vast and easily accessible social media, a growing number of people are exploiting the internet to search and exchange medical information. Because internet search data could reflect population interest in particular health topics, they provide a new way of understanding health concerns regarding noncommunicable diseases (NCDs) and the role they play in their prevention.

Objective: We aimed to explore the association of internet search data for NCDs with published disease incidence and mortality rates in the United States and to grasp the health concerns toward NCDs.

Methods: We tracked NCDs by examining the correlations among the incidence rates, mortality rates, and internet searches in the United States from 2004 to 2017, and we established forecast models based on the relationship between the disease rates and internet searches.

Results: Incidence and mortality rates of 29 diseases in the United States were statistically significantly correlated with the relative search volumes (RSVs) of their search terms (P<.05). From the perspective of the goodness of fit of the multiple regression prediction models, the results were closest to 1 for diabetes mellitus, stroke, atrial fibrillation and flutter, Hodgkin lymphoma, and testicular cancer; the coefficients of determination of their linear regression models for predicting incidence were 80%, 88%, 96%, 80%, and 78%, respectively. Meanwhile, the coefficient of determination of their linear regression models for predicting mortality was 82%, 62%, 94%, 78%, and 62%, respectively.

Conclusions: An advanced understanding of search behaviors could augment traditional epidemiologic surveillance and could be used as a reference to aid in disease prediction and prevention.

Keywords: Google Trends; United States; early warning model; infodemiology; infoveillance; internet searches; noncommunicable diseases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Female
  • Humans
  • Incidence
  • Internet
  • Male
  • Mortality / trends*
  • Noncommunicable Diseases / epidemiology*
  • Noncommunicable Diseases / mortality
  • Noncommunicable Diseases / prevention & control*
  • Retrospective Studies
  • Search Engine / trends*
  • Social Media / trends*