Objectives: An environment-wide association study (EWAS) may be useful to comprehensively test and validate associations between environmental factors and cardiovascular disease (CVD) in an unbiased manner.
Approach and results: Data from National Health and Nutrition Examination Survey (1999-2014) were randomly 50:50 spilt into training set and testing set. CVD was ascertained by a self-reported diagnosis of myocardial infarction, coronary heart disease or stroke. We performed multiple linear regression analyses associating 203 environmental factors and 132 clinical phenotypes with CVD in training set (false discovery rate < 5%) and significant factors were validated in the testing set (P < 0.05). Random forest (RF) model was used for multicollinearity elimination and variable importance ranking. Discriminative power of factors for CVD was calculated by area under the receiver operating characteristic (AUROC). Overall, 43,568 participants with 4084 (9.4%) CVD were included. After adjusting for age, sex, race, body mass index, blood pressure and socio-economic level, we identified 5 environmental variables and 19 clinical phenotypes associated with CVD in training and testing dataset. Top five factors in RF importance ranking were: waist, glucose, uric acid, and red cell distribution width and glycated hemoglobin. AUROC of the RF model was 0.816 (top 5 factors) and 0.819 (full model). Sensitivity analyses reveal no specific moderators of the associations.
Conclusion: Our systematic evaluation provides new knowledge on the complex array of environmental correlates of CVD. These identified correlates may serve as a complementary approach to CVD risk assessment. Our findings need to be probed in further observational and interventional studies.
Keywords: Cardiovascular disease; Environment-wide association study; False discovery rate; Random forest; Receiver operating characteristic.
Copyright © 2018. Published by Elsevier Ltd.