Mapping of air temperature (Ta) at high spatiotemporal resolution is critical to reducing exposure assessment errors in epidemiological studies on the health effects of air temperature. In this study, we applied a three-stage ensemble model to estimate daily mean Ta from satellite-based land surface temperature (Ts) over Sweden during 2001-2019 at a high spatial resolution of 1 × 1 km2. The ensemble model incorporated four base models, including a generalized additive model (GAM), a generalized additive mixed model (GAMM), and two machine learning models (random forest [RF] and extreme gradient boosting [XGBoost]), and allowed the weights for each model to vary over space, with the best-performing model for each grid cell assigned the highest weight. Various spatial predictors were included as adjustment variables in all the base models, including land cover type, normalized difference vegetation index (NDVI), and elevation. The ensemble model showed high performance with an overall R2 of 0.98 and a root mean square error of 1.38 °C in the ten-fold cross-validation, and outperformed each of the four base models. Although each base model performed well, the two machine learning models (RF [R2 = 0.97], XGBoost [R2 = 0.98]) had better performance than the two regression models (GAM [R2 = 0.95], GAMM [R2 = 0.96]). In the machine learning models, Ts was the dominant predictor of Ta, followed by day of year, NDVI, latitude, elevation, and longitude. The highly spatiotemporally-resolved Ta can improve temperature exposure assessment in future epidemiological studies.
Keywords: Air temperature; Ensemble model; Extreme gradient boosting; Land surface temperature; Random forest.
Copyright © 2021 Elsevier Inc. All rights reserved.