2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Various approaches have been proposed to model PM 2.5 in the recent decade, with satellite-derived aerosol optical depth, land-use variables, chemical transport model predictions, and several meteorological variables as major predictor variables. Our study used an ensemble model that integrated multiple machine learning algorithms and predictor variables to estimate daily PM 2.5 at a resolution of 1 km×1 km across the contiguous United States. We used a generalized additive model that accounted for geographic difference to combine PM 2.5 estimates from neural network, random forest, and gradient boosting. The three machine learning algorithms were based on multiple predictor variables, including satellite data, meteorological variables, land-use variables, elevation, chemical transport model predictions, several reanalysis datasets, and others. The model training results from 2000 to 2015 indicated good model performance with a 10-fold cross-validated R 2 of 0.86 for daily PM 2.5 predictions. For annual PM 2.5 estimates, the cross-validated R 2 was 0.89. Our model demonstrated good performance up to 60 μg/m 3 . Using trained PM 2.5 model and predictor variables, we predicted daily PM 2.5 from 2000 to 2015 at every 1 km×1 km grid cell in the contiguous United States. We also used localized land-use variables within 1 km×1 km grids to downscale PM 2.5 predictions to 100 m × 100 m grid cells. To characterize uncertainty, we used meteorological variables, land-use variables, and elevation to model the monthly standard deviation of the difference between daily monitored and predicted PM 2.5 for every 1 km×1 km grid cell. This PM 2.5 prediction dataset, including the downscaled and uncertainty predictions, allows epidemiologists to accurately estimate the adverse health effect of PM 2.5 . Compared with model performance of individual base learners, an ensemble model would achieve a better overall estimation. It is worth exploring other ensemble model formats to synthesize estimations from different models or from different groups to improve overall performance.

          Related collections

          Author and article information

          Journal
          Environment International
          Environment International
          Elsevier BV
          01604120
          September 2019
          September 2019
          : 130
          : 104909
          Article
          10.1016/j.envint.2019.104909
          7063579
          31272018
          3fddad16-8256-4fd6-b18a-0d721ef9c236
          © 2019

          https://www.elsevier.com/tdm/userlicense/1.0/

          http://creativecommons.org/licenses/by-nc-nd/4.0/

          History

          Comments

          Comment on this article