Spatiotemporal air quality inference of low-cost sensor data: Evidence from multiple sensor testbeds
Recent advances in sensor and IoT technologies allow for denser and mobile air quality measurements. These measurements are still spatiotemporally sparse at city-level, but can be interpolated using data-driven techniques. This work presents validation results of two machine-learning models to infer...
Saved in:
Published in: | Environmental modelling & software : with environment data news Vol. 149; p. 105306 |
---|---|
Main Authors: | , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Oxford
Elsevier Ltd
01-03-2022
Elsevier Science Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recent advances in sensor and IoT technologies allow for denser and mobile air quality measurements. These measurements are still spatiotemporally sparse at city-level, but can be interpolated using data-driven techniques. This work presents validation results of two machine-learning models to infer air quality sensor data in both space and time. Temporal validation exercises are performed at available regulatory monitoring stations following the FAIRMODE protocol. Both models show scalable to different mobile datasets with comparable prediction performance for PM2.5 (R2 = 0.68–0.75, MAE = 2.99–2.82 μg m−3) and NO2 (R2 = 0.8–0.82, MAE = 8.81–9.83 μg m−3) in Utrecht and Antwerp. In Oakland (Atlanta), we observed a lower performance for NO2 (R2 = 0.46–0.41, MAE = 4.06–5.07) and BC (R2 = 0.31–0.28, MAE = 0.48–0.27), likely caused by the less representative monitoring coverage. Although comparable in terms of prediction performance, the Geographical Random Forest (GRF) model seems to achieve slightly better accuracies, while the correlations are typically higher for the Air Variational Graph Autoencoder (AVGAE) model. This work demonstrates the potential of data-driven techniques for spatiotemporal air quality inference of complementary sensor data. The observed performance metrics approach current state-of-the-art chemical transport models in terms of performance while needing much lower resources, computational power, infrastructure and processing time.
•Machine learning techniques can interpolate spatiotemporally sparse regulatory- and sensor-derived air quality data.•We present model validation results on different mobile datasets from Antwerp (BE), Utrecht (NL) and Oakland (US).•Following the FAIRMODE protocol, both models show to perform on different mobile datasets.•This work demonstrates the potential of data-driven techniques for spatiotemporal air quality inference of sensor data.•Ultimately, model performance still depends on the applied sensor performance and spatiotemporal monitoring coverage. |
---|---|
ISSN: | 1364-8152 1873-6726 |
DOI: | 10.1016/j.envsoft.2022.105306 |