Assessment of soil salinity using explainable machine learning methods and Landsat 8 images

•ML based Soil salinity estimations were explained using Shapley additive explanations.•Landsat 8 images derived features’ importance for soil salinity were investigated.•Over-sampling methods were implemented to tackle unbalanced data effects on models.•Feature importance vary model to model but no...

Full description

Saved in:
Bibliographic Details
Published in:International journal of applied earth observation and geoinformation Vol. 130; p. 103879
Main Authors: Aksoy, Samet, Sertel, Elif, Roscher, Ribana, Tanik, Aysegul, Hamzehpour, Nikou
Format: Journal Article
Language:English
Published: Elsevier B.V 01-06-2024
Elsevier
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•ML based Soil salinity estimations were explained using Shapley additive explanations.•Landsat 8 images derived features’ importance for soil salinity were investigated.•Over-sampling methods were implemented to tackle unbalanced data effects on models.•Feature importance vary model to model but not as substantial as area to area. The aim of this study is to comparatively analyze the performance of machine learning (ML) algorithms for modeling soil salinity using field-based electrical conductivity (EC) data and Landsat-8 OLI satellite images with derived environmental covariates. We also aim to interpret and explain the ML models with and without over-sampling methods using Shapley (SHAP) values, an explainable ML approach that has not yet been utilized for soil salinity estimation tasks as an ML problem. We investigate two case study areas from western and southeastern Lake Urmia Playas (LUP) in the Northwest of Iran. Our study uses 26 environmental covariates, two ML models, namely extreme gradient boosting (XGBoost) and random forest (RF), and two over-sampling methods: synthetic minority over-sampling technique (SMOTE) and random over-sampling (ROS). Results indicate that XGBoost performs better compared to RF in terms of both R2 and RMSE. Additionally, the visual interpretation of soil salinity maps demonstrated the superiority of XGBoost. SMOTE produced superior results than ROS and no over-sampling test cases. Finally, SHAP analysis illustrated that vegetation indices made a greater contribution to the soil salinity prediction in the West LUP, while visible bands contributed more in the Southeast LUP Region.
ISSN:1569-8432
DOI:10.1016/j.jag.2024.103879