Oil palm yield prediction across blocks from multi-source data using machine learning and deep learning
Crop yield estimates are affected by various factors including weather, nutrients and management practices. Predicting yields on a large scale in a timely and accurate manner by considering these factors is essential for preventing climate risk and ensuring food security, particularly in the light o...
Saved in:
Published in: | Earth science informatics Vol. 15; no. 4; pp. 2349 - 2367 |
---|---|
Main Authors: | , , , , , , , , , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01-12-2022
Springer Nature B.V |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Crop yield estimates are affected by various factors including weather, nutrients and management practices. Predicting yields on a large scale in a timely and accurate manner by considering these factors is essential for preventing climate risk and ensuring food security, particularly in the light of climate change and the escalation of extreme climatic events. In this study, integrating multi-source data (i.e. satellite-derived vegetation indices (VIs), satellite-derived climatic variables (i.e. land surface temperature (LST) and rainfall precipitation, weather station and field-surveys), we built one multiple linear regression (MLR), three machine learning (XGBoost, support vector regression, and random forest) and one deep learning (deep neural network) models to predict oil palm yield at block-level within the oil palm plantation. Moreover, time-series moving average and backward elimination feature selection technique were implemented at the pre-processing stage. The yield prediction models were developed and tested using MLR, XGBoost, support vector regression (SVR), random forest (RF) and deep neural network (DNN) algorithms. Their model performances were then compared using evaluation metrics and generated the final spatial prediction map based on the best performance. DNN achieved the best model performances for both selected (R
2
= 0.91; RMSE = 2.92 t ha
− 1
; MAE = 2.56 t ha
− 1
and MAPE = 0.09 t ha
− 1
) and full predictors (R
2
= 0.76; RMSE of 3.03 t ha
− 1
; MAE of 2.88 t ha
− 1
; MAPE of 0.10 t ha
− 1
). In addition, advanced ensemble machine learning (ML) techniques such as XGBoost may be utilised as a supplementary for oil palm yield prediction at the block level. Among them, MLR recorded the lowest performance. By using backward elimination to identify the most significant predictors, the performance of all models was improved by 5–26% for R
2
, and that decreased by 3–31% for RMSE, 7–34% for MAE, and 1–15% for MAPE. After backward elimination, the DNN achieved the highest prediction accuracy among the other models, with a 14% increase in R-squared, a 11% decrease in RMSE, a 32% decrease in MAE and a 1% decrease in MAPE. Our study successfully developed efficient and accurate yield prediction models for timely predicting oil palm yield over a large area by integrating data from multiple sources. These would be useful for plantation management estimating oil palm yields to speed up the decision-making process for sustainable production. |
---|---|
ISSN: | 1865-0473 1865-0481 |
DOI: | 10.1007/s12145-022-00882-9 |