Multi-step forecasting for big data time series based on ensemble learning
This paper presents ensemble models for forecasting big data time series. An ensemble composed of three methods (decision tree, gradient boosted trees and random forest) is proposed due to the good results these methods have achieved in previous big data applications. The weights of the ensemble are...
Saved in:
Published in: | Knowledge-based systems Vol. 163; pp. 830 - 841 |
---|---|
Main Authors: | , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Amsterdam
Elsevier B.V
01-01-2019
Elsevier Science Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper presents ensemble models for forecasting big data time series. An ensemble composed of three methods (decision tree, gradient boosted trees and random forest) is proposed due to the good results these methods have achieved in previous big data applications. The weights of the ensemble are computed by a weighted least square method. Two strategies related to the weight update are considered, leading to a static or dynamic ensemble model. The predictions for each ensemble member are obtained by dividing the forecasting problem into h forecasting sub-problems, one for each value of the prediction horizon. These sub-problems have been solved using machine learning algorithms from the big data engine Apache Spark, ensuring the scalability of our methodology. The performance of the proposed ensemble models is evaluated on Spanish electricity consumption data for 10 years measured with a 10-minute frequency. The results showed that both the dynamic and static ensembles performed well, outperforming the individual ensemble members they combine. The dynamic ensemble was the most accurate model achieving a MRE of 2%, which is a very promising result for the prediction of big time series. Proposed ensembles are also evaluated using solar power from Australia for two years measured with 30-min frequency. The results are successfully compared with Artificial Neural Network, Pattern Sequence-based Forecasting and Deep Learning, improving their results. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2018.10.009 |