Ensembles of Neural Networks for Robust Reinforcement Learning

Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their training and the validation of final policies can be cumbersome as neural networks can suffer from problems like local minima...

Full description

Saved in:

Bibliographic Details
Published in:	2010 Ninth International Conference on Machine Learning and Applications pp. 401 - 406
Main Authors:	Hans, A, Udluft, S
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-12-2010
Subjects:	Approximation algorithms Artificial neural networks ensemble methods Function approximation Network topology neural fitted Q-iteration neural networks Neurons reinforcement learning robustness Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their training and the validation of final policies can be cumbersome as neural networks can suffer from problems like local minima or over fitting. When using iterative methods, such as neural fitted Q-iteration, the problem becomes even more pronounced since the network has to be trained multiple times and the training process in one iteration builds on the network trained in the previous iteration. Therefore errors can accumulate. In this paper we propose to use ensembles of networks to make the learning process more robust and produce near-optimal policies more reliably. We name various ways of combining single networks to an ensemble that results in a final ensemble policy and show the potential of the approach using a benchmark application. Our experiments indicate that majority voting is superior to Q-averaging and using heterogeneous ensembles (different network topologies) is advisable.
ISBN:	1424492114 9781424492114
DOI:	10.1109/ICMLA.2010.66