Learning Options From Demonstrations: A Pac-Man Case Study

Reinforcement learning (RL) is a machine learning paradigm behind many successes in games, robotics, and control applications. RL agents improve through trial-and-error, therefore undergoing a learning phase during which they perform suboptimally. Research effort has been put into optimizing behavio...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on games Vol. 10; no. 1; pp. 91 - 96
Main Authors:	Tamassia, Marco, Zambetta, Fabio, Raffe, William L., Mueller, Florian, Li, Xiaodong
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01-03-2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithm design and analysis Algorithms Computer & video games Data mining Decisions Games Learning (artificial intelligence) Learning from demonstration Machine learning options framework Performance enhancement reinforcement learning (RL) Robotics Robots temporal difference learning Training Trajectory
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Reinforcement learning (RL) is a machine learning paradigm behind many successes in games, robotics, and control applications. RL agents improve through trial-and-error, therefore undergoing a learning phase during which they perform suboptimally. Research effort has been put into optimizing behavior during this period, to reduce its duration and to maximize after-learning performance. We introduce a novel algorithm that extracts useful information from expert demonstrations (traces of interactions with the target environment) and uses it to improve performance. The algorithm detects unexpected decisions made by the expert and infers what goal the expert was pursuing. Goals are then used to bias decisions while learning. Our experiments in the video game Pac-Man provide statistically significant evidence that our method can improve final performance compared to a state-of-the-art approach.
ISSN:	2475-1502 2475-1510
DOI:	10.1109/TCIAIG.2017.2658659