Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
Deep reinforcement learning has achieved many recent successes, but our understanding of its strengths and limitations is hampered by the lack of rich environments in which we can fully characterize optimal behavior, and correspondingly diagnose individual actions against such a characterization. He...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
07-11-2017
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deep reinforcement learning has achieved many recent successes, but our
understanding of its strengths and limitations is hampered by the lack of rich
environments in which we can fully characterize optimal behavior, and
correspondingly diagnose individual actions against such a characterization.
Here we consider a family of combinatorial games, arising from work of Erdos,
Selfridge, and Spencer, and we propose their use as environments for evaluating
and comparing different approaches to reinforcement learning. These games have
a number of appealing features: they are challenging for current learning
approaches, but they form (i) a low-dimensional, simply parametrized
environment where (ii) there is a linear closed form solution for optimal
behavior from any state, and (iii) the difficulty of the game can be tuned by
changing environment parameters in an interpretable way. We use these
Erdos-Selfridge-Spencer games not only to compare different algorithms, but
test for generalization, make comparisons to supervised learning, analyse
multiagent play, and even develop a self play algorithm. Code can be found at:
https://github.com/rubai5/ESS_Game |
---|---|
DOI: | 10.48550/arxiv.1711.02301 |