Map-based experience replay: a memory-efficient solution to catastrophic forgetting in reinforcement learning

Deep reinforcement learning (RL) agents often suffer from catastrophic forgetting, forgetting previously found solutions in parts of the input space when training new data. Replay memories are a common solution to the problem by decorrelating and shuffling old and new training samples. They naively...

Full description

Saved in:

Bibliographic Details
Published in:	Frontiers in neurorobotics Vol. 17; p. 1127642
Main Authors:	Hafez, Muhammad Burhan, Immisch, Tilman, Weber, Tom, Wermter, Stefan
Format:	Journal Article
Language:	English
Published:	Switzerland Frontiers Research Foundation 27-06-2023 Frontiers Media S.A
Subjects:	Algorithms catastrophic forgetting Cognitive ability cognitive robotics continual learning Decision making experience replay growing self-organizing maps Learning Memory Neuroscience Reinforcement reinforcement learning growing self-organizing maps cognitive robotics continual learning catastrophic forgetting experience replay reinforcement learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep reinforcement learning (RL) agents often suffer from catastrophic forgetting, forgetting previously found solutions in parts of the input space when training new data. Replay memories are a common solution to the problem by decorrelating and shuffling old and new training samples. They naively store state transitions as they arrive, without regard for redundancy. We introduce a novel cognitive-inspired replay memory approach based on the Grow-When-Required (GWR) self-organizing network, which resembles a map-based mental model of the world. Our approach organizes stored transitions into a concise environment-model-like network of state nodes and transition edges, merging similar samples to reduce the memory size and increase pair-wise distance among samples, which increases the relevancy of each sample. Overall, our study shows that map-based experience replay allows for significant memory reduction with only small decreases in performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Edited by: Nicolás Navarro-Guerrero, L3S Research Center, Germany These authors share first authorship Reviewed by: Andreas Schweiger, Airbus, Netherlands; Tongle Zhou, Nanjing University of Aeronautics and Astronautics, China; Guangda Chen, Zhejiang University, China
ISSN:	1662-5218 1662-5218
DOI:	10.3389/fnbot.2023.1127642