So You Think You Can Scale Up Autonomous Robot Data Collection?
A long-standing goal in robot learning is to develop methods for robots to acquire new skills autonomously. While reinforcement learning (RL) comes with the promise of enabling autonomous data collection, it remains challenging to scale in the real-world partly due to the significant effort required...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
04-11-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A long-standing goal in robot learning is to develop methods for robots to
acquire new skills autonomously. While reinforcement learning (RL) comes with
the promise of enabling autonomous data collection, it remains challenging to
scale in the real-world partly due to the significant effort required for
environment design and instrumentation, including the need for designing reset
functions or accurate success detectors. On the other hand, imitation learning
(IL) methods require little to no environment design effort, but instead
require significant human supervision in the form of collected demonstrations.
To address these shortcomings, recent works in autonomous IL start with an
initial seed dataset of human demonstrations that an autonomous policy can
bootstrap from. While autonomous IL approaches come with the promise of
addressing the challenges of autonomous RL as well as pure IL strategies, in
this work, we posit that such techniques do not deliver on this promise and are
still unable to scale up autonomous data collection in the real world. Through
a series of real-world experiments, we demonstrate that these approaches, when
scaled up to realistic settings, face much of the same scaling challenges as
prior attempts in RL in terms of environment design. Further, we perform a
rigorous study of autonomous IL methods across different data scales and 7
simulation and real-world tasks, and demonstrate that while autonomous data
collection can modestly improve performance, simply collecting more human data
often provides significantly more improvement. Our work suggests a negative
result: that scaling up autonomous data collection for learning robot policies
for real-world tasks is more challenging and impractical than what is suggested
in prior work. We hope these insights about the core challenges of scaling up
data collection help inform future efforts in autonomous learning. |
---|---|
DOI: | 10.48550/arxiv.2411.01813 |