Planning in Learned Latent Action Spaces for Generalizable Legged Locomotion

Hierarchical learning has been successful at learning generalizable locomotion skills on walking robots in a sample-efficient manner. However, the low-dimensional "latent" action used to communicate between two layers of the hierarchy is typically user-designed. In this letter, we present...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE robotics and automation letters Vol. 6; no. 2; pp. 2682 - 2689
Main Authors:	Li, Tianyu, Calandra, Roberto, Pathak, Deepak, Tian, Yuandong, Meier, Franziska, Rai, Akshara
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01-04-2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Aerospace electronics Hardware Insect mimicking walking robots Kinematics Legged locomotion Locomotion motion planning Multiple robots Planning Predictive control robot control Robot dynamics Robot learning Robots Simulation Task analysis Trajectory Walking
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Hierarchical learning has been successful at learning generalizable locomotion skills on walking robots in a sample-efficient manner. However, the low-dimensional "latent" action used to communicate between two layers of the hierarchy is typically user-designed. In this letter, we present a fully-learned hierarchical framework, that is capable of jointly learning the low-level controller and the high-level latent action space. Once this latent space is learned, we plan over continuous latent actions in a model-predictive control fashion, using a learned high-level dynamics model. This framework generalizes to multiple robots, and we present results on a Daisy hexapod simulation, A1 quadruped simulation, and Daisy robot hardware. We compare a range of learned hierarchical approaches from literature, and show that our framework outperforms baselines on multiple tasks and two simulations. In addition to learning approaches, we also compare to inverse-kinematics (IK) acting on desired robot motion, and show that our fully-learned framework outperforms IK in adverse settings on both A1 and Daisy simulations. On hardware, we show the Daisy hexapod achieve multiple locomotion tasks, in an unstructured outdoor setting, with only 2000 hardware samples, reinforcing the robustness and sample-efficiency of our approach.
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2021.3062342