Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion

Zipeng Fu*         Xuxin Cheng*         Deepak Pathak
Carnegie Mellon University CoRL 2022 (Oral)

Whole-body control via a unified learned policy achieves agile and dynamic behaviors on our custom-built low-cost legged manipulator


An attached arm can significantly increase the applicability of legged robots to several mobile manipulation tasks that are not possible for the wheeled or tracked counterparts. The standard control pipeline for such legged manipulators is to decouple the controller into that of manipulation and locomotion. However, this is ineffective and requires immense engineering to support coordination between the arm and legs, error can propagate across modules causing non-smooth unnatural motions. It is also biological implausible where there is evidence for strong motor synergies across limbs. In this work, we propose to learn a unified policy for whole-body control of a legged manipulator using reinforcement learning. We propose Regularized Online Adaptation to bridge the Sim2Real gap for high-DoF control, and Advantage Mixing exploiting the causal dependency in the action space to overcome local minima during training the whole-body system. We also present a simple design for a low-cost legged manipulator, and find that our unified policy can demonstrate dynamic and agile behaviors across several task setups.

Project Video


Given only the gripper end-effector command and the command velocity of the quadruped, our trained unified policy controls all the joints of the legs and arm in 50 Hz without any finetuning in the real world.



Visual Tracking

The robot is tasked to track the AR tag, and the unified policy automatically adjusts the whole-body control to track the AR tag. For example, bending legs in coordination with the arm to keep the camera near the AR tag.



Demonstration Replay

Here we show the agility of our robot controlled by the unified policy using demonstrations to perform tasks.



  author    = {Fu, Zipeng and Cheng, Xuxin and Pathak, Deepak},
  title     = {Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion},
  booktitle = {Conference on Robot Learning ({CoRL})},
  year      = {2022},