Autonomous Humanoid Workers
In the Focus Group “Autonomous Humanoid Workers”, Hans Fischer Senior Fellow Cordelia Schmid (Inria, France) works with her hosts Prof. Majid Khadiv (AI Planning in Dynamic Environments, TUM School of Computation, information and Technology) and Prof. Daniel Cremers (Computer Vision and Artificial Intelligence, TUM School of Computation, Information and Technology).
The goal is to obtain successful task execution when providing natural language instructions to a humanoid robot, such as clean the kitchen, remove debris, or unload building materials, without the need for detailed instructions. While humanoid robots have the potential to replace humans in all these dangerous and repetitive tasks, despite more than 30 years of research, we still do not see a single example of such deployments. What is still missing to realize an autonomous humanoid worker? We hypothesize that the main issue is that we do not have a framework that scales to generate a wide range of skills required for these tasks and quickly re-plan the motions if something goes wrong. Our key observation is that since the morphology of humans and humanoids are similar, we can leverage the massive availability and quantity of human motion data to approximate planning from both an instruction-based and visual perspective. Using these solutions as a guidance will enable humanoid robots to plan long sequences of loco-manipulation tasks, given high-level instructions. Hence, the main goal of this project is to leverage human videos on the Internet and retarget them to a humanoid robot. These retargeted trajectories are then used to train a vision-language-action model for humanoid robots.