Toyota Research Institute Unveils Breakthrough in Teaching Robots New Behaviors

Today, Toyota Research Institute (TRI) announced a breakthrough generative AI approach based on Diffusion Policy to quickly and confidently teach robots new, dexterous skills. This advancement significantly improves robot utility and is a step towards building "Large Behavior Models (LBMs)" for robots, analogous to the Large Language Models (LLMs) that have recently revolutionized conversational AI.

 

"Our research in robotics is aimed at amplifying people rather than replacing them," said Gill Pratt, CEO of TRI and Chief Scientist for Toyota Motor Corporation. "This new teaching technique is both very efficient and produces very high performing behaviors, enabling robots to much more effectively amplify people in many ways."

 

Previous state-of-the-art techniques to teach robots new behaviors were slow, inconsistent, inefficient, and often limited to narrowly defined tasks performed in highly constrained environments. Roboticists needed to spend many hours writing sophisticated code and/or using numerous trial and error cycles to program behaviors.

 

TRI has already taught robots more than 60 difficult, dexterous skills using the new approach, including pouring liquids, using tools, and manipulating deformable objects. These achievements were realized without writing a single line of new code; the only change was supplying the robot with new data. Building on this success, TRI has set an ambitious target of teaching hundreds of new skills by the end of the year and 1,000 by the end of 2024.

 

Today's news also highlights that robots can be taught to function in new scenarios and perform a wide range of behaviors. These skills are not limited to just "'pick and place" or simply picking up objects and putting them down in new locations. TRI's robots can now interact with the world in varied and rich ways — which will one day allow robots to support people in everyday situations and unpredictable, ever-changing environments.

 

 

"The tasks that I'm watching these robots perform are simply amazing – even one year ago, I would not have predicted that we were close to this level of diverse dexterity," remarked Russ Tedrake, Vice President of Robotics Research at TRI. Dr. Tedrake, who is also the Toyota Professor of Electrical Engineering and Computer Science, Aeronautics and Astronautics, and Mechanical Engineering at MIT, explained, "What is so exciting about this new approach is the rate and reliability with which we can add new skills. Because these skills work directly from camera images and tactile sensing, using only learned representations, they are able to perform well even on tasks that involve deformable objects, cloth, and liquids — all of which have traditionally been extremely difficult for robots."

 

Technical details:

TRI's robot behavior model learns from haptic demonstrations from a teacher, combined with a language description of the goal. It then uses an AI-based Diffusion Policy to learn the demonstrated skill. This process allows a new behavior to be deployed autonomously from dozens of demonstrations. Not only does this approach produce consistent, repeatable, and performant results, but it does so with tremendous speed.

 

Key achievements of TRI's research for this novel development include:

  • Diffusion Policy: TRI and our collaborators in Professor Song's group at Columbia University developed a new, powerful generative-AI approach to behavior learning. This approach, called Diffusion Policy, enables easy and rapid behavior teaching from demonstration. 
  • Customized Robot Platform: TRI's robot platform is custom-built for dexterous dual-arm manipulation tasks with a special focus on enabling haptic feedback and tactile sensing. 
  • Pipeline: TRI robots have learned 60 dexterous skills already, with a target of hundreds by the end of the year and 1,000 by the end of 2024. 
  • Drake: Part of our (not so) secret sauce is Drake, a model-based design for robotics that provides us with a cutting-edge toolbox and simulation platform. Drake's high degree of realism allows us to develop in both simulation and in reality at a dramatically increased scale and velocity than would otherwise be possible. Our internal robot stack is built using Drake's optimization and systems frameworks, and we have made Drake open source to catalyze work across the entire robotics community.
  • Safety: Safety is core to our robotics efforts at TRI. We have designed our system with strong safeguards, powered by Drake and our custom robot control stack, to ensure our robots respect safety guarantees like not colliding with itself or its environment.

 

Diffusion Policy has been published at the 2023 Robotics Science and Systems conference. Additional technical information can be found on TRI's Medium blog

Please join our LinkedIn Live Q&A session on October 4th from 1 pm - 2 pm ET / 10 am - 11 am PT, for an opportunity to learn more and hear directly from the TRI robotics research team. Sign up for the event on TRI's LinkedIn page.

 

About Toyota Research Institute

Toyota Research Institute (TRI) conducts research to amplify human ability, focusing on making our lives safer and more sustainable. Led by Dr. Gill Pratt, TRI's team of researchers develops technologies to advance energy and materials, human-centered artificial intelligence, human interactive driving, machine learning, and robotics. Established in 2015, TRI has offices in Los Altos, California, and Cambridge, Massachusetts. For more information about TRI, please visit http://tri.global.

 

Comments (0)

This post does not have any comments. Be the first to leave a comment below.


Post A Comment

You must be logged in before you can post a comment. Login now.

Featured Product

3D Vision: Ensenso B now also available as a mono version!

3D Vision: Ensenso B now also available as a mono version!

This compact 3D camera series combines a very short working distance, a large field of view and a high depth of field - perfect for bin picking applications. With its ability to capture multiple objects over a large area, it can help robots empty containers more efficiently. Now available from IDS Imaging Development Systems. In the color version of the Ensenso B, the stereo system is equipped with two RGB image sensors. This saves additional sensors and reduces installation space and hardware costs. Now, you can also choose your model to be equipped with two 5 MP mono sensors, achieving impressively high spatial precision. With enhanced sharpness and accuracy, you can tackle applications where absolute precision is essential. The great strength of the Ensenso B lies in the very precise detection of objects at close range. It offers a wide field of view and an impressively high depth of field. This means that the area in which an object is in focus is unusually large. At a distance of 30 centimetres between the camera and the object, the Z-accuracy is approx. 0.1 millimetres. The maximum working distance is 2 meters. This 3D camera series complies with protection class IP65/67 and is ideal for use in industrial environments.