Robot see, Robot do: Bots learn by watching human behaviour
Robots following coded instructions to complete a task? Old school. Robots learning to do things by watching how humans do it? That’s the future. Stanford’s Animesh Garg and Marynel Vázquez shared their research in a talk on “Generalisable Autonomy for Robotic Mobility and Manipulation” at the GPU Technology Conference recently. In lay terms, generalisable autonomy is the idea that a robot can observe human behaviour and learn to imitate it in a way that’s applicable to a variety of tasks and situations. What kinds of situations? Learning to cook by watching YouTube videos, for one. And figuring out how to cross a crowded room for another. Cooking 101 Garg, a postdoctoral researcher at the Stanford Vision and Learning Lab (CVGL), likes to cook. He also likes robots. But what he’s not so keen on is a future full of robots who can only cook one recipe each. While the present is increasingly full of robots that excel at single tasks, Garg is working toward what he calls “the dream of general-purpose robots.” The path to the dream may lie in neural task programming (NTP), a new approach to meta-learning. NTP leverages hierarchy and learns to program with a modular robot API to perform unseen tasks working from only a single test example. For instance, a robot chef would take a cooking video as input and use a hierarchical neural program to break the video data down into what Garg calls a structured representation of the task based on visual cues as well as temporal sequence. Instead of learning a single recipe that’s only good for making spaghetti with meatballs, the robot understands all the subroutines, or components, that make up the task. From there, the budding mechanical chef can apply skills like boiling water, frying meatballs and simmering sauce to other situations. Solving […]