Insights From The Blog

Robot Control: Stamford Shows the Way …With Images

The control of robotic systems typically involves creating a program in a suitable language – such as C++ or Java – that depicts the exact movement of the robot so that it knows where to go. This is the way that robotic operations have been carried out since the advent of robots, but all of that is now about to change with a new system being developed by Stanford University and the Google DeepMind team.

The new proposal is a system which would see robots that are suitably designed to use sketched that depict their path and operations rather than lines of code. If this works, it will be very exciting.

Follow the Lines, Not the Code

Following months of research, the team have developed a program called RT-Sketch, which presents the data in a logical, diagrammatic way, rather than by the traditional coded method.

The team decided to investigate this mode of control because there are increasingly complex demands being placed on robot systems and it has come to the point where it is difficult to translate those by language alone. To get robots to do the things that we now need them to do would take huge amounts of code, and that is not only tedious to construct, but has the potential for increased bugs and problems in the code.

However, if a robot can be trained to follow a diagram or schematic, then it opens up a whole new swath of possibilities in robotics.

Imitation is the Key

The research team started off by developing a new imitation learning (IL) system that allowed them to show the robot what was being asked of it, rather than to program it in. And rather than being highly precise images, the team found that they could control the system with simple hand-drawn sketches. 

In order to train RT-Sketch, the group selected an episode from a collection of robot trajectories previously recorded. They treated the final observation in the trajectory as a goal picture and transformed it into an edge-detected, a colourised, or a generative adversarial network (GAN) produced goal drawing. These pictures were chain-linked to assign the robot a number of chores. RT-Sketch then used the connected tasks as input to produce certain actions. The aim of the training on the several input kinds was to promote various input drawing detail levels.

By working in this way, it is possible to get the robot to replicate a number of tasks that would be very time-consuming to try and produce in written code. It is simplicity itself.

Tasks Made Easy

Once the training tasks had been compiled and completed, the team found that the robot could identify shapes from even simple drawings, and could use those to spatially align objects within the area. For example, suppose that you wanted a robotic arm to lay a dinner table. Once you have shown the robot what the table looks like, and defined its boundaries with a simple sketch, the next tasks follow on quite easily. 

First, you may want to remove some items from a draw in the front.  So, you make a depiction of the draw. The robot locates and opens the draw. It can then follow the next series of tasks aimed at removing things from the draw and setting them on the table in fairly specific locations.

From that point it is possible to build up increasingly complex scenarios and eventually end up with a fully set table. The robotic arm completes each task based only on a simple diagram. By including rudimentary shapes, the robot uses its cameras to identify what the drawn shape is. Provided that there are not too many conflicting shapes, the robot usually gets it right.

The system has now been tried and tested and found to be both consistent and accurate, leading the team to consider increasingly complex tasks. It is expected that the same system could be used to unpack and assemble flat-pack furniture, or place items in packing boxes prior to delivery. As the team grows the imitation learning system, it will allow the robot to carry out a greater array of tasks. And because it is relatively simple, the system could be used to train robots in all manner of tasks.

We at Unity Developers are very excited about where this technology is going and are looking forward to trying it out for ourselves. Keep checking back for updates.