bloc 33rd Square Business Tools - Sergey Levine 33rd Square Business Tools: Sergey Levine - All Post
Showing posts with label Sergey Levine. Show all posts
Showing posts with label Sergey Levine. Show all posts

Monday, May 25, 2015

Robots Learn on Their Own Through Trial and Error

 Artificial Intelligence
Robotics researchers have engineered new algorithms that enable robots to learn motor tasks by trial and error, using a process that more closely approximates the way people learn.





Researchers have developed algorithms that enable robots to learn motor tasks through trial and error using a process that more closely approximates the way humans learn, marking a major milestone in the field of artificial intelligence.

They demonstrated their technique, a type of reinforcement learning, by having a robot complete various tasks — putting a clothes hanger on a rack, assembling a toy plane, screwing a cap on a water bottle, and more — without pre-programmed details about its surroundings.

“What we’re reporting on here is a new approach to empowering a robot to learn,” said Professor Pieter Abbeel of UC Berkeley’s Department of Electrical Engineering and Computer Sciences. “The key is that when a robot is faced with something new, we won’t have to reprogram it. The exact same software, which encodes how the robot can learn, was used to allow the robot to learn all the different tasks we gave it.”

Looking for information of schools in USA http://www.top10usaschool.com Is the Ultimate source for USA Primary, Middle and High schools.

This advance was be presented at the International Conference on Robotics and Automation (ICRA). Abbeel is leading the project with fellow UC Berkeley faculty member Trevor Darrell, director of the Berkeley Vision and Learning Center. Other members of the research team are postdoctoral researcher Sergey Levine and Ph.D. student Chelsea Finn.

The work is part of a new People and Robots Initiative at UC’s Center for Information Technology Research in the Interest of Society (CITRIS). The new multi-campus, multidisciplinary research initiative seeks to keep the dizzying advances in artificial intelligence, robotics and automation aligned to human needs.

Robots Learn on Their Own Through Trial and Error

“Most robotic applications are in controlled environments where objects are in predictable positions,” said Darrell. “The challenge of putting robots into real-life settings, like homes or offices, is that those environments are constantly changing. The robot must be able to perceive and adapt to its surroundings.”

Conventional, but impractical, approaches to helping a robot make its way through a 3D world include pre-programming it to handle the vast range of possible scenarios or creating simulated environments within which the robot operates.

The UC Berkeley researchers instead turned to a new branch of artificial intelligence known as deep learning, which is loosely inspired by the neural circuitry of the human brain when it perceives and interacts with the world.

"The challenge of putting robots into real-life settings, like homes or offices, is that those environments are constantly changing. The robot must be able to perceive and adapt to its surroundings."


“For all our versatility, humans are not born with a repertoire of behaviors that can be deployed like a Swiss army knife, and we do not need to be programmed,” said Levine. “Instead, we learn new skills over the course of our life from experience and from other humans. This learning process is so deeply rooted in our nervous system, that we cannot even communicate to another person precisely how the resulting skill should be executed. We can at best hope to offer pointers and guidance as they learn it on their own.”

In the experiments, the UC Berkeley researchers worked with a Willow Garage Personal Robot 2 (PR2), which they nicknamed BRETT, or Berkeley Robot for the Elimination of Tedious Tasks.

They presented BRETT with a series of motor tasks, such as placing blocks into matching openings or stacking Lego blocks. The algorithm controlling BRETT’s learning included a reward function that provided a score based upon how well the robot was doing with the task (see video below).

Related articles
BRETT takes in the scene, including the position of its own arms and hands, as viewed by the camera. The algorithm provides real-time feedback via the score based upon the robot’s movements. Movements that bring the robot closer to completing the task will score higher than those that do not. The score feeds back through the neural net, so the robot can learn which movements are better for the task at hand.

This end-to-end training process underlies the robot’s ability to learn on its own. As the PR2 moves its joints and manipulates objects, the algorithm calculates good values for the 92,000 parameters of the neural net it needs to learn.

With this approach, when given the relevant coordinates for the beginning and end of the task, the PR2 could master a typical assignment in about 10 minutes. When the robot is not given the location for the objects in the scene and needs to learn vision and control together, the learning process takes about three hours.

Abbeel says the field will likely see significant improvements as the ability to process vast amounts of data improves.

“With more data, you can start learning more complex things,” he said. “We still have a long way to go before our robots can learn to clean a house or sort laundry, but our initial results indicate that these kinds of deep learning techniques can have a transformative effect in terms of enabling robots to learn complex tasks entirely from scratch. In the next five to 10 years, we may see significant advances in robot learning capabilities through this line of work.”

In the world of artificial intelligence, deep learning programs create “neural nets” in which layers of artificial neurons process overlapping raw sensory data, whether it be sound waves or image pixels. This helps the robot recognize patterns and categories among the data it is receiving. People who use Siri on their iPhones, Google’s speech-to-text program or Google Street View might already have benefited from the significant advances deep learning has provided in speech and vision recognition.

Applying deep reinforcement learning to motor tasks has been far more challenging, however, since the task goes beyond the passive recognition of images and sounds.

“Moving about in an unstructured 3D environment is a whole different ballgame,” said Finn. “There are no labeled directions, no examples of how to solve the problem in advance. There are no examples of the correct solution like one would have in speech and vision recognition programs.”




SOURCE  UC Berkeley

By 33rd SquareEmbed

Tuesday, April 28, 2015


 Machine Intelligence
Videos of robots moving slowly, or sped up are common.  This might be a thing of the past as researchers like Sergey Levin develop neural networks to control robots.  In this impressive work, the perception and the control lead the robots to work in real time and very fast.





A remarkable feature of human and animal intelligence is the ability to autonomously acquire new behaviors. Sergey Levin, a researcher at UC Berkeley, is concerned with designing algorithms that aim to bring this ability to robots and simulated characters.

Readers of this website might be familiar with video of robots like the PR2 above, but may have also noticed that often the videos are run fast to show what is going on.  The actual robot is moving very slowly.

In the video above and in Levin's lecture below, the researchers, have managed to get the robot to perform complicated tasks in real time.

The New Techniques That Will Power Robot Intelligence and Control


This is perhaps the most impressive robotics motion with real-world situations that have ever been released.


"The reason it is fast is because it is optimized on the real physical system."


"The reason it is fast," says Levin of the robot's motion, "is because it is optimized on the real physical system."

According to Levin, a central challenge in this field is to learn behaviors with representations that are sufficiently general and expressive to handle the wide range of motion skills that are necessary for real-world applications, such as general-purpose household robots.


These representations must also be able to operate on raw, high-dimensional inputs and outputs, such as camera images, joint torques, and muscle activations. In the lecture below, Levin describes a class of guided policy search algorithms that tackle this challenge by transforming the task of learning control policies into a supervised learning problem, with supervision provided by simple, efficient trajectory-centric methods.

optimal control reinforcement learning deep learning


Related articles
Levin shows how this approach can be applied to a wide range of tasks, from locomotion and push recovery to robotic manipulation.  This includes generalization of the tasks the robot learns,

For instance, the robot is trained to hang a coat hanger on a rack, without clothes on the hanger, but is then tested with clothes on the hanger.  In another test, the robot learns how to screw a cap onto a bottle, but successfully implements what it learned to other bottles.

The researchers got over 50 percent accuracy on the coat hanger task, and nearly 90% on the bottle exercise.

He also presents new results on using deep convolutional neural networks to directly learn policies that combine visual perception and control, learning the entire mapping from rich visual stimuli to motor torques on a real robot.

Levin concludes his talk below, by discussing future directions in deep sensorimotor learning and how advances in this emerging field can be applied to a range of other areas.  This includes the potential of using big data in addition to reinforcement learning to further improve the robot control and movement.

Sergey Levine is a postdoctoral researcher working with Professor Pieter Abbeel at UC Berkeley. He completed his PhD in 2014 with Vladlen Koltun at Stanford University. His research focuses on robotics, machine learning, and computer graphics. In his PhD thesis, he developed a novel guided policy search algorithm for learning rich, expressive locomotion policies. In later work, this method enabled learning a range of robotic manipulation tasks, as well as end-to-end training of policies for perception and control. He has also developed algorithms for learning from demonstration, inverse reinforcement learning, and data-driven character animation.

The lecture below is hightly technical, including some of the foudational mathematics behind the robot control algorithms, but at around the 25 minute mark, the impressive PR2 robot controls are shown.




SOURCE  University of Washington

By 33rd SquareEmbed