bloc 33rd Square Business Tools - ImageNet 33rd Square Business Tools: ImageNet - All Post
Showing posts with label ImageNet. Show all posts
Showing posts with label ImageNet. Show all posts

Tuesday, July 5, 2016

Gill Pratt Looks At The Road to Artificial Intelligence


Artificial Intelligence

Once the realm of science fiction, smart machines are rapidly becoming part of our world—and these technologies offer amazing potential to improve the way we live. Imagine intelligent, autonomous vehicles that reduce crashes and alleviate congestion in crowded cities.


Imagine robots that can help your aged grandma move around safely or instructors that can assist special-needs children in classrooms. Gill Pratt, former head of the Robotics Challenge at DARPA, now heads up the $1 billion Silicon Valley-based, Toyota Research Institute where he and his team are pushing the boundaries of human knowledge in autonomous vehicles and robotics.

Related articles
In this discussion, recorded at the Aspen Ideas Festival, Pratt and New York Times science writer John Markoff explore the breakthrough technologies on the horizon and the unprecedented issues we will face in this brave new world.

The Road to Artificial Intelligence
ATLAS, one of the key competitors at the DRC
Despite the development of neural networks and other artificial intelligence systems, what Pratt says has been the biggest instrument of change in the last five year is the mobile phone. "What does the cell phone have to do with neural networks? Cell phones allow all of us to take lots of pictures...As a result millions of pictures end up every day in the cloud."

One of the results of all these pictures has been the development of ImageNet, a huge database that has been cataloged by crowd sourcing. This database has presented the decades-old technology of deep convolutional neural networks a system that matches inputs with outputs. For instance a picture of a cow, is matched to the word 'cow.'

"What has been happening in the last four to five years has been tremendous progress on that idea of matching input to output."
"What has been happening in the last four to five years has been tremendous progress on that idea of matching input to output," states Pratt. It turns out that the systems based on this approach are already about as good or better than humans at this image recognition/object classification task.

Pratt further explores this idea in terms of what it means for our understanding of human cognition. 

Pratt was a program manager in the Defense Sciences Office at the US Defense Advanced Research Projects Agency (DARPA) from 2010 to 2015, a professor and associate dean of faculty affairs and research at Olin College, and an associate professor and director of MIT’s Leg Lab. Pratt’s primary interest is in robotics and intelligent systems, particularly in interfaces that enhance human/machine collaboration and mechanisms for enhanced mobility and manipulation, among others. He holds several patents in series elastic actuation and adaptive control. 

SOURCE  The Aspen Institute


By 33rd SquareEmbed


Tuesday, February 10, 2015

Microsoft Achieves Substantial Beyond Human Level Deep Learning Advance

 Deep Learning
A new computer vision system based on deep convolutional neural networks has for the first time eclipsed the abilities of humans to classify objects. The Microsoft researchers claim their system achieved a 4.94 percent error rate on the ImageNet database. Humans tested on the same system averaged  5.1 percent.




Researchers at Microsoft claim their latest deep learning computer vision system can outperform humans in image recognition.

In their paper, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," the Asian-based Microsoft Research developers say their system achieved  a 4.94 percent error rate for the correct classification of images in the 2012 version of the widely recognized ImageNet data set, compared with a 5.1 percent error rate among humans.

The challenge involved identifying objects in the images and then correctly selecting the most accurate categories for the images, out of 1,000 options. Categories included “hatchet,” “geyser,” and “microwave.”
“To the best of our knowledge, our result surpasses for the first time the reported human-level performance on this visual recognition challenge,” Microsoft researchers Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun wrote in the paper.

Related articles
Deep learning involves training artificial neural networks on lots of information derived from images, audio, and other inputs, and then presenting the systems with new information and receiving inferences about it in response

"To the best of our knowledge, our result surpasses for the first time the reported human-level performance on this visual recognition challenge."


The research builds on the company's other impressive deep learning demonstrations of Project Adam, which was first demonstrated last year.

Along with surpassing human capability, the new system from Microsoft researchers improves on Google’s award-winning GoogLeNet system by 26 percent, as it performed with 6.66 percent error, the Microsoft researchers claim.

In a bit of modesty, the researchers noted that they don’t feel computer vision trumps human vision.

“While our algorithm produces a superior result on this particular dataset, this does not indicate that machine vision outperforms human vision on object recognition in general,” they wrote. “On recognizing elementary object categories (i.e., common objects or concepts in daily lives) such as the Pascal VOC task, machines still have obvious errors in cases that are trivial for humans. Nevertheless, we believe that our results show the tremendous potential of machine algorithms to match human-level performance on visual recognition.”

There is no word yet from Microsoft if this development will be used in Cortana, or in the upcoming release of Windows 10.


SOURCE  Microsoft Research

By 33rd SquareEmbed

Thursday, November 20, 2014

Major Advances in Computer Vision Made

 Computer Vision
An essential element for robotic systems that can navigate on their own, will be the robots' ability to see and make sense of the world around them.  New advancements in machine learning are greatly extending computer vision.  




Computer software only recently became smart enough to recognize objects in photographs. Now, Stanford and Google researchers using machine learning have created a system that takes the next step, writing a simple story of what's happening in any digital image.

"The system can analyze an unknown image and explain it in words and phrases that make sense," said  Fei-Fei Li, a professor of computer science and director of the Stanford Artificial Intelligence Lab.

"This is an important milestone," Li said. "It's the first time we've had a computer vision system that could tell a basic story about an unknown image by identifying discrete objects and also putting them into some context."

"It's the first time we've had a computer vision system that could tell a basic story about an unknown image by identifying discrete objects and also putting them into some context."


The research, which has been published online, details how the team used a novel combination
of convolutional neural networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modes.

Humans, Li said, create mental stories that put what we see into context. "Telling a story about a picture turns out to be a core element of human visual intelligence but so far it has proven very difficult to do this with computer algorithms," she said. Li goes as far as saying vision is the key factor in the development of intelligence in animals after the Cambrian Explosion.

At the heart of the Stanford system are algorithms that enable the system to improve its accuracy by scanning scene after scene, looking for patterns, and then using the accumulation of previously described scenes to extrapolate what is being depicted in the next unknown image.

"It's almost like the way a baby learns," Li, who is featured in a video below, said.

She and her collaborators, including Andrej Karpathy, a graduate student in computer science, describe their approach in a paper submitted in advance of a forthcoming conference on cutting edge research in the field of computer vision.

Major Advances in Computer Vision Made

Eventually these advances will lead to robotic systems that can navigate unknown situations. In the near term, machine-based systems that can discern the story in a picture will enable people to search photo or video archives and find specific images. The possibilities for computer surveillance are nothing short of chilling.

"Most of the traffic on the Internet is visual data files, and this might as well be dark matter as far as current search tools are concerned," Li said. "Computer vision seeks to illuminate that dark matter."

The new Stanford paper describes two years of effort that flows from research that Li has been pursuing for a decade. Her work builds on advances that have come, slowly at times, over the last 50 years since MIT scientist Seymour Papert convened a "summer project" to create computer vision in 1966.

Conceived during the early days of artificial intelligence, that timeline proved exceedingly optimistic, as computer scientists struggled to replicate in machines what took millions of years to evolve in living beings. It took researchers 20 years to create systems that could take the relatively simple first step of recognizing discrete objects in photographs.

More recently the emergence of the Internet has helped to propel computer vision. On one hand, the growth of photo and video uploads has created a demand for tools to sort, search and sift visual information. On the other, sophisticated algorithms running on powerful computers have led to electronic systems that can train themselves by performing repetitive tasks, improving as they go.

Related articles
Computer scientists call this machine learning, and Li likened this to how a child learns soccer by getting out and kicking the ball. A coach might demonstrate how to kick, and comment on the child's technique. But improvement occurs from within as the child's eyes, brain, nerves and muscles make tiny adjustments.

Researchers such as Li are developing ways to create positive feedback in loops in machines by inserting mathematical instructions into software. Her latest algorithms incorporate work that her researchers and others have done. This includes training their system on a visual dictionary, using a database of more than 14 million objects.

Each object is described by a mathematical term, or vector, that enables the machine to recognize the shape the next time it is encountered. Those mathematical definitions are linked to the words humans would use to describe the objects, be they cars, carrots, men, mountains or zebras.

Li played a leading role in creating this training tool, the ImageNet project, but her current work goes well beyond memorizing this visual dictionary.

 Her team's new computer vision algorithm trained itself by looking for patterns in a visual dictionary, but this time a dictionary of scenes, a more complicated task than looking just at objects.

 This was a smaller database, made up of tens of thousands of images. Each scene is described in two ways: in mathematical terms that the machine could use to recognize similar scenes and also in a phrase that humans would understand. For instance, one image might be "cat sits on keyboard" while another could be "girl rides on horse in field."

These two databases – one of objects and the other of scenes – served as training material.  Li's machine-learning algorithm analyzed the patterns in these predefined pictures and then applied its analysis to unknown images and used what it had learned to identify individual objects and provide some rudimentary context. In other words, it told a simple story about the image.



SOURCE  Stanford University

By 33rd SquareEmbed

Tuesday, September 9, 2014


 Computer Vision
Google has explained their new award-wining image detection system that can identify multiple objects in a scene, even if they're partly obscured. The key is a neural network that can rapidly refine the criteria it's looking for without requiring a lot of extra computing power.




During the annual ImageNet Computer Vision competition this year, the winning techniques continued the exponential progress of blowing last year's entries out of the water.

John Markoff of the New York Times published a piece on competition and some of those improvements recently.

“We see innovation and creativity exploding,” said Fei-Fei Li, the director of the Stanford Artificial Intelligence Laboratory and one of the creators of a vast set of labeled digital images that is the basis for the contest. “The algorithms are more complex and they are just more interesting.”

"These technological advances will enable even better image understanding on our side and the progress is directly transferable to Google products such as photo search, image search, YouTube, self-driving cars, and any place where it is useful to understand what is in an image as well as where things are."


In the five years that the contest has been held, the organizers have twice, once in 2012 and again this year, seen striking improvements in accuracy, accompanied by more sophisticated algorithms along with larger and faster computers.

Now, Google has published a blog post explaining some of their techniques including, deep learning networks. The team of researchers used the methods to win in a few different categories at the competition.

Related articles 
The deeper scanning system Google used can both identify more objects and make better guesses including items in a living room, and in one example, a jumping cat.

Despite the incredible increases in computer vision accuracy, the systems still cannot match human vision, according to the researchers, and there is a lot of progress remaining to equal a human looking at an image.


According to the post, "These technological advances will enable even better image understanding on our side and the progress is directly transferable to Google products such as photo search, image search, YouTube, self-driving cars, and any place where it is useful to understand what is in an image as well as where things are."


SOURCE  Google Research

By 33rd SquareEmbed