Register Now for our Webinar - ECM: To the Cloud and Beyond

Image recognition: A job for smart software or an average human

This article appears in the issue October 2011, [Vol 20 Issue 9]
Page 1 of 2 next >>


   Bookmark and Share

Can an enterprise use software to figure out what a digital image or a video is "about"? (About, as I am using the word, means looking at a snapshot of a farm and recognizing the pigs, the cows and the chickens.)

Visualize your office building monitored by surveillance cameras. Instead of a human security guard watching for an intrusion, software "watches" the digital video and makes a decision about a specific individual attempting to enter the building. The image recognition system plucks a person's face from the real-time video stream, matches it to a database and determines whether he or she is a vice president or a stranger without access permission. The system "recognizes" the executive and unlocks the door.

For many years, security professionals have funded, tested and tweaked commercial systems to make image recognition of faces a reliable reality, not a science fiction fantasy. Alas, software sufficiently "smart" to figure out the identity of an individual or to determine the "aboutness" of a digital image is a pot of gold long sought after but not yet found.

Google and celebrity facial recognition

But advances are being made. In May 2011, Google's patent "Automatically Mining Person Models of Celebrities for Visual Search Applications" set off a flurry of commentary on blogs and mainstream publications like Forbes. Patent US20110116690 was being downloaded when Google's chairman, Eric Schmidt, was explaining that image recognition was "too creepy." (See "Facial Recognition: Google Chairman Warns US Govt", May 20, 2011, at http://goo.gl/DPOuj.)

When I want some insight into next-generation search technology, I navigate to Google Research's Publications by Googlers at http://research.google.com/pubs/papers.html. Although not a comprehensive archive, the technical papers provide a useful glimpse into search technologies from some of the world's most sophisticated engineers and scientists. In the category Audio, Video and Image Processing, there were more than 100 technical papers, last I checked.

One research report suggested that Google's experts were testing a taxonomy with more than 1,000 categories. The idea was to use "smart software" to figure out what a video is "about." To me, the Google method echoes Autonomy's (autonomy.com) approach, and demonstrated that Google algorithms can categorize video without metadata at an acceptable level of accuracy.

A 2009 article indicated that Google is working to figure out the "what" in imagery. And yet another report suggested that Google has powerful image functionality that remains, for now, on the sidelines. Is that due to a decision dictated by financial, legal or technical factors? There is scant information about Google's plans for its image recognition technology. What is clear is that Google has invested time and effort in figuring out the content of static images and digital video. When Google does move, the impact on the market could be significant due to its near monopolistic control of search and retrieval.

Current examples of what's available

My view is that Google's consumer image search is useful, probably as good as, if not better than, comparable systems from Bing, TinEye and Flickr.

I prefer the image search function of Exalead (a division of Dassault Systemes), which returns relevant images without the malware attracting iFrames used by Google. What few of my colleagues in the field of enterprise search know is that Exalead's system has for several years offered image search features only now becoming available on Google. For example, it automatically recognizes an image suitable for desktop wallpaper and displays a hot link to it. Exalead's portrait or landscape option has been available for a long time, and the company has also pushed ahead in video search.

Autonomy also offers image and video search systems. Other vendors include such companies as OpenText's Nstein unit, which uses technology from Imprezzeo. Nstein employs content-based image retrieval and facial recognition. Its system has been tailored to the needs of those engaged in publishing. The user inputs or identifies a sample image. The system then displays matches. With some clicking, the result set can be narrowed to the image the user requires. Nstein provides a software development kit for the system.

A firm called IQ Engines offers an image recognition system that performs "computer vision search." You upload an image to the system. After a minute of processing, the system either displays matches or reports that the image was not in the database.

Kooaba is a visual recognition startup. The company offers a photo management system for licensees and an iPhone application. The user takes a picture of an object and uploads it to Kooaba. The system then "finds" similar images.

A key point is that these systems are using metadata like the date, time, file type and user generated description of an image. Algorithms create a "fingerprint" for color, shapes and other discernable characteristics. If an image appears in a PowerPoint, the name of the PowerPoint "author" may be attached to the digital object. These systems are not figuring out whether the image is a prize-winning heifer or a Volkswagen Jetta.

Image recognition applications

Confusion about image search, image recognition and image systems is flourishing. One reason is the failure to distinguish between the different applications to which image recognition can be applied.

Certain types of image processing work well, are well understood and have a measureable impact. A good example is the machine vision sector of image recognition. Cognex is one of the leaders in machine vision. The company's products make it possible to process barcodes for inventory control. Its technology can "look at" a stream of manufactured components and "see" those with defects. You may want to check out Orpix Computer Vision, Pattern Recognition Company and Microscan, among others.

Cognex, despite the soft economy, reported record revenue in its first quarter of 2011. The firm seems likely to push beyond $300 million in revenues. One indication of the strength of this company is its cash position. The firm had a war chest in May 2011 or more than $300 million in cash and investment. At a time when traditional enterprise search vendors are struggling to stay afloat or tap investors for additional cash, Cognex is flying high.

Page 1 of 2 next >>

Search KMWorld

Connect