Image recognition: A job for smart software or an average human
There are some important differences between the image recognition needs in markets served by Cognex and the needs for image recognition on the part of marketing, sales and business development people. A Cognex machine vision solution can be focused on a well-defined domain, often with specific attributes or "tells." A defective chip, for example, may emit a different refractive index or have a discernible color variation. The technology to recognize a defect in a production line setting is extremely sophisticated. The return on investment can be calculated. Even at competitive labor rates, machine vision can pay for itself with speed, accuracy and at a lower cost than manual methods.
In marketing and sales, however, the person putting together a slide presentation needs an image of a product (relatively easy to find if there is metadata attached to the available pictures), or an image to show an intangible quality such as vigor (relatively hard even if someone has indexed an in-house image collection). Vendors offering image management systems based on metadata provided by the camera or by a human indexer are available. One can use the InMagic (inmagic.com) system as an image retrieval system. Clever system administrators can make a traditional database like Oracle (oracle.com) or SQL Server (microsoft.com/sqlserver) provide access to images.
But for larger collections of digital images-what used to be called 35-mm slide collections-one needs specialized digital asset management (DAM) systems from such vendors as Adobe, Canto or Microsoft iView, among others. Those systems offer version management, support for different image types such as Adobe Photoshop and PDF, TIFF and vector drawing files. The systems include access controls, essential if an organization is doing work for certain government agencies. They focus on reducing bottlenecks in workflows.
Even with fancy systems, the amount of time required to find a specific image or a specific segment of digital video is indeterminate. Exalead's video search system does allow the user to view a video at the point at which the query matches the content of a digital video.
And what about video?
Video can pose some additional challenges. Digital video is an unwieldy beast with an appetite for storage and a generous side dish of bandwidth. One company that has received accolades from industry groups and analysts is Altus, whose flagship product is vSearch. The company offers on-demand rich media solutions for a range of enterprise applications. The system can be used for knowledge sharing within an organization, a sales enabling service, an educational service or a system to deliver video from a conference with multiple, simultaneous presentations.
Altus has positioned itself as providing a service that "transforms enterprise video into a valuable asset for any organization. vSearch creates a cloud-based learning environment that combines enterprise video with PowerPoint slide synchronization and scrolling transcripts into an accessible video content archive that is searchable down to the spoken word or specific point of interest. Content can be viewed as streaming media or on-demand presentations from any computer, tablet or smart phone-allowing instant access to knowledge anytime or anywhere." The Altus approach is to deliver video search as software as a service (SaaS).
Still, the question that interests me is, "Are these systems from sophisticated technology companies able to look at an image or a frame in the video and ‘figure out' what the picture represents?" The sci-fi version of image recognition is out of reach. The meaning of a picture depends on a context that, at this time, requires a human to discern. For now, humans still have a role to play in finding just the right image for any given situation. We are not about to see the end of that good old-fashioned function called indexing for rich media for a few years.